WorldWideScience

Sample records for classification algorithm aimed

  1. Development and Validation of a Spike Detection and Classification Algorithm Aimed at Implementation on Hardware Devices

    Directory of Open Access Journals (Sweden)

    E. Biffi

    2010-01-01

    Full Text Available Neurons cultured in vitro on MicroElectrode Array (MEA devices connect to each other, forming a network. To study electrophysiological activity and long term plasticity effects, long period recording and spike sorter methods are needed. Therefore, on-line and real time analysis, optimization of memory use and data transmission rate improvement become necessary. We developed an algorithm for amplitude-threshold spikes detection, whose performances were verified with (a statistical analysis on both simulated and real signal and (b Big O Notation. Moreover, we developed a PCA-hierarchical classifier, evaluated on simulated and real signal. Finally we proposed a spike detection hardware design on FPGA, whose feasibility was verified in terms of CLBs number, memory occupation and temporal requirements; once realized, it will be able to execute on-line detection and real time waveform analysis, reducing data storage problems.

  2. Recursive automatic classification algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Bauman, E V; Dorofeyuk, A A

    1982-03-01

    A variational statement of the automatic classification problem is given. The dependence of the form of the optimal partition surface on the form of the classification objective functional is investigated. A recursive algorithm is proposed for maximising a functional of reasonably general form. The convergence problem is analysed in connection with the proposed algorithm. 8 references.

  3. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  4. Unsupervised Classification Using Immune Algorithm

    OpenAIRE

    Al-Muallim, M. T.; El-Kouatly, R.

    2012-01-01

    Unsupervised classification algorithm based on clonal selection principle named Unsupervised Clonal Selection Classification (UCSC) is proposed in this paper. The new proposed algorithm is data driven and self-adaptive, it adjusts its parameters to the data to make the classification operation as fast as possible. The performance of UCSC is evaluated by comparing it with the well known K-means algorithm using several artificial and real-life data sets. The experiments show that the proposed U...

  5. A Semisupervised Cascade Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Stamatis Karlos

    2016-01-01

    Full Text Available Classification is one of the most important tasks of data mining techniques, which have been adopted by several modern applications. The shortage of enough labeled data in the majority of these applications has shifted the interest towards using semisupervised methods. Under such schemes, the use of collected unlabeled data combined with a clearly smaller set of labeled examples leads to similar or even better classification accuracy against supervised algorithms, which use labeled examples exclusively during the training phase. A novel approach for increasing semisupervised classification using Cascade Classifier technique is presented in this paper. The main characteristic of Cascade Classifier strategy is the use of a base classifier for increasing the feature space by adding either the predicted class or the probability class distribution of the initial data. The classifier of the second level is supplied with the new dataset and extracts the decision for each instance. In this work, a self-trained NB∇C4.5 classifier algorithm is presented, which combines the characteristics of Naive Bayes as a base classifier and the speed of C4.5 for final classification. We performed an in-depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique has better accuracy in most cases.

  6. Classification algorithms using adaptive partitioning

    KAUST Repository

    Binev, Peter; Cohen, Albert; Dahmen, Wolfgang; DeVore, Ronald

    2014-01-01

    © 2014 Institute of Mathematical Statistics. Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335.1353; Mach. Learn. 66 (2007) 209.242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter - of margin conditions and a rate s of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness β of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Holder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.

  7. Classification algorithms using adaptive partitioning

    KAUST Repository

    Binev, Peter

    2014-12-01

    © 2014 Institute of Mathematical Statistics. Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335.1353; Mach. Learn. 66 (2007) 209.242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter - of margin conditions and a rate s of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness β of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Holder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.

  8. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    Science.gov (United States)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  9. Evolutionary Algorithms For Neural Networks Binary And Real Data Classification

    Directory of Open Access Journals (Sweden)

    Dr. Hanan A.R. Akkar

    2015-08-01

    Full Text Available Artificial neural networks are complex networks emulating the way human rational neurons process data. They have been widely used generally in prediction clustering classification and association. The training algorithms that used to determine the network weights are almost the most important factor that influence the neural networks performance. Recently many meta-heuristic and Evolutionary algorithms are employed to optimize neural networks weights to achieve better neural performance. This paper aims to use recently proposed algorithms for optimizing neural networks weights comparing these algorithms performance with other classical meta-heuristic algorithms used for the same purpose. However to evaluate the performance of such algorithms for training neural networks we examine such algorithms to classify four opposite binary XOR clusters and classification of continuous real data sets such as Iris and Ecoli.

  10. Distribution Bottlenecks in Classification Algorithms

    NARCIS (Netherlands)

    Zwartjes, G.J.; Havinga, Paul J.M.; Smit, Gerardus Johannes Maria; Hurink, Johann L.

    2012-01-01

    The abundance of data available on Wireless Sensor Networks makes online processing necessary. In industrial applications for example, the correct operation of equipment can be the point of interest while raw sampled data is of minor importance. Classi﬿cation algorithms can be used to make state

  11. A Chinese text classification system based on Naive Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Cui Wei

    2016-01-01

    Full Text Available In this paper, aiming at the characteristics of Chinese text classification, using the ICTCLAS(Chinese lexical analysis system of Chinese academy of sciences for document segmentation, and for data cleaning and filtering the Stop words, using the information gain and document frequency feature selection algorithm to document feature selection. Based on this, based on the Naive Bayesian algorithm implemented text classifier , and use Chinese corpus of Fudan University has carried on the experiment and analysis on the system.

  12. Automatic modulation classification principles, algorithms and applications

    CERN Document Server

    Zhu, Zhechen

    2014-01-01

    Automatic Modulation Classification (AMC) has been a key technology in many military, security, and civilian telecommunication applications for decades. In military and security applications, modulation often serves as another level of encryption; in modern civilian applications, multiple modulation types can be employed by a signal transmitter to control the data rate and link reliability. This book offers comprehensive documentation of AMC models, algorithms and implementations for successful modulation recognition. It provides an invaluable theoretical and numerical comparison of AMC algo

  13. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  14. Detecting Hijacked Journals by Using Classification Algorithms.

    Science.gov (United States)

    Andoohgin Shahri, Mona; Jazi, Mohammad Davarpanah; Borchardt, Glenn; Dadkhah, Mehdi

    2018-04-01

    Invalid journals are recent challenges in the academic world and many researchers are unacquainted with the phenomenon. The number of victims appears to be accelerating. Researchers might be suspicious of predatory journals because they have unfamiliar names, but hijacked journals are imitations of well-known, reputable journals whose websites have been hijacked. Hijacked journals issue calls for papers via generally laudatory emails that delude researchers into paying exorbitant page charges for publication in a nonexistent journal. This paper presents a method for detecting hijacked journals by using a classification algorithm. The number of published articles exposing hijacked journals is limited and most of them use simple techniques that are limited to specific journals. Hence we needed to amass Internet addresses and pertinent data for analyzing this type of attack. We inspected the websites of 104 scientific journals by using a classification algorithm that used criteria common to reputable journals. We then prepared a decision tree that we used to test five journals we knew were authentic and five we knew were hijacked.

  15. Improved RMR Rock Mass Classification Using Artificial Intelligence Algorithms

    Science.gov (United States)

    Gholami, Raoof; Rasouli, Vamegh; Alimoradi, Andisheh

    2013-09-01

    Rock mass classification systems such as rock mass rating (RMR) are very reliable means to provide information about the quality of rocks surrounding a structure as well as to propose suitable support systems for unstable regions. Many correlations have been proposed to relate measured quantities such as wave velocity to rock mass classification systems to limit the associated time and cost of conducting the sampling and mechanical tests conventionally used to calculate RMR values. However, these empirical correlations have been found to be unreliable, as they usually overestimate or underestimate the RMR value. The aim of this paper is to compare the results of RMR classification obtained from the use of empirical correlations versus machine-learning methodologies based on artificial intelligence algorithms. The proposed methods were verified based on two case studies located in northern Iran. Relevance vector regression (RVR) and support vector regression (SVR), as two robust machine-learning methodologies, were used to predict the RMR for tunnel host rocks. RMR values already obtained by sampling and site investigation at one tunnel were taken into account as the output of the artificial networks during training and testing phases. The results reveal that use of empirical correlations overestimates the predicted RMR values. RVR and SVR, however, showed more reliable results, and are therefore suggested for use in RMR classification for design purposes of rock structures.

  16. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  17. Algorithms exploiting ultrasonic sensors for subject classification

    Science.gov (United States)

    Desai, Sachi; Quoraishee, Shafik

    2009-09-01

    Proposed here is a series of techniques exploiting micro-Doppler ultrasonic sensors capable of characterizing various detected mammalian targets based on their physiological movements captured a series of robust features. Employed is a combination of unique and conventional digital signal processing techniques arranged in such a manner they become capable of classifying a series of walkers. These processes for feature extraction develops a robust feature space capable of providing discrimination of various movements generated from bipeds and quadrupeds and further subdivided into large or small. These movements can be exploited to provide specific information of a given signature dividing it in a series of subset signatures exploiting wavelets to generate start/stop times. After viewing a series spectrograms of the signature we are able to see distinct differences and utilizing kurtosis, we generate an envelope detector capable of isolating each of the corresponding step cycles generated during a walk. The walk cycle is defined as one complete sequence of walking/running from the foot pushing off the ground and concluding when returning to the ground. This time information segments the events that are readily seen in the spectrogram but obstructed in the temporal domain into individual walk sequences. This walking sequence is then subsequently translated into a three dimensional waterfall plot defining the expected energy value associated with the motion at particular instance of time and frequency. The value is capable of being repeatable for each particular class and employable to discriminate the events. Highly reliable classification is realized exploiting a classifier trained on a candidate sample space derived from the associated gyrations created by motion from actors of interest. The classifier developed herein provides a capability to classify events as an adult humans, children humans, horses, and dogs at potentially high rates based on the tested sample

  18. QUEST : Eliminating online supervised learning for efficient classification algorithms

    NARCIS (Netherlands)

    Zwartjes, Ardjan; Havinga, Paul J.M.; Smit, Gerard J.M.; Hurink, Johann L.

    2016-01-01

    In this work, we introduce QUEST (QUantile Estimation after Supervised Training), an adaptive classification algorithm for Wireless Sensor Networks (WSNs) that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting

  19. A Comparative Analysis of Classification Algorithms on Diverse Datasets

    Directory of Open Access Journals (Sweden)

    M. Alghobiri

    2018-04-01

    Full Text Available Data mining involves the computational process to find patterns from large data sets. Classification, one of the main domains of data mining, involves known structure generalizing to apply to a new dataset and predict its class. There are various classification algorithms being used to classify various data sets. They are based on different methods such as probability, decision tree, neural network, nearest neighbor, boolean and fuzzy logic, kernel-based etc. In this paper, we apply three diverse classification algorithms on ten datasets. The datasets have been selected based on their size and/or number and nature of attributes. Results have been discussed using some performance evaluation measures like precision, accuracy, F-measure, Kappa statistics, mean absolute error, relative absolute error, ROC Area etc. Comparative analysis has been carried out using the performance evaluation measures of accuracy, precision, and F-measure. We specify features and limitations of the classification algorithms for the diverse nature datasets.

  20. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.; Secomb, Timothy W.; Pries, Axel R.; Smith, Nicolas P.; Shipley, Rebecca J.

    2015-01-01

    algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules

  1. Android Malware Classification Using K-Means Clustering Algorithm

    Science.gov (United States)

    Hamid, Isredza Rahmi A.; Syafiqah Khalid, Nur; Azma Abdullah, Nurul; Rahman, Nurul Hidayah Ab; Chai Wen, Chuah

    2017-08-01

    Malware was designed to gain access or damage a computer system without user notice. Besides, attacker exploits malware to commit crime or fraud. This paper proposed Android malware classification approach based on K-Means clustering algorithm. We evaluate the proposed model in terms of accuracy using machine learning algorithms. Two datasets were selected to demonstrate the practicing of K-Means clustering algorithms that are Virus Total and Malgenome dataset. We classify the Android malware into three clusters which are ransomware, scareware and goodware. Nine features were considered for each types of dataset such as Lock Detected, Text Detected, Text Score, Encryption Detected, Threat, Porn, Law, Copyright and Moneypak. We used IBM SPSS Statistic software for data classification and WEKA tools to evaluate the built cluster. The proposed K-Means clustering algorithm shows promising result with high accuracy when tested using Random Forest algorithm.

  2. A modular and parameterisable classification of algorithms

    NARCIS (Netherlands)

    Nugteren, C.; Corporaal, H.

    2011-01-01

    Multi-core and many-core were already major trends for the past six years, and are expected to continue for the next decades. With this trend of parallel computing, it becomes increasingly difficult to decide on which architecture to run a certain application or algorithm. Additionally, it brings

  3. An algorithm for the arithmetic classification of multilattices.

    Science.gov (United States)

    Indelicato, Giuliana

    2013-01-01

    A procedure for the construction and the classification of monoatomic multilattices in arbitrary dimension is developed. The algorithm allows one to determine the location of the points of all monoatomic multilattices with a given symmetry, or to determine whether two assigned multilattices are arithmetically equivalent. This approach is based on ideas from integral matrix theory, in particular the reduction to the Smith normal form, and can be coded to provide a classification software package.

  4. Implementation of several mathematical algorithms to breast tissue density classification

    International Nuclear Information System (INIS)

    Quintana, C.; Redondo, M.; Tirao, G.

    2014-01-01

    The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories. - Highlights: • Breast density classification can be obtained by suitable mathematical algorithms. • Mathematical processing help radiologists to obtain the BI-RADS classification. • The entropy and joint entropy show high performance for density classification

  5. Prediction of customer behaviour analysis using classification algorithms

    Science.gov (United States)

    Raju, Siva Subramanian; Dhandayudam, Prabha

    2018-04-01

    Customer Relationship management plays a crucial role in analyzing of customer behavior patterns and their values with an enterprise. Analyzing of customer data can be efficient performed using various data mining techniques, with the goal of developing business strategies and to enhance the business. In this paper, three classification models (NB, J48, and MLPNN) are studied and evaluated for our experimental purpose. The performance measures of the three classifications are compared using three different parameters (accuracy, sensitivity, specificity) and experimental results expose J48 algorithm has better accuracy with compare to NB and MLPNN algorithm.

  6. Algorithms for classification of astronomical object spectra

    Science.gov (United States)

    Wasiewicz, P.; Szuppe, J.; Hryniewicz, K.

    2015-09-01

    Obtaining interesting celestial objects from tens of thousands or even millions of recorded optical-ultraviolet spectra depends not only on the data quality but also on the accuracy of spectra decomposition. Additionally rapidly growing data volumes demands higher computing power and/or more efficient algorithms implementations. In this paper we speed up the process of substracting iron transitions and fitting Gaussian functions to emission peaks utilising C++ and OpenCL methods together with the NOSQL database. In this paper we implemented typical astronomical methods of detecting peaks in comparison to our previous hybrid methods implemented with CUDA.

  7. Benchmarking protein classification algorithms via supervised cross-validation

    NARCIS (Netherlands)

    Kertész-Farkas, A.; Dhir, S.; Sonego, P.; Pacurar, M.; Netoteia, S.; Nijveen, H.; Kuzniar, A.; Leunissen, J.A.M.; Kocsor, A.; Pongor, S.

    2008-01-01

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold,

  8. Classification algorithm of Web document in ionization radiation

    International Nuclear Information System (INIS)

    Geng Zengmin; Liu Wanchun

    2005-01-01

    Resources in the Internet is numerous. It is one of research directions of Web mining (WM) how to mine the resource of some calling or trade more efficiently. The paper studies the classification of Web document in ionization radiation (IR) based on the algorithm of Bayes, Rocchio, Widrow-Hoff, and analyses the result of trial effect. (authors)

  9. An evaluation of classification algorithms for intrusion detection ...

    African Journals Online (AJOL)

    An evaluation of classification algorithms for intrusion detection. ... Log in or Register to get access to full text downloads. ... Most of the available IDSs use all the 41 features in the network to evaluate and search for intrusive pattern in which ...

  10. Experimental analysis of the performance of machine learning algorithms in the classification of navigation accident records

    Directory of Open Access Journals (Sweden)

    REIS, M V. S. de A.

    2017-06-01

    Full Text Available This paper aims to evaluate the use of machine learning techniques in a database of marine accidents. We analyzed and evaluated the main causes and types of marine accidents in the Northern Fluminense region. For this, machine learning techniques were used. The study showed that the modeling can be done in a satisfactory manner using different configurations of classification algorithms, varying the activation functions and training parameters. The SMO (Sequential Minimal Optimization algorithm showed the best performance result.

  11. QUEST: Eliminating Online Supervised Learning for Efficient Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Ardjan Zwartjes

    2016-10-01

    Full Text Available In this work, we introduce QUEST (QUantile Estimation after Supervised Training, an adaptive classification algorithm for Wireless Sensor Networks (WSNs that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting raw sensor data puts high demands on the battery, reducing network life time. By merely transmitting partial results or classifications based on the sampled data, the amount of traffic on the network can be significantly reduced. Such classifications can be made by learning based algorithms using sampled data. An important issue, however, is the training phase of these learning based algorithms. Training a deployed sensor network requires a lot of communication and an impractical amount of human involvement. QUEST is a hybrid algorithm that combines supervised learning in a controlled environment with unsupervised learning on the location of deployment. Using the SITEX02 dataset, we demonstrate that the presented solution works with a performance penalty of less than 10% in 90% of the tests. Under some circumstances, it even outperforms a network of classifiers completely trained with supervised learning. As a result, the need for on-site supervised learning and communication for training is completely eliminated by our solution.

  12. QUEST: Eliminating Online Supervised Learning for Efficient Classification Algorithms.

    Science.gov (United States)

    Zwartjes, Ardjan; Havinga, Paul J M; Smit, Gerard J M; Hurink, Johann L

    2016-10-01

    In this work, we introduce QUEST (QUantile Estimation after Supervised Training), an adaptive classification algorithm for Wireless Sensor Networks (WSNs) that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting raw sensor data puts high demands on the battery, reducing network life time. By merely transmitting partial results or classifications based on the sampled data, the amount of traffic on the network can be significantly reduced. Such classifications can be made by learning based algorithms using sampled data. An important issue, however, is the training phase of these learning based algorithms. Training a deployed sensor network requires a lot of communication and an impractical amount of human involvement. QUEST is a hybrid algorithm that combines supervised learning in a controlled environment with unsupervised learning on the location of deployment. Using the SITEX02 dataset, we demonstrate that the presented solution works with a performance penalty of less than 10% in 90% of the tests. Under some circumstances, it even outperforms a network of classifiers completely trained with supervised learning. As a result, the need for on-site supervised learning and communication for training is completely eliminated by our solution.

  13. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  14. Adaptive phase k-means algorithm for waveform classification

    Science.gov (United States)

    Song, Chengyun; Liu, Zhining; Wang, Yaojun; Xu, Feng; Li, Xingming; Hu, Guangmin

    2018-01-01

    Waveform classification is a powerful technique for seismic facies analysis that describes the heterogeneity and compartments within a reservoir. Horizon interpretation is a critical step in waveform classification. However, the horizon often produces inconsistent waveform phase, and thus results in an unsatisfied classification. To alleviate this problem, an adaptive phase waveform classification method called the adaptive phase k-means is introduced in this paper. Our method improves the traditional k-means algorithm using an adaptive phase distance for waveform similarity measure. The proposed distance is a measure with variable phases as it moves from sample to sample along the traces. Model traces are also updated with the best phase interference in the iterative process. Therefore, our method is robust to phase variations caused by the interpretation horizon. We tested the effectiveness of our algorithm by applying it to synthetic and real data. The satisfactory results reveal that the proposed method tolerates certain waveform phase variation and is a good tool for seismic facies analysis.

  15. Unsupervised classification of multivariate geostatistical data: Two algorithms

    Science.gov (United States)

    Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

    2015-12-01

    With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

  16. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    Directory of Open Access Journals (Sweden)

    Saadia Zahid

    2015-01-01

    Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.

  17. Implementation of several mathematical algorithms to breast tissue density classification

    Science.gov (United States)

    Quintana, C.; Redondo, M.; Tirao, G.

    2014-02-01

    The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories.

  18. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  19. Classification of Internet banking customers using data mining algorithms

    Directory of Open Access Journals (Sweden)

    Reza Radfar

    2014-03-01

    Full Text Available Classifying customers using data mining algorithms, enables banks to keep old customers loyality while attracting new ones. Using decision tree as a data mining technique, we can optimize customer classification provided that the appropriate decision tree is selected. In this article we have presented an appropriate model to classify customers who use internet banking service. The model is developed based on CRISP-DM standard and we have used real data of Sina bank’s Internet bank. In compare to other decision trees, ours is based on both optimization and accuracy factors that recognizes new potential internet banking customers using a three level classification, which is low/medium and high. This is a practical, documentary-based research. Mining customer rules enables managers to make policies based on found out patterns in order to have a better perception of what customers really desire.

  20. Toward optimal feature selection using ranking methods and classification algorithms

    Directory of Open Access Journals (Sweden)

    Novaković Jasmina

    2011-01-01

    Full Text Available We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be important for classification accuracy. In our experiments, ranking methods with different supervised learning algorithms give quite different results for balanced accuracy. Our cases confirm that, in order to be sure that a subset of features giving the highest accuracy has been selected, the use of many different indices is recommended.

  1. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  2. CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH

    Directory of Open Access Journals (Sweden)

    V. A. Ayma

    2015-03-01

    Full Text Available Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP, which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA’s machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM. The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  3. Comparison of Unsupervised Vegetation Classification Methods from Vhr Images after Shadows Removal by Innovative Algorithms

    Science.gov (United States)

    Movia, A.; Beinat, A.; Crosilla, F.

    2015-04-01

    The recognition of vegetation by the analysis of very high resolution (VHR) aerial images provides meaningful information about environmental features; nevertheless, VHR images frequently contain shadows that generate significant problems for the classification of the image components and for the extraction of the needed information. The aim of this research is to classify, from VHR aerial images, vegetation involved in the balance process of the environmental biochemical cycle, and to discriminate it with respect to urban and agricultural features. Three classification algorithms have been experimented in order to better recognize vegetation, and compared to NDVI index; unfortunately all these methods are conditioned by the presence of shadows on the images. Literature presents several algorithms to detect and remove shadows in the scene: most of them are based on the RGB to HSI transformations. In this work some of them have been implemented and compared with one based on RGB bands. Successively, in order to remove shadows and restore brightness on the images, some innovative algorithms, based on Procrustes theory, have been implemented and applied. Among these, we evaluate the capability of the so called "not-centered oblique Procrustes" and "anisotropic Procrustes" methods to efficiently restore brightness with respect to a linear correlation correction based on the Cholesky decomposition. Some experimental results obtained by different classification methods after shadows removal carried out with the innovative algorithms are presented and discussed.

  4. Comparison analysis for classification algorithm in data mining and the study of model use

    Science.gov (United States)

    Chen, Junde; Zhang, Defu

    2018-04-01

    As a key technique in data mining, classification algorithm was received extensive attention. Through an experiment of classification algorithm in UCI data set, we gave a comparison analysis method for the different algorithms and the statistical test was used here. Than that, an adaptive diagnosis model for preventive electricity stealing and leakage was given as a specific case in the paper.

  5. Land-cover classification with an expert classification algorithm using digital aerial photographs

    Directory of Open Access Journals (Sweden)

    José L. de la Cruz

    2010-05-01

    Full Text Available The purpose of this study was to evaluate the usefulness of the spectral information of digital aerial sensors in determining land-cover classification using new digital techniques. The land covers that have been evaluated are the following, (1 bare soil, (2 cereals, including maize (Zea mays L., oats (Avena sativa L., rye (Secale cereale L., wheat (Triticum aestivum L. and barley (Hordeun vulgare L., (3 high protein crops, such as peas (Pisum sativum L. and beans (Vicia faba L., (4 alfalfa (Medicago sativa L., (5 woodlands and scrublands, including holly oak (Quercus ilex L. and common retama (Retama sphaerocarpa L., (6 urban soil, (7 olive groves (Olea europaea L. and (8 burnt crop stubble. The best result was obtained using an expert classification algorithm, achieving a reliability rate of 95%. This result showed that the images of digital airborne sensors hold considerable promise for the future in the field of digital classifications because these images contain valuable information that takes advantage of the geometric viewpoint. Moreover, new classification techniques reduce problems encountered using high-resolution images; while reliabilities are achieved that are better than those achieved with traditional methods.

  6. Predicting disease risk using bootstrap ranking and classification algorithms.

    Directory of Open Access Journals (Sweden)

    Ohad Manor

    Full Text Available Genome-wide association studies (GWAS are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a "black box" in order to promote changes in life-style and screening for early diagnosis, or as a model that can be studied to better understand the mechanism of the disease. Current methods for risk prediction typically rank single nucleotide polymorphisms (SNPs by the p-value of their association with the disease, and use the top-associated SNPs as input to a classification algorithm. However, the predictive power of such methods is relatively poor. To improve the predictive power, we devised BootRank, which uses bootstrapping in order to obtain a robust prioritization of SNPs for use in predictive models. We show that BootRank improves the ability to predict disease risk of unseen individuals in the Wellcome Trust Case Control Consortium (WTCCC data and results in a more robust set of SNPs and a larger number of enriched pathways being associated with the different diseases. Finally, we show that combining BootRank with seven different classification algorithms improves performance compared to previous studies that used the WTCCC data. Notably, diseases for which BootRank results in the largest improvements were recently shown to have more heritability than previously thought, likely due to contributions from variants with low minimum allele frequency (MAF, suggesting that BootRank can be beneficial in cases where SNPs affecting the disease are poorly tagged or have low MAF. Overall, our results show that improving disease risk prediction from genotypic information may be a tangible goal, with potential implications for personalized disease screening and treatment.

  7. Improved wavelet packet classification algorithm for vibrational intrusions in distributed fiber-optic monitoring systems

    Science.gov (United States)

    Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo

    2015-05-01

    An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.

  8. On using the Multiple Signal Classification algorithm to study microbaroms

    Science.gov (United States)

    Marcillo, O. E.; Blom, P. S.; Euler, G. G.

    2016-12-01

    Multiple Signal Classification (MUSIC) (Schmidt, 1986) is a well-known high-resolution algorithm used in array processing for parameter estimation. We report on the application of MUSIC to infrasonic array data in a study of the structure of microbaroms. Microbaroms can be globally observed and display energy centered around 0.2 Hz. Microbaroms are an infrasonic signal generated by the non-linear interaction of ocean surface waves that radiate into the ocean and atmosphere as well as the solid earth in the form of microseisms. Microbaroms sources are dynamic and, in many cases, distributed in space and moving in time. We assume that the microbarom energy detected by an infrasonic array is the result of multiple sources (with different back-azimuths) in the same bandwidth and apply the MUSIC algorithm accordingly to recover the back-azimuth and trace velocity of the individual components. Preliminary results show that the multiple component assumption in MUSIC allows one to resolve the fine structure in the microbarom band that can be related to multiple ocean surface phenomena.

  9. Robust Semi-Supervised Manifold Learning Algorithm for Classification

    Directory of Open Access Journals (Sweden)

    Mingxia Chen

    2018-01-01

    Full Text Available In the recent years, manifold learning methods have been widely used in data classification to tackle the curse of dimensionality problem, since they can discover the potential intrinsic low-dimensional structures of the high-dimensional data. Given partially labeled data, the semi-supervised manifold learning algorithms are proposed to predict the labels of the unlabeled points, taking into account label information. However, these semi-supervised manifold learning algorithms are not robust against noisy points, especially when the labeled data contain noise. In this paper, we propose a framework for robust semi-supervised manifold learning (RSSML to address this problem. The noisy levels of the labeled points are firstly predicted, and then a regularization term is constructed to reduce the impact of labeled points containing noise. A new robust semi-supervised optimization model is proposed by adding the regularization term to the traditional semi-supervised optimization model. Numerical experiments are given to show the improvement and efficiency of RSSML on noisy data sets.

  10. A method for classification of network traffic based on C5.0 Machine Learning Algorithm

    DEFF Research Database (Denmark)

    Bujlow, Tomasz; Riaz, M. Tahir; Pedersen, Jens Myrup

    2012-01-01

    current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown...... and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options...

  11. Time series classification using k-Nearest neighbours, Multilayer Perceptron and Learning Vector Quantization algorithms

    Directory of Open Access Journals (Sweden)

    Jiří Fejfar

    2012-01-01

    Full Text Available We are presenting results comparison of three artificial intelligence algorithms in a classification of time series derived from musical excerpts in this paper. Algorithms were chosen to represent different principles of classification – statistic approach, neural networks and competitive learning. The first algorithm is a classical k-Nearest neighbours algorithm, the second algorithm is Multilayer Perceptron (MPL, an example of artificial neural network and the third one is a Learning Vector Quantization (LVQ algorithm representing supervised counterpart to unsupervised Self Organizing Map (SOM.After our own former experiments with unlabelled data we moved forward to the data labels utilization, which generally led to a better accuracy of classification results. As we need huge data set of labelled time series (a priori knowledge of correct class which each time series instance belongs to, we used, with a good experience in former studies, musical excerpts as a source of real-world time series. We are using standard deviation of the sound signal as a descriptor of a musical excerpts volume level.We are describing principle of each algorithm as well as its implementation briefly, giving links for further research. Classification results of each algorithm are presented in a confusion matrix showing numbers of misclassifications and allowing to evaluate overall accuracy of the algorithm. Results are compared and particular misclassifications are discussed for each algorithm. Finally the best solution is chosen and further research goals are given.

  12. A fingerprint classification algorithm based on combination of local and global information

    Science.gov (United States)

    Liu, Chongjin; Fu, Xiang; Bian, Junjie; Feng, Jufu

    2011-12-01

    Fingerprint recognition is one of the most important technologies in biometric identification and has been wildly applied in commercial and forensic areas. Fingerprint classification, as the fundamental procedure in fingerprint recognition, can sharply decrease the quantity for fingerprint matching and improve the efficiency of fingerprint recognition. Most fingerprint classification algorithms are based on the number and position of singular points. Because the singular points detecting method only considers the local information commonly, the classification algorithms are sensitive to noise. In this paper, we propose a novel fingerprint classification algorithm combining the local and global information of fingerprint. Firstly we use local information to detect singular points and measure their quality considering orientation structure and image texture in adjacent areas. Furthermore the global orientation model is adopted to measure the reliability of singular points group. Finally the local quality and global reliability is weighted to classify fingerprint. Experiments demonstrate the accuracy and effectivity of our algorithm especially for the poor quality fingerprint images.

  13. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. TEXT CLASSIFICATION USING NAIVE BAYES UPDATEABLE ALGORITHM IN SBMPTN TEST QUESTIONS

    Directory of Open Access Journals (Sweden)

    Ristu Saptono

    2017-01-01

    Full Text Available Document classification is a growing interest in the research of text mining. Classification can be done based on the topics, languages, and so on. This study was conducted to determine how Naive Bayes Updateable performs in classifying the SBMPTN exam questions based on its theme. Increment model of one classification algorithm often used in text classification Naive Bayes classifier has the ability to learn from new data introduces with the system even after the classifier has been produced with the existing data. Naive Bayes Classifier classifies the exam questions based on the theme of the field of study by analyzing keywords that appear on the exam questions. One of feature selection method DF-Thresholding is implemented for improving the classification performance. Evaluation of the classification with Naive Bayes classifier algorithm produces 84,61% accuracy.

  15. Improved algorithms for the classification of rough rice using a bionic electronic nose based on PCA and the Wilks distribution.

    Science.gov (United States)

    Xu, Sai; Zhou, Zhiyan; Lu, Huazhong; Luo, Xiwen; Lan, Yubin

    2014-03-19

    Principal Component Analysis (PCA) is one of the main methods used for electronic nose pattern recognition. However, poor classification performance is common in classification and recognition when using regular PCA. This paper aims to improve the classification performance of regular PCA based on the existing Wilks Λ-statistic (i.e., combined PCA with the Wilks distribution). The improved algorithms, which combine regular PCA with the Wilks Λ-statistic, were developed after analysing the functionality and defects of PCA. Verification tests were conducted using a PEN3 electronic nose. The collected samples consisted of the volatiles of six varieties of rough rice (Zhongxiang1, Xiangwan13, Yaopingxiang, WufengyouT025, Pin 36, and Youyou122), grown in same area and season. The first two principal components used as analysis vectors cannot perform the rough rice varieties classification task based on a regular PCA. Using the improved algorithms, which combine the regular PCA with the Wilks Λ-statistic, many different principal components were selected as analysis vectors. The set of data points of the Mahalanobis distance between each of the varieties of rough rice was selected to estimate the performance of the classification. The result illustrates that the rough rice varieties classification task is achieved well using the improved algorithm. A Probabilistic Neural Networks (PNN) was also established to test the effectiveness of the improved algorithms. The first two principal components (namely PC1 and PC2) and the first and fifth principal component (namely PC1 and PC5) were selected as the inputs of PNN for the classification of the six rough rice varieties. The results indicate that the classification accuracy based on the improved algorithm was improved by 6.67% compared to the results of the regular method. These results prove the effectiveness of using the Wilks Λ-statistic to improve the classification accuracy of the regular PCA approach. The results

  16. Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance

    Science.gov (United States)

    Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi

    2017-11-01

    K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).

  17. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

    Science.gov (United States)

    Al-Rajab, Murad; Lu, Joan; Xu, Qiang

    2017-07-01

    This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. A Weighted Block Dictionary Learning Algorithm for Classification

    OpenAIRE

    Shi, Zhongrong

    2016-01-01

    Discriminative dictionary learning, playing a critical role in sparse representation based classification, has led to state-of-the-art classification results. Among the existing discriminative dictionary learning methods, two different approaches, shared dictionary and class-specific dictionary, which associate each dictionary atom to all classes or a single class, have been studied. The shared dictionary is a compact method but with lack of discriminative information; the class-specific dict...

  19. A semi-supervised classification algorithm using the TAD-derived background as training data

    Science.gov (United States)

    Fan, Lei; Ambeau, Brittany; Messinger, David W.

    2013-05-01

    In general, spectral image classification algorithms fall into one of two categories: supervised and unsupervised. In unsupervised approaches, the algorithm automatically identifies clusters in the data without a priori information about those clusters (except perhaps the expected number of them). Supervised approaches require an analyst to identify training data to learn the characteristics of the clusters such that they can then classify all other pixels into one of the pre-defined groups. The classification algorithm presented here is a semi-supervised approach based on the Topological Anomaly Detection (TAD) algorithm. The TAD algorithm defines background components based on a mutual k-Nearest Neighbor graph model of the data, along with a spectral connected components analysis. Here, the largest components produced by TAD are used as regions of interest (ROI's),or training data for a supervised classification scheme. By combining those ROI's with a Gaussian Maximum Likelihood (GML) or a Minimum Distance to the Mean (MDM) algorithm, we are able to achieve a semi supervised classification method. We test this classification algorithm against data collected by the HyMAP sensor over the Cooke City, MT area and University of Pavia scene.

  20. New Dandelion Algorithm Optimizes Extreme Learning Machine for Biomedical Classification Problems

    Directory of Open Access Journals (Sweden)

    Xiguang Li

    2017-01-01

    Full Text Available Inspired by the behavior of dandelion sowing, a new novel swarm intelligence algorithm, namely, dandelion algorithm (DA, is proposed for global optimization of complex functions in this paper. In DA, the dandelion population will be divided into two subpopulations, and different subpopulations will undergo different sowing behaviors. Moreover, another sowing method is designed to jump out of local optimum. In order to demonstrate the validation of DA, we compare the proposed algorithm with other existing algorithms, including bat algorithm, particle swarm optimization, and enhanced fireworks algorithm. Simulations show that the proposed algorithm seems much superior to other algorithms. At the same time, the proposed algorithm can be applied to optimize extreme learning machine (ELM for biomedical classification problems, and the effect is considerable. At last, we use different fusion methods to form different fusion classifiers, and the fusion classifiers can achieve higher accuracy and better stability to some extent.

  1. Energy-efficient algorithm for classification of states of wireless sensor network using machine learning methods

    Science.gov (United States)

    Yuldashev, M. N.; Vlasov, A. I.; Novikov, A. N.

    2018-05-01

    This paper focuses on the development of an energy-efficient algorithm for classification of states of a wireless sensor network using machine learning methods. The proposed algorithm reduces energy consumption by: 1) elimination of monitoring of parameters that do not affect the state of the sensor network, 2) reduction of communication sessions over the network (the data are transmitted only if their values can affect the state of the sensor network). The studies of the proposed algorithm have shown that at classification accuracy close to 100%, the number of communication sessions can be reduced by 80%.

  2. Woven fabric defects detection based on texture classification algorithm

    International Nuclear Information System (INIS)

    Ben Salem, Y.; Nasri, S.

    2011-01-01

    In this paper we have compared two famous methods in texture classification to solve the problem of recognition and classification of defects occurring in a textile manufacture. We have compared local binary patterns method with co-occurrence matrix. The classifier used is the support vector machines (SVM). The system has been tested using TILDA database. The results obtained are interesting and show that LBP is a good method for the problems of recognition and classifcation defects, it gives a good running time especially for the real time applications.

  3. Classification Formula and Generation Algorithm of Cycle Decomposition Expression for Dihedral Groups

    Directory of Open Access Journals (Sweden)

    Dakun Zhang

    2013-01-01

    Full Text Available The necessary of classification research on common formula of group (dihedral group cycle decomposition expression is illustrated. It includes the reflection and rotation conversion, which derived six common formulae on cycle decomposition expressions of group; it designed the generation algorithm on the cycle decomposition expressions of group, which is based on the method of replacement conversion and the classification formula; algorithm analysis and the results of the process show that the generation algorithm which is based on the classification formula is outperformed by the general algorithm which is based on replacement conversion; it has great significance to solve the enumeration of the necklace combinational scheme, especially the structural problems of combinational scheme, by using group theory and computer.

  4. Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

    Directory of Open Access Journals (Sweden)

    Lixiong Xu

    2017-01-01

    Full Text Available As one of the most effective function mining algorithms, Gene Expression Programming (GEP algorithm has been widely used in classification, pattern recognition, prediction, and other research fields. Based on the self-evolution, GEP is able to mine an optimal function for dealing with further complicated tasks. However, in big data researches, GEP encounters low efficiency issue due to its long time mining processes. To improve the efficiency of GEP in big data researches especially for processing large-scale classification tasks, this paper presents a parallelized GEP algorithm using MapReduce computing model. The experimental results show that the presented algorithm is scalable and efficient for processing large-scale classification tasks.

  5. Comparison between Possibilistic c-Means (PCM and Artificial Neural Network (ANN Classification Algorithms in Land use/ Land cover Classification

    Directory of Open Access Journals (Sweden)

    Ganchimeg Ganbold

    2017-03-01

    Full Text Available There are several statistical classification algorithms available for landuse/land cover classification. However, each has a certain bias orcompromise. Some methods like the parallel piped approach in supervisedclassification, cannot classify continuous regions within a feature. Onthe other hand, while unsupervised classification method takes maximumadvantage of spectral variability in an image, the maximally separableclusters in spectral space may not do much for our perception of importantclasses in a given study area. In this research, the output of an ANNalgorithm was compared with the Possibilistic c-Means an improvementof the fuzzy c-Means on both moderate resolutions Landsat8 and a highresolution Formosat 2 images. The Formosat 2 image comes with an8m spectral resolution on the multispectral data. This multispectral imagedata was resampled to 10m in order to maintain a uniform ratio of1:3 against Landsat 8 image. Six classes were chosen for analysis including:Dense forest, eucalyptus, water, grassland, wheat and riverine sand. Using a standard false color composite (FCC, the six features reflecteddifferently in the infrared region with wheat producing the brightestpixel values. Signature collection per class was therefore easily obtainedfor all classifications. The output of both ANN and FCM, were analyzedseparately for accuracy and an error matrix generated to assess the qualityand accuracy of the classification algorithms. When you compare theresults of the two methods on a per-class-basis, ANN had a crisperoutput compared to PCM which yielded clusters with pixels especiallyon the moderate resolution Landsat 8 imagery.

  6. Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data

    Directory of Open Access Journals (Sweden)

    Viswanath Satish

    2012-02-01

    Full Text Available Abstract Background Dimensionality reduction (DR enables the construction of a lower dimensional space (embedding from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of noise in the data. In this paper, we present a novel DR technique known as consensus embedding that aims to overcome these problems by generating and combining multiple low-dimensional embeddings, hence exploiting the variance among them in a manner similar to ensemble classifier schemes such as Bagging. We demonstrate theoretical properties of consensus embedding which show that it will result in a single stable embedding solution that preserves information more accurately as compared to any individual embedding (generated via DR schemes such as Principal Component Analysis, Graph Embedding, or Locally Linear Embedding. Intelligent sub-sampling (via mean-shift and code parallelization are utilized to provide for an efficient implementation of the scheme. Results Applications of consensus embedding are shown in the context of classification and clustering as applied to: (1 image partitioning of white matter and gray matter on 10 different synthetic brain MRI images corrupted with 18 different combinations of noise and bias field inhomogeneity, (2 classification of 4 high-dimensional gene-expression datasets, (3 cancer detection (at a pixel-level on 16 image slices obtained from 2 different high-resolution prostate MRI datasets. In over 200 different experiments concerning classification and segmentation of biomedical data, consensus embedding was found to consistently outperform both linear and non-linear DR methods within all applications considered. Conclusions We have presented a novel framework termed consensus embedding which leverages ensemble classification theory within dimensionality reduction, allowing for application to a wide range

  7. Packet Classification by Multilevel Cutting of the Classification Space: An Algorithmic-Architectural Solution for IP Packet Classification in Next Generation Networks

    Directory of Open Access Journals (Sweden)

    Motasem Aldiab

    2008-01-01

    Full Text Available Traditionally, the Internet provides only a “best-effort” service, treating all packets going to the same destination equally. However, providing differentiated services for different users based on their quality requirements is increasingly becoming a demanding issue. For this, routers need to have the capability to distinguish and isolate traffic belonging to different flows. This ability to determine the flow each packet belongs to is called packet classification. Technology vendors are reluctant to support algorithmic solutions for classification due to their nondeterministic performance. Although content addressable memories (CAMs are favoured by technology vendors due to their deterministic high-lookup rates, they suffer from the problems of high-power consumption and high-silicon cost. This paper provides a new algorithmic-architectural solution for packet classification that mixes CAMs with algorithms based on multilevel cutting of the classification space into smaller spaces. The provided solution utilizes the geometrical distribution of rules in the classification space. It provides the deterministic performance of CAMs, support for dynamic updates, and added flexibility for system designers.

  8. A Supervised Classification Algorithm for Note Onset Detection

    Directory of Open Access Journals (Sweden)

    Douglas Eck

    2007-01-01

    Full Text Available This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

  9. Analysis and Evaluation of IKONOS Image Fusion Algorithm Based on Land Cover Classification

    Institute of Scientific and Technical Information of China (English)

    Xia; JING; Yan; BAO

    2015-01-01

    Different fusion algorithm has its own advantages and limitations,so it is very difficult to simply evaluate the good points and bad points of the fusion algorithm. Whether an algorithm was selected to fuse object images was also depended upon the sensor types and special research purposes. Firstly,five fusion methods,i. e. IHS,Brovey,PCA,SFIM and Gram-Schmidt,were briefly described in the paper. And then visual judgment and quantitative statistical parameters were used to assess the five algorithms. Finally,in order to determine which one is the best suitable fusion method for land cover classification of IKONOS image,the maximum likelihood classification( MLC) was applied using the above five fusion images. The results showed that the fusion effect of SFIM transform and Gram-Schmidt transform were better than the other three image fusion methods in spatial details improvement and spectral information fidelity,and Gram-Schmidt technique was superior to SFIM transform in the aspect of expressing image details. The classification accuracy of the fused image using Gram-Schmidt and SFIM algorithms was higher than that of the other three image fusion methods,and the overall accuracy was greater than 98%. The IHS-fused image classification accuracy was the lowest,the overall accuracy and kappa coefficient were 83. 14% and 0. 76,respectively. Thus the IKONOS fusion images obtained by the Gram-Schmidt and SFIM were better for improving the land cover classification accuracy.

  10. Algorithms for the Automatic Classification and Sorting of Conifers in the Garden Nursery Industry

    DEFF Research Database (Denmark)

    Petri, Stig

    with the classification and sorting of plants using machine vision have been discussed as an introduction to the work reported here. The use of Nordmann firs as a basis for evaluating the developed algorithms naturally introduces a bias towards this species in the algorithms, but steps have been taken throughout...... was used as the basis for evaluating the constructed feature extraction algorithms. Through an analysis of the construction of a machine vision system suitable for classifying and sorting plants, the needs with regard to physical frame, lighting system, camera and software algorithms have been uncovered......The ultimate purpose of this work is the development of general feature extraction algorithms useful for the classification and sorting of plants in the garden nursery industry. Narrowing the area of focus to bare-root plants, more specifically Nordmann firs, the scientific literature dealing...

  11. An Improved Brain-Inspired Emotional Learning Algorithm for Fast Classification

    Directory of Open Access Journals (Sweden)

    Ying Mei

    2017-06-01

    Full Text Available Classification is an important task of machine intelligence in the field of information. The artificial neural network (ANN is widely used for classification. However, the traditional ANN shows slow training speed, and it is hard to meet the real-time requirement for large-scale applications. In this paper, an improved brain-inspired emotional learning (BEL algorithm is proposed for fast classification. The BEL algorithm was put forward to mimic the high speed of the emotional learning mechanism in mammalian brain, which has the superior features of fast learning and low computational complexity. To improve the accuracy of BEL in classification, the genetic algorithm (GA is adopted for optimally tuning the weights and biases of amygdala and orbitofrontal cortex in the BEL neural network. The combinational algorithm named as GA-BEL has been tested on eight University of California at Irvine (UCI datasets and two well-known databases (Japanese Female Facial Expression, Cohn–Kanade. The comparisons of experiments indicate that the proposed GA-BEL is more accurate than the original BEL algorithm, and it is much faster than the traditional algorithm.

  12. Classification of underground pipe scanned images using feature extraction and neuro-fuzzy algorithm.

    Science.gov (United States)

    Sinha, S K; Karray, F

    2002-01-01

    Pipeline surface defects such as holes and cracks cause major problems for utility managers, particularly when the pipeline is buried under the ground. Manual inspection for surface defects in the pipeline has a number of drawbacks, including subjectivity, varying standards, and high costs. Automatic inspection system using image processing and artificial intelligence techniques can overcome many of these disadvantages and offer utility managers an opportunity to significantly improve quality and reduce costs. A recognition and classification of pipe cracks using images analysis and neuro-fuzzy algorithm is proposed. In the preprocessing step the scanned images of pipe are analyzed and crack features are extracted. In the classification step the neuro-fuzzy algorithm is developed that employs a fuzzy membership function and error backpropagation algorithm. The idea behind the proposed approach is that the fuzzy membership function will absorb variation of feature values and the backpropagation network, with its learning ability, will show good classification efficiency.

  13. Optimization of Neuro-Fuzzy System Using Genetic Algorithm for Chromosome Classification

    Directory of Open Access Journals (Sweden)

    M. Sarosa

    2013-09-01

    Full Text Available Neuro-fuzzy system has been shown to provide a good performance on chromosome classification but does not offer a simple method to obtain the accurate parameter values required to yield the best recognition rate. This paper presents a neuro-fuzzy system where its parameters can be automatically adjusted using genetic algorithms. The approach combines the advantages of fuzzy logic theory, neural networks, and genetic algorithms. The structure consists of a four layer feed-forward neural network that uses a GBell membership function as the output function. The proposed methodology has been applied and tested on banded chromosome classification from the Copenhagen Chromosome Database. Simulation result showed that the proposed neuro-fuzzy system optimized by genetic algorithms offers advantages in setting the parameter values, improves the recognition rate significantly and decreases the training/testing time which makes genetic neuro-fuzzy system suitable for chromosome classification.

  14. A Support Vector Machine Hydrometeor Classification Algorithm for Dual-Polarization Radar

    Directory of Open Access Journals (Sweden)

    Nicoletta Roberto

    2017-07-01

    Full Text Available An algorithm based on a support vector machine (SVM is proposed for hydrometeor classification. The training phase is driven by the output of a fuzzy logic hydrometeor classification algorithm, i.e., the most popular approach for hydrometer classification algorithms used for ground-based weather radar. The performance of SVM is evaluated by resorting to a weather scenario, generated by a weather model; the corresponding radar measurements are obtained by simulation and by comparing results of SVM classification with those obtained by a fuzzy logic classifier. Results based on the weather model and simulations show a higher accuracy of the SVM classification. Objective comparison of the two classifiers applied to real radar data shows that SVM classification maps are spatially more homogenous (textural indices, energy, and homogeneity increases by 21% and 12% respectively and do not present non-classified data. The improvements found by SVM classifier, even though it is applied pixel-by-pixel, can be attributed to its ability to learn from the entire hyperspace of radar measurements and to the accurate training. The reliability of results and higher computing performance make SVM attractive for some challenging tasks such as its implementation in Decision Support Systems for helping pilots to make optimal decisions about changes inthe flight route caused by unexpected adverse weather.

  15. Sequential Classification of Palm Gestures Based on A* Algorithm and MLP Neural Network for Quadrocopter Control

    Directory of Open Access Journals (Sweden)

    Wodziński Marek

    2017-06-01

    Full Text Available This paper presents an alternative approach to the sequential data classification, based on traditional machine learning algorithms (neural networks, principal component analysis, multivariate Gaussian anomaly detector and finding the shortest path in a directed acyclic graph, using A* algorithm with a regression-based heuristic. Palm gestures were used as an example of the sequential data and a quadrocopter was the controlled object. The study includes creation of a conceptual model and practical construction of a system using the GPU to ensure the realtime operation. The results present the classification accuracy of chosen gestures and comparison of the computation time between the CPU- and GPU-based solutions.

  16. PCIU: Hardware Implementations of an Efficient Packet Classification Algorithm with an Incremental Update Capability

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2011-01-01

    Full Text Available Packet classification plays a crucial role for a number of network services such as policy-based routing, firewalls, and traffic billing, to name a few. However, classification can be a bottleneck in the above-mentioned applications if not implemented properly and efficiently. In this paper, we propose PCIU, a novel classification algorithm, which improves upon previously published work. PCIU provides lower preprocessing time, lower memory consumption, ease of incremental rule update, and reasonable classification time compared to state-of-the-art algorithms. The proposed algorithm was evaluated and compared to RFC and HiCut using several benchmarks. Results obtained indicate that PCIU outperforms these algorithms in terms of speed, memory usage, incremental update capability, and preprocessing time. The algorithm, furthermore, was improved and made more accessible for a variety of applications through implementation in hardware. Two such implementations are detailed and discussed in this paper. The results indicate that a hardware/software codesign approach results in a slower, but easier to optimize and improve within time constraints, PCIU solution. A hardware accelerator based on an ESL approach using Handel-C, on the other hand, resulted in a 31x speed-up over a pure software implementation running on a state of the art Xeon processor.

  17. A novel evaluation of two related and two independent algorithms for eye movement classification during reading.

    Science.gov (United States)

    Friedman, Lee; Rigas, Ioannis; Abdulin, Evgeny; Komogortsev, Oleg V

    2018-05-15

    Nystrӧm and Holmqvist have published a method for the classification of eye movements during reading (ONH) (Nyström & Holmqvist, 2010). When we applied this algorithm to our data, the results were not satisfactory, so we modified the algorithm (now the MNH) to better classify our data. The changes included: (1) reducing the amount of signal filtering, (2) excluding a new type of noise, (3) removing several adaptive thresholds and replacing them with fixed thresholds, (4) changing the way that the start and end of each saccade was determined, (5) employing a new algorithm for detecting PSOs, and (6) allowing a fixation period to either begin or end with noise. A new method for the evaluation of classification algorithms is presented. It was designed to provide comprehensive feedback to an algorithm developer, in a time-efficient manner, about the types and numbers of classification errors that an algorithm produces. This evaluation was conducted by three expert raters independently, across 20 randomly chosen recordings, each classified by both algorithms. The MNH made many fewer errors in determining when saccades start and end, and it also detected some fixations and saccades that the ONH did not. The MNH fails to detect very small saccades. We also evaluated two additional algorithms: the EyeLink Parser and a more current, machine-learning-based algorithm. The EyeLink Parser tended to find more saccades that ended too early than did the other methods, and we found numerous problems with the output of the machine-learning-based algorithm.

  18. Automated detection and classification of cryptographic algorithms in binary programs through machine learning

    OpenAIRE

    Hosfelt, Diane Duros

    2015-01-01

    Threats from the internet, particularly malicious software (i.e., malware) often use cryptographic algorithms to disguise their actions and even to take control of a victim's system (as in the case of ransomware). Malware and other threats proliferate too quickly for the time-consuming traditional methods of binary analysis to be effective. By automating detection and classification of cryptographic algorithms, we can speed program analysis and more efficiently combat malware. This thesis wil...

  19. Data classification using metaheuristic Cuckoo Search technique for Levenberg Marquardt back propagation (CSLM) algorithm

    Science.gov (United States)

    Nawi, Nazri Mohd.; Khan, Abdullah; Rehman, M. Z.

    2015-05-01

    A nature inspired behavior metaheuristic techniques which provide derivative-free solutions to solve complex problems. One of the latest additions to the group of nature inspired optimization procedure is Cuckoo Search (CS) algorithm. Artificial Neural Network (ANN) training is an optimization task since it is desired to find optimal weight set of a neural network in training process. Traditional training algorithms have some limitation such as getting trapped in local minima and slow convergence rate. This study proposed a new technique CSLM by combining the best features of two known algorithms back-propagation (BP) and Levenberg Marquardt algorithm (LM) for improving the convergence speed of ANN training and avoiding local minima problem by training this network. Some selected benchmark classification datasets are used for simulation. The experiment result show that the proposed cuckoo search with Levenberg Marquardt algorithm has better performance than other algorithm used in this study.

  20. Walking pattern classification and walking distance estimation algorithms using gait phase information.

    Science.gov (United States)

    Wang, Jeen-Shing; Lin, Che-Wei; Yang, Ya-Ting C; Ho, Yu-Jen

    2012-10-01

    This paper presents a walking pattern classification and a walking distance estimation algorithm using gait phase information. A gait phase information retrieval algorithm was developed to analyze the duration of the phases in a gait cycle (i.e., stance, push-off, swing, and heel-strike phases). Based on the gait phase information, a decision tree based on the relations between gait phases was constructed for classifying three different walking patterns (level walking, walking upstairs, and walking downstairs). Gait phase information was also used for developing a walking distance estimation algorithm. The walking distance estimation algorithm consists of the processes of step count and step length estimation. The proposed walking pattern classification and walking distance estimation algorithm have been validated by a series of experiments. The accuracy of the proposed walking pattern classification was 98.87%, 95.45%, and 95.00% for level walking, walking upstairs, and walking downstairs, respectively. The accuracy of the proposed walking distance estimation algorithm was 96.42% over a walking distance.

  1. Classification of remotely sensed images

    CSIR Research Space (South Africa)

    Dudeni, N

    2008-10-01

    Full Text Available For this research, the researchers examine various existing image classification algorithms with the aim of demonstrating how these algorithms can be applied to remote sensing images. These algorithms are broadly divided into supervised...

  2. Experiments in Discourse Analysis Impact on Information Classification and Retrieval Algorithms.

    Science.gov (United States)

    Morato, Jorge; Llorens, J.; Genova, G.; Moreiro, J. A.

    2003-01-01

    Discusses the inclusion of contextual information in indexing and retrieval systems to improve results and the ability to carry out text analysis by means of linguistic knowledge. Presents research that investigated whether discourse variables have an impact on information and retrieval and classification algorithms. (Author/LRW)

  3. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2012-01-01

    Full Text Available Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

  4. The surgical algorithm for the AOSpine thoracolumbar spine injury classification system

    NARCIS (Netherlands)

    Vaccaro, Alexander R.; Schroeder, Gregory D.; Kepler, Christopher K.; Cumhur Oner, F.; Vialle, Luiz R.; Kandziora, Frank; Koerner, John D.; Kurd, Mark F.; Reinhold, Max; Schnake, Klaus J.; Chapman, Jens; Aarabi, Bizhan; Fehlings, Michael G.; Dvorak, Marcel F.

    2016-01-01

    Purpose: The goal of the current study is to establish a surgical algorithm to accompany the AOSpine thoracolumbar spine injury classification system. Methods: A survey was sent to AOSpine members from the six AO regions of the world, and surgeons were asked if a patient should undergo an initial

  5. A Comparative Study of Classification and Regression Algorithms for Modelling Students' Academic Performance

    Science.gov (United States)

    Strecht, Pedro; Cruz, Luís; Soares, Carlos; Mendes-Moreira, João; Abreu, Rui

    2015-01-01

    Predicting the success or failure of a student in a course or program is a problem that has recently been addressed using data mining techniques. In this paper we evaluate some of the most popular classification and regression algorithms on this problem. We address two problems: prediction of approval/failure and prediction of grade. The former is…

  6. Classification and learning using genetic algorithms applications in Bioinformatics and Web Intelligence

    CERN Document Server

    Bandyopadhyay, Sanghamitra

    2007-01-01

    This book provides a unified framework that describes how genetic learning can be used to design pattern recognition and learning systems. It examines how a search technique, the genetic algorithm, can be used for pattern classification mainly through approximating decision boundaries. Coverage also demonstrates the effectiveness of the genetic classifiers vis-à-vis several widely used classifiers, including neural networks.

  7. A Novel Algorithm for Imbalance Data Classification Based on Neighborhood Hypergraph

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2014-01-01

    Full Text Available The classification problem for imbalance data is paid more attention to. So far, many significant methods are proposed and applied to many fields. But more efficient methods are needed still. Hypergraph may not be powerful enough to deal with the data in boundary region, although it is an efficient tool to knowledge discovery. In this paper, the neighborhood hypergraph is presented, combining rough set theory and hypergraph. After that, a novel classification algorithm for imbalance data based on neighborhood hypergraph is developed, which is composed of three steps: initialization of hyperedge, classification of training data set, and substitution of hyperedge. After conducting an experiment of 10-fold cross validation on 18 data sets, the proposed algorithm has higher average accuracy than others.

  8. Does a Diagnostic Classification Algorithm Help to Predict the Course of Low Back Pain?

    DEFF Research Database (Denmark)

    Hartvigsen, Lisbeth; Kongsted, Alice; Vach, Werner

    2018-01-01

    ). Objectives To investigate if a diagnostic classification algorithm is associated with activity limitation and LBP intensity at 2-week and 3-month follow up, and 1-year trajectories of LBP intensity, and if it improves prediction of outcome when added to a set of known predictors. Methods 934 consecutive......Study Design A prospective observational study. Background A diagnostic classification algorithm was developed by Petersen et al., consisting of 12 categories based on a standardized examination protocol with the primary purpose of identifying clinically homogeneous subgroups of low back pain (LBP...... adult patients, with new episodes of LBP, who were visiting chiropractic practices in primary care were categorized according to the Petersen classification. Outcomes were disability and pain intensity measured at 2 weeks and 3 months, and 1-year trajectories of LBP based on weekly responses to text...

  9. Classification of EEG Signals using adaptive weighted distance nearest neighbor algorithm

    Directory of Open Access Journals (Sweden)

    E. Parvinnia

    2014-01-01

    Full Text Available Electroencephalogram (EEG signals are often used to diagnose diseases such as seizure, alzheimer, and schizophrenia. One main problem with the recorded EEG samples is that they are not equally reliable due to the artifacts at the time of recording. EEG signal classification algorithms should have a mechanism to handle this issue. It seems that using adaptive classifiers can be useful for the biological signals such as EEG. In this paper, a general adaptive method named weighted distance nearest neighbor (WDNN is applied for EEG signal classification to tackle this problem. This classification algorithm assigns a weight to each training sample to control its influence in classifying test samples. The weights of training samples are used to find the nearest neighbor of an input query pattern. To assess the performance of this scheme, EEG signals of thirteen schizophrenic patients and eighteen normal subjects are analyzed for the classification of these two groups. Several features including, fractal dimension, band power and autoregressive (AR model are extracted from EEG signals. The classification results are evaluated using Leave one (subject out cross validation for reliable estimation. The results indicate that combination of WDNN and selected features can significantly outperform the basic nearest-neighbor and the other methods proposed in the past for the classification of these two groups. Therefore, this method can be a complementary tool for specialists to distinguish schizophrenia disorder.

  10. Statistical classification techniques in high energy physics (SDDT algorithm)

    International Nuclear Information System (INIS)

    Bouř, Petr; Kůs, Václav; Franc, Jiří

    2016-01-01

    We present our proposal of the supervised binary divergence decision tree with nested separation method based on the generalized linear models. A key insight we provide is the clustering driven only by a few selected physical variables. The proper selection consists of the variables achieving the maximal divergence measure between two different classes. Further, we apply our method to Monte Carlo simulations of physics processes corresponding to a data sample of top quark-antiquark pair candidate events in the lepton+jets decay channel. The data sample is produced in pp̅ collisions at √S = 1.96 TeV. It corresponds to an integrated luminosity of 9.7 fb"-"1 recorded with the D0 detector during Run II of the Fermilab Tevatron Collider. The efficiency of our algorithm achieves 90% AUC in separating signal from background. We also briefly deal with the modification of statistical tests applicable to weighted data sets in order to test homogeneity of the Monte Carlo simulations and measured data. The justification of these modified tests is proposed through the divergence tests. (paper)

  11. Vision-based Human Action Classification Using Adaptive Boosting Algorithm

    KAUST Repository

    Zerrouki, Nabil; Harrou, Fouzi; Sun, Ying; Houacine, Amrane

    2018-01-01

    Precise recognition of human action is a key enabler for the development of many applications including autonomous robots for medical diagnosis and surveillance of elderly people in home environment. This paper addresses the human action recognition based on variation in body shape. Specifically, we divide the human body into five partitions that correspond to five partial occupancy areas. For each frame, we calculated area ratios and used them as input data for recognition stage. Here, we consider six classes of activities namely: walking, standing, bending, lying, squatting, and sitting. In this paper, we proposed an efficient human action recognition scheme, which takes advantages of superior discrimination capacity of AdaBoost algorithm. We validated the effectiveness of this approach by using experimental data from two publicly available databases fall detection databases from the University of Rzeszow’s and the Universidad de Málaga fall detection datasets. We provided comparisons of the proposed approach with state-of-the-art classifiers based on the neural network, K-nearest neighbor, support vector machine and naïve Bayes and showed that we achieve better results in discriminating human gestures.

  12. Vision-based Human Action Classification Using Adaptive Boosting Algorithm

    KAUST Repository

    Zerrouki, Nabil

    2018-05-07

    Precise recognition of human action is a key enabler for the development of many applications including autonomous robots for medical diagnosis and surveillance of elderly people in home environment. This paper addresses the human action recognition based on variation in body shape. Specifically, we divide the human body into five partitions that correspond to five partial occupancy areas. For each frame, we calculated area ratios and used them as input data for recognition stage. Here, we consider six classes of activities namely: walking, standing, bending, lying, squatting, and sitting. In this paper, we proposed an efficient human action recognition scheme, which takes advantages of superior discrimination capacity of AdaBoost algorithm. We validated the effectiveness of this approach by using experimental data from two publicly available databases fall detection databases from the University of Rzeszow’s and the Universidad de Málaga fall detection datasets. We provided comparisons of the proposed approach with state-of-the-art classifiers based on the neural network, K-nearest neighbor, support vector machine and naïve Bayes and showed that we achieve better results in discriminating human gestures.

  13. An arrhythmia classification algorithm using a dedicated wavelet adapted to different subjects.

    Science.gov (United States)

    Kim, Jinkwon; Min, Se Dong; Lee, Myoungho

    2011-06-27

    Numerous studies have been conducted regarding a heartbeat classification algorithm over the past several decades. However, many algorithms have also been studied to acquire robust performance, as biosignals have a large amount of variation among individuals. Various methods have been proposed to reduce the differences coming from personal characteristics, but these expand the differences caused by arrhythmia. In this paper, an arrhythmia classification algorithm using a dedicated wavelet adapted to individual subjects is proposed. We reduced the performance variation using dedicated wavelets, as in the ECG morphologies of the subjects. The proposed algorithm utilizes morphological filtering and a continuous wavelet transform with a dedicated wavelet. A principal component analysis and linear discriminant analysis were utilized to compress the morphological data transformed by the dedicated wavelets. An extreme learning machine was used as a classifier in the proposed algorithm. A performance evaluation was conducted with the MIT-BIH arrhythmia database. The results showed a high sensitivity of 97.51%, specificity of 85.07%, accuracy of 97.94%, and a positive predictive value of 97.26%. The proposed algorithm achieves better accuracy than other state-of-the-art algorithms with no intrasubject between the training and evaluation datasets. And it significantly reduces the amount of intervention needed by physicians.

  14. An arrhythmia classification algorithm using a dedicated wavelet adapted to different subjects

    Directory of Open Access Journals (Sweden)

    Min Se Dong

    2011-06-01

    Full Text Available Abstract Background Numerous studies have been conducted regarding a heartbeat classification algorithm over the past several decades. However, many algorithms have also been studied to acquire robust performance, as biosignals have a large amount of variation among individuals. Various methods have been proposed to reduce the differences coming from personal characteristics, but these expand the differences caused by arrhythmia. Methods In this paper, an arrhythmia classification algorithm using a dedicated wavelet adapted to individual subjects is proposed. We reduced the performance variation using dedicated wavelets, as in the ECG morphologies of the subjects. The proposed algorithm utilizes morphological filtering and a continuous wavelet transform with a dedicated wavelet. A principal component analysis and linear discriminant analysis were utilized to compress the morphological data transformed by the dedicated wavelets. An extreme learning machine was used as a classifier in the proposed algorithm. Results A performance evaluation was conducted with the MIT-BIH arrhythmia database. The results showed a high sensitivity of 97.51%, specificity of 85.07%, accuracy of 97.94%, and a positive predictive value of 97.26%. Conclusions The proposed algorithm achieves better accuracy than other state-of-the-art algorithms with no intrasubject between the training and evaluation datasets. And it significantly reduces the amount of intervention needed by physicians.

  15. Multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement

    Science.gov (United States)

    Yan, Dan; Bai, Lianfa; Zhang, Yi; Han, Jing

    2018-02-01

    For the problems of missing details and performance of the colorization based on sparse representation, we propose a conceptual model framework for colorizing gray-scale images, and then a multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement (CEMDC) is proposed based on this framework. The algorithm can achieve a natural colorized effect for a gray-scale image, and it is consistent with the human vision. First, the algorithm establishes a multi-sparse dictionary classification colorization model. Then, to improve the accuracy rate of the classification, the corresponding local constraint algorithm is proposed. Finally, we propose a detail enhancement based on Laplacian Pyramid, which is effective in solving the problem of missing details and improving the speed of image colorization. In addition, the algorithm not only realizes the colorization of the visual gray-scale image, but also can be applied to the other areas, such as color transfer between color images, colorizing gray fusion images, and infrared images.

  16. Brake fault diagnosis using Clonal Selection Classification Algorithm (CSCA – A statistical learning approach

    Directory of Open Access Journals (Sweden)

    R. Jegadeeshwaran

    2015-03-01

    Full Text Available In automobile, brake system is an essential part responsible for control of the vehicle. Any failure in the brake system impacts the vehicle's motion. It will generate frequent catastrophic effects on the vehicle cum passenger's safety. Thus the brake system plays a vital role in an automobile and hence condition monitoring of the brake system is essential. Vibration based condition monitoring using machine learning techniques are gaining momentum. This study is one such attempt to perform the condition monitoring of a hydraulic brake system through vibration analysis. In this research, the performance of a Clonal Selection Classification Algorithm (CSCA for brake fault diagnosis has been reported. A hydraulic brake system test rig was fabricated. Under good and faulty conditions of a brake system, the vibration signals were acquired using a piezoelectric transducer. The statistical parameters were extracted from the vibration signal. The best feature set was identified for classification using attribute evaluator. The selected features were then classified using CSCA. The classification accuracy of such artificial intelligence technique has been compared with other machine learning approaches and discussed. The Clonal Selection Classification Algorithm performs better and gives the maximum classification accuracy (96% for the fault diagnosis of a hydraulic brake system.

  17. Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Bülent Haznedar

    2017-02-01

    Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

  18. A Comprehensive Study of Features and Algorithms for URL-Based Topic Classification

    CERN Document Server

    Weber, I; Henzinger, M; Baykan, E

    2011-01-01

    Given only the URL of a Web page, can we identify its topic? We study this problem in detail by exploring a large number of different feature sets and algorithms on several datasets. We also show that the inherent overlap between topics and the sparsity of the information in URLs makes this a very challenging problem. Web page classification without a page's content is desirable when the content is not available at all, when a classification is needed before obtaining the content, or when classification speed is of utmost importance. For our experiments we used five different corpora comprising a total of about 3 million (URL, classification) pairs. We evaluated several techniques for feature generation and classification algorithms. The individual binary classifiers were then combined via boosting into metabinary classifiers. We achieve typical F-measure values between 80 and 85, and a typical precision of around 86. The precision can be pushed further over 90 while maintaining a typical level of recall betw...

  19. Machine learning algorithms for meteorological event classification in the coastal area using in-situ data

    Science.gov (United States)

    Sokolov, Anton; Gengembre, Cyril; Dmitriev, Egor; Delbarre, Hervé

    2017-04-01

    The problem is considered of classification of local atmospheric meteorological events in the coastal area such as sea breezes, fogs and storms. The in-situ meteorological data as wind speed and direction, temperature, humidity and turbulence are used as predictors. Local atmospheric events of 2013-2014 were analysed manually to train classification algorithms in the coastal area of English Channel in Dunkirk (France). Then, ultrasonic anemometer data and LIDAR wind profiler data were used as predictors. A few algorithms were applied to determine meteorological events by local data such as a decision tree, the nearest neighbour classifier, a support vector machine. The comparison of classification algorithms was carried out, the most important predictors for each event type were determined. It was shown that in more than 80 percent of the cases machine learning algorithms detect the meteorological class correctly. We expect that this methodology could be applied also to classify events by climatological in-situ data or by modelling data. It allows estimating frequencies of each event in perspective of climate change.

  20. Feature Selection for Motor Imagery EEG Classification Based on Firefly Algorithm and Learning Automata

    Directory of Open Access Journals (Sweden)

    Aiming Liu

    2017-11-01

    Full Text Available Motor Imagery (MI electroencephalography (EEG is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP and local characteristic-scale decomposition (LCD algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA classifier. Both the fourth brain–computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain–computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain–computer interface systems.

  1. A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation

    Directory of Open Access Journals (Sweden)

    Lavner Yizhar

    2009-01-01

    Full Text Available We present an efficient algorithm for segmentation of audio signals into speech or music. The central motivation to our study is consumer audio applications, where various real-time enhancements are often applied. The algorithm consists of a learning phase and a classification phase. In the learning phase, predefined training data is used for computing various time-domain and frequency-domain features, for speech and music signals separately, and estimating the optimal speech/music thresholds, based on the probability density functions of the features. An automatic procedure is employed to select the best features for separation. In the test phase, initial classification is performed for each segment of the audio signal, using a three-stage sieve-like approach, applying both Bayesian and rule-based methods. To avoid erroneous rapid alternations in the classification, a smoothing technique is applied, averaging the decision on each segment with past segment decisions. Extensive evaluation of the algorithm, on a database of more than 12 hours of speech and more than 22 hours of music showed correct identification rates of 99.4% and 97.8%, respectively, and quick adjustment to alternating speech/music sections. In addition to its accuracy and robustness, the algorithm can be easily adapted to different audio types, and is suitable for real-time operation.

  2. Feature Selection for Motor Imagery EEG Classification Based on Firefly Algorithm and Learning Automata.

    Science.gov (United States)

    Liu, Aiming; Chen, Kun; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi

    2017-11-08

    Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain-computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain-computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain-computer interface systems.

  3. Comparison of some classification algorithms based on deterministic and nondeterministic decision rules

    KAUST Repository

    Delimata, Paweł

    2010-01-01

    We discuss two, in a sense extreme, kinds of nondeterministic rules in decision tables. The first kind of rules, called as inhibitory rules, are blocking only one decision value (i.e., they have all but one decisions from all possible decisions on their right hand sides). Contrary to this, any rule of the second kind, called as a bounded nondeterministic rule, can have on the right hand side only a few decisions. We show that both kinds of rules can be used for improving the quality of classification. In the paper, two lazy classification algorithms of polynomial time complexity are considered. These algorithms are based on deterministic and inhibitory decision rules, but the direct generation of rules is not required. Instead of this, for any new object the considered algorithms extract from a given decision table efficiently some information about the set of rules. Next, this information is used by a decision-making procedure. The reported results of experiments show that the algorithms based on inhibitory decision rules are often better than those based on deterministic decision rules. We also present an application of bounded nondeterministic rules in construction of rule based classifiers. We include the results of experiments showing that by combining rule based classifiers based on minimal decision rules with bounded nondeterministic rules having confidence close to 1 and sufficiently large support, it is possible to improve the classification quality. © 2010 Springer-Verlag.

  4. Seasonal cultivated and fallow cropland mapping using MODIS-based automated cropland classification algorithm

    Science.gov (United States)

    Wu, Zhuoting; Thenkabail, Prasad S.; Mueller, Rick; Zakzeski, Audra; Melton, Forrest; Johnson, Lee; Rosevelt, Carolyn; Dwyer, John; Jones, Jeanine; Verdin, James P.

    2014-01-01

    Increasing drought occurrences and growing populations demand accurate, routine, and consistent cultivated and fallow cropland products to enable water and food security analysis. The overarching goal of this research was to develop and test automated cropland classification algorithm (ACCA) that provide accurate, consistent, and repeatable information on seasonal cultivated as well as seasonal fallow cropland extents and areas based on the Moderate Resolution Imaging Spectroradiometer remote sensing data. Seasonal ACCA development process involves writing series of iterative decision tree codes to separate cultivated and fallow croplands from noncroplands, aiming to accurately mirror reliable reference data sources. A pixel-by-pixel accuracy assessment when compared with the U.S. Department of Agriculture (USDA) cropland data showed, on average, a producer’s accuracy of 93% and a user’s accuracy of 85% across all months. Further, ACCA-derived cropland maps agreed well with the USDA Farm Service Agency crop acreage-reported data for both cultivated and fallow croplands with R-square values over 0.7 and field surveys with an accuracy of ≥95% for cultivated croplands and ≥76% for fallow croplands. Our results demonstrated the ability of ACCA to generate cropland products, such as cultivated and fallow cropland extents and areas, accurately, automatically, and repeatedly throughout the growing season.

  5. Seasonal cultivated and fallow cropland mapping using MODIS-based automated cropland classification algorithm

    Science.gov (United States)

    Wu, Zhuoting; Thenkabail, Prasad S.; Mueller, Rick; Zakzeski, Audra; Melton, Forrest; Johnson, Lee; Rosevelt, Carolyn; Dwyer, John; Jones, Jeanine; Verdin, James P.

    2014-01-01

    Increasing drought occurrences and growing populations demand accurate, routine, and consistent cultivated and fallow cropland products to enable water and food security analysis. The overarching goal of this research was to develop and test automated cropland classification algorithm (ACCA) that provide accurate, consistent, and repeatable information on seasonal cultivated as well as seasonal fallow cropland extents and areas based on the Moderate Resolution Imaging Spectroradiometer remote sensing data. Seasonal ACCA development process involves writing series of iterative decision tree codes to separate cultivated and fallow croplands from noncroplands, aiming to accurately mirror reliable reference data sources. A pixel-by-pixel accuracy assessment when compared with the U.S. Department of Agriculture (USDA) cropland data showed, on average, a producer's accuracy of 93% and a user's accuracy of 85% across all months. Further, ACCA-derived cropland maps agreed well with the USDA Farm Service Agency crop acreage-reported data for both cultivated and fallow croplands with R-square values over 0.7 and field surveys with an accuracy of ≥95% for cultivated croplands and ≥76% for fallow croplands. Our results demonstrated the ability of ACCA to generate cropland products, such as cultivated and fallow cropland extents and areas, accurately, automatically, and repeatedly throughout the growing season.

  6. Classification of Aerosol Retrievals from Spaceborne Polarimetry Using a Multiparameter Algorithm

    Science.gov (United States)

    Russell, Philip B.; Kacenelenbogen, Meloe; Livingston, John M.; Hasekamp, Otto P.; Burton, Sharon P.; Schuster, Gregory L.; Johnson, Matthew S.; Knobelspiesse, Kirk D.; Redemann, Jens; Ramachandran, S.; hide

    2013-01-01

    In this presentation, we demonstrate application of a new aerosol classification algorithm to retrievals from the POLDER-3 polarimter on the PARASOL spacecraft. Motivation and method: Since the development of global aerosol measurements by satellites and AERONET, classification of observed aerosols into several types (e.g., urban-industrial, biomass burning, mineral dust, maritime, and various subtypes or mixtures of these) has proven useful to: understanding aerosol sources, transformations, effects, and feedback mechanisms; improving accuracy of satellite retrievals and quantifying assessments of aerosol radiative impacts on climate.

  7. Spectral Classification of Similar Materials using the Tetracorder Algorithm: The Calcite-Epidote-Chlorite Problem

    Science.gov (United States)

    Dalton, J. Brad; Bove, Dana; Mladinich, Carol; Clark, Roger; Rockwell, Barnaby; Swayze, Gregg; King, Trude; Church, Stanley

    2001-01-01

    Recent work on automated spectral classification algorithms has sought to distinguish ever-more similar materials. From modest beginnings separating shade, soil, rock and vegetation to ambitious attempts to discriminate mineral types and specific plant species, the trend seems to be toward using increasingly subtle spectral differences to perform the classification. Rule-based expert systems exploiting the underlying physics of spectroscopy such as the US Geological Society Tetracorder system are now taking advantage of the high spectral resolution and dimensionality of current imaging spectrometer designs to discriminate spectrally similar materials. The current paper details recent efforts to discriminate three minerals having absorptions centered at the same wavelength, with encouraging results.

  8. A Region-Based GeneSIS Segmentation Algorithm for the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    Stelios K. Mylonas

    2015-03-01

    Full Text Available This paper proposes an object-based segmentation/classification scheme for remotely sensed images, based on a novel variant of the recently proposed Genetic Sequential Image Segmentation (GeneSIS algorithm. GeneSIS segments the image in an iterative manner, whereby at each iteration a single object is extracted via a genetic-based object extraction algorithm. Contrary to the previous pixel-based GeneSIS where the candidate objects to be extracted were evaluated through the fuzzy content of their included pixels, in the newly developed region-based GeneSIS algorithm, a watershed-driven fine segmentation map is initially obtained from the original image, which serves as the basis for the forthcoming GeneSIS segmentation. Furthermore, in order to enhance the spatial search capabilities, we introduce a more descriptive encoding scheme in the object extraction algorithm, where the structural search modules are represented by polygonal shapes. Our objectives in the new framework are posed as follows: enhance the flexibility of the algorithm in extracting more flexible object shapes, assure high level classification accuracies, and reduce the execution time of the segmentation, while at the same time preserving all the inherent attributes of the GeneSIS approach. Finally, exploiting the inherent attribute of GeneSIS to produce multiple segmentations, we also propose two segmentation fusion schemes that operate on the ensemble of segmentations generated by GeneSIS. Our approaches are tested on an urban and two agricultural images. The results show that region-based GeneSIS has considerably lower computational demands compared to the pixel-based one. Furthermore, the suggested methods achieve higher classification accuracies and good segmentation maps compared to a series of existing algorithms.

  9. An Incremental Classification Algorithm for Mining Data with Feature Space Heterogeneity

    Directory of Open Access Journals (Sweden)

    Yu Wang

    2014-01-01

    Full Text Available Feature space heterogeneity often exists in many real world data sets so that some features are of different importance for classification over different subsets. Moreover, the pattern of feature space heterogeneity might dynamically change over time as more and more data are accumulated. In this paper, we develop an incremental classification algorithm, Supervised Clustering for Classification with Feature Space Heterogeneity (SCCFSH, to address this problem. In our approach, supervised clustering is implemented to obtain a number of clusters such that samples in each cluster are from the same class. After the removal of outliers, relevance of features in each cluster is calculated based on their variations in this cluster. The feature relevance is incorporated into distance calculation for classification. The main advantage of SCCFSH lies in the fact that it is capable of solving a classification problem with feature space heterogeneity in an incremental way, which is favorable for online classification tasks with continuously changing data. Experimental results on a series of data sets and application to a database marketing problem show the efficiency and effectiveness of the proposed approach.

  10. Classification of upper limb disability levels of children with spastic unilateral cerebral palsy using K-means algorithm.

    Science.gov (United States)

    Raouafi, Sana; Achiche, Sofiane; Begon, Mickael; Sarcher, Aurélie; Raison, Maxime

    2018-01-01

    Treatment for cerebral palsy depends upon the severity of the child's condition and requires knowledge about upper limb disability. The aim of this study was to develop a systematic quantitative classification method of the upper limb disability levels for children with spastic unilateral cerebral palsy based on upper limb movements and muscle activation. Thirteen children with spastic unilateral cerebral palsy and six typically developing children participated in this study. Patients were matched on age and manual ability classification system levels I to III. Twenty-three kinematic and electromyographic variables were collected from two tasks. Discriminative analysis and K-means clustering algorithm were applied using 23 kinematic and EMG variables of each participant. Among the 23 kinematic and electromyographic variables, only two variables containing the most relevant information for the prediction of the four levels of severity of spastic unilateral cerebral palsy, which are fixed by manual ability classification system, were identified by discriminant analysis: (1) the Falconer index (CAI E ) which represents the ratio of biceps to triceps brachii activity during extension and (2) the maximal angle extension (θ Extension,max ). A good correlation (Kendall Rank correlation coefficient = -0.53, p = 0.01) was found between levels fixed by manual ability classification system and the obtained classes. These findings suggest that the cost and effort needed to assess and characterize the disability level of a child can be further reduced.

  11. [Automatic Sleep Stage Classification Based on an Improved K-means Clustering Algorithm].

    Science.gov (United States)

    Xiao, Shuyuan; Wang, Bei; Zhang, Jian; Zhang, Qunfeng; Zou, Junzhong

    2016-10-01

    Sleep stage scoring is a hotspot in the field of medicine and neuroscience.Visual inspection of sleep is laborious and the results may be subjective to different clinicians.Automatic sleep stage classification algorithm can be used to reduce the manual workload.However,there are still limitations when it encounters complicated and changeable clinical cases.The purpose of this paper is to develop an automatic sleep staging algorithm based on the characteristics of actual sleep data.In the proposed improved K-means clustering algorithm,points were selected as the initial centers by using a concept of density to avoid the randomness of the original K-means algorithm.Meanwhile,the cluster centers were updated according to the‘Three-Sigma Rule’during the iteration to abate the influence of the outliers.The proposed method was tested and analyzed on the overnight sleep data of the healthy persons and patients with sleep disorders after continuous positive airway pressure(CPAP)treatment.The automatic sleep stage classification results were compared with the visual inspection by qualified clinicians and the averaged accuracy reached 76%.With the analysis of morphological diversity of sleep data,it was proved that the proposed improved K-means algorithm was feasible and valid for clinical practice.

  12. Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results

    OpenAIRE

    Otero, Fernando E.B.; Freitas, Alex A.

    2016-01-01

    The vast majority of Ant Colony Optimization (ACO) algorithms for inducing classification rules use an ACO-based procedure to create a rule in an one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules)-i.e., the ACO search is guided by the quality of a list of rules, instead of an individual rule. In this paper we propose an extension of the cAnt-MinerPB algorith...

  13. Algorithm for Optimizing Bipolar Interconnection Weights with Applications in Associative Memories and Multitarget Classification

    Science.gov (United States)

    Chang, Shengjiang; Wong, Kwok-Wo; Zhang, Wenwei; Zhang, Yanxin

    1999-08-01

    An algorithm for optimizing a bipolar interconnection weight matrix with the Hopfield network is proposed. The effectiveness of this algorithm is demonstrated by computer simulation and optical implementation. In the optical implementation of the neural network the interconnection weights are biased to yield a nonnegative weight matrix. Moreover, a threshold subchannel is added so that the system can realize, in real time, the bipolar weighted summation in a single channel. Preliminary experimental results obtained from the applications in associative memories and multitarget classification with rotation invariance are shown.

  14. Comparison of classification algorithms for various methods of preprocessing radar images of the MSTAR base

    Science.gov (United States)

    Borodinov, A. A.; Myasnikov, V. V.

    2018-04-01

    The present work is devoted to comparing the accuracy of the known qualification algorithms in the task of recognizing local objects on radar images for various image preprocessing methods. Preprocessing involves speckle noise filtering and normalization of the object orientation in the image by the method of image moments and by a method based on the Hough transform. In comparison, the following classification algorithms are used: Decision tree; Support vector machine, AdaBoost, Random forest. The principal component analysis is used to reduce the dimension. The research is carried out on the objects from the base of radar images MSTAR. The paper presents the results of the conducted studies.

  15. An Automated Cropland Classification Algorithm (ACCA) for Tajikistan by Combining Landsat, MODIS, and Secondary Data

    OpenAIRE

    Thenkabail, Prasad S.; Wu, Zhuoting

    2012-01-01

    The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA) that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed) over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan u...

  16. Linear Subpixel Learning Algorithm for Land Cover Classification from WELD using High Performance Computing

    Science.gov (United States)

    Ganguly, S.; Kumar, U.; Nemani, R. R.; Kalia, S.; Michaelis, A.

    2017-12-01

    In this work, we use a Fully Constrained Least Squares Subpixel Learning Algorithm to unmix global WELD (Web Enabled Landsat Data) to obtain fractions or abundances of substrate (S), vegetation (V) and dark objects (D) classes. Because of the sheer nature of data and compute needs, we leveraged the NASA Earth Exchange (NEX) high performance computing architecture to optimize and scale our algorithm for large-scale processing. Subsequently, the S-V-D abundance maps were characterized into 4 classes namely, forest, farmland, water and urban areas (with NPP-VIIRS - national polar orbiting partnership visible infrared imaging radiometer suite nighttime lights data) over California, USA using Random Forest classifier. Validation of these land cover maps with NLCD (National Land Cover Database) 2011 products and NAFD (North American Forest Dynamics) static forest cover maps showed that an overall classification accuracy of over 91% was achieved, which is a 6% improvement in unmixing based classification relative to per-pixel based classification. As such, abundance maps continue to offer an useful alternative to high-spatial resolution data derived classification maps for forest inventory analysis, multi-class mapping for eco-climatic models and applications, fast multi-temporal trend analysis and for societal and policy-relevant applications needed at the watershed scale.

  17. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis.

    Science.gov (United States)

    Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H

    2016-01-01

    Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.

  18. Machine learning algorithms for mode-of-action classification in toxicity assessment.

    Science.gov (United States)

    Zhang, Yile; Wong, Yau Shu; Deng, Jian; Anton, Cristina; Gabos, Stephan; Zhang, Weiping; Huang, Dorothy Yu; Jin, Can

    2016-01-01

    Real Time Cell Analysis (RTCA) technology is used to monitor cellular changes continuously over the entire exposure period. Combining with different testing concentrations, the profiles have potential in probing the mode of action (MOA) of the testing substances. In this paper, we present machine learning approaches for MOA assessment. Computational tools based on artificial neural network (ANN) and support vector machine (SVM) are developed to analyze the time-concentration response curves (TCRCs) of human cell lines responding to tested chemicals. The techniques are capable of learning data from given TCRCs with known MOA information and then making MOA classification for the unknown toxicity. A novel data processing step based on wavelet transform is introduced to extract important features from the original TCRC data. From the dose response curves, time interval leading to higher classification success rate can be selected as input to enhance the performance of the machine learning algorithm. This is particularly helpful when handling cases with limited and imbalanced data. The validation of the proposed method is demonstrated by the supervised learning algorithm applied to the exposure data of HepG2 cell line to 63 chemicals with 11 concentrations in each test case. Classification success rate in the range of 85 to 95 % are obtained using SVM for MOA classification with two clusters to cases up to four clusters. Wavelet transform is capable of capturing important features of TCRCs for MOA classification. The proposed SVM scheme incorporated with wavelet transform has a great potential for large scale MOA classification and high-through output chemical screening.

  19. A Hybrid Multiobjective Differential Evolution Algorithm and Its Application to the Optimization of Grinding and Classification

    Directory of Open Access Journals (Sweden)

    Yalin Wang

    2013-01-01

    Full Text Available The grinding-classification is the prerequisite process for full recovery of the nonrenewable minerals with both production quality and quantity objectives concerned. Its natural formulation is a constrained multiobjective optimization problem of complex expression since the process is composed of one grinding machine and two classification machines. In this paper, a hybrid differential evolution (DE algorithm with multi-population is proposed. Some infeasible solutions with better performance are allowed to be saved, and they participate randomly in the evolution. In order to exploit the meaningful infeasible solutions, a functionally partitioned multi-population mechanism is designed to find an optimal solution from all possible directions. Meanwhile, a simplex method for local search is inserted into the evolution process to enhance the searching strategy in the optimization process. Simulation results from the test of some benchmark problems indicate that the proposed algorithm tends to converge quickly and effectively to the Pareto frontier with better distribution. Finally, the proposed algorithm is applied to solve a multiobjective optimization model of a grinding and classification process. Based on the technique for order performance by similarity to ideal solution (TOPSIS, the satisfactory solution is obtained by using a decision-making method for multiple attributes.

  20. Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification

    Directory of Open Access Journals (Sweden)

    Jie Hu

    2018-02-01

    Full Text Available Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM classification, which are valuable when human-annotated keywords are not available. We used a standard benchmark dataset and a homemade patent dataset to evaluate the performance of PKEA. Our patent dataset includes 2500 patents from five distinct technological fields related to autonomous cars (GPS systems, lidar systems, object recognition systems, radar systems, and vehicle control systems. We compared our method with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF, TextRank and Rapid Automatic Keyword Extraction (RAKE. The experimental results show that our proposed algorithm provides a promising way to extract keywords from patent texts for patent classification.

  1. An up-to-date comparison of state-of-the-art classification algorithms

    KAUST Repository

    Zhang, Chongsheng

    2017-04-05

    Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.

  2. An up-to-date comparison of state-of-the-art classification algorithms

    KAUST Repository

    Zhang, Chongsheng; Liu, Changchang; Zhang, Xiangliang; Almpanidis, George

    2017-01-01

    Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.

  3. Kernel Clustering with a Differential Harmony Search Algorithm for Scheme Classification

    Directory of Open Access Journals (Sweden)

    Yu Feng

    2017-01-01

    Full Text Available This paper presents a kernel fuzzy clustering with a novel differential harmony search algorithm to coordinate with the diversion scheduling scheme classification. First, we employed a self-adaptive solution generation strategy and differential evolution-based population update strategy to improve the classical harmony search. Second, we applied the differential harmony search algorithm to the kernel fuzzy clustering to help the clustering method obtain better solutions. Finally, the combination of the kernel fuzzy clustering and the differential harmony search is applied for water diversion scheduling in East Lake. A comparison of the proposed method with other methods has been carried out. The results show that the kernel clustering with the differential harmony search algorithm has good performance to cooperate with the water diversion scheduling problems.

  4. The efficiency of the RULES-4 classification learning algorithm in predicting the density of agents

    Directory of Open Access Journals (Sweden)

    Ziad Salem

    2014-12-01

    Full Text Available Learning is the act of obtaining new or modifying existing knowledge, behaviours, skills or preferences. The ability to learn is found in humans, other organisms and some machines. Learning is always based on some sort of observations or data such as examples, direct experience or instruction. This paper presents a classification algorithm to learn the density of agents in an arena based on the measurements of six proximity sensors of a combined actuator sensor units (CASUs. Rules are presented that were induced by the learning algorithm that was trained with data-sets based on the CASU’s sensor data streams collected during a number of experiments with “Bristlebots (agents in the arena (environment”. It was found that a set of rules generated by the learning algorithm is able to predict the number of bristlebots in the arena based on the CASU’s sensor readings with satisfying accuracy.

  5. SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES

    Directory of Open Access Journals (Sweden)

    Hugo Leonardo Pereira Rufino

    2016-04-01

    Full Text Available Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases

  6. Classification of Ultrasonic NDE Signals Using the Expectation Maximization (EM) and Least Mean Square (LMS) Algorithms

    International Nuclear Information System (INIS)

    Kim, Dae Won

    2005-01-01

    Ultrasonic inspection methods are widely used for detecting flaws in materials. The signal analysis step plays a crucial part in the data interpretation process. A number of signal processing methods have been proposed to classify ultrasonic flaw signals. One of the more popular methods involves the extraction of an appropriate set of features followed by the use of a neural network for the classification of the signals in the feature spare. This paper describes an alternative approach which uses the least mean square (LMS) method and exportation maximization (EM) algorithm with the model based deconvolution which is employed for classifying nondestructive evaluation (NDE) signals from steam generator tubes in a nuclear power plant. The signals due to cracks and deposits are not significantly different. These signals must be discriminated to prevent from happening a huge disaster such as contamination of water or explosion. A model based deconvolution has been described to facilitate comparison of classification results. The method uses the space alternating generalized expectation maximiBation (SAGE) algorithm ill conjunction with the Newton-Raphson method which uses the Hessian parameter resulting in fast convergence to estimate the time of flight and the distance between the tube wall and the ultrasonic sensor. Results using these schemes for the classification of ultrasonic signals from cracks and deposits within steam generator tubes are presented and showed a reasonable performances

  7. Classification of different kinds of pesticide residues on lettuce based on fluorescence spectra and WT-BCC-SVM algorithm

    Science.gov (United States)

    Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu

    2017-07-01

    In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.

  8. Exploring high dimensional data with Butterfly: a novel classification algorithm based on discrete dynamical systems.

    Science.gov (United States)

    Geraci, Joseph; Dharsee, Moyez; Nuin, Paulo; Haslehurst, Alexandria; Koti, Madhuri; Feilotter, Harriet E; Evans, Ken

    2014-03-01

    We introduce a novel method for visualizing high dimensional data via a discrete dynamical system. This method provides a 2D representation of the relationship between subjects according to a set of variables without geometric projections, transformed axes or principal components. The algorithm exploits a memory-type mechanism inherent in a certain class of discrete dynamical systems collectively referred to as the chaos game that are closely related to iterative function systems. The goal of the algorithm was to create a human readable representation of high dimensional patient data that was capable of detecting unrevealed subclusters of patients from within anticipated classifications. This provides a mechanism to further pursue a more personalized exploration of pathology when used with medical data. For clustering and classification protocols, the dynamical system portion of the algorithm is designed to come after some feature selection filter and before some model evaluation (e.g. clustering accuracy) protocol. In the version given here, a univariate features selection step is performed (in practice more complex feature selection methods are used), a discrete dynamical system is driven by this reduced set of variables (which results in a set of 2D cluster models), these models are evaluated for their accuracy (according to a user-defined binary classification) and finally a visual representation of the top classification models are returned. Thus, in addition to the visualization component, this methodology can be used for both supervised and unsupervised machine learning as the top performing models are returned in the protocol we describe here. Butterfly, the algorithm we introduce and provide working code for, uses a discrete dynamical system to classify high dimensional data and provide a 2D representation of the relationship between subjects. We report results on three datasets (two in the article; one in the appendix) including a public lung cancer

  9. A COMPARISON OF HAZE REMOVAL ALGORITHMS AND THEIR IMPACTS ON CLASSIFICATION ACCURACY FOR LANDSAT IMAGERY

    Directory of Open Access Journals (Sweden)

    Yang Xiao

    Full Text Available The quality of Landsat images in humid areas is considerably degraded by haze in terms of their spectral response pattern, which limits the possibility of their application in using visible and near-infrared bands. A variety of haze removal algorithms have been proposed to correct these unsatisfactory illumination effects caused by the haze contamination. The purpose of this study was to illustrate the difference of two major algorithms (the improved homomorphic filtering (HF and the virtual cloud point (VCP for their effectiveness in solving spatially varying haze contamination, and to evaluate the impacts of haze removal on land cover classification. A case study with exploiting large quantities of Landsat TM images and climates (clear and haze in the most humid areas in China proved that these haze removal algorithms both perform well in processing Landsat images contaminated by haze. The outcome of the application of VCP appears to be more similar to the reference images compared to HF. Moreover, the Landsat image with VCP haze removal can improve the classification accuracy effectively in comparison to that without haze removal, especially in the cloudy contaminated area

  10. Unraveling cognitive traits using the Morris water maze unbiased strategy classification (MUST-C) algorithm.

    Science.gov (United States)

    Illouz, Tomer; Madar, Ravit; Louzon, Yoram; Griffioen, Kathleen J; Okun, Eitan

    2016-02-01

    The assessment of spatial cognitive learning in rodents is a central approach in neuroscience, as it enables one to assess and quantify the effects of treatments and genetic manipulations from a broad perspective. Although the Morris water maze (MWM) is a well-validated paradigm for testing spatial learning abilities, manual categorization of performance in the MWM into behavioral strategies is subject to individual interpretation, and thus to biases. Here we offer a support vector machine (SVM) - based, automated, MWM unbiased strategy classification (MUST-C) algorithm, as well as a cognitive score scale. This model was examined and validated by analyzing data obtained from five MWM experiments with changing platform sizes, revealing a limitation in the spatial capacity of the hippocampus. We have further employed this algorithm to extract novel mechanistic insights on the impact of members of the Toll-like receptor pathway on cognitive spatial learning and memory. The MUST-C algorithm can greatly benefit MWM users as it provides a standardized method of strategy classification as well as a cognitive scoring scale, which cannot be derived from typical analysis of MWM data. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  11. Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results.

    Science.gov (United States)

    Otero, Fernando E B; Freitas, Alex A

    2016-01-01

    Most ant colony optimization (ACO) algorithms for inducing classification rules use a ACO-based procedure to create a rule in a one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-Miner[Formula: see text] algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules), i.e., the ACO search is guided by the quality of a list of rules instead of an individual rule. In this paper we propose an extension of the cAnt-Miner[Formula: see text] algorithm to discover a set of rules (unordered rules). The main motivations for this work are to improve the interpretation of individual rules by discovering a set of rules and to evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms, support vector machines, and the cAnt-Miner[Formula: see text] producing ordered rules are also presented.

  12. Multi-objective evolutionary algorithms for fuzzy classification in survival prediction.

    Science.gov (United States)

    Jiménez, Fernando; Sánchez, Gracia; Juárez, José M

    2014-03-01

    This paper presents a novel rule-based fuzzy classification methodology for survival/mortality prediction in severe burnt patients. Due to the ethical aspects involved in this medical scenario, physicians tend not to accept a computer-based evaluation unless they understand why and how such a recommendation is given. Therefore, any fuzzy classifier model must be both accurate and interpretable. The proposed methodology is a three-step process: (1) multi-objective constrained optimization of a patient's data set, using Pareto-based elitist multi-objective evolutionary algorithms to maximize accuracy and minimize the complexity (number of rules) of classifiers, subject to interpretability constraints; this step produces a set of alternative (Pareto) classifiers; (2) linguistic labeling, which assigns a linguistic label to each fuzzy set of the classifiers; this step is essential to the interpretability of the classifiers; (3) decision making, whereby a classifier is chosen, if it is satisfactory, according to the preferences of the decision maker. If no classifier is satisfactory for the decision maker, the process starts again in step (1) with a different input parameter set. The performance of three multi-objective evolutionary algorithms, niched pre-selection multi-objective algorithm, elitist Pareto-based multi-objective evolutionary algorithm for diversity reinforcement (ENORA) and the non-dominated sorting genetic algorithm (NSGA-II), was tested using a patient's data set from an intensive care burn unit and a standard machine learning data set from an standard machine learning repository. The results are compared using the hypervolume multi-objective metric. Besides, the results have been compared with other non-evolutionary techniques and validated with a multi-objective cross-validation technique. Our proposal improves the classification rate obtained by other non-evolutionary techniques (decision trees, artificial neural networks, Naive Bayes, and case

  13. Study of Image Analysis Algorithms for Segmentation, Feature Extraction and Classification of Cells

    Directory of Open Access Journals (Sweden)

    Margarita Gamarra

    2017-08-01

    Full Text Available Recent advances in microcopy and improvements in image processing algorithms have allowed the development of computer-assisted analytical approaches in cell identification. Several applications could be mentioned in this field: Cellular phenotype identification, disease detection and treatment, identifying virus entry in cells and virus classification; these applications could help to complement the opinion of medical experts. Although many surveys have been presented in medical image analysis, they focus mainly in tissues and organs and none of the surveys about image cells consider an analysis following the stages in the typical image processing: Segmentation, feature extraction and classification. The goal of this study is to provide comprehensive and critical analyses about the trends in each stage of cell image processing. In this paper, we present a literature survey about cell identification using different image processing techniques.

  14. Algorithms and data structures for automated change detection and classification of sidescan sonar imagery

    Science.gov (United States)

    Gendron, Marlin Lee

    During Mine Warfare (MIW) operations, MIW analysts perform change detection by visually comparing historical sidescan sonar imagery (SSI) collected by a sidescan sonar with recently collected SSI in an attempt to identify objects (which might be explosive mines) placed at sea since the last time the area was surveyed. This dissertation presents a data structure and three algorithms, developed by the author, that are part of an automated change detection and classification (ACDC) system. MIW analysts at the Naval Oceanographic Office, to reduce the amount of time to perform change detection, are currently using ACDC. The dissertation introductory chapter gives background information on change detection, ACDC, and describes how SSI is produced from raw sonar data. Chapter 2 presents the author's Geospatial Bitmap (GB) data structure, which is capable of storing information geographically and is utilized by the three algorithms. This chapter shows that a GB data structure used in a polygon-smoothing algorithm ran between 1.3--48.4x faster than a sparse matrix data structure. Chapter 3 describes the GB clustering algorithm, which is the author's repeatable, order-independent method for clustering. Results from tests performed in this chapter show that the time to cluster a set of points is not affected by the distribution or the order of the points. In Chapter 4, the author presents his real-time computer-aided detection (CAD) algorithm that automatically detects mine-like objects on the seafloor in SSI. The author ran his GB-based CAD algorithm on real SSI data, and results of these tests indicate that his real-time CAD algorithm performs comparably to or better than other non-real-time CAD algorithms. The author presents his computer-aided search (CAS) algorithm in Chapter 5. CAS helps MIW analysts locate mine-like features that are geospatially close to previously detected features. A comparison between the CAS and a great circle distance algorithm shows that the

  15. Classification and authentication of unknown water samples using machine learning algorithms.

    Science.gov (United States)

    Kundu, Palash K; Panchariya, P C; Kundu, Madhusree

    2011-07-01

    This paper proposes the development of water sample classification and authentication, in real life which is based on machine learning algorithms. The proposed techniques used experimental measurements from a pulse voltametry method which is based on an electronic tongue (E-tongue) instrumentation system with silver and platinum electrodes. E-tongue include arrays of solid state ion sensors, transducers even of different types, data collectors and data analysis tools, all oriented to the classification of liquid samples and authentication of unknown liquid samples. The time series signal and the corresponding raw data represent the measurement from a multi-sensor system. The E-tongue system, implemented in a laboratory environment for 6 numbers of different ISI (Bureau of Indian standard) certified water samples (Aquafina, Bisleri, Kingfisher, Oasis, Dolphin, and McDowell) was the data source for developing two types of machine learning algorithms like classification and regression. A water data set consisting of 6 numbers of sample classes containing 4402 numbers of features were considered. A PCA (principal component analysis) based classification and authentication tool was developed in this study as the machine learning component of the E-tongue system. A proposed partial least squares (PLS) based classifier, which was dedicated as well; to authenticate a specific category of water sample evolved out as an integral part of the E-tongue instrumentation system. The developed PCA and PLS based E-tongue system emancipated an overall encouraging authentication percentage accuracy with their excellent performances for the aforesaid categories of water samples. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.

  16. A real-time classification algorithm for EEG-based BCI driven by self-induced emotions.

    Science.gov (United States)

    Iacoviello, Daniela; Petracca, Andrea; Spezialetti, Matteo; Placidi, Giuseppe

    2015-12-01

    The aim of this paper is to provide an efficient, parametric, general, and completely automatic real time classification method of electroencephalography (EEG) signals obtained from self-induced emotions. The particular characteristics of the considered low-amplitude signals (a self-induced emotion produces a signal whose amplitude is about 15% of a really experienced emotion) require exploring and adapting strategies like the Wavelet Transform, the Principal Component Analysis (PCA) and the Support Vector Machine (SVM) for signal processing, analysis and classification. Moreover, the method is thought to be used in a multi-emotions based Brain Computer Interface (BCI) and, for this reason, an ad hoc shrewdness is assumed. The peculiarity of the brain activation requires ad-hoc signal processing by wavelet decomposition, and the definition of a set of features for signal characterization in order to discriminate different self-induced emotions. The proposed method is a two stages algorithm, completely parameterized, aiming at a multi-class classification and may be considered in the framework of machine learning. The first stage, the calibration, is off-line and is devoted at the signal processing, the determination of the features and at the training of a classifier. The second stage, the real-time one, is the test on new data. The PCA theory is applied to avoid redundancy in the set of features whereas the classification of the selected features, and therefore of the signals, is obtained by the SVM. Some experimental tests have been conducted on EEG signals proposing a binary BCI, based on the self-induced disgust produced by remembering an unpleasant odor. Since in literature it has been shown that this emotion mainly involves the right hemisphere and in particular the T8 channel, the classification procedure is tested by using just T8, though the average accuracy is calculated and reported also for the whole set of the measured channels. The obtained

  17. Classification of large-sized hyperspectral imagery using fast machine learning algorithms

    Science.gov (United States)

    Xia, Junshi; Yokoya, Naoto; Iwasaki, Akira

    2017-07-01

    We present a framework of fast machine learning algorithms in the context of large-sized hyperspectral images classification from the theoretical to a practical viewpoint. In particular, we assess the performance of random forest (RF), rotation forest (RoF), and extreme learning machine (ELM) and the ensembles of RF and ELM. These classifiers are applied to two large-sized hyperspectral images and compared to the support vector machines. To give the quantitative analysis, we pay attention to comparing these methods when working with high input dimensions and a limited/sufficient training set. Moreover, other important issues such as the computational cost and robustness against the noise are also discussed.

  18. Comparison of Different Classification Algorithms for the Detection of User's Interaction with Windows in Office Buildings

    DEFF Research Database (Denmark)

    Markovic, Romana; Wolf, Sebastian; Cao, Jun

    2017-01-01

    Occupant behavior in terms of interactions with windows and heating systems is seen as one of the main sources of discrepancy between predicted and measured heating, ventilation and air conditioning (HVAC) building energy consumption. Thus, this work analyzes the performance of several...... classification algorithms for detecting occupant's interactions with windows, while taking the imbalanced properties of the available data set into account. The tested methods include support vector machines (SVM), random forests, and their combination with dynamic Bayesian networks (DBN). The results will show...

  19. PMSVM: An Optimized Support Vector Machine Classification Algorithm Based on PCA and Multilevel Grid Search Methods

    Directory of Open Access Journals (Sweden)

    Yukai Yao

    2015-01-01

    Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.

  20. Application of multiple signal classification algorithm to frequency estimation in coherent dual-frequency lidar

    Science.gov (United States)

    Li, Ruixiao; Li, Kun; Zhao, Changming

    2018-01-01

    Coherent dual-frequency Lidar (CDFL) is a new development of Lidar which dramatically enhances the ability to decrease the influence of atmospheric interference by using dual-frequency laser to measure the range and velocity with high precision. Based on the nature of CDFL signals, we propose to apply the multiple signal classification (MUSIC) algorithm in place of the fast Fourier transform (FFT) to estimate the phase differences in dual-frequency Lidar. In the presence of Gaussian white noise, the simulation results show that the signal peaks are more evident when using MUSIC algorithm instead of FFT in condition of low signal-noise-ratio (SNR), which helps to improve the precision of detection on range and velocity, especially for the long distance measurement systems.

  1. A Fast Algorithm of Convex Hull Vertices Selection for Online Classification.

    Science.gov (United States)

    Ding, Shuguang; Nie, Xiangli; Qiao, Hong; Zhang, Bo

    2018-04-01

    Reducing samples through convex hull vertices selection (CHVS) within each class is an important and effective method for online classification problems, since the classifier can be trained rapidly with the selected samples. However, the process of CHVS is NP-hard. In this paper, we propose a fast algorithm to select the convex hull vertices, based on the convex hull decomposition and the property of projection. In the proposed algorithm, the quadratic minimization problem of computing the distance between a point and a convex hull is converted into a linear equation problem with a low computational complexity. When the data dimension is high, an approximate, instead of exact, convex hull is allowed to be selected by setting an appropriate termination condition in order to delete more nonimportant samples. In addition, the impact of outliers is also considered, and the proposed algorithm is improved by deleting the outliers in the initial procedure. Furthermore, a dimension convention technique via the kernel trick is used to deal with nonlinearly separable problems. An upper bound is theoretically proved for the difference between the support vector machines based on the approximate convex hull vertices selected and all the training samples. Experimental results on both synthetic and real data sets show the effectiveness and validity of the proposed algorithm.

  2. A Constructive Data Classification Version of the Particle Swarm Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Alexandre Szabo

    2013-01-01

    Full Text Available The particle swarm optimization algorithm was originally introduced to solve continuous parameter optimization problems. It was soon modified to solve other types of optimization tasks and also to be applied to data analysis. In the latter case, however, there are few works in the literature that deal with the problem of dynamically building the architecture of the system. This paper introduces new particle swarm algorithms specifically designed to solve classification problems. The first proposal, named Particle Swarm Classifier (PSClass, is a derivation of a particle swarm clustering algorithm and its architecture, as in most classifiers, is pre-defined. The second proposal, named Constructive Particle Swarm Classifier (cPSClass, uses ideas from the immune system to automatically build the swarm. A sensitivity analysis of the growing procedure of cPSClass and an investigation into a proposed pruning procedure for this algorithm are performed. The proposals were applied to a wide range of databases from the literature and the results show that they are competitive in relation to other approaches, with the advantage of having a dynamically constructed architecture.

  3. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  4. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  5. Slow Learner Prediction Using Multi-Variate Naïve Bayes Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Shiwani Rana

    2017-01-01

    Full Text Available Machine Learning is a field of computer science that learns from data by studying algorithms and their constructions. In machine learning, for specific inputs, algorithms help to make predictions. Classification is a supervised learning approach, which maps a data item into predefined classes. For predicting slow learners in an institute, a modified Naïve Bayes algorithm implemented. The implementation is carried sing Python.  It takes into account a combination of likewise multi-valued attributes. A dataset of the 60 students of BE (Information Technology Third Semester for the subject of Digital Electronics of University Institute of Engineering and Technology (UIET, Panjab University (PU, Chandigarh, India is taken to carry out the simulations. The analysis is done by choosing most significant forty-eight attributes. The experimental results have shown that the modified Naïve Bayes model has outperformed the Naïve Bayes Classifier in accuracy but requires significant improvement in the terms of elapsed time. By using Modified Naïve Bayes approach, the accuracy is found out to be 71.66% whereas it is calculated 66.66% using existing Naïve Bayes model. Further, a comparison is drawn by using WEKA tool. Here, an accuracy of Naïve Bayes is obtained as 58.33 %.

  6. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine.

    Directory of Open Access Journals (Sweden)

    Fei Gao

    Full Text Available For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a "soft-start" approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment.

  7. SVM-based multimodal classification of activities of daily living in Health Smart Homes: sensors, algorithms, and first experimental results.

    Science.gov (United States)

    Fleury, Anthony; Vacher, Michel; Noury, Norbert

    2010-03-01

    By 2050, about one third of the French population will be over 65. Our laboratory's current research focuses on the monitoring of elderly people at home, to detect a loss of autonomy as early as possible. Our aim is to quantify criteria such as the international activities of daily living (ADL) or the French Autonomie Gerontologie Groupes Iso-Ressources (AGGIR) scales, by automatically classifying the different ADL performed by the subject during the day. A Health Smart Home is used for this. Our Health Smart Home includes, in a real flat, infrared presence sensors (location), door contacts (to control the use of some facilities), temperature and hygrometry sensor in the bathroom, and microphones (sound classification and speech recognition). A wearable kinematic sensor also informs postural transitions (using pattern recognition) and walk periods (frequency analysis). This data collected from the various sensors are then used to classify each temporal frame into one of the ADL that was previously acquired (seven activities: hygiene, toilet use, eating, resting, sleeping, communication, and dressing/undressing). This is done using support vector machines. We performed a 1-h experimentation with 13 young and healthy subjects to determine the models of the different activities, and then we tested the classification algorithm (cross validation) with real data.

  8. An Automated Algorithm to Screen Massive Training Samples for a Global Impervious Surface Classification

    Science.gov (United States)

    Tan, Bin; Brown de Colstoun, Eric; Wolfe, Robert E.; Tilton, James C.; Huang, Chengquan; Smith, Sarah E.

    2012-01-01

    An algorithm is developed to automatically screen the outliers from massive training samples for Global Land Survey - Imperviousness Mapping Project (GLS-IMP). GLS-IMP is to produce a global 30 m spatial resolution impervious cover data set for years 2000 and 2010 based on the Landsat Global Land Survey (GLS) data set. This unprecedented high resolution impervious cover data set is not only significant to the urbanization studies but also desired by the global carbon, hydrology, and energy balance researches. A supervised classification method, regression tree, is applied in this project. A set of accurate training samples is the key to the supervised classifications. Here we developed the global scale training samples from 1 m or so resolution fine resolution satellite data (Quickbird and Worldview2), and then aggregate the fine resolution impervious cover map to 30 m resolution. In order to improve the classification accuracy, the training samples should be screened before used to train the regression tree. It is impossible to manually screen 30 m resolution training samples collected globally. For example, in Europe only, there are 174 training sites. The size of the sites ranges from 4.5 km by 4.5 km to 8.1 km by 3.6 km. The amount training samples are over six millions. Therefore, we develop this automated statistic based algorithm to screen the training samples in two levels: site and scene level. At the site level, all the training samples are divided to 10 groups according to the percentage of the impervious surface within a sample pixel. The samples following in each 10% forms one group. For each group, both univariate and multivariate outliers are detected and removed. Then the screen process escalates to the scene level. A similar screen process but with a looser threshold is applied on the scene level considering the possible variance due to the site difference. We do not perform the screen process across the scenes because the scenes might vary due to

  9. Development of an algorithm for heartbeats detection and classification in Holter records based on temporal and morphological features

    International Nuclear Information System (INIS)

    García, A; Romano, H; Laciar, E; Correa, R

    2011-01-01

    In this work a detection and classification algorithm for heartbeats analysis in Holter records was developed. First, a QRS complexes detector was implemented and their temporal and morphological characteristics were extracted. A vector was built with these features; this vector is the input of the classification module, based on discriminant analysis. The beats were classified in three groups: Premature Ventricular Contraction beat (PVC), Atrial Premature Contraction beat (APC) and Normal Beat (NB). These beat categories represent the most important groups of commercial Holter systems. The developed algorithms were evaluated in 76 ECG records of two validated open-access databases 'arrhythmias MIT BIH database' and M IT BIH supraventricular arrhythmias database . A total of 166343 beats were detected and analyzed, where the QRS detection algorithm provides a sensitivity of 99.69 % and a positive predictive value of 99.84 %. The classification stage gives sensitivities of 97.17% for NB, 97.67% for PCV and 92.78% for APC.

  10. Support vector machines and evolutionary algorithms for classification single or together?

    CERN Document Server

    Stoean, Catalin

    2014-01-01

    When discussing classification, support vector machines are known to be a capable and efficient technique to learn and predict with high accuracy within a quick time frame. Yet, their black box means to do so make the practical users quite circumspect about relying on it, without much understanding of the how and why of its predictions. The question raised in this book is how can this ‘masked hero’ be made more comprehensible and friendly to the public: provide a surrogate model for its hidden optimization engine, replace the method completely or appoint a more friendly approach to tag along and offer the much desired explanations? Evolutionary algorithms can do all these and this book presents such possibilities of achieving high accuracy, comprehensibility, reasonable runtime as well as unconstrained performance.

  11. Traumatic subarachnoid pleural fistula in children: case report, algorithm and classification proposal

    Directory of Open Access Journals (Sweden)

    Moscote-Salazar Luis Rafael

    2016-06-01

    Full Text Available Subarachnoid pleural fistulas are rare. They have been described as complications of thoracic surgery, penetrating injuries and spinal surgery, among others. We present the case of a 3-year-old female child, who suffer spinal cord trauma secondary to a car accident, developing a posterior subarachnoid pleural fistula. To our knowledge this is the first reported case of a pediatric patient with subarachnoid pleural fistula resulting from closed trauma, requiring intensive multimodal management. We also present a management algorithm and a proposed classification. The diagnosis of this pathology is difficult when not associated with neurological deficit. A high degree of suspicion, multidisciplinary management and timely surgical intervention allow optimal management.

  12. A multiresolution hierarchical classification algorithm for filtering airborne LiDAR data

    Science.gov (United States)

    Chen, Chuanfa; Li, Yanyan; Li, Wei; Dai, Honglei

    2013-08-01

    We presented a multiresolution hierarchical classification (MHC) algorithm for differentiating ground from non-ground LiDAR point cloud based on point residuals from the interpolated raster surface. MHC includes three levels of hierarchy, with the simultaneous increase of cell resolution and residual threshold from the low to the high level of the hierarchy. At each level, the surface is iteratively interpolated towards the ground using thin plate spline (TPS) until no ground points are classified, and the classified ground points are used to update the surface in the next iteration. 15 groups of benchmark dataset, provided by the International Society for Photogrammetry and Remote Sensing (ISPRS) commission, were used to compare the performance of MHC with those of the 17 other publicized filtering methods. Results indicated that MHC with the average total error and average Cohen’s kappa coefficient of 4.11% and 86.27% performs better than all other filtering methods.

  13. Classification Model for Forest Fire Hotspot Occurrences Prediction Using ANFIS Algorithm

    Science.gov (United States)

    Wijayanto, A. K.; Sani, O.; Kartika, N. D.; Herdiyeni, Y.

    2017-01-01

    This study proposed the application of data mining technique namely Adaptive Neuro-Fuzzy inference system (ANFIS) on forest fires hotspot data to develop classification models for hotspots occurrence in Central Kalimantan. Hotspot is a point that is indicated as the location of fires. In this study, hotspot distribution is categorized as true alarm and false alarm. ANFIS is a soft computing method in which a given inputoutput data set is expressed in a fuzzy inference system (FIS). The FIS implements a nonlinear mapping from its input space to the output space. The method of this study classified hotspots as target objects by correlating spatial attributes data using three folds in ANFIS algorithm to obtain the best model. The best result obtained from the 3rd fold provided low error for training (error = 0.0093676) and also low error testing result (error = 0.0093676). Attribute of distance to road is the most determining factor that influences the probability of true and false alarm where the level of human activities in this attribute is higher. This classification model can be used to develop early warning system of forest fire.

  14. Application of Classification Algorithm of Machine Learning and Buffer Analysis in Torism Regional Planning

    Science.gov (United States)

    Zhang, T. H.; Ji, H. W.; Hu, Y.; Ye, Q.; Lin, Y.

    2018-04-01

    Remote Sensing (RS) and Geography Information System (GIS) technologies are widely used in ecological analysis and regional planning. With the advantages of large scale monitoring, combination of point and area, multiple time-phases and repeated observation, they are suitable for monitoring and analysis of environmental information in a large range. In this study, support vector machine (SVM) classification algorithm is used to monitor the land use and land cover change (LUCC), and then to perform the ecological evaluation for Chaohu lake tourism area quantitatively. The automatic classification and the quantitative spatial-temporal analysis for the Chaohu Lake basin are realized by the analysis of multi-temporal and multispectral satellite images, DEM data and slope information data. Furthermore, the ecological buffer zone analysis is also studied to set up the buffer width for each catchment area surrounding Chaohu Lake. The results of LUCC monitoring from 1992 to 2015 has shown obvious affections by human activities. Since the construction of the Chaohu Lake basin is in the crucial stage of the rapid development of urbanization, the application of RS and GIS technique can effectively provide scientific basis for land use planning, ecological management, environmental protection and tourism resources development in the Chaohu Lake Basin.

  15. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  16. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  17. Automatic Classification of Sub-Techniques in Classical Cross-Country Skiing Using a Machine Learning Algorithm on Micro-Sensor Data

    Directory of Open Access Journals (Sweden)

    Ole Marius Hoel Rindal

    2017-12-01

    Full Text Available The automatic classification of sub-techniques in classical cross-country skiing provides unique possibilities for analyzing the biomechanical aspects of outdoor skiing. This is currently possible due to the miniaturization and flexibility of wearable inertial measurement units (IMUs that allow researchers to bring the laboratory to the field. In this study, we aimed to optimize the accuracy of the automatic classification of classical cross-country skiing sub-techniques by using two IMUs attached to the skier’s arm and chest together with a machine learning algorithm. The novelty of our approach is the reliable detection of individual cycles using a gyroscope on the skier’s arm, while a neural network machine learning algorithm robustly classifies each cycle to a sub-technique using sensor data from an accelerometer on the chest. In this study, 24 datasets from 10 different participants were separated into the categories training-, validation- and test-data. Overall, we achieved a classification accuracy of 93.9% on the test-data. Furthermore, we illustrate how an accurate classification of sub-techniques can be combined with data from standard sports equipment including position, altitude, speed and heart rate measuring systems. Combining this information has the potential to provide novel insight into physiological and biomechanical aspects valuable to coaches, athletes and researchers.

  18. Multi-step EMG Classification Algorithm for Human-Computer Interaction

    Science.gov (United States)

    Ren, Peng; Barreto, Armando; Adjouadi, Malek

    A three-electrode human-computer interaction system, based on digital processing of the Electromyogram (EMG) signal, is presented. This system can effectively help disabled individuals paralyzed from the neck down to interact with computers or communicate with people through computers using point-and-click graphic interfaces. The three electrodes are placed on the right frontalis, the left temporalis and the right temporalis muscles in the head, respectively. The signal processing algorithm used translates the EMG signals during five kinds of facial movements (left jaw clenching, right jaw clenching, eyebrows up, eyebrows down, simultaneous left & right jaw clenching) into five corresponding types of cursor movements (left, right, up, down and left-click), to provide basic mouse control. The classification strategy is based on three principles: the EMG energy of one channel is typically larger than the others during one specific muscle contraction; the spectral characteristics of the EMG signals produced by the frontalis and temporalis muscles during different movements are different; the EMG signals from adjacent channels typically have correlated energy profiles. The algorithm is evaluated on 20 pre-recorded EMG signal sets, using Matlab simulations. The results show that this method provides improvements and is more robust than other previous approaches.

  19. A review of classification algorithms for EEG-based brain–computer interfaces: a 10 year update

    Science.gov (United States)

    Lotte, F.; Bougrain, L.; Cichocki, A.; Clerc, M.; Congedo, M.; Rakotomamonjy, A.; Yger, F.

    2018-06-01

    Objective. Most current electroencephalography (EEG)-based brain–computer interfaces (BCIs) are based on machine learning algorithms. There is a large diversity of classifier types that are used in this field, as described in our 2007 review paper. Now, approximately ten years after this review publication, many new algorithms have been developed and tested to classify EEG signals in BCIs. The time is therefore ripe for an updated review of EEG classification algorithms for BCIs. Approach. We surveyed the BCI and machine learning literature from 2007 to 2017 to identify the new classification approaches that have been investigated to design BCIs. We synthesize these studies in order to present such algorithms, to report how they were used for BCIs, what were the outcomes, and to identify their pros and cons. Main results. We found that the recently designed classification algorithms for EEG-based BCIs can be divided into four main categories: adaptive classifiers, matrix and tensor classifiers, transfer learning and deep learning, plus a few other miscellaneous classifiers. Among these, adaptive classifiers were demonstrated to be generally superior to static ones, even with unsupervised adaptation. Transfer learning can also prove useful although the benefits of transfer learning remain unpredictable. Riemannian geometry-based methods have reached state-of-the-art performances on multiple BCI problems and deserve to be explored more thoroughly, along with tensor-based methods. Shrinkage linear discriminant analysis and random forests also appear particularly useful for small training samples settings. On the other hand, deep learning methods have not yet shown convincing improvement over state-of-the-art BCI methods. Significance. This paper provides a comprehensive overview of the modern classification algorithms used in EEG-based BCIs, presents the principles of these methods and guidelines on when and how to use them. It also identifies a number of challenges

  20. A review of classification algorithms for EEG-based brain-computer interfaces: a 10 year update.

    Science.gov (United States)

    Lotte, F; Bougrain, L; Cichocki, A; Clerc, M; Congedo, M; Rakotomamonjy, A; Yger, F

    2018-06-01

    Most current electroencephalography (EEG)-based brain-computer interfaces (BCIs) are based on machine learning algorithms. There is a large diversity of classifier types that are used in this field, as described in our 2007 review paper. Now, approximately ten years after this review publication, many new algorithms have been developed and tested to classify EEG signals in BCIs. The time is therefore ripe for an updated review of EEG classification algorithms for BCIs. We surveyed the BCI and machine learning literature from 2007 to 2017 to identify the new classification approaches that have been investigated to design BCIs. We synthesize these studies in order to present such algorithms, to report how they were used for BCIs, what were the outcomes, and to identify their pros and cons. We found that the recently designed classification algorithms for EEG-based BCIs can be divided into four main categories: adaptive classifiers, matrix and tensor classifiers, transfer learning and deep learning, plus a few other miscellaneous classifiers. Among these, adaptive classifiers were demonstrated to be generally superior to static ones, even with unsupervised adaptation. Transfer learning can also prove useful although the benefits of transfer learning remain unpredictable. Riemannian geometry-based methods have reached state-of-the-art performances on multiple BCI problems and deserve to be explored more thoroughly, along with tensor-based methods. Shrinkage linear discriminant analysis and random forests also appear particularly useful for small training samples settings. On the other hand, deep learning methods have not yet shown convincing improvement over state-of-the-art BCI methods. This paper provides a comprehensive overview of the modern classification algorithms used in EEG-based BCIs, presents the principles of these methods and guidelines on when and how to use them. It also identifies a number of challenges to further advance EEG classification in BCI.

  1. Classification of Atrial Septal Defect and Ventricular Septal Defect with Documented Hemodynamic Parameters via Cardiac Catheterization by Genetic Algorithms and Multi-Layered Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Mustafa Yıldız

    2012-08-01

    Full Text Available Introduction: We aimed to develop a classification method to discriminate ventricular septal defect and atrial septal defect by using severalhemodynamic parameters.Patients and Methods: Forty three patients (30 atrial septal defect, 13 ventricular septal defect; 26 female, 17 male with documentedhemodynamic parameters via cardiac catheterization are included to study. Such parameters as blood pressure values of different areas,gender, age and Qp/Qs ratios are used for classification. Parameters, we used in classification are determined by divergence analysismethod. Those parameters are; i pulmonary artery diastolic pressure, ii Qp/Qs ratio, iii right atrium pressure, iv age, v pulmonary arterysystolic pressure, vi left ventricular sistolic pressure, vii aorta mean pressure, viii left ventricular diastolic pressure, ix aorta diastolicpressure, x aorta systolic pressure. Those parameters detected from our study population, are uploaded to multi-layered artificial neuralnetwork and the network was trained by genetic algorithm.Results: Trained cluster consists of 14 factors (7 atrial septal defect and 7 ventricular septal defect. Overall success ratio is 79.2%, andwith a proper instruction of artificial neural network this ratio increases up to 89%.Conclusion: Parameters, belonging to artificial neural network, which are needed to be detected by the investigator in classical methods,can easily be detected with the help of genetic algorithms. During the instruction of artificial neural network by genetic algorithms, boththe topology of network and factors of network can be determined. During the test stage, elements, not included in instruction cluster, areassumed as in test cluster, and as a result of this study, we observed that multi-layered artificial neural network can be instructed properly,and neural network is a successful method for aimed classification.

  2. Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image

    CSIR Research Space (South Africa)

    Adelabu, S

    2013-11-01

    Full Text Available in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples. We performed...

  3. A Novel User Classification Method for Femtocell Network by Using Affinity Propagation Algorithm and Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Afaz Uddin Ahmed

    2014-01-01

    Full Text Available An artificial neural network (ANN and affinity propagation (AP algorithm based user categorization technique is presented. The proposed algorithm is designed for closed access femtocell network. ANN is used for user classification process and AP algorithm is used to optimize the ANN training process. AP selects the best possible training samples for faster ANN training cycle. The users are distinguished by using the difference of received signal strength in a multielement femtocell device. A previously developed directive microstrip antenna is used to configure the femtocell device. Simulation results show that, for a particular house pattern, the categorization technique without AP algorithm takes 5 indoor users and 10 outdoor users to attain an error-free operation. While integrating AP algorithm with ANN, the system takes 60% less training samples reducing the training time up to 50%. This procedure makes the femtocell more effective for closed access operation.

  4. A Novel User Classification Method for Femtocell Network by Using Affinity Propagation Algorithm and Artificial Neural Network

    Science.gov (United States)

    Ahmed, Afaz Uddin; Tariqul Islam, Mohammad; Ismail, Mahamod; Kibria, Salehin; Arshad, Haslina

    2014-01-01

    An artificial neural network (ANN) and affinity propagation (AP) algorithm based user categorization technique is presented. The proposed algorithm is designed for closed access femtocell network. ANN is used for user classification process and AP algorithm is used to optimize the ANN training process. AP selects the best possible training samples for faster ANN training cycle. The users are distinguished by using the difference of received signal strength in a multielement femtocell device. A previously developed directive microstrip antenna is used to configure the femtocell device. Simulation results show that, for a particular house pattern, the categorization technique without AP algorithm takes 5 indoor users and 10 outdoor users to attain an error-free operation. While integrating AP algorithm with ANN, the system takes 60% less training samples reducing the training time up to 50%. This procedure makes the femtocell more effective for closed access operation. PMID:25133214

  5. Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification

    Directory of Open Access Journals (Sweden)

    Mustafa Serter Uzer

    2013-01-01

    Full Text Available This paper offers a hybrid approach that uses the artificial bee colony (ABC algorithm for feature selection and support vector machines for classification. The purpose of this paper is to test the effect of elimination of the unimportant and obsolete features of the datasets on the success of the classification, using the SVM classifier. The developed approach conventionally used in liver diseases and diabetes diagnostics, which are commonly observed and reduce the quality of life, is developed. For the diagnosis of these diseases, hepatitis, liver disorders and diabetes datasets from the UCI database were used, and the proposed system reached a classification accuracies of 94.92%, 74.81%, and 79.29%, respectively. For these datasets, the classification accuracies were obtained by the help of the 10-fold cross-validation method. The results show that the performance of the method is highly successful compared to other results attained and seems very promising for pattern recognition applications.

  6. Automatic classification of endogenous seismic sources within a landslide body using random forest algorithm

    Science.gov (United States)

    Provost, Floriane; Hibert, Clément; Malet, Jean-Philippe; Stumpf, André; Doubre, Cécile

    2016-04-01

    Different studies have shown the presence of microseismic activity in soft-rock landslides. The seismic signals exhibit significantly different features in the time and frequency domains which allow their classification and interpretation. Most of the classes could be associated with different mechanisms of deformation occurring within and at the surface (e.g. rockfall, slide-quake, fissure opening, fluid circulation). However, some signals remain not fully understood and some classes contain few examples that prevent any interpretation. To move toward a more complete interpretation of the links between the dynamics of soft-rock landslides and the physical processes controlling their behaviour, a complete catalog of the endogeneous seismicity is needed. We propose a multi-class detection method based on the random forests algorithm to automatically classify the source of seismic signals. Random forests is a supervised machine learning technique that is based on the computation of a large number of decision trees. The multiple decision trees are constructed from training sets including each of the target classes. In the case of seismic signals, these attributes may encompass spectral features but also waveform characteristics, multi-stations observations and other relevant information. The Random Forest classifier is used because it provides state-of-the-art performance when compared with other machine learning techniques (e.g. SVM, Neural Networks) and requires no fine tuning. Furthermore it is relatively fast, robust, easy to parallelize, and inherently suitable for multi-class problems. In this work, we present the first results of the classification method applied to the seismicity recorded at the Super-Sauze landslide between 2013 and 2015. We selected a dozen of seismic signal features that characterize precisely its spectral content (e.g. central frequency, spectrum width, energy in several frequency bands, spectrogram shape, spectrum local and global maxima

  7. Classification of caesarean section and normal vaginal deliveries using foetal heart rate signals and advanced machine learning algorithms.

    Science.gov (United States)

    Fergus, Paul; Hussain, Abir; Al-Jumeily, Dhiya; Huang, De-Shuang; Bouguila, Nizar

    2017-07-06

    Visual inspection of cardiotocography traces by obstetricians and midwives is the gold standard for monitoring the wellbeing of the foetus during antenatal care. However, inter- and intra-observer variability is high with only a 30% positive predictive value for the classification of pathological outcomes. This has a significant negative impact on the perinatal foetus and often results in cardio-pulmonary arrest, brain and vital organ damage, cerebral palsy, hearing, visual and cognitive defects and in severe cases, death. This paper shows that using machine learning and foetal heart rate signals provides direct information about the foetal state and helps to filter the subjective opinions of medical practitioners when used as a decision support tool. The primary aim is to provide a proof-of-concept that demonstrates how machine learning can be used to objectively determine when medical intervention, such as caesarean section, is required and help avoid preventable perinatal deaths. This is evidenced using an open dataset that comprises 506 controls (normal virginal deliveries) and 46 cases (caesarean due to pH ≤ 7.20-acidosis, n = 18; pH > 7.20 and pH machine-learning algorithms are trained, and validated, using binary classifier performance measures. The findings show that deep learning classification achieves sensitivity = 94%, specificity = 91%, Area under the curve = 99%, F-score = 100%, and mean square error = 1%. The results demonstrate that machine learning significantly improves the efficiency for the detection of caesarean section and normal vaginal deliveries using foetal heart rate signals compared with obstetrician and midwife predictions and systems reported in previous studies.

  8. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  9. Classification of Noisy Data: An Approach Based on Genetic Algorithms and Voronoi Tessellation

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Knudsen, Torben

    Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based on the po......Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based...

  10. An evaluation of scanpath-comparison and machine-learning classification algorithms used to study the dynamics of analogy making.

    Science.gov (United States)

    French, Robert M; Glady, Yannick; Thibaut, Jean-Pierre

    2017-08-01

    In recent years, eyetracking has begun to be used to study the dynamics of analogy making. Numerous scanpath-comparison algorithms and machine-learning techniques are available that can be applied to the raw eyetracking data. We show how scanpath-comparison algorithms, combined with multidimensional scaling and a classification algorithm, can be used to resolve an outstanding question in analogy making-namely, whether or not children's and adults' strategies in solving analogy problems are different. (They are.) We show which of these scanpath-comparison algorithms is best suited to the kinds of analogy problems that have formed the basis of much analogy-making research over the years. Furthermore, we use machine-learning classification algorithms to examine the item-to-item saccade vectors making up these scanpaths. We show which of these algorithms best predicts, from very early on in a trial, on the basis of the frequency of various item-to-item saccades, whether a child or an adult is doing the problem. This type of analysis can also be used to predict, on the basis of the item-to-item saccade dynamics in the first third of a trial, whether or not a problem will be solved correctly.

  11. Performance of fusion algorithms for computer-aided detection and classification of mines in very shallow water obtained from testing in navy Fleet Battle Exercise-Hotel 2000

    Science.gov (United States)

    Ciany, Charles M.; Zurawski, William; Kerfoot, Ian

    2001-10-01

    The performance of Computer Aided Detection/Computer Aided Classification (CAD/CAC) Fusion algorithms on side-scan sonar images was evaluated using data taken at the Navy's's Fleet Battle Exercise-Hotel held in Panama City, Florida, in August 2000. A 2-of-3 binary fusion algorithm is shown to provide robust performance. The algorithm accepts the classification decisions and associated contact locations form three different CAD/CAC algorithms, clusters the contacts based on Euclidian distance, and then declares a valid target when a clustered contact is declared by at least 2 of the 3 individual algorithms. This simple binary fusion provided a 96 percent probability of correct classification at a false alarm rate of 0.14 false alarms per image per side. The performance represented a 3.8:1 reduction in false alarms over the best performing single CAD/CAC algorithm, with no loss in probability of correct classification.

  12. A Quick Negative Selection Algorithm for One-Class Classification in Big Data Era

    Directory of Open Access Journals (Sweden)

    Fangdong Zhu

    2017-01-01

    Full Text Available Negative selection algorithm (NSA is an important kind of the one-class classification model, but it is limited in the big data era due to its low efficiency. In this paper, we propose a new NSA based on Voronoi diagrams: VorNSA. The scheme of the detector generation process is changed from the traditional “Random-Discard” model to the “Computing-Designated” model by VorNSA. Furthermore, we present an immune detection process of VorNSA under Map/Reduce framework (VorNSA/MR to further reduce the time consumption on massive data in the testing stage. Theoretical analyses show that the time complexity of VorNSA decreases from the exponential level to the logarithmic level. Experiments are performed to compare the proposed technique with other NSAs and one-class classifiers. The results show that the time cost of the VorNSA is averagely decreased by 87.5% compared with traditional NSAs in UCI skin dataset.

  13. Multiple Signal Classification Algorithm Based Electric Dipole Source Localization Method in an Underwater Environment

    Directory of Open Access Journals (Sweden)

    Yidong Xu

    2017-10-01

    Full Text Available A novel localization method based on multiple signal classification (MUSIC algorithm is proposed for positioning an electric dipole source in a confined underwater environment by using electric dipole-receiving antenna array. In this method, the boundary element method (BEM is introduced to analyze the boundary of the confined region by use of a matrix equation. The voltage of each dipole pair is used as spatial-temporal localization data, and it does not need to obtain the field component in each direction compared with the conventional fields based localization method, which can be easily implemented in practical engineering applications. Then, a global-multiple region-conjugate gradient (CG hybrid search method is used to reduce the computation burden and to improve the operation speed. Two localization simulation models and a physical experiment are conducted. Both the simulation results and physical experiment result provide accurate positioning performance, with the help to verify the effectiveness of the proposed localization method in underwater environments.

  14. Fuzzy Expert System based on a Novel Hybrid Stem Cell (HSC) Algorithm for Classification of Micro Array Data.

    Science.gov (United States)

    Vijay, S Arul Antran; GaneshKumar, P

    2018-02-21

    In the growing scenario, microarray data is extensively used since it provides a more comprehensive understanding of genetic variants among diseases. As the gene expression samples have high dimensionality it becomes tedious to analyze the samples manually. Hence an automated system is needed to analyze these samples. The fuzzy expert system offers a clear classification when compared to the machine learning and statistical methodologies. In fuzzy classification, knowledge acquisition would be a major concern. Despite several existing approaches for knowledge acquisition much effort is necessary to enhance the learning process. This paper proposes an innovative Hybrid Stem Cell (HSC) algorithm that utilizes Ant Colony optimization and Stem Cell algorithm for designing fuzzy classification system to extract the informative rules to form the membership functions from the microarray dataset. The HSC algorithm uses a novel Adaptive Stem Cell Optimization (ASCO) to improve the points of membership function and Ant Colony Optimization to produce the near optimum rule set. In order to extract the most informative genes from the large microarray dataset a method called Mutual Information is used. The performance results of the proposed technique evaluated using the five microarray datasets are simulated. These results prove that the proposed Hybrid Stem Cell (HSC) algorithm produces a precise fuzzy system than the existing methodologies.

  15. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  16. Columbia Classification Algorithm of Suicide Assessment (C-CASA): classification of suicidal events in the FDA's pediatric suicidal risk analysis of antidepressants.

    Science.gov (United States)

    Posner, Kelly; Oquendo, Maria A; Gould, Madelyn; Stanley, Barbara; Davies, Mark

    2007-07-01

    To evaluate the link between antidepressants and suicidal behavior and ideation (suicidality) in youth, adverse events from pediatric clinical trials were classified in order to identify suicidal events. The authors describe the Columbia Classification Algorithm for Suicide Assessment (C-CASA), a standardized suicidal rating system that provided data for the pediatric suicidal risk analysis of antidepressants conducted by the Food and Drug Administration (FDA). Adverse events (N=427) from 25 pediatric antidepressant clinical trials were systematically identified by pharmaceutical companies. Randomly assigned adverse events were evaluated by three of nine independent expert suicidologists using the Columbia classification algorithm. Reliability of the C-CASA ratings and agreement with pharmaceutical company classification were estimated. Twenty-six new, possibly suicidal events (behavior and ideation) that were not originally identified by pharmaceutical companies were identified in the C-CASA, and 12 events originally labeled as suicidal by pharmaceutical companies were eliminated, which resulted in a total of 38 discrepant ratings. For the specific label of "suicide attempt," a relatively low level of agreement was observed between the C-CASA and pharmaceutical company ratings, with the C-CASA reporting a 50% reduction in ratings. Thus, although the C-CASA resulted in the identification of more suicidal events overall, fewer events were classified as suicide attempts. Additionally, the C-CASA ratings were highly reliable (intraclass correlation coefficient [ICC]=0.89). Utilizing a methodical, anchored approach to categorizing suicidality provides an accurate and comprehensive identification of suicidal events. The FDA's audit of the C-CASA demonstrated excellent transportability of this approach. The Columbia algorithm was used to classify suicidal adverse events in the recent FDA adult antidepressant safety analyses and has also been mandated to be applied to all

  17. An Improved Cloud Classification Algorithm for China's FY-2C Multi-Channel Images Using Artificial Neural Network.

    Science.gov (United States)

    Liu, Yu; Xia, Jun; Shi, Chun-Xiang; Hong, Yang

    2009-01-01

    The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China's first operational geostationary meteorological satellite FengYun-2C (FY-2C) data. First, the capabilities of six widely-used Artificial Neural Network (ANN) methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA) and a Support Vector Machine (SVM), using 2864 cloud samples manually collected by meteorologists in June, July, and August in 2007 from three FY-2C channel (IR1, 10.3-11.3 μm; IR2, 11.5-12.5 μm and WV 6.3-7.6 μm) imagery. The result shows that: (1) ANN approaches, in general, outperformed the PCA and the SVM given sufficient training samples and (2) among the six ANN networks, higher cloud classification accuracy was obtained with the Self-Organizing Map (SOM) and Probabilistic Neural Network (PNN). Second, to compare the ANN methods to the present FY-2C operational algorithm, this study implemented SOM, one of the best ANN network identified from this study, as an automated cloud classification system for the FY-2C multi-channel data. It shows that SOM method has improved the results greatly not only in pixel-level accuracy but also in cloud patch-level classification by more accurately identifying cloud types such as cumulonimbus, cirrus and clouds in high latitude. Findings of this study suggest that the ANN-based classifiers, in particular the SOM, can be potentially used as an improved Automated Cloud Classification Algorithm to upgrade the current window-based clustering method for the FY-2C operational products.

  18. An Improved Cloud Classification Algorithm for China’s FY-2C Multi-Channel Images Using Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Chun-Xiang Shi

    2009-07-01

    Full Text Available The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China’s first operational geostationary meteorological satellite FengYun-2C (FY-2C data. First, the capabilities of six widely-used Artificial Neural Network (ANN methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA and a Support Vector Machine (SVM, using 2864 cloud samples manually collected by meteorologists in June, July, and August in 2007 from three FY-2C channel (IR1, 10.3-11.3 μm; IR2, 11.5-12.5 μm and WV 6.3-7.6 μm imagery. The result shows that: (1 ANN approaches, in general, outperformed the PCA and the SVM given sufficient training samples and (2 among the six ANN networks, higher cloud classification accuracy was obtained with the Self-Organizing Map (SOM and Probabilistic Neural Network (PNN. Second, to compare the ANN methods to the present FY-2C operational algorithm, this study implemented SOM, one of the best ANN network identified from this study, as an automated cloud classification system for the FY-2C multi-channel data. It shows that SOM method has improved the results greatly not only in pixel-level accuracy but also in cloud patch-level classification by more accurately identifying cloud types such as cumulonimbus, cirrus and clouds in high latitude. Findings of this study suggest that the ANN-based classifiers, in particular the SOM, can be potentially used as an improved Automated Cloud Classification Algorithm to upgrade the current window-based clustering method for the FY-2C operational products.

  19. Evaluation of machine learning algorithms for classification of primary biological aerosol using a new UV-LIF spectrometer

    Science.gov (United States)

    Ruske, Simon; Topping, David O.; Foot, Virginia E.; Kaye, Paul H.; Stanley, Warren R.; Crawford, Ian; Morse, Andrew P.; Gallagher, Martin W.

    2017-03-01

    Characterisation of bioaerosols has important implications within environment and public health sectors. Recent developments in ultraviolet light-induced fluorescence (UV-LIF) detectors such as the Wideband Integrated Bioaerosol Spectrometer (WIBS) and the newly introduced Multiparameter Bioaerosol Spectrometer (MBS) have allowed for the real-time collection of fluorescence, size and morphology measurements for the purpose of discriminating between bacteria, fungal spores and pollen.This new generation of instruments has enabled ever larger data sets to be compiled with the aim of studying more complex environments. In real world data sets, particularly those from an urban environment, the population may be dominated by non-biological fluorescent interferents, bringing into question the accuracy of measurements of quantities such as concentrations. It is therefore imperative that we validate the performance of different algorithms which can be used for the task of classification.For unsupervised learning we tested hierarchical agglomerative clustering with various different linkages. For supervised learning, 11 methods were tested, including decision trees, ensemble methods (random forests, gradient boosting and AdaBoost), two implementations for support vector machines (libsvm and liblinear) and Gaussian methods (Gaussian naïve Bayesian, quadratic and linear discriminant analysis, the k-nearest neighbours algorithm and artificial neural networks).The methods were applied to two different data sets produced using the new MBS, which provides multichannel UV-LIF fluorescence signatures for single airborne biological particles. The first data set contained mixed PSLs and the second contained a variety of laboratory-generated aerosol.Clustering in general performs slightly worse than the supervised learning methods, correctly classifying, at best, only 67. 6 and 91. 1 % for the two data sets respectively. For supervised learning the gradient boosting algorithm was

  20. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    Science.gov (United States)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care

  1. Direction of Radio Finding via MUSIC (Multiple Signal Classification) Algorithm for Hardware Design System

    Science.gov (United States)

    Zhang, Zheng

    2017-10-01

    Concept of radio direction finding systems, which use radio direction finding is based on digital signal processing algorithms. Thus, the radio direction finding system becomes capable to locate and track signals by the both. Performance of radio direction finding significantly depends on effectiveness of digital signal processing algorithms. The algorithm uses the Direction of Arrival (DOA) algorithms to estimate the number of incidents plane waves on the antenna array and their angle of incidence. This manuscript investigates implementation of the DOA algorithms (MUSIC) on the uniform linear array in the presence of white noise. The experiment results exhibit that MUSIC algorithm changed well with the radio direction.

  2. New FIGO and Swedish intrapartum cardiotocography classification systems incorporated in the fetal ECG ST analysis (STAN) interpretation algorithm: agreements and discrepancies in cardiotocography classification and evaluation of significant ST events.

    Science.gov (United States)

    Olofsson, Per; Norén, Håkan; Carlsson, Ann

    2018-02-01

    The updated intrapartum cardiotocography (CTG) classification system by FIGO in 2015 (FIGO2015) and the FIGO2015-approached classification by the Swedish Society of Obstetricians and Gynecologist in 2017 (SSOG2017) are not harmonized with the fetal ECG ST analysis (STAN) algorithm from 2007 (STAN2007). The study aimed to reveal homogeneity and agreement between the systems in classifying CTG and ST events, and relate them to maternal and perinatal outcomes. Among CTG traces with ST events, 100 traces originally classified as normal, 100 as suspicious and 100 as pathological were randomly selected from a STAN database and classified by two experts in consensus. Homogeneity and agreement statistics between the CTG classifications were performed. Maternal and perinatal outcomes were evaluated in cases with clinically hidden ST data (n = 151). A two-tailed p ST events, heterogeneities were significant and agreements moderate to almost perfect (STAN2007 vs. FIGO2015 0.86, 0.72; STAN2007 vs. SSOG2017 0.92, 0.84; FIGO2015 vs. SSOG2017 0.94, 0.87). Significant ST events occurred more often combined with STAN2007 than with FIGO2015 classification, but not with SSOG2017; correct identification of adverse outcomes was not significantly different between the systems. There are discrepancies in the classification of CTG patterns and significant ST events between the old and new systems. The clinical relevance of the findings remains to be shown. © 2017 The Authors. Acta Obstetricia et Gynecologica Scandinavica published by John Wiley & Sons Ltd on behalf of Nordic Federation of Societies of Obstetrics and Gynecology (NFOG).

  3. Comparison of some classification algorithms based on deterministic and nondeterministic decision rules

    KAUST Repository

    Delimata, Paweł; Marszał-Paszek, Barbara; Moshkov, Mikhail; Paszek, Piotr; Skowron, Andrzej; Suraj, Zbigniew

    2010-01-01

    the considered algorithms extract from a given decision table efficiently some information about the set of rules. Next, this information is used by a decision-making procedure. The reported results of experiments show that the algorithms based on inhibitory

  4. Driver drowsiness classification using fuzzy wavelet-packet-based feature-extraction algorithm.

    Science.gov (United States)

    Khushaba, Rami N; Kodagoda, Sarath; Lal, Sara; Dissanayake, Gamini

    2011-01-01

    Driver drowsiness and loss of vigilance are a major cause of road accidents. Monitoring physiological signals while driving provides the possibility of detecting and warning of drowsiness and fatigue. The aim of this paper is to maximize the amount of drowsiness-related information extracted from a set of electroencephalogram (EEG), electrooculogram (EOG), and electrocardiogram (ECG) signals during a simulation driving test. Specifically, we develop an efficient fuzzy mutual-information (MI)- based wavelet packet transform (FMIWPT) feature-extraction method for classifying the driver drowsiness state into one of predefined drowsiness levels. The proposed method estimates the required MI using a novel approach based on fuzzy memberships providing an accurate-information content-estimation measure. The quality of the extracted features was assessed on datasets collected from 31 drivers on a simulation test. The experimental results proved the significance of FMIWPT in extracting features that highly correlate with the different drowsiness levels achieving a classification accuracy of 95%-- 97% on an average across all subjects.

  5. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  6. Algorithms

    Indian Academy of Sciences (India)

    polynomial) division have been found in Vedic Mathematics which are dated much before Euclid's algorithm. A programming language Is used to describe an algorithm for execution on a computer. An algorithm expressed using a programming.

  7. An Automated Cropland Classification Algorithm (ACCA) for Tajikistan by combining Landsat, MODIS, and secondary data

    Science.gov (United States)

    Thenkabail, Prasad S.; Wu, Zhuoting

    2012-01-01

    The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA) that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed) over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan using mega file data cubes (MFDCs) involving data from Landsat Global Land Survey (GLS), Landsat Enhanced Thematic Mapper Plus (ETM+) 30 m, Moderate Resolution Imaging Spectroradiometer (MODIS) 250 m time-series, a suite of secondary data (e.g., elevation, slope, precipitation, temperature), and in situ data. First, the process involved producing an accurate reference (or truth) cropland layer (TCL), consisting of cropland extent, areas, and irrigated vs. rainfed cropland areas, for the entire country of Tajikistan based on MFDC of year 2005 (MFDC2005). The methods involved in producing TCL included using ISOCLASS clustering, Tasseled Cap bi-spectral plots, spectro-temporal characteristics from MODIS 250 m monthly normalized difference vegetation index (NDVI) maximum value composites (MVC) time-series, and textural characteristics of higher resolution imagery. The TCL statistics accurately matched with the national statistics of Tajikistan for irrigated and rainfed croplands, where about 70% of croplands were irrigated and the rest rainfed. Second, a rule-based ACCA was developed to replicate the TCL accurately (~80% producer’s and user’s accuracies or within 20% quantity disagreement involving about 10 million Landsat 30 m sized cropland pixels of Tajikistan). Development of ACCA was an iterative process involving series of rules that are coded, refined, tweaked, and re-coded till ACCA derived croplands (ACLs) match accurately with TCLs. Third, the ACCA derived cropland

  8. An Automated Cropland Classification Algorithm (ACCA for Tajikistan by Combining Landsat, MODIS, and Secondary Data

    Directory of Open Access Journals (Sweden)

    Prasad S. Thenkabail

    2012-09-01

    Full Text Available The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan using mega file data cubes (MFDCs involving data from Landsat Global Land Survey (GLS, Landsat Enhanced Thematic Mapper Plus (ETM+ 30 m, Moderate Resolution Imaging Spectroradiometer (MODIS 250 m time-series, a suite of secondary data (e.g., elevation, slope, precipitation, temperature, and in situ data. First, the process involved producing an accurate reference (or truth cropland layer (TCL, consisting of cropland extent, areas, and irrigated vs. rainfed cropland areas, for the entire country of Tajikistan based on MFDC of year 2005 (MFDC2005. The methods involved in producing TCL included using ISOCLASS clustering, Tasseled Cap bi-spectral plots, spectro-temporal characteristics from MODIS 250 m monthly normalized difference vegetation index (NDVI maximum value composites (MVC time-series, and textural characteristics of higher resolution imagery. The TCL statistics accurately matched with the national statistics of Tajikistan for irrigated and rainfed croplands, where about 70% of croplands were irrigated and the rest rainfed. Second, a rule-based ACCA was developed to replicate the TCL accurately (~80% producer’s and user’s accuracies or within 20% quantity disagreement involving about 10 million Landsat 30 m sized cropland pixels of Tajikistan. Development of ACCA was an iterative process involving series of rules that are coded, refined, tweaked, and re-coded till ACCA derived croplands (ACLs match accurately with TCLs. Third, the ACCA derived

  9. A canonical correlation analysis based EMG classification algorithm for eliminating electrode shift effect.

    Science.gov (United States)

    Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang

    2016-08-01

    Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.

  10. Embedded vision equipment of industrial robot for inline detection of product errors by clustering–classification algorithms

    Directory of Open Access Journals (Sweden)

    Kamil Zidek

    2016-10-01

    Full Text Available The article deals with the design of embedded vision equipment of industrial robots for inline diagnosis of product error during manipulation process. The vision equipment can be attached to the end effector of robots or manipulators, and it provides an image snapshot of part surface before grasp, searches for error during manipulation, and separates products with error from the next operation of manufacturing. The new approach is a methodology based on machine teaching for the automated identification, localization, and diagnosis of systematic errors in products of high-volume production. To achieve this, we used two main data mining algorithms: clustering for accumulation of similar errors and classification methods for the prediction of any new error to proposed class. The presented methodology consists of three separate processing levels: image acquisition for fail parameterization, data clustering for categorizing errors to separate classes, and new pattern prediction with a proposed class model. We choose main representatives of clustering algorithms, for example, K-mean from quantization of vectors, fast library for approximate nearest neighbor from hierarchical clustering, and density-based spatial clustering of applications with noise from algorithm based on the density of the data. For machine learning, we selected six major algorithms of classification: support vector machines, normal Bayesian classifier, K-nearest neighbor, gradient boosted trees, random trees, and neural networks. The selected algorithms were compared for speed and reliability and tested on two platforms: desktop-based computer system and embedded system based on System on Chip (SoC with vision equipment.

  11. TESTING THE GENERALIZATION EFFICIENCY OF OIL SLICK CLASSIFICATION ALGORITHM USING MULTIPLE SAR DATA FOR DEEPWATER HORIZON OIL SPILL

    Directory of Open Access Journals (Sweden)

    C. Ozkan

    2012-07-01

    Full Text Available Marine oil spills due to releases of crude oil from tankers, offshore platforms, drilling rigs and wells, etc. are seriously affecting the fragile marine and coastal ecosystem and cause political and environmental concern. A catastrophic explosion and subsequent fire in the Deepwater Horizon oil platform caused the platform to burn and sink, and oil leaked continuously between April 20th and July 15th of 2010, releasing about 780,000 m3 of crude oil into the Gulf of Mexico. Today, space-borne SAR sensors are extensively used for the detection of oil spills in the marine environment, as they are independent from sun light, not affected by cloudiness, and more cost-effective than air patrolling due to covering large areas. In this study, generalization extent of an object based classification algorithm was tested for oil spill detection using multiple SAR imagery data. Among many geometrical, physical and textural features, some more distinctive ones were selected to distinguish oil and look alike objects from each others. The tested classifier was constructed from a Multilayer Perception Artificial Neural Network trained by ABC, LM and BP optimization algorithms. The training data to train the classifier were constituted from SAR data consisting of oil spill originated from Lebanon in 2007. The classifier was then applied to the Deepwater Horizon oil spill data in the Gulf of Mexico on RADARSAT-2 and ALOS PALSAR images to demonstrate the generalization efficiency of oil slick classification algorithm.

  12. DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification

    Directory of Open Access Journals (Sweden)

    Arran Schlosberg

    2014-05-01

    Full Text Available Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV.

  13. Evaluation of a treatment-based classification algorithm for low back pain: a cross-sectional study.

    Science.gov (United States)

    Stanton, Tasha R; Fritz, Julie M; Hancock, Mark J; Latimer, Jane; Maher, Christopher G; Wand, Benedict M; Parent, Eric C

    2011-04-01

    Several studies have investigated criteria for classifying patients with low back pain (LBP) into treatment-based subgroups. A comprehensive algorithm was created to translate these criteria into a clinical decision-making guide. This study investigated the translation of the individual subgroup criteria into a comprehensive algorithm by studying the prevalence of patients meeting the criteria for each treatment subgroup and the reliability of the classification. This was a cross-sectional, observational study. Two hundred fifty patients with acute or subacute LBP were recruited from the United States and Australia to participate in the study. Trained physical therapists performed standardized assessments on all participants. The researchers used these findings to classify participants into subgroups. Thirty-one participants were reassessed to determine interrater reliability of the algorithm decision. Based on individual subgroup criteria, 25.2% (95% confidence interval [CI]=19.8%-30.6%) of the participants did not meet the criteria for any subgroup, 49.6% (95% CI=43.4%-55.8%) of the participants met the criteria for only one subgroup, and 25.2% (95% CI=19.8%-30.6%) of the participants met the criteria for more than one subgroup. The most common combination of subgroups was manipulation + specific exercise (68.4% of the participants who met the criteria for 2 subgroups). Reliability of the algorithm decision was moderate (kappa=0.52, 95% CI=0.27-0.77, percentage of agreement=67%). Due to a relatively small patient sample, reliability estimates are somewhat imprecise. These findings provide important clinical data to guide future research and revisions to the algorithm. The finding that 25% of the participants met the criteria for more than one subgroup has important implications for the sequencing of treatments in the algorithm. Likewise, the finding that 25% of the participants did not meet the criteria for any subgroup provides important information regarding

  14. Classification of Suicide Attempts through a Machine Learning Algorithm Based on Multiple Systemic Psychiatric Scales

    Directory of Open Access Journals (Sweden)

    Jihoon Oh

    2017-09-01

    Full Text Available Classification and prediction of suicide attempts in high-risk groups is important for preventing suicide. The purpose of this study was to investigate whether the information from multiple clinical scales has classification power for identifying actual suicide attempts. Patients with depression and anxiety disorders (N = 573 were included, and each participant completed 31 self-report psychiatric scales and questionnaires about their history of suicide attempts. We then trained an artificial neural network classifier with 41 variables (31 psychiatric scales and 10 sociodemographic elements and ranked the contribution of each variable for the classification of suicide attempts. To evaluate the clinical applicability of our model, we measured classification performance with top-ranked predictors. Our model had an overall accuracy of 93.7% in 1-month, 90.8% in 1-year, and 87.4% in lifetime suicide attempts detection. The area under the receiver operating characteristic curve (AUROC was the highest for 1-month suicide attempts detection (0.93, followed by lifetime (0.89, and 1-year detection (0.87. Among all variables, the Emotion Regulation Questionnaire had the highest contribution, and the positive and negative characteristics of the scales similarly contributed to classification performance. Performance on suicide attempts classification was largely maintained when we only used the top five ranked variables for training (AUROC; 1-month, 0.75, 1-year, 0.85, lifetime suicide attempts detection, 0.87. Our findings indicate that information from self-report clinical scales can be useful for the classification of suicide attempts. Based on the reliable performance of the top five predictors alone, this machine learning approach could help clinicians identify high-risk patients in clinical settings.

  15. Classification of Suicide Attempts through a Machine Learning Algorithm Based on Multiple Systemic Psychiatric Scales.

    Science.gov (United States)

    Oh, Jihoon; Yun, Kyongsik; Hwang, Ji-Hyun; Chae, Jeong-Ho

    2017-01-01

    Classification and prediction of suicide attempts in high-risk groups is important for preventing suicide. The purpose of this study was to investigate whether the information from multiple clinical scales has classification power for identifying actual suicide attempts. Patients with depression and anxiety disorders ( N  = 573) were included, and each participant completed 31 self-report psychiatric scales and questionnaires about their history of suicide attempts. We then trained an artificial neural network classifier with 41 variables (31 psychiatric scales and 10 sociodemographic elements) and ranked the contribution of each variable for the classification of suicide attempts. To evaluate the clinical applicability of our model, we measured classification performance with top-ranked predictors. Our model had an overall accuracy of 93.7% in 1-month, 90.8% in 1-year, and 87.4% in lifetime suicide attempts detection. The area under the receiver operating characteristic curve (AUROC) was the highest for 1-month suicide attempts detection (0.93), followed by lifetime (0.89), and 1-year detection (0.87). Among all variables, the Emotion Regulation Questionnaire had the highest contribution, and the positive and negative characteristics of the scales similarly contributed to classification performance. Performance on suicide attempts classification was largely maintained when we only used the top five ranked variables for training (AUROC; 1-month, 0.75, 1-year, 0.85, lifetime suicide attempts detection, 0.87). Our findings indicate that information from self-report clinical scales can be useful for the classification of suicide attempts. Based on the reliable performance of the top five predictors alone, this machine learning approach could help clinicians identify high-risk patients in clinical settings.

  16. DOA Estimation of Low Altitude Target Based on Adaptive Step Glowworm Swarm Optimization-multiple Signal Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Zhou Hao

    2015-06-01

    Full Text Available The traditional MUltiple SIgnal Classification (MUSIC algorithm requires significant computational effort and can not be employed for the Direction Of Arrival (DOA estimation of targets in a low-altitude multipath environment. As such, a novel MUSIC approach is proposed on the basis of the algorithm of Adaptive Step Glowworm Swarm Optimization (ASGSO. The virtual spatial smoothing of the matrix formed by each snapshot is used to realize the decorrelation of the multipath signal and the establishment of a fullorder correlation matrix. ASGSO optimizes the function and estimates the elevation of the target. The simulation results suggest that the proposed method can overcome the low altitude multipath effect and estimate the DOA of target readily and precisely without radar effective aperture loss.

  17. Automated detection and classification of the proximal humerus fracture by using deep learning algorithm.

    Science.gov (United States)

    Chung, Seok Won; Han, Seung Seog; Lee, Ji Whan; Oh, Kyung-Soo; Kim, Na Ra; Yoon, Jong Pil; Kim, Joon Yub; Moon, Sung Hoon; Kwon, Jieun; Lee, Hyo-Jin; Noh, Young-Min; Kim, Youngjun

    2018-03-26

    Background and purpose - We aimed to evaluate the ability of artificial intelligence (a deep learning algorithm) to detect and classify proximal humerus fractures using plain anteroposterior shoulder radiographs. Patients and methods - 1,891 images (1 image per person) of normal shoulders (n = 515) and 4 proximal humerus fracture types (greater tuberosity, 346; surgical neck, 514; 3-part, 269; 4-part, 247) classified by 3 specialists were evaluated. We trained a deep convolutional neural network (CNN) after augmentation of a training dataset. The ability of the CNN, as measured by top-1 accuracy, area under receiver operating characteristics curve (AUC), sensitivity/specificity, and Youden index, in comparison with humans (28 general physicians, 11 general orthopedists, and 19 orthopedists specialized in the shoulder) to detect and classify proximal humerus fractures was evaluated. Results - The CNN showed a high performance of 96% top-1 accuracy, 1.00 AUC, 0.99/0.97 sensitivity/specificity, and 0.97 Youden index for distinguishing normal shoulders from proximal humerus fractures. In addition, the CNN showed promising results with 65-86% top-1 accuracy, 0.90-0.98 AUC, 0.88/0.83-0.97/0.94 sensitivity/specificity, and 0.71-0.90 Youden index for classifying fracture type. When compared with the human groups, the CNN showed superior performance to that of general physicians and orthopedists, similar performance to orthopedists specialized in the shoulder, and the superior performance of the CNN was more marked in complex 3- and 4-part fractures. Interpretation - The use of artificial intelligence can accurately detect and classify proximal humerus fractures on plain shoulder AP radiographs. Further studies are necessary to determine the feasibility of applying artificial intelligence in the clinic and whether its use could improve care and outcomes compared with current orthopedic assessments.

  18. Active-passive data fusion algorithms for seafloor imaging and classification from CZMIL data

    Science.gov (United States)

    Park, Joong Yong; Ramnath, Vinod; Feygels, Viktor; Kim, Minsu; Mathur, Abhinav; Aitken, Jennifer; Tuell, Grady

    2010-04-01

    CZMIL will simultaneously acquire lidar and passive spectral data. These data will be fused to produce enhanced seafloor reflectance images from each sensor, and combined at a higher level to achieve seafloor classification. In the DPS software, the lidar data will first be processed to solve for depth, attenuation, and reflectance. The depth measurements will then be used to constrain the spectral optimization of the passive spectral data, and the resulting water column estimates will be used recursively to improve the estimates of seafloor reflectance from the lidar. Finally, the resulting seafloor reflectance cube will be combined with texture metrics estimated from the seafloor topography to produce classifications of the seafloor.

  19. Algorithms for Hyperspectral Endmember Extraction and Signature Classification with Morphological Dendritic Networks

    Science.gov (United States)

    Schmalz, M.; Ritter, G.

    Accurate multispectral or hyperspectral signature classification is key to the nonimaging detection and recognition of space objects. Additionally, signature classification accuracy depends on accurate spectral endmember determination [1]. Previous approaches to endmember computation and signature classification were based on linear operators or neural networks (NNs) expressed in terms of the algebra (R, +, x) [1,2]. Unfortunately, class separation in these methods tends to be suboptimal, and the number of signatures that can be accurately classified often depends linearly on the number of NN inputs. This can lead to poor endmember distinction, as well as potentially significant classification errors in the presence of noise or densely interleaved signatures. In contrast to traditional CNNs, autoassociative morphological memories (AMM) are a construct similar to Hopfield autoassociatived memories defined on the (R, +, ?,?) lattice algebra [3]. Unlimited storage and perfect recall of noiseless real valued patterns has been proven for AMMs [4]. However, AMMs suffer from sensitivity to specific noise models, that can be characterized as erosive and dilative noise. On the other hand, the prior definition of a set of endmembers corresponds to material spectra lying on vertices of the minimum convex region covering the image data. These vertices can be characterized as morphologically independent patterns. It has further been shown that AMMs can be based on dendritic computation [3,6]. These techniques yield improved accuracy and class segmentation/separation ability in the presence of highly interleaved signature data. In this paper, we present a procedure for endmember determination based on AMM noise sensitivity, which employs morphological dendritic computation. We show that detected endmembers can be exploited by AMM based classification techniques, to achieve accurate signature classification in the presence of noise, closely spaced or interleaved signatures, and

  20. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data

    Directory of Open Access Journals (Sweden)

    Sheng Yang

    2015-01-01

    Full Text Available Sequencing is widely used to discover associations between microRNAs (miRNAs and diseases. However, the negative binomial distribution (NB and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility. Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare. The performance of seven feature selection (FS algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF, was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio. Real data were used to evaluate the performance of RF, logistic regression, and support vector machine. Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms. The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96 from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas. Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes.

  1. A Fast Logdet Divergence Based Metric Learning Algorithm for Large Data Sets Classification

    Directory of Open Access Journals (Sweden)

    Jiangyuan Mei

    2014-01-01

    the basis of classifiers, for example, the k-nearest neighbors classifier. Experiments on benchmark data sets demonstrate that the proposed algorithm compares favorably with the state-of-the-art methods.

  2. Autonomous Time-Frequency Cropping and Feature-Extraction Algorithms for Classification of LPI Radar Modulations

    National Research Council Canada - National Science Library

    Zilberman, Eric R

    2006-01-01

    ...), uses the marginal frequency distribution and the adaptive threshold binarization algorithm to determine the start and stop frequencies of the modulation energy to locate and adapt the size of the cropping window...

  3. A Coupled k-Nearest Neighbor Algorithm for Multi-Label Classification

    Science.gov (United States)

    2015-05-22

    classification, an image may contain several concepts simultaneously, such as beach, sunset and kangaroo . Such tasks are usually denoted as multi-label...informatics, a gene can belong to both metabolism and transcription classes; and in music categorization, a song may labeled as Mozart and sad. In the

  4. A Public Image Database for Benchmark of Plant Seedling Classification Algorithms

    DEFF Research Database (Denmark)

    Giselsson, Thomas Mosgaard; Nyholm Jørgensen, Rasmus; Jensen, Peter Kryger

    A database of images of approximately 960 unique plants belonging to 12 species at several growth stages is made publicly available. It comprises annotated RGB images with a physical resolution of roughly 10 pixels per mm. To standardise the evaluation of classification results obtained...

  5. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures.

    Science.gov (United States)

    Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana

    2016-01-01

    With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.

  6. Algorithms

    Indian Academy of Sciences (India)

    to as 'divide-and-conquer'. Although there has been a large effort in realizing efficient algorithms, there are not many universally accepted algorithm design paradigms. In this article, we illustrate algorithm design techniques such as balancing, greedy strategy, dynamic programming strategy, and backtracking or traversal of ...

  7. Reproducibility of measurements and variability of the classification algorithm of Stratus OCT in normal, hypertensive, and glaucomatous patients

    Directory of Open Access Journals (Sweden)

    Alfonso Antón

    2009-01-01

    Full Text Available Alfonso Antón1,2,3, Marta Castany1,2, Marta Pazos-Lopez1,2, Ruben Cuadrado3, Ana Flores3, Miguel Castilla11Hospital de la Esperanza-Hospital del Mar (IMAS, Barcelona, Spain; 2Institut Català de la Retina (ICR, Barcelona, Spain. Glaucoma Department; 3Instituto Universitario de Oftalmobiología Aplicada (IOBA, Universidad de Valladolid, Valladolid, EspañaPurpose: To assess the reproducibility of retinal nerve fiber layer (RNFL measurements and the variability of the probabilistic classification algorithm in normal, hypertensive and glaucomatous eyes using Stratus optical coherence tomography (OCT.Methods: Forty-nine eyes (13 normal, 17 ocular hypertensive [OHT] and 19 glaucomatous of 49 subjects were included in this study. RNFL was determined with Stratus OCT using the standard protocol RNFL thickness 3.4. Three different images of each eye were taken consecutively during the same session. To evaluate OCT reproducibility, coefficient of variation (COV and intraclass correlation coefficient (ICC were calculated for average thickness (AvgT, superior average thickness (Savg, and inferior average thickness (Iavg parameters. The variability of the results of the probabilistic classification algorithm, based on the OCT normative database, was also analyzed. The percentage of eyes with changes in the category assigned was calculated for each group.Results: The 50th percentile of COV was 2.96%, 4.00%, and 4.31% for AvgT, Savg, and Iavg, respectively. Glaucoma group presented the largest COV for all three parameters (3.87%, 5.55%, 7.82%. ICC were greater than 0.75 for almost all measures (except from the inferior thickness parameter in the normal group; ICC = 0.64, 95% CI 0.334–0.857. Regarding the probabilistic classification algorithm for the three parameters (AvgT, Savg, Iavg, the percentage of eyes without color-code category changes among the three images was as follows: normal group, 100%, 84.6% and 92%; OHT group, 89.5%, 52.7%, 79%; and

  8. Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy.

    Science.gov (United States)

    Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A

    2015-07-01

    Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Developing a Random Forest Algorithm for MODIS Global Burned Area Classification

    Directory of Open Access Journals (Sweden)

    Rubén Ramo

    2017-11-01

    Full Text Available This paper aims to develop a global burned area (BA algorithm for MODIS BRDF-corrected images based on the Random Forest (RF classifier. Two RF models were generated, including: (1 all MODIS reflective bands; and (2 only the red (R and near infrared (NIR bands. Active fire information, vegetation indices and auxiliary variables were taken into account as well. Both RF models were trained using a statistically designed sample of 130 reference sites, which took into account the global diversity of fire conditions. For each site, fire perimeters were obtained from multitemporal pairs of Landsat TM/ETM+ images acquired in 2008. Those fire perimeters were used to extract burned and unburned areas to train the RF models. Using the standard MD43A4 resolution (500 × 500 m, the training dataset included 48,365 burned pixels and 6,293,205 unburned pixels. Different combinations of number of trees and number of parameters were tested. The final RF models included 600 trees and 5 attributes. The RF full model (considering all bands provided a balanced accuracy of 0.94, while the RF RNIR model had 0.93. As a first assessment of these RF models, they were used to classify daily MCD43A4 images in three test sites for three consecutive years (2006–2008. The selected sites included different ecosystems: Australia (Tropical, Boreal (Canada and Temperate (California, and extended coverage (totaling more than 2,500,000 km2. Results from both RF models for those sites were compared with national fire perimeters, as well as with two existing BA MODIS products; the MCD45 and MCD64. Considering all three years and three sites, commission error for the RF Full model was 0.16, with an omission error of 0.23. For the RF RNIR model, these errors were 0.19 and 0.21, respectively. The existing MODIS BA products had lower commission errors, but higher omission errors (0.09 and 0.33 for the MCD45 and 0.10 and 0.29 for the MCD64 than those obtained with the RF models, and

  10. Pap-smear Classification Using Efficient Second Order Neural Network Training Algorithms

    DEFF Research Database (Denmark)

    Ampazis, Nikolaos; Dounias, George; Jantzen, Jan

    2004-01-01

    In this paper we make use of two highly efficient second order neural network training algorithms, namely the LMAM (Levenberg-Marquardt with Adaptive Momentum) and OLMAM (Optimized Levenberg-Marquardt with Adaptive Momentum), for the construction of an efficient pap-smear test classifier. The alg......In this paper we make use of two highly efficient second order neural network training algorithms, namely the LMAM (Levenberg-Marquardt with Adaptive Momentum) and OLMAM (Optimized Levenberg-Marquardt with Adaptive Momentum), for the construction of an efficient pap-smear test classifier....... The algorithms are methodologically similar, and are based on iterations of the form employed in the Levenberg-Marquardt (LM) method for non-linear least squares problems with the inclusion of an additional adaptive momentum term arising from the formulation of the training task as a constrained optimization...

  11. An Algorithm Based on the Self-Organized Maps for the Classification of Facial Features

    Directory of Open Access Journals (Sweden)

    Gheorghe Gîlcă

    2015-12-01

    Full Text Available This paper deals with an algorithm based on Self Organized Maps networks which classifies facial features. The proposed algorithm can categorize the facial features defined by the input variables: eyebrow, mouth, eyelids into a map of their grouping. The groups map is based on calculating the distance between each input vector and each output neuron layer , the neuron with the minimum distance being declared winner neuron. The network structure consists of two levels: the first level contains three input vectors, each having forty-one values, while the second level contains the SOM competitive network which consists of 100 neurons. The proposed system can classify facial features quickly and easily using the proposed algorithm based on SOMs.

  12. A Machine-Learning Algorithm Toward Color Analysis for Chronic Liver Disease Classification, Employing Ultrasound Shear Wave Elastography.

    Science.gov (United States)

    Gatos, Ilias; Tsantis, Stavros; Spiliopoulos, Stavros; Karnabatidis, Dimitris; Theotokas, Ioannis; Zoumpoulis, Pavlos; Loupas, Thanasis; Hazle, John D; Kagadis, George C

    2017-09-01

    The purpose of the present study was to employ a computer-aided diagnosis system that classifies chronic liver disease (CLD) using ultrasound shear wave elastography (SWE) imaging, with a stiffness value-clustering and machine-learning algorithm. A clinical data set of 126 patients (56 healthy controls, 70 with CLD) was analyzed. First, an RGB-to-stiffness inverse mapping technique was employed. A five-cluster segmentation was then performed associating corresponding different-color regions with certain stiffness value ranges acquired from the SWE manufacturer-provided color bar. Subsequently, 35 features (7 for each cluster), indicative of physical characteristics existing within the SWE image, were extracted. A stepwise regression analysis toward feature reduction was used to derive a reduced feature subset that was fed into the support vector machine classification algorithm to classify CLD from healthy cases. The highest accuracy in classification of healthy to CLD subject discrimination from the support vector machine model was 87.3% with sensitivity and specificity values of 93.5% and 81.2%, respectively. Receiver operating characteristic curve analysis gave an area under the curve value of 0.87 (confidence interval: 0.77-0.92). A machine-learning algorithm that quantifies color information in terms of stiffness values from SWE images and discriminates CLD from healthy cases is introduced. New objective parameters and criteria for CLD diagnosis employing SWE images provided by the present study can be considered an important step toward color-based interpretation, and could assist radiologists' diagnostic performance on a daily basis after being installed in a PC and employed retrospectively, immediately after the examination. Copyright © 2017 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  13. CLASSIFICATION OF NEURAL NETWORK FOR TECHNICAL CONDITION OF TURBOFAN ENGINES BASED ON HYBRID ALGORITHM

    Directory of Open Access Journals (Sweden)

    Valentin Potapov

    2016-12-01

    Full Text Available Purpose: This work presents a method of diagnosing the technical condition of turbofan engines using hybrid neural network algorithm based on software developed for the analysis of data obtained in the aircraft life. Methods: allows the engine diagnostics with deep recognition to the structural assembly in the presence of single structural damage components of the engine running and the multifaceted damage. Results: of the optimization of neural network structure to solve the problems of evaluating technical state of the bypass turbofan engine, when used with genetic algorithms.

  14. Pap-smear Classification Using Efficient Second Order Neural Network Training Algorithms

    DEFF Research Database (Denmark)

    Ampazis, Nikolaos; Dounias, George; Jantzen, Jan

    2004-01-01

    In this paper we make use of two highly efficient second order neural network training algorithms, namely the LMAM (Levenberg-Marquardt with Adaptive Momentum) and OLMAM (Optimized Levenberg-Marquardt with Adaptive Momentum), for the construction of an efficient pap-smear test classifier. The alg......In this paper we make use of two highly efficient second order neural network training algorithms, namely the LMAM (Levenberg-Marquardt with Adaptive Momentum) and OLMAM (Optimized Levenberg-Marquardt with Adaptive Momentum), for the construction of an efficient pap-smear test classifier...

  15. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    importance in the projection (VIP) information of the DPLS method. The power of the gene selection method and the proposed supervised hierarchical clustering method is illustrated on a three microarray data sets of leukemia, breast, and colon cancer. Supervised machine learning algorithms thus enable...

  16. Algorithm for the classification of multi-modulating signals on the electrocardiogram.

    Science.gov (United States)

    Mita, Mitsuo

    2007-03-01

    This article discusses the algorithm to measure electrocardiogram (ECG) and respiration simultaneously and to have the diagnostic potentiality for sleep apnoea from ECG recordings. The algorithm is composed by the combination with the three particular scale transform of a(j)(t), u(j)(t), o(j)(a(j)) and the statistical Fourier transform (SFT). Time and magnitude scale transforms of a(j)(t), u(j)(t) change the source into the periodic signal and tau(j) = o(j)(a(j)) confines its harmonics into a few instantaneous components at tau(j) being a common instant on two scales between t and tau(j). As a result, the multi-modulating source is decomposed by the SFT and is reconstructed into ECG, respiration and the other signals by inverse transform. The algorithm is expected to get the partial ventilation and the heart rate variability from scale transforms among a(j)(t), a(j+1)(t) and u(j+1)(t) joining with each modulation. The algorithm has a high potentiality of the clinical checkup for the diagnosis of sleep apnoea from ECG recordings.

  17. A comparison of two open source LiDAR surface classification algorithms

    Science.gov (United States)

    With the progression of LiDAR (Light Detection and Ranging) towards a mainstream resource management tool, it has become necessary to understand how best to process and analyze the data. While most ground surface identification algorithms remain proprietary and have high purchase costs; a few are op...

  18. A comparison of two open source LiDAR surface classification algorithms

    Science.gov (United States)

    Wade T. Tinkham; Hongyu Huang; Alistair M.S. Smith; Rupesh Shrestha; Michael J. Falkowski; Andrew T. Hudak; Timothy E. Link; Nancy F. Glenn; Danny G. Marks

    2011-01-01

    With the progression of LiDAR (Light Detection and Ranging) towards a mainstream resource management tool, it has become necessary to understand how best to process and analyze the data. While most ground surface identification algorithms remain proprietary and have high purchase costs; a few are openly available, free to use, and are supported by published results....

  19. Feature Selection for Object-Based Classification of High-Resolution Remote Sensing Images Based on the Combination of a Genetic Algorithm and Tabu Search

    Directory of Open Access Journals (Sweden)

    Lei Shi

    2018-01-01

    Full Text Available In object-based image analysis of high-resolution images, the number of features can reach hundreds, so it is necessary to perform feature reduction prior to classification. In this paper, a feature selection method based on the combination of a genetic algorithm (GA and tabu search (TS is presented. The proposed GATS method aims to reduce the premature convergence of the GA by the use of TS. A prematurity index is first defined to judge the convergence situation during the search. When premature convergence does take place, an improved mutation operator is executed, in which TS is performed on individuals with higher fitness values. As for the other individuals with lower fitness values, mutation with a higher probability is carried out. Experiments using the proposed GATS feature selection method and three other methods, a standard GA, the multistart TS method, and ReliefF, were conducted on WorldView-2 and QuickBird images. The experimental results showed that the proposed method outperforms the other methods in terms of the final classification accuracy.

  20. Feature Selection for Object-Based Classification of High-Resolution Remote Sensing Images Based on the Combination of a Genetic Algorithm and Tabu Search

    Science.gov (United States)

    Shi, Lei; Wan, Youchuan; Gao, Xianjun

    2018-01-01

    In object-based image analysis of high-resolution images, the number of features can reach hundreds, so it is necessary to perform feature reduction prior to classification. In this paper, a feature selection method based on the combination of a genetic algorithm (GA) and tabu search (TS) is presented. The proposed GATS method aims to reduce the premature convergence of the GA by the use of TS. A prematurity index is first defined to judge the convergence situation during the search. When premature convergence does take place, an improved mutation operator is executed, in which TS is performed on individuals with higher fitness values. As for the other individuals with lower fitness values, mutation with a higher probability is carried out. Experiments using the proposed GATS feature selection method and three other methods, a standard GA, the multistart TS method, and ReliefF, were conducted on WorldView-2 and QuickBird images. The experimental results showed that the proposed method outperforms the other methods in terms of the final classification accuracy. PMID:29581721

  1. Evaluation of feature selection algorithms for classification in temporal lobe epilepsy based on MR images

    Science.gov (United States)

    Lai, Chunren; Guo, Shengwen; Cheng, Lina; Wang, Wensheng; Wu, Kai

    2017-02-01

    It's very important to differentiate the temporal lobe epilepsy (TLE) patients from healthy people and localize the abnormal brain regions of the TLE patients. The cortical features and changes can reveal the unique anatomical patterns of brain regions from the structural MR images. In this study, structural MR images from 28 normal controls (NC), 18 left TLE (LTLE), and 21 right TLE (RTLE) were acquired, and four types of cortical feature, namely cortical thickness (CTh), cortical surface area (CSA), gray matter volume (GMV), and mean curvature (MCu), were explored for discriminative analysis. Three feature selection methods, the independent sample t-test filtering, the sparse-constrained dimensionality reduction model (SCDRM), and the support vector machine-recursive feature elimination (SVM-RFE), were investigated to extract dominant regions with significant differences among the compared groups for classification using the SVM classifier. The results showed that the SVM-REF achieved the highest performance (most classifications with more than 92% accuracy), followed by the SCDRM, and the t-test. Especially, the surface area and gray volume matter exhibited prominent discriminative ability, and the performance of the SVM was improved significantly when the four cortical features were combined. Additionally, the dominant regions with higher classification weights were mainly located in temporal and frontal lobe, including the inferior temporal, entorhinal cortex, fusiform, parahippocampal cortex, middle frontal and frontal pole. It was demonstrated that the cortical features provided effective information to determine the abnormal anatomical pattern and the proposed method has the potential to improve the clinical diagnosis of the TLE.

  2. An application of the Self Organizing Map Algorithm to computer aided classification of ASTER multispectral data

    Directory of Open Access Journals (Sweden)

    Ferdinando Giacco

    2008-01-01

    Full Text Available In this paper we employ the Kohonen’s Self Organizing Map (SOM as a strategy for an unsupervised analysis of ASTER multispectral (MS images. In order to obtain an accurate clusterization we introduce as input for the network, in addition to spectral data, some texture measures extracted from IKONOS images, which gives a contribution to the classification of manmade structures. After clustering of SOM outcomes, we associated each cluster with a major land cover and compared them with prior knowledge of the scene analyzed.

  3. Different Apple Varieties Classification Using kNN and MLP Algorithms

    OpenAIRE

    Sabancı, Kadir

    2016-01-01

    In this study, three different apple varieties grown in Karaman provinceare classified using kNN and MLP algorithms. 90 apples in total, 30 GoldenDelicious, 30 Granny Smith and 30 Starking Delicious have been used in thestudy. DFK 23U445 USB 3.0 (with Fujinon C Mount Lens) industrial camera hasbeen used to capture apple images. 4 size properties (diameter, area, perimeterand fullness) and 3 color properties (red, green, blue) have been decided usingimage processing techniques through analyzin...

  4. Decision making in double-pedicled DIEP and SIEA abdominal free flap breast reconstructions: An algorithmic approach and comprehensive classification.

    Directory of Open Access Journals (Sweden)

    Charles M Malata

    2015-10-01

    Full Text Available Introduction: The deep inferior epigastric artery perforator (DIEP free flap is the gold standard for autologous breast reconstruction. However, using a single vascular pedicle may not yield sufficient tissue in patients with midline scars or insufficient lower abdominal pannus. Double-pedicled free flaps overcome this problem using different vascular arrangements to harvest the entire lower abdominal flap. The literature is, however, sparse regarding technique selection. We therefore reviewed our experience in order to formulate an algorithm and comprehensive classification for this purpose. Methods: All patients undergoing unilateral double-pedicled abdominal perforator free flap breast reconstruction (AFFBR by a single surgeon (CMM over 40 months were reviewed from a prospectively collected database. Results: Of the 112 consecutive breast free flaps performed, 25 (22% utilised two vascular pedicles. The mean patient age was 45 years (range=27-54. All flaps but one (which used the thoracodorsal system were anastomosed to the internal mammary vessels using the rib-preservation technique. The surgical duration was 656 minutes (range=468-690 mins. The median flap weight was 618g (range=432-1275g and the mastectomy weight was 445g (range=220-896g. All flaps were successful and only three patients requested minor liposuction to reduce and reshape their reconstructed breasts.Conclusion: Bipedicled free abdominal perforator flaps, employed in a fifth of all our AFFBRs, are a reliable and safe option for unilateral breast reconstruction. They, however, necessitate clear indications to justify the additional technical complexity and surgical duration. Our algorithm and comprehensive classification facilitate technique selection for the anastomotic permutations and successful execution of these operations.

  5. Algorithm for predicting macular dysfunction based on moment invariants classification of the foveal avascular zone in functional retinal images

    Directory of Open Access Journals (Sweden)

    Angélica Moises Arthur

    2017-12-01

    Full Text Available Abstract Introduction A new method for segmenting and quantifying the macular area based on morphological alternating sequential filtering (ASF is proposed. Previous studies show that persons with diabetes present alterations in the foveal avascular zone (FAZ prior to the appearance of retinopathy. Thus, a proper characterization of FAZ using a method of automatic classification and prediction is a supportive and complementary tool for medical evaluation of the macular region, and may be useful for possible early treatment of eye diseases in persons without diabetic retinopathy. Methods We obtained high-resolution retinal images using a non-invasive functional imaging system called Retinal Function Imager to generate a series of combined capillary perfusion maps. We filtered sequentially the macular images to reduce the complexity by ASF. Then we segmented the FAZ using watershed transform from an automatic selection of markers. Using Hu’s moment invariants as a descriptor, we can automatically classify and categorize each FAZ. Results The FAZ differences between non-diabetic volunteers and diabetic subjects were automatically distinguished by the proposed system with an accuracy of 81%. Conclusion This is an innovative method to classify FAZ using a fully automatic algorithm for segmentation (based on morphological operators and for the classification (based on descriptor formed by Hu’s moments despite the presence of edema or other structures. This is an alternative tool for eye exams, which may contribute to the analysis and evaluation of FAZ morphology, promoting the prevention of macular impairment in diabetics without retinopathy.

  6. Classifying Classifications

    DEFF Research Database (Denmark)

    Debus, Michael S.

    2017-01-01

    This paper critically analyzes seventeen game classifications. The classifications were chosen on the basis of diversity, ranging from pre-digital classification (e.g. Murray 1952), over game studies classifications (e.g. Elverdam & Aarseth 2007) to classifications of drinking games (e.g. LaBrie et...... al. 2013). The analysis aims at three goals: The classifications’ internal consistency, the abstraction of classification criteria and the identification of differences in classification across fields and/or time. Especially the abstraction of classification criteria can be used in future endeavors...... into the topic of game classifications....

  7. Unraveling the linguistic nature of specific autobiographical memories using a computerized classification algorithm.

    Science.gov (United States)

    Takano, Keisuke; Ueno, Mayumi; Moriya, Jun; Mori, Masaki; Nishiguchi, Yuki; Raes, Filip

    2017-06-01

    In the present study, we explored the linguistic nature of specific memories generated with the Autobiographical Memory Test (AMT) by developing a computerized classifier that distinguishes between specific and nonspecific memories. The AMT is regarded as one of the most important assessment tools to study memory dysfunctions (e.g., difficulty recalling the specific details of memories) in psychopathology. In Study 1, we utilized the Japanese corpus data of 12,400 cue-recalled memories tagged with observer-rated specificity. We extracted linguistic features of particular relevance to memory specificity, such as past tense, negation, and adverbial words and phrases pertaining to time and location. On the basis of these features, a support vector machine (SVM) was trained to classify the memories into specific and nonspecific categories, which achieved an area under the curve (AUC) of .92 in a performance test. In Study 2, the trained SVM was tested in terms of its robustness in classifying novel memories (n = 8,478) that were retrieved in response to cue words that were different from those used in Study 1. The SVM showed an AUC of .89 in classifying the new memories. In Study 3, we extended the binary SVM to a five-class classification of the AMT, which achieved 64%-65% classification accuracy, against the chance level (20%) in the performance tests. Our data suggest that memory specificity can be identified with a relatively small number of words, capturing the universal linguistic features of memory specificity across memories in diverse contents.

  8. A new avenue for classification and prediction of olive cultivars using supervised and unsupervised algorithms.

    Directory of Open Access Journals (Sweden)

    Amir H Beiki

    Full Text Available Various methods have been used to identify cultivares of olive trees; herein we used different bioinformatics algorithms to propose new tools to classify 10 cultivares of olive based on RAPD and ISSR genetic markers datasets generated from PCR reactions. Five RAPD markers (OPA0a21, OPD16a, OP01a1, OPD16a1 and OPA0a8 and five ISSR markers (UBC841a4, UBC868a7, UBC841a14, U12BC807a and UBC810a13 selected as the most important markers by all attribute weighting models. K-Medoids unsupervised clustering run on SVM dataset was fully able to cluster each olive cultivar to the right classes. All trees (176 induced by decision tree models generated meaningful trees and UBC841a4 attribute clearly distinguished between foreign and domestic olive cultivars with 100% accuracy. Predictive machine learning algorithms (SVM and Naïve Bayes were also able to predict the right class of olive cultivares with 100% accuracy. For the first time, our results showed data mining techniques can be effectively used to distinguish between plant cultivares and proposed machine learning based systems in this study can predict new olive cultivars with the best possible accuracy.

  9. Algorithms

    Indian Academy of Sciences (India)

    ticians but also forms the foundation of computer science. Two ... with methods of developing algorithms for solving a variety of problems but ... applications of computers in science and engineer- ... numerical calculus are as important. We will ...

  10. Analysis and Classification of Stride Patterns Associated with Children Development Using Gait Signal Dynamics Parameters and Ensemble Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Meihong Wu

    2016-01-01

    Full Text Available Measuring stride variability and dynamics in children is useful for the quantitative study of gait maturation and neuromotor development in childhood and adolescence. In this paper, we computed the sample entropy (SampEn and average stride interval (ASI parameters to quantify the stride series of 50 gender-matched children participants in three age groups. We also normalized the SampEn and ASI values by leg length and body mass for each participant, respectively. Results show that the original and normalized SampEn values consistently decrease over the significance level of the Mann-Whitney U test (p<0.01 in children of 3–14 years old, which indicates the stride irregularity has been significantly ameliorated with the body growth. The original and normalized ASI values are also significantly changing when comparing between any two groups of young (aged 3–5 years, middle (aged 6–8 years, and elder (aged 10–14 years children. Such results suggest that healthy children may better modulate their gait cadence rhythm with the development of their musculoskeletal and neurological systems. In addition, the AdaBoost.M2 and Bagging algorithms were used to effectively distinguish the children’s gait patterns. These ensemble learning algorithms both provided excellent gait classification results in terms of overall accuracy (≥90%, recall (≥0.8, and precision (≥0.8077.

  11. Classification of the Clinical Images for Benign and Malignant Cutaneous Tumors Using a Deep Learning Algorithm.

    Science.gov (United States)

    Han, Seung Seog; Kim, Myoung Shin; Lim, Woohyung; Park, Gyeong Hun; Park, Ilwoo; Chang, Sung Eun

    2018-02-08

    We tested the use of a deep learning algorithm to classify the clinical images of 12 skin diseases-basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, actinic keratosis, seborrheic keratosis, malignant melanoma, melanocytic nevus, lentigo, pyogenic granuloma, hemangioma, dermatofibroma, and wart. The convolutional neural network (Microsoft ResNet-152 model; Microsoft Research Asia, Beijing, China) was fine-tuned with images from the training portion of the Asan dataset, MED-NODE dataset, and atlas site images (19,398 images in total). The trained model was validated with the testing portion of the Asan, Hallym and Edinburgh datasets. With the Asan dataset, the area under the curve for the diagnosis of basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, and melanoma was 0.96 ± 0.01, 0.83 ± 0.01, 0.82 ± 0.02, and 0.96 ± 0.00, respectively. With the Edinburgh dataset, the area under the curve for the corresponding diseases was 0.90 ± 0.01, 0.91 ± 0.01, 0.83 ± 0.01, and 0.88 ± 0.01, respectively. With the Hallym dataset, the sensitivity for basal cell carcinoma diagnosis was 87.1% ± 6.0%. The tested algorithm performance with 480 Asan and Edinburgh images was comparable to that of 16 dermatologists. To improve the performance of convolutional neural network, additional images with a broader range of ages and ethnicities should be collected. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  12. Phase Clustering Based Modulation Classification Algorithm for PSK Signal over Wireless Environment

    Directory of Open Access Journals (Sweden)

    Qi An

    2016-01-01

    Full Text Available Promptitude and accuracy of signals’ non-data-aided (NDA identification is one of the key technology demands in noncooperative wireless communication network, especially in information monitoring and other electronic warfare. Based on this background, this paper proposes a new signal classifier for phase shift keying (PSK signals. The periodicity of signal’s phase is utilized as the assorted character, with which a fractional function is constituted for phase clustering. Classification and the modulation order of intercepted signals can be achieved through its Fast Fourier Transform (FFT of the phase clustering function. Frequency offset is also considered for practical conditions. The accuracy of frequency offset estimation has a direct impact on its correction. Thus, a feasible solution is supplied. In this paper, an advanced estimator is proposed for estimating the frequency offset and balancing estimation accuracy and range under low signal-to-noise ratio (SNR conditions. The influence on estimation range brought by the maximum correlation interval is removed through the differential operation of the autocorrelation of the normalized baseband signal raised to the power of Q. Then, a weighted summation is adopted for an effective frequency estimation. Details of equations and relevant simulations are subsequently presented. The estimator proposed can reach an estimation accuracy of 10-4 even when the SNR is as low as -15 dB. Analytical formulas are expressed, and the corresponding simulations illustrate that the classifier proposed is more efficient than its counterparts even at low SNRs.

  13. Classification of amyotrophic lateral sclerosis disease based on convolutional neural network and reinforcement sample learning algorithm.

    Science.gov (United States)

    Sengur, Abdulkadir; Akbulut, Yaman; Guo, Yanhui; Bajaj, Varun

    2017-12-01

    Electromyogram (EMG) signals contain useful information of the neuromuscular diseases like amyotrophic lateral sclerosis (ALS). ALS is a well-known brain disease, which can progressively degenerate the motor neurons. In this paper, we propose a deep learning based method for efficient classification of ALS and normal EMG signals. Spectrogram, continuous wavelet transform (CWT), and smoothed pseudo Wigner-Ville distribution (SPWVD) have been employed for time-frequency (T-F) representation of EMG signals. A convolutional neural network is employed to classify these features. In it, Two convolution layers, two pooling layer, a fully connected layer and a lost function layer is considered in CNN architecture. The CNN architecture is trained with the reinforcement sample learning strategy. The efficiency of the proposed implementation is tested on publicly available EMG dataset. The dataset contains 89 ALS and 133 normal EMG signals with 24 kHz sampling frequency. Experimental results show 96.80% accuracy. The obtained results are also compared with other methods, which show the superiority of the proposed method.

  14. Classification of ulnar triangular fibrocartilage complex tears. A treatment algorithm for Palmer type IB tears.

    Science.gov (United States)

    Atzei, A; Luchetti, R; Garagnani, L

    2017-05-01

    The classical definition of 'Palmer Type IB' triangular fibrocartilage complex tear, includes a spectrum of clinical conditions. This review highlights the clinical and arthroscopic criteria that enable us to categorize five classes on a treatment-oriented classification system of triangular fibrocartilage complex peripheral tears. Class 1 lesions represent isolated tears of the distal triangular fibrocartilage complex without distal radio-ulnar joint instability and are amenable to arthroscopic suture. Class 2 tears include rupture of both the distal triangular fibrocartilage complex and proximal attachments of the triangular fibrocartilage complex to the fovea. Class 3 tears constitute isolated ruptures of the proximal attachment of the triangular fibrocartilage complex to the fovea; they are not visible at radio-carpal arthroscopy. Both Class 2 and Class 3 tears are diagnosed with a positive hook test and are typically associated with distal radio-ulnar joint instability. If required, treatment is through reattachment of the distal radio-ulnar ligament insertions to the fovea. Class 4 lesions are irreparable tears due to the size of the defect or to poor tissue quality and, if required, treatment is through distal radio-ulnar ligament reconstruction with tendon graft. Class 5 tears are associated with distal radio-ulnar joint arthritis and can only be treated with salvage procedures. This subdivision of type IB triangular fibrocartilage complex tear provides more insights in the pathomechanics and treatment strategies. II.

  15. Comparative analysis of classification based algorithms for diabetes diagnosis using iris images.

    Science.gov (United States)

    Samant, Piyush; Agarwal, Ravinder

    2018-01-01

    Photo-diagnosis is always an intriguing area for the researchers, with the advancement of image processing and computer machine vision techniques it have become more reliable and popular in recent years. The objective of this paper is to study the change in the features of iris, particularly irregularities in the pigmentation of certain areas of the iris with respect to diabetic health of an individual. Apart from the point that iris recognition concentrates on the overall structure of the iris, diagnostic techniques emphasises the local variations in the particular area of iris. Pre-image processing techniques have been applied to extract iris and thereafter, region of interest from the extracted iris have been cropped out. In order to observe the changes in the tissue pigmentation of region of interest, statistical, texture textural and wavelet features have been extracted. At the end, a comparison of accuracies of five different classifiers has been presented to classify two subject groups of diabetic and non-diabetic. Best classification accuracy has been calculated as 89.66% by the random forest classifier. Results have been shown the effectiveness and diagnostic significance of the proposed methodology. Presented piece of work offers a novel systemic perspective of non-invasive and automatic diabetic diagnosis.

  16. Application of classification algorithms for analysis of road safety risk factor dependencies.

    Science.gov (United States)

    Kwon, Oh Hoon; Rhee, Wonjong; Yoon, Yoonjin

    2015-02-01

    Transportation continues to be an integral part of modern life, and the importance of road traffic safety cannot be overstated. Consequently, recent road traffic safety studies have focused on analysis of risk factors that impact fatality and injury level (severity) of traffic accidents. While some of the risk factors, such as drug use and drinking, are widely known to affect severity, an accurate modeling of their influences is still an open research topic. Furthermore, there are innumerable risk factors that are waiting to be discovered or analyzed. A promising approach is to investigate historical traffic accident data that have been collected in the past decades. This study inspects traffic accident reports that have been accumulated by the California Highway Patrol (CHP) since 1973 for which each accident report contains around 100 data fields. Among them, we investigate 25 fields between 2004 and 2010 that are most relevant to car accidents. Using two classification methods, the Naive Bayes classifier and the decision tree classifier, the relative importance of the data fields, i.e., risk factors, is revealed with respect to the resulting severity level. Performances of the classifiers are compared to each other and a binary logistic regression model is used as the basis for the comparisons. Some of the high-ranking risk factors are found to be strongly dependent on each other, and their incremental gains on estimating or modeling severity level are evaluated quantitatively. The analysis shows that only a handful of the risk factors in the data dominate the severity level and that dependency among the top risk factors is an imperative trait to consider for an accurate analysis. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. Application of a kernel-based online learning algorithm to the classification of nodule candidates in computer-aided detection of CT lung nodules

    International Nuclear Information System (INIS)

    Matsumoto, S.; Ohno, Y.; Takenaka, D.; Sugimura, K.; Yamagata, H.

    2007-01-01

    Classification of the nodule candidates in computer-aided detection (CAD) of lung nodules in CT images was addressed by constructing a nonlinear discriminant function using a kernel-based learning algorithm called the kernel recursive least-squares (KRLS) algorithm. Using the nodule candidates derived from the processing by a CAD scheme of 100 CT datasets containing 253 non-calcified nodules or 3 mm or larger as determined by the consensus of two thoracic radiologists, the following trial were carried out 100 times: by randomly selecting 50 datasets for training, a nonlinear discriminant function was obtained using the nodule candidates in the training datasets and tested with the remaining candidates; for comparison, a rule-based classification was tested in a similar manner. At the number of false positives per case of about 5, the nonlinear classification method showed an improved sensitivity of 80% (mean over the 100 trials) compared with 74% of the rule-based method. (orig.)

  18. Classification of EEG-P300 Signals Extracted from Brain Activities in BCI Systems Using ν-SVM and BLDA Algorithms

    Directory of Open Access Journals (Sweden)

    Ali MOMENNEZHAD

    2014-06-01

    Full Text Available In this paper, a linear predictive coding (LPC model is used to improve classification accuracy, convergent speed to maximum accuracy, and maximum bitrates in brain computer interface (BCI system based on extracting EEG-P300 signals. First, EEG signal is filtered in order to eliminate high frequency noise. Then, the parameters of filtered EEG signal are extracted using LPC model. Finally, the samples are reconstructed by LPC coefficients and two classifiers, a Bayesian Linear discriminant analysis (BLDA, and b the υ-support vector machine (υ-SVM are applied in order to classify. The proposed algorithm performance is compared with fisher linear discriminant analysis (FLDA. Results show that the efficiency of our algorithm in improving classification accuracy and convergent speed to maximum accuracy are much better. As example at the proposed algorithms, respectively BLDA with LPC model and υ-SVM with LPC model with8 electrode configuration for subject S1 the total classification accuracy is improved as 9.4% and 1.7%. And also, subject 7 at BLDA and υ-SVM with LPC model algorithms (LPC+BLDA and LPC+ υ-SVM after block 11th converged to maximum accuracy but Fisher Linear Discriminant Analysis (FLDA algorithm did not converge to maximum accuracy (with the same configuration. So, it can be used as a promising tool in designing BCI systems.

  19. Algorithms

    Indian Academy of Sciences (India)

    algorithm design technique called 'divide-and-conquer'. One of ... Turtle graphics, September. 1996. 5. ... whole list named 'PO' is a pointer to the first element of the list; ..... Program for computing matrices X and Y and placing the result in C *).

  20. Algorithms

    Indian Academy of Sciences (India)

    algorithm that it is implicitly understood that we know how to generate the next natural ..... Explicit comparisons are made in line (1) where maximum and minimum is ... It can be shown that the function T(n) = 3/2n -2 is the solution to the above ...

  1. Improved binary dragonfly optimization algorithm and wavelet packet based non-linear features for infant cry classification.

    Science.gov (United States)

    Hariharan, M; Sindhu, R; Vijean, Vikneswaran; Yazid, Haniza; Nadarajaw, Thiyagar; Yaacob, Sazali; Polat, Kemal

    2018-03-01

    Infant cry signal carries several levels of information about the reason for crying (hunger, pain, sleepiness and discomfort) or the pathological status (asphyxia, deaf, jaundice, premature condition and autism, etc.) of an infant and therefore suited for early diagnosis. In this work, combination of wavelet packet based features and Improved Binary Dragonfly Optimization based feature selection method was proposed to classify the different types of infant cry signals. Cry signals from 2 different databases were utilized. First database contains 507 cry samples of normal (N), 340 cry samples of asphyxia (A), 879 cry samples of deaf (D), 350 cry samples of hungry (H) and 192 cry samples of pain (P). Second database contains 513 cry samples of jaundice (J), 531 samples of premature (Prem) and 45 samples of normal (N). Wavelet packet transform based energy and non-linear entropies (496 features), Linear Predictive Coding (LPC) based cepstral features (56 features), Mel-frequency Cepstral Coefficients (MFCCs) were extracted (16 features). The combined feature set consists of 568 features. To overcome the curse of dimensionality issue, improved binary dragonfly optimization algorithm (IBDFO) was proposed to select the most salient attributes or features. Finally, Extreme Learning Machine (ELM) kernel classifier was used to classify the different types of infant cry signals using all the features and highly informative features as well. Several experiments of two-class and multi-class classification of cry signals were conducted. In binary or two-class experiments, maximum accuracy of 90.18% for H Vs P, 100% for A Vs N, 100% for D Vs N and 97.61% J Vs Prem was achieved using the features selected (only 204 features out of 568) by IBDFO. For the classification of multiple cry signals (multi-class problem), the selected features could differentiate between three classes (N, A & D) with the accuracy of 100% and seven classes with the accuracy of 97.62%. The experimental

  2. Classification Technique of Interviewer-Bot Result using Naïve Bayes and Phrase Reinforcement Algorithms

    Directory of Open Access Journals (Sweden)

    Moechammad Sarosa

    2018-02-01

    Full Text Available Students with hectic college schedules tend not to have enough time repeating the course material. Meanwhile, after they graduated, to be accepted in a foreign company with a higher salary, they must be ready for the English-based interview. To meet these needs, they try to practice conversing with someone who is proficient in English. On the other hand, it is not easy to have someone who is not only proficient in English, but also understand about a job interview related topics. This paper presents the development of a machine which is able to provide practice on English-based interviews, specifically on job interviews. Interviewer machine (interviewer bot is expected to help students practice on speaking English in particular issue of finding suitable job. The interviewer machine design uses words from a chat bot database named ALICE to mimic human intelligence that can be applied to a search engine using AIML. Naïve Bayes algorithm is used to classify the interview results into three categories: POTENTIAL, TALENT and INTEREST students. Furthermore, based on the classification result, the summary is made at the end of the interview session by using phrase reinforcement algorithms. By using this bot, students are expected to practice their listening and speaking skills, also to be familiar with the questions often asked in job interviews so that they can prepare the proper answers. In addition, the bot’ users could know their potential, talent and interest in finding a job, so they could apply to the appropriate companies. Based on the validation results of 50 respondents, the accuracy degree of interviewer chat-bot (interviewer engine response obtained 86.93%.

  3. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Directory of Open Access Journals (Sweden)

    Mark D McDonnell

    Full Text Available Recent advances in training deep (multi-layer architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM approach, which also enables a very rapid training time (∼ 10 minutes. Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  4. Algorithms

    Indian Academy of Sciences (India)

    will become clear in the next article when we discuss a simple logo like programming language. ... Rod B may be used as an auxiliary store. The problem is to find an algorithm which performs this task. ... No disks are moved from A to Busing C as auxiliary rod. • move _disk (A, C);. (No + l)th disk is moved from A to C directly ...

  5. Emotion Recognition of Weblog Sentences Based on an Ensemble Algorithm of Multi-label Classification and Word Emotions

    Science.gov (United States)

    Li, Ji; Ren, Fuji

    Weblogs have greatly changed the communication ways of mankind. Affective analysis of blog posts is found valuable for many applications such as text-to-speech synthesis or computer-assisted recommendation. Traditional emotion recognition in text based on single-label classification can not satisfy higher requirements of affective computing. In this paper, the automatic identification of sentence emotion in weblogs is modeled as a multi-label text categorization task. Experiments are carried out on 12273 blog sentences from the Chinese emotion corpus Ren_CECps with 8-dimension emotion annotation. An ensemble algorithm RAKEL is used to recognize dominant emotions from the writer's perspective. Our emotion feature using detailed intensity representation for word emotions outperforms the other main features such as the word frequency feature and the traditional lexicon-based feature. In order to deal with relatively complex sentences, we integrate grammatical characteristics of punctuations, disjunctive connectives, modification relations and negation into features. It achieves 13.51% and 12.49% increases for Micro-averaged F1 and Macro-averaged F1 respectively compared to the traditional lexicon-based feature. Result shows that multiple-dimension emotion representation with grammatical features can efficiently classify sentence emotion in a multi-label problem.

  6. A Classification Detection Algorithm Based on Joint Entropy Vector against Application-Layer DDoS Attack

    Directory of Open Access Journals (Sweden)

    Yuntao Zhao

    2018-01-01

    Full Text Available The application-layer distributed denial of service (AL-DDoS attack makes a great threat against cyberspace security. The attack detection is an important part of the security protection, which provides effective support for defense system through the rapid and accurate identification of attacks. According to the attacker’s different URL of the Web service, the AL-DDoS attack is divided into three categories, including a random URL attack and a fixed and a traverse one. In order to realize identification of attacks, a mapping matrix of the joint entropy vector is constructed. By defining and computing the value of EUPI and jEIPU, a visual coordinate discrimination diagram of entropy vector is proposed, which also realizes data dimension reduction from N to two. In terms of boundary discrimination and the region where the entropy vectors fall in, the class of AL-DDoS attack can be distinguished. Through the study of training data set and classification, the results show that the novel algorithm can effectively distinguish the web server DDoS attack from normal burst traffic.

  7. Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface.

    Science.gov (United States)

    Siuly; Li, Yan; Paul Wen, Peng

    2014-03-01

    Motor imagery (MI) tasks classification provides an important basis for designing brain-computer interface (BCI) systems. If the MI tasks are reliably distinguished through identifying typical patterns in electroencephalography (EEG) data, a motor disabled people could communicate with a device by composing sequences of these mental states. In our earlier study, we developed a cross-correlation based logistic regression (CC-LR) algorithm for the classification of MI tasks for BCI applications, but its performance was not satisfactory. This study develops a modified version of the CC-LR algorithm exploring a suitable feature set that can improve the performance. The modified CC-LR algorithm uses the C3 electrode channel (in the international 10-20 system) as a reference channel for the cross-correlation (CC) technique and applies three diverse feature sets separately, as the input to the logistic regression (LR) classifier. The present algorithm investigates which feature set is the best to characterize the distribution of MI tasks based EEG data. This study also provides an insight into how to select a reference channel for the CC technique with EEG signals considering the anatomical structure of the human brain. The proposed algorithm is compared with eight of the most recently reported well-known methods including the BCI III Winner algorithm. The findings of this study indicate that the modified CC-LR algorithm has potential to improve the identification performance of MI tasks in BCI systems. The results demonstrate that the proposed technique provides a classification improvement over the existing methods tested. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  8. Hybrid Optimization of Object-Based Classification in High-Resolution Images Using Continous ANT Colony Algorithm with Emphasis on Building Detection

    Science.gov (United States)

    Tamimi, E.; Ebadi, H.; Kiani, A.

    2017-09-01

    Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.

  9. HYBRID OPTIMIZATION OF OBJECT-BASED CLASSIFICATION IN HIGH-RESOLUTION IMAGES USING CONTINOUS ANT COLONY ALGORITHM WITH EMPHASIS ON BUILDING DETECTION

    Directory of Open Access Journals (Sweden)

    E. Tamimi

    2017-09-01

    Full Text Available Automatic building detection from High Spatial Resolution (HSR images is one of the most important issues in Remote Sensing (RS. Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object. These showed the superiority of the proposed method in terms of time and accuracy.

  10. Effectiveness of Partition and Graph Theoretic Clustering Algorithms for Multiple Source Partial Discharge Pattern Classification Using Probabilistic Neural Network and Its Adaptive Version: A Critique Based on Experimental Studies

    Directory of Open Access Journals (Sweden)

    S. Venkatesh

    2012-01-01

    Full Text Available Partial discharge (PD is a major cause of failure of power apparatus and hence its measurement and analysis have emerged as a vital field in assessing the condition of the insulation system. Several efforts have been undertaken by researchers to classify PD pulses utilizing artificial intelligence techniques. Recently, the focus has shifted to the identification of multiple sources of PD since it is often encountered in real-time measurements. Studies have indicated that classification of multi-source PD becomes difficult with the degree of overlap and that several techniques such as mixed Weibull functions, neural networks, and wavelet transformation have been attempted with limited success. Since digital PD acquisition systems record data for a substantial period, the database becomes large, posing considerable difficulties during classification. This research work aims firstly at analyzing aspects concerning classification capability during the discrimination of multisource PD patterns. Secondly, it attempts at extending the previous work of the authors in utilizing the novel approach of probabilistic neural network versions for classifying moderate sets of PD sources to that of large sets. The third focus is on comparing the ability of partition-based algorithms, namely, the labelled (learning vector quantization and unlabelled (K-means versions, with that of a novel hypergraph-based clustering method in providing parsimonious sets of centers during classification.

  11. ENTERPRISE RESTRUCTURING AIM AND TYPES

    Directory of Open Access Journals (Sweden)

    S. P. Baranenko

    2011-01-01

    Full Text Available Enterprise restructuring is aimed at adapting it to market conditions and improving its competitiveness through selection of most effective model of using material, technical, technological, organizational, commercial, economical, financial, tax-related and other resources with due account of the demand. Restructuring classification signs and types as well as restructuring aims specific for industrial enterprises are provided for.

  12. Decoding the encoding of functional brain networks: An fMRI classification comparison of non-negative matrix factorization (NMF), independent component analysis (ICA), and sparse coding algorithms.

    Science.gov (United States)

    Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E

    2017-04-15

    Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (pcoding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (pcoding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    Directory of Open Access Journals (Sweden)

    Li Zhen

    2008-05-01

    Full Text Available Abstract Background Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in complex in vitro/in vivo datasets. We present a novel model to simulate complex chemical-toxicology data sets and use this model to evaluate the relative performance of different machine learning (ML methods. Results The classification performance of Artificial Neural Networks (ANN, K-Nearest Neighbors (KNN, Linear Discriminant Analysis (LDA, Naïve Bayes (NB, Recursive Partitioning and Regression Trees (RPART, and Support Vector Machines (SVM in the presence and absence of filter-based feature selection was analyzed using K-way cross-validation testing and independent validation on simulated in vitro assay data sets with varying levels of model complexity, number of irrelevant features and measurement noise. While the prediction accuracy of all ML methods decreased as non-causal (irrelevant features were added, some ML methods performed better than others. In the limit of using a large number of features, ANN and SVM were always in the top performing set of methods while RPART and KNN (k = 5 were always in the poorest performing set. The addition of measurement noise and irrelevant features decreased the classification accuracy of all ML methods, with LDA suffering the greatest performance degradation. LDA performance is especially sensitive to the use of feature selection. Filter-based feature selection generally improved performance, most strikingly for LDA. Conclusion We have developed a novel simulation model to evaluate machine learning methods for the

  14. An algorithm for the classification of mRNA patterns in eosinophilic esophagitis: Integration of machine learning.

    Science.gov (United States)

    Sallis, Benjamin F; Erkert, Lena; Moñino-Romero, Sherezade; Acar, Utkucan; Wu, Rina; Konnikova, Liza; Lexmond, Willem S; Hamilton, Matthew J; Dunn, W Augustine; Szepfalusi, Zsolt; Vanderhoof, Jon A; Snapper, Scott B; Turner, Jerrold R; Goldsmith, Jeffrey D; Spencer, Lisa A; Nurko, Samuel; Fiebiger, Edda

    2018-04-01

    Diagnostic evaluation of eosinophilic esophagitis (EoE) remains difficult, particularly the assessment of the patient's allergic status. This study sought to establish an automated medical algorithm to assist in the evaluation of EoE. Machine learning techniques were used to establish a diagnostic probability score for EoE, p(EoE), based on esophageal mRNA transcript patterns from biopsies of patients with EoE, gastroesophageal reflux disease and controls. Dimensionality reduction in the training set established weighted factors, which were confirmed by immunohistochemistry. Following weighted factor analysis, p(EoE) was determined by random forest classification. Accuracy was tested in an external test set, and predictive power was assessed with equivocal patients. Esophageal IgE production was quantified with epsilon germ line (IGHE) transcripts and correlated with serum IgE and the T h 2-type mRNA profile to establish an IGHE score for tissue allergy. In the primary analysis, a 3-class statistical model generated a p(EoE) score based on common characteristics of the inflammatory EoE profile. A p(EoE) ≥ 25 successfully identified EoE with high accuracy (sensitivity: 90.9%, specificity: 93.2%, area under the curve: 0.985) and improved diagnosis of equivocal cases by 84.6%. The p(EoE) changed in response to therapy. A secondary analysis loop in EoE patients defined an IGHE score of ≥37.5 for a patient subpopulation with increased esophageal allergic inflammation. The development of intelligent data analysis from a machine learning perspective provides exciting opportunities to improve diagnostic precision and improve patient care in EoE. The p(EoE) and the IGHE score are steps toward the development of decision trees to define EoE subpopulations and, consequently, will facilitate individualized therapy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  15. Prediction models discriminating between nonlocomotive and locomotive activities in children using a triaxial accelerometer with a gravity-removal physical activity classification algorithm.

    Science.gov (United States)

    Hikihara, Yuki; Tanaka, Chiaki; Oshima, Yoshitake; Ohkawara, Kazunori; Ishikawa-Takata, Kazuko; Tanaka, Shigeho

    2014-01-01

    The aims of our study were to examine whether a gravity-removal physical activity classification algorithm (GRPACA) is applicable for discrimination between nonlocomotive and locomotive activities for various physical activities (PAs) of children and to prove that this approach improves the estimation accuracy of a prediction model for children using an accelerometer. Japanese children (42 boys and 26 girls) attending primary school were invited to participate in this study. We used a triaxial accelerometer with a sampling interval of 32 Hz and within a measurement range of ±6 G. Participants were asked to perform 6 nonlocomotive and 5 locomotive activities. We measured raw synthetic acceleration with the triaxial accelerometer and monitored oxygen consumption and carbon dioxide production during each activity with the Douglas bag method. In addition, the resting metabolic rate (RMR) was measured with the subject sitting on a chair to calculate metabolic equivalents (METs). When the ratio of unfiltered synthetic acceleration (USA) and filtered synthetic acceleration (FSA) was 1.12, the rate of correct discrimination between nonlocomotive and locomotive activities was excellent, at 99.1% on average. As a result, a strong linear relationship was found for both nonlocomotive (METs = 0.013×synthetic acceleration +1.220, R2 = 0.772) and locomotive (METs = 0.005×synthetic acceleration +0.944, R2 = 0.880) activities, except for climbing down and up. The mean differences between the values predicted by our model and measured METs were -0.50 to 0.23 for moderate to vigorous intensity (>3.5 METs) PAs like running, ball throwing and washing the floor, which were regarded as unpredictable PAs. In addition, the difference was within 0.25 METs for sedentary to mild moderate PAs (model that discriminates between nonlocomotive and locomotive activities for children can be useful to evaluate the sedentary to vigorous PAs intensity of both nonlocomotive and

  16. Comparison of machine learning and semi-quantification algorithms for (I123)FP-CIT classification: the beginning of the end for semi-quantification?

    Science.gov (United States)

    Taylor, Jonathan Christopher; Fenner, John Wesley

    2017-11-29

    Semi-quantification methods are well established in the clinic for assisted reporting of (I123) Ioflupane images. Arguably, these are limited diagnostic tools. Recent research has demonstrated the potential for improved classification performance offered by machine learning algorithms. A direct comparison between methods is required to establish whether a move towards widespread clinical adoption of machine learning algorithms is justified. This study compared three machine learning algorithms with that of a range of semi-quantification methods, using the Parkinson's Progression Markers Initiative (PPMI) research database and a locally derived clinical database for validation. Machine learning algorithms were based on support vector machine classifiers with three different sets of features: Voxel intensities Principal components of image voxel intensities Striatal binding radios from the putamen and caudate. Semi-quantification methods were based on striatal binding ratios (SBRs) from both putamina, with and without consideration of the caudates. Normal limits for the SBRs were defined through four different methods: Minimum of age-matched controls Mean minus 1/1.5/2 standard deviations from age-matched controls Linear regression of normal patient data against age (minus 1/1.5/2 standard errors) Selection of the optimum operating point on the receiver operator characteristic curve from normal and abnormal training data Each machine learning and semi-quantification technique was evaluated with stratified, nested 10-fold cross-validation, repeated 10 times. The mean accuracy of the semi-quantitative methods for classification of local data into Parkinsonian and non-Parkinsonian groups varied from 0.78 to 0.87, contrasting with 0.89 to 0.95 for classifying PPMI data into healthy controls and Parkinson's disease groups. The machine learning algorithms gave mean accuracies between 0.88 to 0.92 and 0.95 to 0.97 for local and PPMI data respectively. Classification

  17. Laser Raman detection of platelets for early and differential diagnosis of Alzheimer’s disease based on an adaptive Gaussian process classification algorithm

    International Nuclear Information System (INIS)

    Luo, Yusheng; Du, Z W; Yang, Y J; Chen, P; Wang, X H; Cheng, Y; Peng, J; Shen, A G; Hu, J M; Tian, Q; Shang, X L; Liu, Z C; Yao, X Q; Wang, J Z

    2013-01-01

    Early and differential diagnosis of Alzheimer’s disease (AD) has puzzled many clinicians. In this work, laser Raman spectroscopy (LRS) was developed to diagnose AD from platelet samples from AD transgenic mice and non-transgenic controls of different ages. An adaptive Gaussian process (GP) classification algorithm was used to re-establish the classification models of early AD, advanced AD and the control group with just two features and the capacity for noise reduction. Compared with the previous multilayer perceptron network method, the GP showed much better classification performance with the same feature set. Besides, spectra of platelets isolated from AD and Parkinson’s disease (PD) mice were also discriminated. Spectral data from 4 month AD (n = 39) and 12 month AD (n = 104) platelets, as well as control data (n = 135), were collected. Prospective application of the algorithm to the data set resulted in a sensitivity of 80%, a specificity of about 100% and a Matthews correlation coefficient of 0.81. Samples from PD (n = 120) platelets were also collected for differentiation from 12 month AD. The results suggest that platelet LRS detection analysis with the GP appears to be an easier and more accurate method than current ones for early and differential diagnosis of AD. (paper)

  18. Knowledge discovery from patients' behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services.

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.

  19. An Evaluation of Different Training Sample Allocation Schemes for Discrete and Continuous Land Cover Classification Using Decision Tree-Based Algorithms

    Directory of Open Access Journals (Sweden)

    René Roland Colditz

    2015-07-01

    Full Text Available Land cover mapping for large regions often employs satellite images of medium to coarse spatial resolution, which complicates mapping of discrete classes. Class memberships, which estimate the proportion of each class for every pixel, have been suggested as an alternative. This paper compares different strategies of training data allocation for discrete and continuous land cover mapping using classification and regression tree algorithms. In addition to measures of discrete and continuous map accuracy the correct estimation of the area is another important criteria. A subset of the 30 m national land cover dataset of 2006 (NLCD2006 of the United States was used as reference set to classify NADIR BRDF-adjusted surface reflectance time series of MODIS at 900 m spatial resolution. Results show that sampling of heterogeneous pixels and sample allocation according to the expected area of each class is best for classification trees. Regression trees for continuous land cover mapping should be trained with random allocation, and predictions should be normalized with a linear scaling function to correctly estimate the total area. From the tested algorithms random forest classification yields lower errors than boosted trees of C5.0, and Cubist shows higher accuracies than random forest regression.

  20. Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers. PMID:27610177

  1. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification.

    Science.gov (United States)

    Li, Jinyan; Fong, Simon; Sung, Yunsick; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L

    2016-01-01

    An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class. In this paper, we use a novel class-balancing method named adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique (ASCB_DmSMOTE) to solve this imbalanced dataset problem, which is common in biomedical applications. The proposed method combines under-sampling and over-sampling into a swarm optimisation algorithm. It adaptively selects suitable parameters for the rebalancing algorithm to find the best solution. Compared with the other versions of the SMOTE algorithm, significant improvements, which include higher accuracy and credibility, are observed with ASCB_DmSMOTE. Our proposed method tactfully combines two rebalancing techniques together. It reasonably re-allocates the majority class in the details and dynamically optimises the two parameters of SMOTE to synthesise a reasonable scale of minority class for each clustered sub-imbalanced dataset. The proposed methods ultimately overcome other conventional methods and attains higher credibility with even greater accuracy of the classification model.

  2. Content-based and algorithmic classifications of journals: perspectives on the dynamics of scientific communication and indexer effects

    NARCIS (Netherlands)

    Rafols, I.; Leydesdorff, L.; Larsen, B.; Leta, J.

    2009-01-01

    The aggregated journal-journal citation matrix—based on the Journal Citation Reports (JCR) of the Science Citation Index—can be decomposed by indexers and/or algorithmically. In this study, we test the results of two recently available algorithms for the decomposition of large matrices against two

  3. Content-based and algorithmic classifications of journals: Perspectives on the dynamics of scientific communication and indexer effects

    NARCIS (Netherlands)

    Rafols, I; Leydesdorff, L.

    2009-01-01

    The aggregated journal-journal citation matrix—based on the Journal Citation Reports (JCR) of the Science Citation Index—can be decomposed by indexers or algorithmically. In this study, we test the results of two recently available algorithms for the decomposition of large matrices against two

  4. A search algorithm to meta-optimize the parameters for an extended Kalman filter to improve classification on hyper-temporal images

    CSIR Research Space (South Africa)

    Salmon

    2012-07-01

    Full Text Available stream_source_info Salmon1_2012_ABSTRACT ONLY.pdf.txt stream_content_type text/plain stream_size 1654 Content-Encoding ISO-8859-1 stream_name Salmon1_2012_ABSTRACT ONLY.pdf.txt Content-Type text/plain; charset=ISO-8859...-1 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22-27 July 2012 A search algorithm to meta-optimize the parameters for an extended Kalman filter to improve classification on hyper-temporal images yzB.P. Salmon, yz...

  5. Normed kernel function-based fuzzy possibilistic C-means (NKFPCM) algorithm for high-dimensional breast cancer database classification with feature selection is based on Laplacian Score

    Science.gov (United States)

    Lestari, A. W.; Rustam, Z.

    2017-07-01

    In the last decade, breast cancer has become the focus of world attention as this disease is one of the primary leading cause of death for women. Therefore, it is necessary to have the correct precautions and treatment. In previous studies, Fuzzy Kennel K-Medoid algorithm has been used for multi-class data. This paper proposes an algorithm to classify the high dimensional data of breast cancer using Fuzzy Possibilistic C-means (FPCM) and a new method based on clustering analysis using Normed Kernel Function-Based Fuzzy Possibilistic C-Means (NKFPCM). The objective of this paper is to obtain the best accuracy in classification of breast cancer data. In order to improve the accuracy of the two methods, the features candidates are evaluated using feature selection, where Laplacian Score is used. The results show the comparison accuracy and running time of FPCM and NKFPCM with and without feature selection.

  6. Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils.

    Science.gov (United States)

    Devos, Olivier; Downey, Gerard; Duponchel, Ludovic

    2014-04-01

    Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. Leukemia and colon tumor detection based on microarray data classification using momentum backpropagation and genetic algorithm as a feature selection method

    Science.gov (United States)

    Wisesty, Untari N.; Warastri, Riris S.; Puspitasari, Shinta Y.

    2018-03-01

    Cancer is one of the major causes of mordibility and mortality problems in the worldwide. Therefore, the need of a system that can analyze and identify a person suffering from a cancer by using microarray data derived from the patient’s Deoxyribonucleic Acid (DNA). But on microarray data has thousands of attributes, thus making the challenges in data processing. This is often referred to as the curse of dimensionality. Therefore, in this study built a system capable of detecting a patient whether contracted cancer or not. The algorithm used is Genetic Algorithm as feature selection and Momentum Backpropagation Neural Network as a classification method, with data used from the Kent Ridge Bio-medical Dataset. Based on system testing that has been done, the system can detect Leukemia and Colon Tumor with best accuracy equal to 98.33% for colon tumor data and 100% for leukimia data. Genetic Algorithm as feature selection algorithm can improve system accuracy, which is from 64.52% to 98.33% for colon tumor data and 65.28% to 100% for leukemia data, and the use of momentum parameters can accelerate the convergence of the system in the training process of Neural Network.

  8. Paper 5: Surveillance of multiple congenital anomalies: implementation of a computer algorithm in European registers for classification of cases

    DEFF Research Database (Denmark)

    Garne, Ester; Dolk, Helen; Loane, Maria

    2011-01-01

    Surveillance of multiple congenital anomalies is considered to be more sensitive for the detection of new teratogens than surveillance of all or isolated congenital anomalies. Current literature proposes the manual review of all cases for classification into isolated or multiple congenital anomal...

  9. Evaluating the statistical performance of less applied algorithms in classification of worldview-3 imagery data in an urbanized landscape

    Science.gov (United States)

    Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa

    2018-03-01

    In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.

  10. An algorithm for identification and classification of individuals with type 1 and type 2 diabetes mellitus in a large primary care database

    Directory of Open Access Journals (Sweden)

    Sharma M

    2016-10-01

    Full Text Available Manuj Sharma,1 Irene Petersen,1,2 Irwin Nazareth,1 Sonia J Coton,1 1Department of Primary Care and Population Health, University College London, London, UK; 2Department of Clinical Epidemiology, Aarhus University, Aarhus, Denmark Background: Research into diabetes mellitus (DM often requires a reproducible method for identifying and distinguishing individuals with type 1 DM (T1DM and type 2 DM (T2DM.  Objectives: To develop a method to identify individuals with T1DM and T2DM using UK primary care electronic health records.  Methods: Using data from The Health Improvement Network primary care database, we developed a two-step algorithm. The first algorithm step identified individuals with potential T1DM or T2DM based on diagnostic records, treatment, and clinical test results. We excluded individuals with records for rarer DM subtypes only. For individuals to be considered diabetic, they needed to have at least two records indicative of DM; one of which was required to be a diagnostic record. We then classified individuals with T1DM and T2DM using the second algorithm step. A combination of diagnostic codes, medication prescribed, age at diagnosis, and whether the case was incident or prevalent were used in this process. We internally validated this classification algorithm through comparison against an independent clinical examination of The Health Improvement Network electronic health records for a random sample of 500 DM individuals.  Results: Out of 9,161,866 individuals aged 0–99 years from 2000 to 2014, we classified 37,693 individuals with T1DM and 418,433 with T2DM, while 1,792 individuals remained unclassified. A small proportion were classified with some uncertainty (1,155 [3.1%] of all individuals with T1DM and 6,139 [1.5%] with T2DM due to unclear health records. During validation, manual assignment of DM type based on clinical assessment of the entire electronic record and algorithmic assignment led to equivalent classification

  11. Artificial Mangrove Species Mapping Using Pléiades-1: An Evaluation of Pixel-Based and Object-Based Classifications with Selected Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Dezhi Wang

    2018-02-01

    Full Text Available In the dwindling natural mangrove today, mangrove reforestation projects are conducted worldwide to prevent further losses. Due to monoculture and the low survival rate of artificial mangroves, it is necessary to pay attention to mapping and monitoring them dynamically. Remote sensing techniques have been widely used to map mangrove forests due to their capacity for large-scale, accurate, efficient, and repetitive monitoring. This study evaluated the capability of a 0.5-m Pléiades-1 in classifying artificial mangrove species using both pixel-based and object-based classification schemes. For comparison, three machine learning algorithms—decision tree (DT, support vector machine (SVM, and random forest (RF—were used as the classifiers in the pixel-based and object-based classification procedure. The results showed that both the pixel-based and object-based approaches could recognize the major discriminations between the four major artificial mangrove species. However, the object-based method had a better overall accuracy than the pixel-based method on average. For pixel-based image analysis, SVM produced the highest overall accuracy (79.63%; for object-based image analysis, RF could achieve the highest overall accuracy (82.40%, and it was also the best machine learning algorithm for classifying artificial mangroves. The patches produced by object-based image analysis approaches presented a more generalized appearance and could contiguously depict mangrove species communities. When the same machine learning algorithms were compared by McNemar’s test, a statistically significant difference in overall classification accuracy between the pixel-based and object-based classifications only existed in the RF algorithm. Regarding species, monoculture and dominant mangrove species Sonneratia apetala group 1 (SA1 as well as partly mixed and regular shape mangrove species Hibiscus tiliaceus (HT could well be identified. However, for complex and easily

  12. CIN classification and prediction using machine learning methods

    Science.gov (United States)

    Chirkina, Anastasia; Medvedeva, Marina; Komotskiy, Evgeny

    2017-06-01

    The aim of this paper is a comparison of the existing classification algorithms with different parameters, and selection those ones, which allows solving the problem of primary diagnosis of cervical intraepithelial neoplasia (CIN), as it characterizes the condition of the body in the precancerous stage. The paper describes a feature selection process, as well as selection of the best models for a multiclass classification.

  13. Impact of Reducing Polarimetric SAR Input on the Uncertainty of Crop Classifications Based on the Random Forests Algorithm

    DEFF Research Database (Denmark)

    Loosvelt, Lien; Peters, Jan; Skriver, Henning

    2012-01-01

    Although the use of multidate polarimetric synthetic aperture radar (SAR) data for highly accurate land cover classification has been acknowledged in the literature, the high dimensionality of the data set remains a major issue. This study presents two different strategies to reduce the number...... acquired by the Danish EMISAR on four dates within the period April to July in 1998. The predictive capacity of each feature is analyzed by the importance score generated by random forests (RF). Results show that according to the variation in importance score over time, a distinction can be made between...... general and specific features for crop classification. Based on the importance ranking, features are gradually removed from the single-date data sets in order to construct several multidate data sets with decreasing dimensionality. In the accuracy-oriented and efficiency-oriented reduction, the input...

  14. Partial imputation to improve predictive modelling in insurance risk classification using a hybrid positive selection algorithm and correlation-based feature selection

    CSIR Research Space (South Africa)

    Duma, M

    2013-09-01

    Full Text Available of missing data, with a decline in performance as the amount of missing data increases. Wagner et al.18 presented a study aimed at constructing a multimodal, ensemble of classifiers for emotion recog- nition with missing values in one or multiple... classification accuracies of 55%, which includes certain generic fusion schemes and emotion adapted strategies like arousal, valence and cross-axis. There are four kinds of missing data mechanisms found in the literature, namely missing at random (MAR), miss...

  15. COMBINATION OF GENETIC ALGORITHM AND DEMPSTER-SHAFER THEORY OF EVIDENCE FOR LAND COVER CLASSIFICATION USING INTEGRATION OF SAR AND OPTICAL SATELLITE IMAGERY

    Directory of Open Access Journals (Sweden)

    H. T. Chu

    2012-07-01

    Full Text Available The integration of different kinds of remotely sensed data, in particular Synthetic Aperture Radar (SAR and optical satellite imagery, is considered a promising approach for land cover classification because of the complimentary properties of each data source. However, the challenges are: how to fully exploit the capabilities of these multiple data sources, which combined datasets should be used and which data processing and classification techniques are most appropriate in order to achieve the best results. In this paper an approach, in which synergistic use of a feature selection (FS methods with Genetic Algorithm (GA and multiple classifiers combination based on Dempster-Shafer Theory of Evidence, is proposed and evaluated for classifying land cover features in New South Wales, Australia. Multi-date SAR data, including ALOS/PALSAR, ENVISAT/ASAR and optical (Landsat 5 TM+ images, were used for this study. Textural information were also derived and integrated with the original images. Various combined datasets were generated for classification. Three classifiers, namely Artificial Neural Network (ANN, Support Vector Machines (SVMs and Self-Organizing Map (SOM were employed. Firstly, feature selection using GA was applied for each classifier and dataset to determine the optimal input features and parameters. Then the results of three classifiers on particular datasets were combined using the Dempster-Shafer theory of Evidence. Results of this study demonstrate the advantages of the proposed method for land cover mapping using complex datasets. It is revealed that the use of GA in conjunction with the Dempster-Shafer Theory of Evidence can significantly improve the classification accuracy. Furthermore, integration of SAR and optical data often outperform single-type datasets.

  16. Statistics-based optimization of the polarimetric radar hydrometeor classification algorithm and its application for a squall line in South China

    Science.gov (United States)

    Wu, Chong; Liu, Liping; Wei, Ming; Xi, Baozhu; Yu, Minghui

    2018-03-01

    A modified hydrometeor classification algorithm (HCA) is developed in this study for Chinese polarimetric radars. This algorithm is based on the U.S. operational HCA. Meanwhile, the methodology of statistics-based optimization is proposed including calibration checking, datasets selection, membership functions modification, computation thresholds modification, and effect verification. Zhuhai radar, the first operational polarimetric radar in South China, applies these procedures. The systematic bias of calibration is corrected, the reliability of radar measurements deteriorates when the signal-to-noise ratio is low, and correlation coefficient within the melting layer is usually lower than that of the U.S. WSR-88D radar. Through modification based on statistical analysis of polarimetric variables, the localized HCA especially for Zhuhai is obtained, and it performs well over a one-month test through comparison with sounding and surface observations. The algorithm is then utilized for analysis of a squall line process on 11 May 2014 and is found to provide reasonable details with respect to horizontal and vertical structures, and the HCA results—especially in the mixed rain-hail region—can reflect the life cycle of the squall line. In addition, the kinematic and microphysical processes of cloud evolution and the differences between radar-detected hail and surface observations are also analyzed. The results of this study provide evidence for the improvement of this HCA developed specifically for China.

  17. Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification

    DEFF Research Database (Denmark)

    Marinakis, Yannis; Dounias, Georgios; Jantzen, Jan

    2009-01-01

    The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify t...... other previously applied intelligent approaches....

  18. Dual-energy cone-beam CT with a flat-panel detector: Effect of reconstruction algorithm on material classification

    International Nuclear Information System (INIS)

    Zbijewski, W.; Gang, G. J.; Xu, J.; Wang, A. S.; Stayman, J. W.; Taguchi, K.; Carrino, J. A.; Siewerdsen, J. H.

    2014-01-01

    Purpose: Cone-beam CT (CBCT) with a flat-panel detector (FPD) is finding application in areas such as breast and musculoskeletal imaging, where dual-energy (DE) capabilities offer potential benefit. The authors investigate the accuracy of material classification in DE CBCT using filtered backprojection (FBP) and penalized likelihood (PL) reconstruction and optimize contrast-enhanced DE CBCT of the joints as a function of dose, material concentration, and detail size. Methods: Phantoms consisting of a 15 cm diameter water cylinder with solid calcium inserts (50–200 mg/ml, 3–28.4 mm diameter) and solid iodine inserts (2–10 mg/ml, 3–28.4 mm diameter), as well as a cadaveric knee with intra-articular injection of iodine were imaged on a CBCT bench with a Varian 4343 FPD. The low energy (LE) beam was 70 kVp (+0.2 mm Cu), and the high energy (HE) beam was 120 kVp (+0.2 mm Cu, +0.5 mm Ag). Total dose (LE+HE) was varied from 3.1 to 15.6 mGy with equal dose allocation. Image-based DE classification involved a nearest distance classifier in the space of LE versus HE attenuation values. Recognizing the differences in noise between LE and HE beams, the LE and HE data were differentially filtered (in FBP) or regularized (in PL). Both a quadratic (PLQ) and a total-variation penalty (PLTV) were investigated for PL. The performance of DE CBCT material discrimination was quantified in terms of voxelwise specificity, sensitivity, and accuracy. Results: Noise in the HE image was primarily responsible for classification errors within the contrast inserts, whereas noise in the LE image mainly influenced classification in the surrounding water. For inserts of diameter 28.4 mm, DE CBCT reconstructions were optimized to maximize the total combined accuracy across the range of calcium and iodine concentrations, yielding values of ∼88% for FBP and PLQ, and ∼95% for PLTV at 3.1 mGy total dose, increasing to ∼95% for FBP and PLQ, and ∼98% for PLTV at 15.6 mGy total dose. For a

  19. Use of Pattern Classification Algorithms to Interpret Passive and Active Data Streams from a Walking-Speed Robotic Sensor Platform

    Science.gov (United States)

    Dieckman, Eric Allen

    In order to perform useful tasks for us, robots must have the ability to notice, recognize, and respond to objects and events in their environment. This requires the acquisition and synthesis of information from a variety of sensors. Here we investigate the performance of a number of sensor modalities in an unstructured outdoor environment, including the Microsoft Kinect, thermal infrared camera, and coffee can radar. Special attention is given to acoustic echolocation measurements of approaching vehicles, where an acoustic parametric array propagates an audible signal to the oncoming target and the Kinect microphone array records the reflected backscattered signal. Although useful information about the target is hidden inside the noisy time domain measurements, the Dynamic Wavelet Fingerprint process (DWFP) is used to create a time-frequency representation of the data. A small-dimensional feature vector is created for each measurement using an intelligent feature selection process for use in statistical pattern classification routines. Using our experimentally measured data from real vehicles at 50 m, this process is able to correctly classify vehicles into one of five classes with 94% accuracy. Fully three-dimensional simulations allow us to study the nonlinear beam propagation and interaction with real-world targets to improve classification results.

  20. [Magnetic resonance semiotics of prostate cancer according to the PI-RADS classification. The clinical diagnostic algorithm of a study].

    Science.gov (United States)

    Korobkin, A S; Shariya, M A; Chaban, A S; Voskanvan, G A; Vinarov, A Z

    2015-01-01

    to elaborate the magnetic resonance imaging (MRI) signs of prostate cancer (PC) in accordance with the PI-RADS classification during multiparametric MRI (mpMRI). A total of 89 men aged 20 to 82 years were examined. A control group consisted of 8 (9%) healthy volunteers younger than 30 years of age with no urological history to obtain control images and MRI plots and 20 (22.5%) men aged 26-76 years, whose morphological changes were inflammatory and hyperplastic. The second age-matched group included 61 (68.5%) patients diagnosed with prostate cancer at morphological examination. A set of studies included digital rectal examination, serum prostate-specific antigen, and transrectal ultrasound-guided prostate biopsy. All the patients underwent prostate mpMRI applying a 3.0 T Achieva MRI scanner (Philips, the Netherlands). The patients have been found to have mpMRI signs that were typical of PC; its MRI semiotics according to the PI-RADS classification is presented. Each mpMRI procedure has been determined to be of importance and informative value in detecting PC. The comprehensive mpMRI approach to diagnosing PC improves the quality and diagnostic value of prostate MRI.

  1. Classification of bladder cancer cell lines using Raman spectroscopy: a comparison of excitation wavelength, sample substrate and statistical algorithms

    Science.gov (United States)

    Kerr, Laura T.; Adams, Aine; O'Dea, Shirley; Domijan, Katarina; Cullen, Ivor; Hennelly, Bryan M.

    2014-05-01

    Raman microspectroscopy can be applied to the urinary bladder for highly accurate classification and diagnosis of bladder cancer. This technique can be applied in vitro to bladder epithelial cells obtained from urine cytology or in vivo as an optical biopsy" to provide results in real-time with higher sensitivity and specificity than current clinical methods. However, there exists a high degree of variability across experimental parameters which need to be standardised before this technique can be utilized in an everyday clinical environment. In this study, we investigate different laser wavelengths (473 nm and 532 nm), sample substrates (glass, fused silica and calcium fluoride) and multivariate statistical methods in order to gain insight into how these various experimental parameters impact on the sensitivity and specificity of Raman cytology.

  2. Automated classifications of topography from DEMs by an unsupervised nested-means algorithm and a three-part geometric signature

    Science.gov (United States)

    Iwahashi, J.; Pike, R.J.

    2007-01-01

    An iterative procedure that implements the classification of continuous topography as a problem in digital image-processing automatically divides an area into categories of surface form; three taxonomic criteria-slope gradient, local convexity, and surface texture-are calculated from a square-grid digital elevation model (DEM). The sequence of programmed operations combines twofold-partitioned maps of the three variables converted to greyscale images, using the mean of each variable as the dividing threshold. To subdivide increasingly subtle topography, grid cells sloping at less than mean gradient of the input DEM are classified by designating mean values of successively lower-sloping subsets of the study area (nested means) as taxonomic thresholds, thereby increasing the number of output categories from the minimum 8 to 12 or 16. Program output is exemplified by 16 topographic types for the world at 1-km spatial resolution (SRTM30 data), the Japanese Islands at 270??m, and part of Hokkaido at 55??m. Because the procedure is unsupervised and reflects frequency distributions of the input variables rather than pre-set criteria, the resulting classes are undefined and must be calibrated empirically by subsequent analysis. Maps of the example classifications reflect physiographic regions, geological structure, and landform as well as slope materials and processes; fine-textured terrain categories tend to correlate with erosional topography or older surfaces, coarse-textured classes with areas of little dissection. In Japan the resulting classes approximate landform types mapped from airphoto analysis, while in the Americas they create map patterns resembling Hammond's terrain types or surface-form classes; SRTM30 output for the United States compares favorably with Fenneman's physical divisions. Experiments are suggested for further developing the method; the Arc/Info AML and the map of terrain classes for the world are available as online downloads. ?? 2006 Elsevier

  3. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    2016-01-01

    Full Text Available Among non-small cell lung cancer (NSCLC, adenocarcinoma (AC, and squamous cell carcinoma (SCC are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR, can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.

  4. The classification of hunger behaviour of Lates Calcarifer through the integration of image processing technique and k-Nearest Neighbour learning algorithm

    Science.gov (United States)

    Taha, Z.; Razman, M. A. M.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Adnan, F. A.; Sallehudin, M. F.; Mukai, Y.

    2018-04-01

    Fish Hunger behaviour is essential in determining the fish feeding routine, particularly for fish farmers. The inability to provide accurate feeding routines (under-feeding or over-feeding) may lead the death of the fish and consequently inhibits the quantity of the fish produced. Moreover, the excessive food that is not consumed by the fish will be dissolved in the water and accordingly reduce the water quality through the reduction of oxygen quantity. This problem also leads the death of the fish or even spur fish diseases. In the present study, a correlation of Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique is established. The behaviour is clustered with respect to the position of the school size as well as the school density of the fish before feeding, during feeding and after feeding. The clustered fish behaviour is then classified through k-Nearest Neighbour (k-NN) learning algorithm. Three different variations of the algorithm namely cosine, cubic and weighted are assessed on its ability to classify the aforementioned fish hunger behaviour. It was found from the study that the weighted k-NN variation provides the best classification with an accuracy of 86.5%. Therefore, it could be concluded that the proposed integration technique may assist fish farmers in ascertaining fish feeding routine.

  5. Positive Predictive Values of International Classification of Diseases, 10th Revision Coding Algorithms to Identify Patients With Autosomal Dominant Polycystic Kidney Disease

    Directory of Open Access Journals (Sweden)

    Vinusha Kalatharan

    2016-12-01

    Full Text Available Background: International Classification of Diseases, 10th Revision codes (ICD-10 for autosomal dominant polycystic kidney disease (ADPKD is used within several administrative health care databases. It is unknown whether these codes identify patients who meet strict clinical criteria for ADPKD. Objective: The objective of this study is (1 to determine whether different ICD-10 coding algorithms identify adult patients who meet strict clinical criteria for ADPKD as assessed through medical chart review and (2 to assess the number of patients identified with different ADPKD coding algorithms in Ontario. Design: Validation study of health care database codes, and prevalence. Setting: Ontario, Canada. Patients: For the chart review, 201 adult patients with hospital encounters between April 1, 2002, and March 31, 2014, assigned either ICD-10 codes Q61.2 or Q61.3. Measurements: This study measured positive predictive value of the ICD-10 coding algorithms and the number of Ontarians identified with different coding algorithms. Methods: We manually reviewed a random sample of medical charts in London, Ontario, Canada, and determined whether or not ADPKD was present according to strict clinical criteria. Results: The presence of either ICD-10 code Q61.2 or Q61.3 in a hospital encounter had a positive predictive value of 85% (95% confidence interval [CI], 79%-89% and identified 2981 Ontarians (0.02% of the Ontario adult population. The presence of ICD-10 code Q61.2 in a hospital encounter had a positive predictive value of 97% (95% CI, 86%-100% and identified 394 adults in Ontario (0.003% of the Ontario adult population. Limitations: (1 We could not calculate other measures of validity; (2 the coding algorithms do not identify patients without hospital encounters; and (3 coding practices may differ between hospitals. Conclusions: Most patients with ICD-10 code Q61.2 or Q61.3 assigned during their hospital encounters have ADPKD according to the clinical

  6. Spectral matching techniques (SMTs) and automated cropland classification algorithms (ACCAs) for mapping croplands of Australia using MODIS 250-m time-series (2000–2015) data

    Science.gov (United States)

    Teluguntla, Pardhasaradhi G.; Thenkabail, Prasad S.; Xiong, Jun N.; Gumma, Murali Krishna; Congalton, Russell G.; Oliphant, Adam; Poehnelt, Justin; Yadav, Kamini; Rao, Mahesh N.; Massey, Richard

    2017-01-01

    Mapping croplands, including fallow areas, are an important measure to determine the quantity of food that is produced, where they are produced, and when they are produced (e.g. seasonality). Furthermore, croplands are known as water guzzlers by consuming anywhere between 70% and 90% of all human water use globally. Given these facts and the increase in global population to nearly 10 billion by the year 2050, the need for routine, rapid, and automated cropland mapping year-after-year and/or season-after-season is of great importance. The overarching goal of this study was to generate standard and routine cropland products, year-after-year, over very large areas through the use of two novel methods: (a) quantitative spectral matching techniques (QSMTs) applied at continental level and (b) rule-based Automated Cropland Classification Algorithm (ACCA) with the ability to hind-cast, now-cast, and future-cast. Australia was chosen for the study given its extensive croplands, rich history of agriculture, and yet nonexistent routine yearly generated cropland products using multi-temporal remote sensing. This research produced three distinct cropland products using Moderate Resolution Imaging Spectroradiometer (MODIS) 250-m normalized difference vegetation index 16-day composite time-series data for 16 years: 2000 through 2015. The products consisted of: (1) cropland extent/areas versus cropland fallow areas, (2) irrigated versus rainfed croplands, and (3) cropping intensities: single, double, and continuous cropping. An accurate reference cropland product (RCP) for the year 2014 (RCP2014) produced using QSMT was used as a knowledge base to train and develop the ACCA algorithm that was then applied to the MODIS time-series data for the years 2000–2015. A comparison between the ACCA-derived cropland products (ACPs) for the year 2014 (ACP2014) versus RCP2014 provided an overall agreement of 89.4% (kappa = 0.814) with six classes: (a) producer’s accuracies varying

  7. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  8. A Comparison of Machine Learning Algorithms for Chemical Toxicity Classification Using a Simulated Multi-Scale Data Model

    Science.gov (United States)

    Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in hig...

  9. Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

    International Nuclear Information System (INIS)

    Ahmed, M; Gu, F; Ball, A

    2011-01-01

    Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.

  10. Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

    Energy Technology Data Exchange (ETDEWEB)

    Ahmed, M; Gu, F; Ball, A, E-mail: M.Ahmed@hud.ac.uk [Diagnostic Engineering Research Group, University of Huddersfield, HD1 3DH (United Kingdom)

    2011-07-19

    Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.

  11. AIMES Final Technical Report

    Energy Technology Data Exchange (ETDEWEB)

    Katz, Daniel S [Univ. of Illinois, Urbana-Champaign, IL (United States). National Center for Supercomputing Applications (NCSA); Jha, Shantenu [Rutgers Univ., New Brunswick, NJ (United States); Weissman, Jon [Univ. of Minnesota, Minneapolis, MN (United States); Turilli, Matteo [Rutgers Univ., New Brunswick, NJ (United States)

    2017-01-31

    This is the final technical report for the AIMES project. Many important advances in science and engineering are due to large-scale distributed computing. Notwithstanding this reliance, we are still learning how to design and deploy large-scale production Distributed Computing Infrastructures (DCI). This is evidenced by missing design principles for DCI, and an absence of generally acceptable and usable distributed computing abstractions. The AIMES project was conceived against this backdrop, following on the heels of a comprehensive survey of scientific distributed applications. AIMES laid the foundations to address the tripartite challenge of dynamic resource management, integrating information, and portable and interoperable distributed applications. Four abstractions were defined and implemented: skeleton, resource bundle, pilot, and execution strategy. The four abstractions were implemented into software modules and then aggregated into the AIMES middleware. This middleware successfully integrates information across the application layer (skeletons) and resource layer (Bundles), derives a suitable execution strategy for the given skeleton and enacts its execution by means of pilots on one or more resources, depending on the application requirements, and resource availabilities and capabilities.

  12. Can Automatic Classification Help to Increase Accuracy in Data Collection?

    Directory of Open Access Journals (Sweden)

    Frederique Lang

    2016-09-01

    Full Text Available Purpose: The authors aim at testing the performance of a set of machine learning algorithms that could improve the process of data cleaning when building datasets. Design/methodology/approach: The paper is centered on cleaning datasets gathered from publishers and online resources by the use of specific keywords. In this case, we analyzed data from the Web of Science. The accuracy of various forms of automatic classification was tested here in comparison with manual coding in order to determine their usefulness for data collection and cleaning. We assessed the performance of seven supervised classification algorithms (Support Vector Machine (SVM, Scaled Linear Discriminant Analysis, Lasso and elastic-net regularized generalized linear models, Maximum Entropy, Regression Tree, Boosting, and Random Forest and analyzed two properties: accuracy and recall. We assessed not only each algorithm individually, but also their combinations through a voting scheme. We also tested the performance of these algorithms with different sizes of training data. When assessing the performance of different combinations, we used an indicator of coverage to account for the agreement and disagreement on classification between algorithms. Findings: We found that the performance of the algorithms used vary with the size of the sample for training. However, for the classification exercise in this paper the best performing algorithms were SVM and Boosting. The combination of these two algorithms achieved a high agreement on coverage and was highly accurate. This combination performs well with a small training dataset (10%, which may reduce the manual work needed for classification tasks. Research limitations: The dataset gathered has significantly more records related to the topic of interest compared to unrelated topics. This may affect the performance of some algorithms, especially in their identification of unrelated papers. Practical implications: Although the

  13. AIM Data Services

    Directory of Open Access Journals (Sweden)

    Michael Scholz

    2016-05-01

    Full Text Available AIM Data Services as a virtual facility provides virtual 3D reference tracks for simulation applications in the domain of automotive and railway systems. It offers tools for management and analysis of experiment data and a platform for survey and processing of vehicle data in the public transport domain. Collected spatial data is bundled in a database cluster and published through common web mapping interfaces.

  14. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor

    Directory of Open Access Journals (Sweden)

    Chang Xu

    2018-05-01

    Full Text Available This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs. Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  15. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.

    Science.gov (United States)

    Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong

    2018-05-24

    This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  16. AIMES Final Technical Report

    Energy Technology Data Exchange (ETDEWEB)

    Jha, Shantenu [Rutgers Univ., New Brunswick, NJ (United States)

    2017-01-31

    Many important advances in science and engineering are due to large-scale distributed computing. Notwithstanding this reliance, we are still learning how to design and deploy large-scale production Distributed Computing Infrastructures (DCI). The AIMES project was conceived against this backdrop, following on the heels of a comprehensive survey of scienti c distributed applications [1]. The survey established, arguably for the rst time, the relationship between infrastructure and scienti c distributed applications. It examined well known contributors to the complexity associated with infrastructure, such as inconsistent internal and external interfaces, and demonstrated the correlation with application brittleness. It discussed how infrastructure complexity reinforces the challenges inherent in developing distributed applications.

  17. Aiming for the ordinary

    DEFF Research Database (Denmark)

    Offersen, Sara Marie Hebsgaard

    that the Danes are encouraged to be alert to still earlier and vaguer bodily signs of potential cancer and seek care ‘in time’. With biomedical constructions such as ‘cancer awareness’ and ‘alarm symptoms of cancer’ and the retrospectively oriented definition of life before symptoms-based healthcare seeking...... and articulation of bodily sensations, and how decisions about healthcare seeking are established in this context. This dissertation aims to explore these matters from the perspective of the Danish middle class, mainly focusing on how sensations are ascribed meaning as symptoms and how they are evoked...... on a continuum between what is locally considered ordinary and extraordinary. Overall, the dissertation argues that inquiries into morality and potentiality provide valuable insights into healthcare seeking practices and the making and management of symptoms in everyday life. The dissertation is based on 18...

  18. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  19. Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery.

    Science.gov (United States)

    Li, Guiying; Lu, Dengsheng; Moran, Emilio; Hetrick, Scott

    2011-01-01

    This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms - maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes.

  20. Enhancement of ELM by Clustering Discrimination Manifold Regularization and Multiobjective FOA for Semisupervised Classification

    OpenAIRE

    Qing Ye; Hao Pan; Changhua Liu

    2015-01-01

    A novel semisupervised extreme learning machine (ELM) with clustering discrimination manifold regularization (CDMR) framework named CDMR-ELM is proposed for semisupervised classification. By using unsupervised fuzzy clustering method, CDMR framework integrates clustering discrimination of both labeled and unlabeled data with twinning constraints regularization. Aiming at further improving the classification accuracy and efficiency, a new multiobjective fruit fly optimization algorithm (MOFOA)...

  1. [Aiming for zero blindness].

    Science.gov (United States)

    Nakazawa, Toru

    2015-03-01

    -independent factors, as well as our investigation of ways to improve the clinical evaluation of the disease. Our research was prompted by the multifactorial nature of glaucoma. There is a high degree of variability in the pattern and speed of the progression of visual field defects in individual patients, presenting a major obstacle for successful clinical trials. To overcome this, we classified the eyes of glaucoma patients into 4 types, corresponding to the 4 patterns of glaucomatous optic nerve head morphology described: by Nicolela et al. and then tested the validity of this method by assessing the uniformity of clinical features in each group. We found that in normal tension glaucoma (NTG) eyes, each disc morphology group had a characteristic location in which the loss of circumpapillary retinal nerve fiber layer thickness (cpRNFLT; measured with optical coherence tomography: OCT) was most likely to occur. Furthermore, the incidence of reductions in visual acuity differed between the groups, as did the speed of visual field loss, the distribution of defective visual field test points, and the location of test points that were most susceptible to progressive damage, measured by Humphrey static perimetry. These results indicate that Nicolela's method of classifying eyes with glaucoma was able to overcome the difficulties caused by the diverse nature of the disease, at least to a certain extent. Building on these findings, we then set out to identify sectors of the visual field that correspond to the distribution of retinal nerve fibers, with the aim of detecting glaucoma progression with improved sensitivity. We first mapped the statistical correlation between visual field test points and cpRNFLT in each temporal clock-hour sector (from 6 to 12 o'clock), using OCT data from NTG patients. The resulting series of maps allowed us to identify areas containing visual field test points that were prone to be affected together as a group. We also used a similar method to identify visual

  2. Secondary structure classification of amino-acid sequences using state-space modeling

    OpenAIRE

    Brunnert, Marcus; Krahnke, Tillmann; Urfer, Wolfgang

    2001-01-01

    The secondary structure classification of amino acid sequences can be carried out by a statistical analysis of sequence and structure data using state-space models. Aiming at this classification, a modified filter algorithm programmed in S is applied to data of three proteins. The application leads to correct classifications of two proteins even when using relatively simple estimation methods for the parameters of the state-space models. Furthermore, it has been shown that the assumed initial...

  3. Classification of neuropathic pain in cancer patients: A Delphi expert survey report and EAPC/IASP proposal of an algorithm for diagnostic criteria.

    Science.gov (United States)

    Brunelli, Cinzia; Bennett, Michael I; Kaasa, Stein; Fainsinger, Robin; Sjøgren, Per; Mercadante, Sebastiano; Løhre, Erik T; Caraceni, Augusto

    2014-12-01

    Neuropathic pain (NP) in cancer patients lacks standards for diagnosis. This study is aimed at reaching consensus on the application of the International Association for the Study of Pain (IASP) special interest group for neuropathic pain (NeuPSIG) criteria to the diagnosis of NP in cancer patients and on the relevance of patient-reported outcome (PRO) descriptors for the screening of NP in this population. An international group of 42 experts was invited to participate in a consensus process through a modified 2-round Internet-based Delphi survey. Relevant topics investigated were: peculiarities of NP in patients with cancer, IASP NeuPSIG diagnostic criteria adaptation and assessment, and standardized PRO assessment for NP screening. Median consensus scores (MED) and interquartile ranges (IQR) were calculated to measure expert consensus after both rounds. Twenty-nine experts answered, and good agreement was found on the statement "the pathophysiology of NP due to cancer can be different from non-cancer NP" (MED=9, IQR=2). Satisfactory consensus was reached for the first 3 NeuPSIG criteria (pain distribution, history, and sensory findings; MEDs⩾8, IQRs⩽3), but not for the fourth one (diagnostic test/imaging; MED=6, IQR=3). Agreement was also reached on clinical examination by soft brush or pin stimulation (MEDs⩾7 and IQRs⩽3) and on the use of PRO descriptors for NP screening (MED=8, IQR=3). Based on the study results, a clinical algorithm for NP diagnostic criteria in cancer patients with pain was proposed. Clinical research on PRO in the screening phase and on the application of the algorithm will be needed to examine their effectiveness in classifying NP in cancer patients. Copyright © 2014 International Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.

  4. Mimicking human texture classification

    NARCIS (Netherlands)

    Rogowitz, B.E.; van Rikxoort, Eva M.; van den Broek, Egon; Pappas, T.N.; Schouten, Theo E.; Daly, S.J.

    2005-01-01

    In an attempt to mimic human (colorful) texture classification by a clustering algorithm three lines of research have been encountered, in which as test set 180 texture images (both their color and gray-scale equivalent) were drawn from the OuTex and VisTex databases. First, a k-means algorithm was

  5. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    Science.gov (United States)

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  6. The application of mixed recommendation algorithm with user clustering in the microblog advertisements promotion

    Science.gov (United States)

    Gong, Lina; Xu, Tao; Zhang, Wei; Li, Xuhong; Wang, Xia; Pan, Wenwen

    2017-03-01

    The traditional microblog recommendation algorithm has the problems of low efficiency and modest effect in the era of big data. In the aim of solving these issues, this paper proposed a mixed recommendation algorithm with user clustering. This paper first introduced the situation of microblog marketing industry. Then, this paper elaborates the user interest modeling process and detailed advertisement recommendation methods. Finally, this paper compared the mixed recommendation algorithm with the traditional classification algorithm and mixed recommendation algorithm without user clustering. The results show that the mixed recommendation algorithm with user clustering has good accuracy and recall rate in the microblog advertisements promotion.

  7. Efficient Fingercode Classification

    Science.gov (United States)

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  8. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    Science.gov (United States)

    2011-09-01

    SNP Array v2. A ‘proof-of-concept’ advanced data mining algorithm for unsupervised analysis of genome-wide association study (GWAS) dataset was... Opal F AUS Yes U141 Peggs F AUS Yes U142 Taxi F AUS Yes U143 Riso MI MAL Yes U144 Szarik MI GSD Yes U145 Astor MI MAL Yes U146 Roy MC MAL Yes... mining of genetic studies in general, and especially GWAS. As a proof-of-concept, a classification analysis of the WG SNP typing dataset of a

  9. Opposite Degree Algorithm and Its Applications

    Directory of Open Access Journals (Sweden)

    Xiao-Guang Yue

    2015-12-01

    Full Text Available The opposite (Opposite Degree, referred to as OD algorithm is an intelligent algorithm proposed by Yue Xiaoguang et al. Opposite degree algorithm is mainly based on the concept of opposite degree, combined with the idea of design of neural network and genetic algorithm and clustering analysis algorithm. The OD algorithm is divided into two sub algorithms, namely: opposite degree - numerical computation (OD-NC algorithm and opposite degree - Classification computation (OD-CC algorithm.

  10. Classification, disease, and diagnosis.

    Science.gov (United States)

    Jutel, Annemarie

    2011-01-01

    Classification shapes medicine and guides its practice. Understanding classification must be part of the quest to better understand the social context and implications of diagnosis. Classifications are part of the human work that provides a foundation for the recognition and study of illness: deciding how the vast expanse of nature can be partitioned into meaningful chunks, stabilizing and structuring what is otherwise disordered. This article explores the aims of classification, their embodiment in medical diagnosis, and the historical traditions of medical classification. It provides a brief overview of the aims and principles of classification and their relevance to contemporary medicine. It also demonstrates how classifications operate as social framing devices that enable and disable communication, assert and refute authority, and are important items for sociological study.

  11. Study on Magneto-Hydro-Dynamics Disturbance Signal Feature Classification Using Improved S-Transform Algorithm and Radial Basis Function Neural Network

    Directory of Open Access Journals (Sweden)

    Nan YU

    2014-09-01

    Full Text Available The interference signal in magneto-hydro-dynamics (MHD may be the disturbance from the power supply, the equipment itself, or the electromagnetic radiation. Interference signal mixed in normal signal, brings difficulties for signal analysis and processing. Recently proposed S-Transform algorithm combines advantages of short time Fourier transform and wavelet transform. It uses Fourier kernel and wavelet like Gauss window whose width is inversely proportional to the frequency. Therefore, S-Transform algorithm not only preserves the phase information of the signals but also has variable resolution like wavelet transform. This paper proposes a new method to establish a MHD signal classifier using S-transform algorithm and radial basis function neural network (RBFNN. Because RBFNN centers ascertained by k-means clustering algorithm probably are the local optimum, this paper analyzes the characteristics of k-means clustering algorithm and proposes an improved k-means clustering algorithm called GCW (Group-cluster-weight k-means clustering algorithm to improve the centers distribution. The experiment results show that the improvement greatly enhances the RBFNN performance.

  12. A search algorithm to meta-optimize the parameters for an extended Kalman filter to improve classification on hyper-temporal images

    CSIR Research Space (South Africa)

    Salmon, BP

    2012-07-01

    Full Text Available stream_source_info Salmon2_2012.pdf.txt stream_content_type text/plain stream_size 16400 Content-Encoding ISO-8859-1 stream_name Salmon2_2012.pdf.txt Content-Type text/plain; charset=ISO-8859-1 A SEARCH ALGORITHM TO META... the spectral bands separately and introduced a meta-optimization method for the EKF that will be called the Bias Variance Equilibrium Point (BVEP) in this paper. The objective of this paper is to introduce an unsuper- vised search algorithm called the Bias...

  13. Classification of smooth Fano polytopes

    DEFF Research Database (Denmark)

    Øbro, Mikkel

    A simplicial lattice polytope containing the origin in the interior is called a smooth Fano polytope, if the vertices of every facet is a basis of the lattice. The study of smooth Fano polytopes is motivated by their connection to toric varieties. The thesis concerns the classification of smooth...... Fano polytopes up to isomorphism. A smooth Fano -polytope can have at most vertices. In case of vertices an explicit classification is known. The thesis contains the classification in case of vertices. Classifications of smooth Fano -polytopes for fixed exist only for . In the thesis an algorithm...... for the classification of smooth Fano -polytopes for any given is presented. The algorithm has been implemented and used to obtain the complete classification for ....

  14. Quantum computing for pattern classification

    OpenAIRE

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2014-01-01

    It is well known that for certain tasks, quantum computing outperforms classical computing. A growing number of contributions try to use this advantage in order to improve or extend classical machine learning algorithms by methods of quantum information theory. This paper gives a brief introduction into quantum machine learning using the example of pattern classification. We introduce a quantum pattern classification algorithm that draws on Trugenberger's proposal for measuring the Hamming di...

  15. Joint Concept Correlation and Feature-Concept Relevance Learning for Multilabel Classification.

    Science.gov (United States)

    Zhao, Xiaowei; Ma, Zhigang; Li, Zhi; Li, Zhihui

    2018-02-01

    In recent years, multilabel classification has attracted significant attention in multimedia annotation. However, most of the multilabel classification methods focus only on the inherent correlations existing among multiple labels and concepts and ignore the relevance between features and the target concepts. To obtain more robust multilabel classification results, we propose a new multilabel classification method aiming to capture the correlations among multiple concepts by leveraging hypergraph that is proved to be beneficial for relational learning. Moreover, we consider mining feature-concept relevance, which is often overlooked by many multilabel learning algorithms. To better show the feature-concept relevance, we impose a sparsity constraint on the proposed method. We compare the proposed method with several other multilabel classification methods and evaluate the classification performance by mean average precision on several data sets. The experimental results show that the proposed method outperforms the state-of-the-art methods.

  16. Classification of Polarimetric SAR Data Using Dictionary Learning

    DEFF Research Database (Denmark)

    Vestergaard, Jacob Schack; Nielsen, Allan Aasbjerg; Dahl, Anders Lindbjerg

    2012-01-01

    This contribution deals with classification of multilook fully polarimetric synthetic aperture radar (SAR) data by learning a dictionary of crop types present in the Foulum test site. The Foulum test site contains a large number of agricultural fields, as well as lakes, forests, natural vegetation......, grasslands and urban areas, which make it ideally suited for evaluation of classification algorithms. Dictionary learning centers around building a collection of image patches typical for the classification problem at hand. This requires initial manual labeling of the classes present in the data and is thus...... a method for supervised classification. Sparse coding of these image patches aims to maintain a proficient number of typical patches and associated labels. Data is consecutively classified by a nearest neighbor search of the dictionary elements and labeled with probabilities of each class. Each dictionary...

  17. Testing block subdivision algorithms on block designs

    Science.gov (United States)

    Wiseman, Natalie; Patterson, Zachary

    2016-01-01

    Integrated land use-transportation models predict future transportation demand taking into account how households and firms arrange themselves partly as a function of the transportation system. Recent integrated models require parcels as inputs and produce household and employment predictions at the parcel scale. Block subdivision algorithms automatically generate parcel patterns within blocks. Evaluating block subdivision algorithms is done by way of generating parcels and comparing them to those in a parcel database. Three block subdivision algorithms are evaluated on how closely they reproduce parcels of different block types found in a parcel database from Montreal, Canada. While the authors who developed each of the algorithms have evaluated them, they have used their own metrics and block types to evaluate their own algorithms. This makes it difficult to compare their strengths and weaknesses. The contribution of this paper is in resolving this difficulty with the aim of finding a better algorithm suited to subdividing each block type. The proposed hypothesis is that given the different approaches that block subdivision algorithms take, it's likely that different algorithms are better adapted to subdividing different block types. To test this, a standardized block type classification is used that consists of mutually exclusive and comprehensive categories. A statistical method is used for finding a better algorithm and the probability it will perform well for a given block type. Results suggest the oriented bounding box algorithm performs better for warped non-uniform sites, as well as gridiron and fragmented uniform sites. It also produces more similar parcel areas and widths. The Generalized Parcel Divider 1 algorithm performs better for gridiron non-uniform sites. The Straight Skeleton algorithm performs better for loop and lollipop networks as well as fragmented non-uniform and warped uniform sites. It also produces more similar parcel shapes and patterns.

  18. Detection, identification and classification of defects using ANN and a robotic manipulator of 2 G.L. (Kohonen and MLP algorithms)

    International Nuclear Information System (INIS)

    Barrera, G.; Fabian, M. A.; Ugalde, C. A.

    2002-01-01

    The ultrasonic inspection technique had a sustained growth since the 80's It has several advantages, compared with the contact technique. A flexible and low cost solution is presented based on virtual instrumentation for the servomechanism (manipulator) control of the ultrasound inspection transducer in the immersion technique. The developed system uses a personal computer (PC). a Windows Operating System. Virtual Instrumentation Software. DAQ cards and a GPIB card. As a solution to detection, classification and evaluation of defects an Artificial Neuronal Networks technique proposed. It consists of characterization and interpretation of acoustic signals (echoes) acquired by the immersion ultrasonic inspection technique. Two neuronal networks are proposed: Kohonen and Multilayer Perceptron (MLP). With this techniques non-linear complex processes can be modeled with great precision. The 2-degree of freedom manipulator control, the data acquisition and the net training have been carried out in a virtual instrument environment using LabVIEV and Data Engine. (Author) 14 refs

  19. A Novel Segment-Based Approach for Improving Classification Performance of Transport Mode Detection.

    Science.gov (United States)

    Guvensan, M Amac; Dusun, Burak; Can, Baris; Turkmen, H Irem

    2017-12-30

    Transportation planning and solutions have an enormous impact on city life. To minimize the transport duration, urban planners should understand and elaborate the mobility of a city. Thus, researchers look toward monitoring people's daily activities including transportation types and duration by taking advantage of individual's smartphones. This paper introduces a novel segment-based transport mode detection architecture in order to improve the results of traditional classification algorithms in the literature. The proposed post-processing algorithm, namely the Healing algorithm, aims to correct the misclassification results of machine learning-based solutions. Our real-life test results show that the Healing algorithm could achieve up to 40% improvement of the classification results. As a result, the implemented mobile application could predict eight classes including stationary, walking, car, bus, tram, train, metro and ferry with a success rate of 95% thanks to the proposed multi-tier architecture and Healing algorithm.

  20. Maximum mutual information regularized classification

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-09-07

    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  1. Maximum mutual information regularized classification

    KAUST Repository

    Wang, Jim Jing-Yan; Wang, Yi; Zhao, Shiguang; Gao, Xin

    2014-01-01

    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  2. The diverse aims of science.

    Science.gov (United States)

    Potochnik, Angela

    2015-10-01

    There is increasing attention to the centrality of idealization in science. One common view is that models and other idealized representations are important to science, but that they fall short in one or more ways. On this view, there must be an intermediary step between idealized representation and the traditional aims of science, including truth, explanation, and prediction. Here I develop an alternative interpretation of the relationship between idealized representation and the aims of science. I suggest that continuing, widespread idealization calls into question the idea that science aims for truth. If instead science aims to produce understanding, this would enable idealizations to directly contribute to science's epistemic success. I also use the fact of widespread idealization to motivate the idea that science's wide variety aims, epistemic and non-epistemic, are best served by different kinds of scientific products. Finally, I show how these diverse aims—most rather distant from truth—result in the expanded influence of social values on science. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Development of Tier 1 screening tool for soil and groundwater vulnerability assessment in Korea using classification algorithm in a neural network

    Science.gov (United States)

    Shin, K. H.; Kim, K. H.; Ki, S. J.; Lee, H. G.

    2017-12-01

    The vulnerability assessment tool at a Tier 1 level, although not often used for regulatory purposes, helps establish pollution prevention and management strategies in the areas of potential environmental concern such as soil and ground water. In this study, the Neural Network Pattern Recognition Tool embedded in MATLAB was used to allow the initial screening of soil and groundwater pollution based on data compiled across about 1000 previously contaminated sites in Korea. The input variables included a series of parameters which were tightly related to downward movement of water and contaminants through soil and ground water, whereas multiple classes were assigned to the sum of concentrations of major pollutants detected. Results showed that in accordance with diverse pollution indices for soil and ground water, pollution levels in both media were strongly modulated by site-specific characteristics such as intrinsic soil and other geologic properties, in addition to pollution sources and rainfall. However, classification accuracy was very sensitive to the number of classes defined as well as the types of the variables incorporated, requiring careful selection of input variables and output categories. Therefore, we believe that the proposed methodology is used not only to modify existing pollution indices so that they are more suitable for addressing local vulnerability, but also to develop a unique assessment tool to support decision making based on locally or nationally available data. This study was funded by a grant from the GAIA project(2016000560002), Korea Environmental Industry & Technology Institute, Republic of Korea.

  4. Aim For a Healthy Weight

    Science.gov (United States)

    ... out of your control, you can make positive lifestyle changes to lose weight and to maintain a healthy weight. These include a healthy eating plan and being more physically active. Take the Challenge When it comes to aiming for a healthy ...

  5. Stellar Spectral Classification with Locality Preserving Projections ...

    Indian Academy of Sciences (India)

    With the help of computer tools and algorithms, automatic stellar spectral classification has become an area of current interest. The process of stellar spectral classification mainly includes two steps: dimension reduction and classification. As a popular dimensionality reduction technique, Principal Component Analysis (PCA) ...

  6. The Classification of Romanian High-Schools

    Science.gov (United States)

    Ivan, Ion; Milodin, Daniel; Naie, Lucian

    2006-01-01

    The article tries to tackle the issue of high-schools classification from one city, district or from Romania. The classification criteria are presented. The National Database of Education is also presented and the application of criteria is illustrated. An algorithm for high-school multi-rang classification is proposed in order to build classes of…

  7. Vietnamese Document Representation and Classification

    Science.gov (United States)

    Nguyen, Giang-Son; Gao, Xiaoying; Andreae, Peter

    Vietnamese is very different from English and little research has been done on Vietnamese document classification, or indeed, on any kind of Vietnamese language processing, and only a few small corpora are available for research. We created a large Vietnamese text corpus with about 18000 documents, and manually classified them based on different criteria such as topics and styles, giving several classification tasks of different difficulty levels. This paper introduces a new syllable-based document representation at the morphological level of the language for efficient classification. We tested the representation on our corpus with different classification tasks using six classification algorithms and two feature selection techniques. Our experiments show that the new representation is effective for Vietnamese categorization, and suggest that best performance can be achieved using syllable-pair document representation, an SVM with a polynomial kernel as the learning algorithm, and using Information gain and an external dictionary for feature selection.

  8. Aims, assessments and workplace needs

    Science.gov (United States)

    Black, Paul

    1997-03-01

    This paper attempts to consider the aims that undergraduate physics degree courses actually reflect and serve in the light of the employment patterns of graduates and of the expressed needs of employers. Calling on evidence mainly from the UK, it reviews analyses of what degree examinations actually test, and goes on to quote criticisms of their courses and radical proposals to change them adopted by the senior physics professors in the UK. The discussion is then broadened by discussion of evidence, about the employment of graduates and about the priorities that some industrialists now give in the qualities that they look for when recruiting new graduates. The evidence leads to a view that radical changes are needed, both in courses and examinations, and that there is a need for university departments to work more closely with employers in re-formulating the aims and priorities in their teaching.

  9. Characterization and classification of seven citrus herbs by liquid chromatography-quadrupole time-of-flight mass spectrometry and genetic algorithm optimized support vector machines.

    Science.gov (United States)

    Duan, Li; Guo, Long; Liu, Ke; Liu, E-Hu; Li, Ping

    2014-04-25

    Citrus herbs have been widely used in traditional medicine and cuisine in China and other countries since the ancient time. However, the authentication and quality control of Citrus herbs has always been a challenging task due to their similar morphological characteristics and the diversity of the multi-components existed in the complicated matrix. In the present investigation, we developed a novel strategy to characterize and classify seven Citrus herbs based on chromatographic analysis and chemometric methods. Firstly, the chemical constituents in seven Citrus herbs were globally characterized by liquid chromatography combined with quadrupole time-of-flight mass spectrometry (LC-QTOF-MS). Based on their retention time, UV spectra and MS fragmentation behavior, a total of 75 compounds were identified or tentatively characterized in these herbal medicines. Secondly, a segmental monitoring method based on LC-variable wavelength detection was developed for simultaneous quantification of ten marker compounds in these Citrus herbs. Thirdly, based on the contents of the ten analytes, genetic algorithm optimized support vector machines (GA-SVM) was employed to differentiate and classify the 64 samples covering these seven herbs. The obtained classifier showed good prediction performance and the overall prediction accuracy reached 96.88%. The proposed strategy is expected to provide new insight for authentication and quality control of traditional herbs. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Minimum Error Entropy Classification

    CERN Document Server

    Marques de Sá, Joaquim P; Santos, Jorge M F; Alexandre, Luís A

    2013-01-01

    This book explains the minimum error entropy (MEE) concept applied to data classification machines. Theoretical results on the inner workings of the MEE concept, in its application to solving a variety of classification problems, are presented in the wider realm of risk functionals. Researchers and practitioners also find in the book a detailed presentation of practical data classifiers using MEE. These include multi‐layer perceptrons, recurrent neural networks, complexvalued neural networks, modular neural networks, and decision trees. A clustering algorithm using a MEE‐like concept is also presented. Examples, tests, evaluation experiments and comparison with similar machines using classic approaches, complement the descriptions.

  11. Brain source localization: A new method based on MUltiple SIgnal Classification algorithm and spatial sparsity of the field signal for electroencephalogram measurements

    Science.gov (United States)

    Vergallo, P.; Lay-Ekuakille, A.

    2013-08-01

    Brain activity can be recorded by means of EEG (Electroencephalogram) electrodes placed on the scalp of the patient. The EEG reflects the activity of groups of neurons located in the head, and the fundamental problem in neurophysiology is the identification of the sources responsible of brain activity, especially if a seizure occurs and in this case it is important to identify it. The studies conducted in order to formalize the relationship between the electromagnetic activity in the head and the recording of the generated external field allow to know pattern of brain activity. The inverse problem, that is given the sampling field at different electrodes the underlying asset must be determined, is more difficult because the problem may not have a unique solution, or the search for the solution is made difficult by a low spatial resolution which may not allow to distinguish between activities involving sources close to each other. Thus, sources of interest may be obscured or not detected and known method in source localization problem as MUSIC (MUltiple SIgnal Classification) could fail. Many advanced source localization techniques achieve a best resolution by exploiting sparsity: if the number of sources is small as a result, the neural power vs. location is sparse. In this work a solution based on the spatial sparsity of the field signal is presented and analyzed to improve MUSIC method. For this purpose, it is necessary to set a priori information of the sparsity in the signal. The problem is formulated and solved using a regularization method as Tikhonov, which calculates a solution that is the better compromise between two cost functions to minimize, one related to the fitting of the data, and another concerning the maintenance of the sparsity of the signal. At the first, the method is tested on simulated EEG signals obtained by the solution of the forward problem. Relatively to the model considered for the head and brain sources, the result obtained allows to

  12. Comment on “An algorithm for identification and classification of individuals with type 1 and type 2 diabetes mellitus in a large primary care database”, written by Sharma et al

    Directory of Open Access Journals (Sweden)

    Bocquet V

    2017-01-01

    Full Text Available Valéry Bocquet Competence Center for Methodology and Statistics, Luxembourg Institute of Health, LuxembourgDiabetes is a disease whose global prevalence has been rising year after year, and by 2014 more than 400 million individuals were diagnosed with diabetes.1 As a consequence, screening of patients with type 1 or type 2 diabetes has become important, both to estimate the prevalence of diabetes and to treat affected individuals. For that purpose, a two-step algorithm suggested by Sharma et al2 was recently published, whose aims were to identify type 1 or type 2 individuals from a primary care database. The first step of the algorithm was based on the diagnostic records, treatment given, and results obtained from clinical tests. The second part was based on the combination of diagnostic codes, prescribed medications, age at the time of diagnosis, and finally whether the case was prevalent or incident.View original paper by Sharma et al

  13. A comparison of CA125, HE4, risk ovarian malignancy algorithm (ROMA, and risk malignancy index (RMI for the classification of ovarian masses

    Directory of Open Access Journals (Sweden)

    Cristina Anton

    2012-01-01

    Full Text Available OBJECTIVE: Differentiation between benign and malignant ovarian neoplasms is essential for creating a system for patient referrals. Therefore, the contributions of the tumor markers CA125 and human epididymis protein 4 (HE4 as well as the risk ovarian malignancy algorithm (ROMA and risk malignancy index (RMI values were considered individually and in combination to evaluate their utility for establishing this type of patient referral system. METHODS: Patients who had been diagnosed with ovarian masses through imaging analyses (n = 128 were assessed for their expression of the tumor markers CA125 and HE4. The ROMA and RMI values were also determined. The sensitivity and specificity of each parameter were calculated using receiver operating characteristic curves according to the area under the curve (AUC for each method. RESULTS: The sensitivities associated with the ability of CA125, HE4, ROMA, or RMI to distinguish between malignant versus benign ovarian masses were 70.4%, 79.6%, 74.1%, and 63%, respectively. Among carcinomas, the sensitivities of CA125, HE4, ROMA (pre-and post-menopausal, and RMI were 93.5%, 87.1%, 80%, 95.2%, and 87.1%, respectively. The most accurate numerical values were obtained with RMI, although the four parameters were shown to be statistically equivalent. CONCLUSION: There were no differences in accuracy between CA125, HE4, ROMA, and RMI for differentiating between types of ovarian masses. RMI had the lowest sensitivity but was the most numerically accurate method. HE4 demonstrated the best overall sensitivity for the evaluation of malignant ovarian tumors and the differential diagnosis of endometriosis. All of the parameters demonstrated increased sensitivity when tumors with low malignancy potential were considered low-risk, which may be used as an acceptable assessment method for referring patients to reference centers.

  14. Validation of a new classification for periprosthetic shoulder fractures.

    Science.gov (United States)

    Kirchhoff, Chlodwig; Beirer, Marc; Brunner, Ulrich; Buchholz, Arne; Biberthaler, Peter; Crönlein, Moritz

    2018-06-01

    Successful treatment of periprosthetic shoulder fractures depends on the right strategy, starting with a well-structured classification of the fracture. Unfortunately, clinically relevant factors for treatment planning are missing in the pre-existing classifications. Therefore, the aim of the present study was to describe a new specific classification system for periprosthetic shoulder fractures including a structured treatment algorithm for this important fragility fracture issue. The classification was established, focussing on five relevant items, naming the prosthesis type, the fracture localisation, the rotator cuff status, the anatomical fracture region and the stability of the implant. After considering each single item, the individual treatment concept can be assessed in one last step. To evaluate the introduced classification, a retrospective analysis of pre- and post-operative data of patients, treated with periprosthetic shoulder fractures, was conducted by two board certified trauma surgery consultants. The data of 19 patients (8 male, 11 female) with a mean age of 74 ± five years have been analysed in our study. The suggested treatment algorithm was proven to be reliable, detected by good clinical outcome in 15 of 16 (94%) cases, where the suggested treatment was maintained. Only one case resulted in poor outcome due to post-operative wound infection and had to be revised. The newly developed six-step classification is easy to utilise and extends the pre-existing classification systems in terms of clinically-relevant information. This classification should serve as a simple tool for the surgeon to consider the optimal treatment for his patients.

  15. Supernova Photometric Lightcurve Classification

    Science.gov (United States)

    Zaidi, Tayeb; Narayan, Gautham

    2016-01-01

    This is a preliminary report on photometric supernova classification. We first explore the properties of supernova light curves, and attempt to restructure the unevenly sampled and sparse data from assorted datasets to allow for processing and classification. The data was primarily drawn from the Dark Energy Survey (DES) simulated data, created for the Supernova Photometric Classification Challenge. This poster shows a method for producing a non-parametric representation of the light curve data, and applying a Random Forest classifier algorithm to distinguish between supernovae types. We examine the impact of Principal Component Analysis to reduce the dimensionality of the dataset, for future classification work. The classification code will be used in a stage of the ANTARES pipeline, created for use on the Large Synoptic Survey Telescope alert data and other wide-field surveys. The final figure-of-merit for the DES data in the r band was 60% for binary classification (Type I vs II).Zaidi was supported by the NOAO/KPNO Research Experiences for Undergraduates (REU) Program which is funded by the National Science Foundation Research Experiences for Undergraduates Program (AST-1262829).

  16. Classification d'images RSO polarimétriques à haute résolution spatiale sur site urbain.

    OpenAIRE

    Soheili Majd , Maryam

    2014-01-01

    In this research, our aim is to assess the potential of a one single look high spatial resolution polarimetric radar image for the classification of urban areas. For that purpose, we concentrate on classes corresponding to different kinds of roofs, objects and ground surfaces.At first, we propose a uni-variate statistical analysis of polarimetric and texture attributes, that can be used in a classification algorithm. We perform a statistical analysis of descriptors and show that the Fisher di...

  17. Site Classification using Multichannel Channel Analysis of Surface Wave (MASW) method on Soft and Hard Ground

    Science.gov (United States)

    Ashraf, M. A. M.; Kumar, N. S.; Yusoh, R.; Hazreek, Z. A. M.; Aziman, M.

    2018-04-01

    Site classification utilizing average shear wave velocity (Vs(30) up to 30 meters depth is a typical parameter. Numerous geophysical methods have been proposed for estimation of shear wave velocity by utilizing assortment of testing configuration, processing method, and inversion algorithm. Multichannel Analysis of Surface Wave (MASW) method is been rehearsed by numerous specialist and professional to geotechnical engineering for local site characterization and classification. This study aims to determine the site classification on soft and hard ground using MASW method. The subsurface classification was made utilizing National Earthquake Hazards Reduction Program (NERHP) and international Building Code (IBC) classification. Two sites are chosen to acquire the shear wave velocity which is in the state of Pulau Pinang for soft soil and Perlis for hard rock. Results recommend that MASW technique can be utilized to spatially calculate the distribution of shear wave velocity (Vs(30)) in soil and rock to characterize areas.

  18. Online co-regularized algorithms

    NARCIS (Netherlands)

    Ruijter, T. de; Tsivtsivadze, E.; Heskes, T.

    2012-01-01

    We propose an online co-regularized learning algorithm for classification and regression tasks. We demonstrate that by sequentially co-regularizing prediction functions on unlabeled data points, our algorithm provides improved performance in comparison to supervised methods on several UCI benchmarks

  19. CCM: A Text Classification Method by Clustering

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    In this paper, a new Cluster based Classification Model (CCM) for suspicious email detection and other text classification tasks, is presented. Comparative experiments of the proposed model against traditional classification models and the boosting algorithm are also discussed. Experimental results...... show that the CCM outperforms traditional classification models as well as the boosting algorithm for the task of suspicious email detection on terrorism domain email dataset and topic categorization on the Reuters-21578 and 20 Newsgroups datasets. The overall finding is that applying a cluster based...

  20. Tweet-based Target Market Classification Using Ensemble Method

    Directory of Open Access Journals (Sweden)

    Muhammad Adi Khairul Anshary

    2016-09-01

    Full Text Available Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract features. Classification models were constructed to manipulate the training data using two ensemble methods (bagging and boosting. To investigate the effectiveness of the ensemble methods, this study used the CART (classification and regression tree algorithm for comparison. Three categories of consumer goods (computers, mobile phones and cameras and three categories of sentiments (positive, negative and neutral were classified towards three target-market categories. Machine learning was performed using Weka 3.6.9. The results of the test data showed that the bagging method improved the accuracy of CART with 1.9% (to 85.20%. On the other hand, for sentiment classification, the ensemble methods were not successful in increasing the accuracy of CART. The results of this study may be taken into consideration by companies who approach their customers through social media, especially Twitter.

  1. Combining two open source tools for neural computation (BioPatRec and Netlab) improves movement classification for prosthetic control.

    Science.gov (United States)

    Prahm, Cosima; Eckstein, Korbinian; Ortiz-Catalan, Max; Dorffner, Georg; Kaniusas, Eugenijus; Aszmann, Oskar C

    2016-08-31

    Controlling a myoelectric prosthesis for upper limbs is increasingly challenging for the user as more electrodes and joints become available. Motion classification based on pattern recognition with a multi-electrode array allows multiple joints to be controlled simultaneously. Previous pattern recognition studies are difficult to compare, because individual research groups use their own data sets. To resolve this shortcoming and to facilitate comparisons, open access data sets were analysed using components of BioPatRec and Netlab pattern recognition models. Performances of the artificial neural networks, linear models, and training program components were compared. Evaluation took place within the BioPatRec environment, a Matlab-based open source platform that provides feature extraction, processing and motion classification algorithms for prosthetic control. The algorithms were applied to myoelectric signals for individual and simultaneous classification of movements, with the aim of finding the best performing algorithm and network model. Evaluation criteria included classification accuracy and training time. Results in both the linear and the artificial neural network models demonstrated that Netlab's implementation using scaled conjugate training algorithm reached significantly higher accuracies than BioPatRec. It is concluded that the best movement classification performance would be achieved through integrating Netlab training algorithms in the BioPatRec environment so that future prosthesis training can be shortened and control made more reliable. Netlab was therefore included into the newest release of BioPatRec (v4.0).

  2. Music classification with MPEG-7

    Science.gov (United States)

    Crysandt, Holger; Wellhausen, Jens

    2003-01-01

    Driven by increasing amount of music available electronically the need and possibility of automatic classification systems for music becomes more and more important. Currently most search engines for music are based on textual descriptions like artist or/and title. This paper presents a system for automatic music description, classification and visualization for a set of songs. The system is designed to extract significant features of a piece of music in order to find songs of similar genre or a similar sound characteristics. The description is done with the help of MPEG-7 only. The classification and visualization is done with the self organizing map algorithm.

  3. Algorithming the Algorithm

    DEFF Research Database (Denmark)

    Mahnke, Martina; Uprichard, Emma

    2014-01-01

    Imagine sailing across the ocean. The sun is shining, vastness all around you. And suddenly [BOOM] you’ve hit an invisible wall. Welcome to the Truman Show! Ever since Eli Pariser published his thoughts on a potential filter bubble, this movie scenario seems to have become reality, just with slight...... changes: it’s not the ocean, it’s the internet we’re talking about, and it’s not a TV show producer, but algorithms that constitute a sort of invisible wall. Building on this assumption, most research is trying to ‘tame the algorithmic tiger’. While this is a valuable and often inspiring approach, we...

  4. Texture classification using autoregressive filtering

    Science.gov (United States)

    Lawton, W. M.; Lee, M.

    1984-01-01

    A general theory of image texture models is proposed and its applicability to the problem of scene segmentation using texture classification is discussed. An algorithm, based on half-plane autoregressive filtering, which optimally utilizes second order statistics to discriminate between texture classes represented by arbitrary wide sense stationary random fields is described. Empirical results of applying this algorithm to natural and sysnthesized scenes are presented and future research is outlined.

  5. Diagnostic Accuracy Comparison of Artificial Immune Algorithms for Primary Headaches

    Directory of Open Access Journals (Sweden)

    Ufuk Çelik

    2015-01-01

    Full Text Available The present study evaluated the diagnostic accuracy of immune system algorithms with the aim of classifying the primary types of headache that are not related to any organic etiology. They are divided into four types: migraine, tension, cluster, and other primary headaches. After we took this main objective into consideration, three different neurologists were required to fill in the medical records of 850 patients into our web-based expert system hosted on our project web site. In the evaluation process, Artificial Immune Systems (AIS were used as the classification algorithms. The AIS are classification algorithms that are inspired by the biological immune system mechanism that involves significant and distinct capabilities. These algorithms simulate the specialties of the immune system such as discrimination, learning, and the memorizing process in order to be used for classification, optimization, or pattern recognition. According to the results, the accuracy level of the classifier used in this study reached a success continuum ranging from 95% to 99%, except for the inconvenient one that yielded 71% accuracy.

  6. Weakly supervised classification in high energy physics

    International Nuclear Information System (INIS)

    Dery, Lucio Mwinmaarong; Nachman, Benjamin; Rubbo, Francesco; Schwartzman, Ariel

    2017-01-01

    As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics — quark versus gluon tagging — we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.

  7. Weakly supervised classification in high energy physics

    Energy Technology Data Exchange (ETDEWEB)

    Dery, Lucio Mwinmaarong [Physics Department, Stanford University,Stanford, CA, 94305 (United States); Nachman, Benjamin [Physics Division, Lawrence Berkeley National Laboratory,1 Cyclotron Rd, Berkeley, CA, 94720 (United States); Rubbo, Francesco; Schwartzman, Ariel [SLAC National Accelerator Laboratory, Stanford University,2575 Sand Hill Rd, Menlo Park, CA, 94025 (United States)

    2017-05-29

    As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics — quark versus gluon tagging — we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.

  8. Biometric Authentication for Gender Classification Techniques: A Review

    Science.gov (United States)

    Mathivanan, P.; Poornima, K.

    2017-12-01

    One of the challenging biometric authentication applications is gender identification and age classification, which captures gait from far distance and analyze physical information of the subject such as gender, race and emotional state of the subject. It is found that most of the gender identification techniques have focused only with frontal pose of different human subject, image size and type of database used in the process. The study also classifies different feature extraction process such as, Principal Component Analysis (PCA) and Local Directional Pattern (LDP) that are used to extract the authentication features of a person. This paper aims to analyze different gender classification techniques that help in evaluating strength and weakness of existing gender identification algorithm. Therefore, it helps in developing a novel gender classification algorithm with less computation cost and more accuracy. In this paper, an overview and classification of different gender identification techniques are first presented and it is compared with other existing human identification system by means of their performance.

  9. An opto-electronic joint detection system based on DSP aiming at early cervical cancer screening

    Science.gov (United States)

    Wang, Weiya; Jia, Mengyu; Gao, Feng; Yang, Lihong; Qu, Pengpeng; Zou, Changping; Liu, Pengxi; Zhao, Huijuan

    2015-02-01

    The cervical cancer screening at a pre-cancer stage is beneficial to reduce the mortality of women. An opto-electronic joint detection system based on DSP aiming at early cervical cancer screening is introduced in this paper. In this system, three electrodes alternately discharge to the cervical tissue and three light emitting diodes in different wavelengths alternately irradiate the cervical tissue. Then the relative optical reflectance and electrical voltage attenuation curve are obtained by optical and electrical detection, respectively. The system is based on DSP to attain the portable and cheap instrument. By adopting the relative reflectance and the voltage attenuation constant, the classification algorithm based on Support Vector Machine (SVM) discriminates abnormal cervical tissue from normal. We use particle swarm optimization to optimize the two key parameters of SVM, i.e. nuclear factor and cost factor. The clinical data were collected on 313 patients to build a clinical database of tissue responses under optical and electrical stimulations with the histopathologic examination as the gold standard. The classification result shows that the opto-electronic joint detection has higher total coincidence rate than separate optical detection or separate electrical detection. The sensitivity, specificity, and total coincidence rate increase with the increasing of sample numbers in the training set. The average total coincidence rate of the system can reach 85.1% compared with the histopathologic examination.

  10. Mixing geometric and radiometric features for change classification

    Science.gov (United States)

    Fournier, Alexandre; Descombes, Xavier; Zerubia, Josiane

    2008-02-01

    Most basic change detection algorithms use a pixel-based approach. Whereas such approach is quite well defined for monitoring important area changes (such as urban growth monitoring) in low resolution images, an object based approach seems more relevant when the change detection is specifically aimed toward targets (such as small buildings and vehicles). In this paper, we present an approach that mixes radiometric and geometric features to qualify the changed zones. The goal is to establish bounds (appearance, disappearance, substitution ...) between the detected changes and the underlying objects. We proceed by first clustering the change map (containing each pixel bitemporal radiosity) in different classes using the entropy-kmeans algorithm. Assuming that most man-made objects have a polygonal shape, a polygonal approximation algorithm is then used in order to characterize the resulting zone shapes. Hence allowing us to refine the primary rough classification, by integrating the polygon orientations in the state space. Tests are currently conducted on Quickbird data.

  11. A New Method for Solving Supervised Data Classification Problems

    Directory of Open Access Journals (Sweden)

    Parvaneh Shabanzadeh

    2014-01-01

    Full Text Available Supervised data classification is one of the techniques used to extract nontrivial information from data. Classification is a widely used technique in various fields, including data mining, industry, medicine, science, and law. This paper considers a new algorithm for supervised data classification problems associated with the cluster analysis. The mathematical formulations for this algorithm are based on nonsmooth, nonconvex optimization. A new algorithm for solving this optimization problem is utilized. The new algorithm uses a derivative-free technique, with robustness and efficiency. To improve classification performance and efficiency in generating classification model, a new feature selection algorithm based on techniques of convex programming is suggested. Proposed methods are tested on real-world datasets. Results of numerical experiments have been presented which demonstrate the effectiveness of the proposed algorithms.

  12. Document Classification Using Distributed Machine Learning

    OpenAIRE

    Aydin, Galip; Hallac, Ibrahim Riza

    2018-01-01

    In this paper, we investigate the performance and success rates of Na\\"ive Bayes Classification Algorithm for automatic classification of Turkish news into predetermined categories like economy, life, health etc. We use Apache Big Data technologies such as Hadoop, HDFS, Spark and Mahout, and apply these distributed technologies to Machine Learning.

  13. Computerized Classification Testing with the Rasch Model

    Science.gov (United States)

    Eggen, Theo J. H. M.

    2011-01-01

    If classification in a limited number of categories is the purpose of testing, computerized adaptive tests (CATs) with algorithms based on sequential statistical testing perform better than estimation-based CATs (e.g., Eggen & Straetmans, 2000). In these computerized classification tests (CCTs), the Sequential Probability Ratio Test (SPRT) (Wald,…

  14. Accuracy assessment between different image classification ...

    African Journals Online (AJOL)

    What image classification does is to assign pixel to a particular land cover and land use type that has the most similar spectral signature. However, there are possibilities that different methods or algorithms of image classification of the same data set could produce appreciable variant results in the sizes, shapes and areas of ...

  15. Density Based Support Vector Machines for Classification

    OpenAIRE

    Zahra Nazari; Dongshik Kang

    2015-01-01

    Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...

  16. Unsupervised classification of variable stars

    Science.gov (United States)

    Valenzuela, Lucas; Pichara, Karim

    2018-03-01

    During the past 10 years, a considerable amount of effort has been made to develop algorithms for automatic classification of variable stars. That has been primarily achieved by applying machine learning methods to photometric data sets where objects are represented as light curves. Classifiers require training sets to learn the underlying patterns that allow the separation among classes. Unfortunately, building training sets is an expensive process that demands a lot of human efforts. Every time data come from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared with the large amounts of unlabelled sources. In this work, we present an algorithm that performs unsupervised classification of variable stars, relying only on the similarity among light curves. We tackle the unsupervised classification problem by proposing an untraditional approach. Instead of trying to match classes of stars with clusters found by a clustering algorithm, we propose a query-based method where astronomers can find groups of variable stars ranked by similarity. We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire data set of unlabelled objects. Experiments show that our unsupervised model achieves high accuracy in the classification of different types of variable stars and that the proposed algorithm scales up to massive amounts of light curves.

  17. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is

  18. A COMPARISON STUDY OF DIFFERENT MARKER SELECTION METHODS FOR SPECTRAL-SPATIAL CLASSIFICATION OF HYPERSPECTRAL IMAGES

    Directory of Open Access Journals (Sweden)

    D. Akbari

    2015-12-01

    Full Text Available An effective approach based on the Minimum Spanning Forest (MSF, grown from automatically selected markers using Support Vector Machines (SVM, has been proposed for spectral-spatial classification of hyperspectral images by Tarabalka et al. This paper aims at improving this approach by using image segmentation to integrate the spatial information into marker selection process. In this study, the markers are extracted from the classification maps, obtained by both SVM and segmentation algorithms, and then are used to build the MSF. The segmentation algorithms are the watershed, expectation maximization (EM and hierarchical clustering. These algorithms are used in parallel and independently to segment the image. Moreover, the pixels of each class, with the largest population in the classification map, are kept for each region of the segmentation map. Lastly, the most reliable classified pixels are chosen from among the exiting pixels as markers. Two benchmark urban hyperspectral datasets are used for evaluation: Washington DC Mall and Berlin. The results of our experiments indicate that, compared to the original MSF approach, the marker selection using segmentation algorithms leads in more accurate classification maps.

  19. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  20. Polyp Morphology: An Interobserver Evaluation for the Paris Classification Among International Experts

    NARCIS (Netherlands)

    van Doorn, Sascha C.; Hazewinkel, Y.; East, James E.; van Leerdam, Monique E.; Rastogi, Amit; Pellisé, Maria; Sanduleanu-Dascalescu, Silvia; Bastiaansen, Barbara A. J.; Fockens, Paul; Dekker, Evelien

    2015-01-01

    OBJECTIVES: The Paris classification is an international classification system for describing polyp morphology. Thus far, the validity and reproducibility of this classification have not been assessed. We aimed to determine the interobserver agreement for the Paris classification among seven Western

  1. Sound algorithms

    OpenAIRE

    De Götzen , Amalia; Mion , Luca; Tache , Olivier

    2007-01-01

    International audience; We call sound algorithms the categories of algorithms that deal with digital sound signal. Sound algorithms appeared in the very infancy of computer. Sound algorithms present strong specificities that are the consequence of two dual considerations: the properties of the digital sound signal itself and its uses, and the properties of auditory perception.

  2. Genetic algorithms

    Science.gov (United States)

    Wang, Lui; Bayer, Steven E.

    1991-01-01

    Genetic algorithms are mathematical, highly parallel, adaptive search procedures (i.e., problem solving methods) based loosely on the processes of natural genetics and Darwinian survival of the fittest. Basic genetic algorithms concepts are introduced, genetic algorithm applications are introduced, and results are presented from a project to develop a software tool that will enable the widespread use of genetic algorithm technology.

  3. New Optimization Algorithms in Physics

    CERN Document Server

    Hartmann, Alexander K

    2004-01-01

    Many physicists are not aware of the fact that they can solve their problems by applying optimization algorithms. Since the number of such algorithms is steadily increasing, many new algorithms have not been presented comprehensively until now. This presentation of recently developed algorithms applied in physics, including demonstrations of how they work and related results, aims to encourage their application, and as such the algorithms selected cover concepts and methods from statistical physics to optimization problems emerging in theoretical computer science.

  4. Graph Colouring Algorithms

    DEFF Research Database (Denmark)

    Husfeldt, Thore

    2015-01-01

    This chapter presents an introduction to graph colouring algorithms. The focus is on vertex-colouring algorithms that work for general classes of graphs with worst-case performance guarantees in a sequential model of computation. The presentation aims to demonstrate the breadth of available...

  5. Online learning algorithm for ensemble of decision rules

    KAUST Repository

    Chikalov, Igor; Moshkov, Mikhail; Zielosko, Beata

    2011-01-01

    We describe an online learning algorithm that builds a system of decision rules for a classification problem. Rules are constructed according to the minimum description length principle by a greedy algorithm or using the dynamic programming approach

  6. Ichthyoplankton Classification Tool using Generative Adversarial Networks and Transfer Learning

    KAUST Repository

    Aljaafari, Nura

    2018-04-15

    The study and the analysis of marine ecosystems is a significant part of the marine science research. These systems are valuable resources for fisheries, improving water quality and can even be used in drugs production. The investigation of ichthyoplankton inhabiting these ecosystems is also an important research field. Ichthyoplankton are fish in their early stages of life. In this stage, the fish have relatively similar shape and are small in size. The currently used way of identifying them is not optimal. Marine scientists typically study such organisms by sending a team that collects samples from the sea which is then taken to the lab for further investigation. These samples need to be studied by an expert and usually end needing a DNA sequencing. This method is time-consuming and requires a high level of experience. The recent advances in AI have helped to solve and automate several difficult tasks which motivated us to develop a classification tool for ichthyoplankton. We show that using machine learning techniques, such as generative adversarial networks combined with transfer learning solves such a problem with high accuracy. We show that using traditional machine learning algorithms fails to solve it. We also give a general framework for creating a classification tool when the dataset used for training is a limited dataset. We aim to build a user-friendly tool that can be used by any user for the classification task and we aim to give a guide to the researchers so that they can follow in creating a classification tool.

  7. Determining the saliency of feature measurements obtained from images of sedimentary organic matter for use in its classification

    Science.gov (United States)

    Weller, Andrew F.; Harris, Anthony J.; Ware, J. Andrew; Jarvis, Paul S.

    2006-11-01

    The classification of sedimentary organic matter (OM) images can be improved by determining the saliency of image analysis (IA) features measured from them. Knowing the saliency of IA feature measurements means that only the most significant discriminating features need be used in the classification process. This is an important consideration for classification techniques such as artificial neural networks (ANNs), where too many features can lead to the 'curse of dimensionality'. The classification scheme adopted in this work is a hybrid of morphologically and texturally descriptive features from previous manual classification schemes. Some of these descriptive features are assigned to IA features, along with several others built into the IA software (Halcon) to ensure that a valid cross-section is available. After an image is captured and segmented, a total of 194 features are measured for each particle. To reduce this number to a more manageable magnitude, the SPSS AnswerTree Exhaustive CHAID (χ 2 automatic interaction detector) classification tree algorithm is used to establish each measurement's saliency as a classification discriminator. In the case of continuous data as used here, the F-test is used as opposed to the published algorithm. The F-test checks various statistical hypotheses about the variance of groups of IA feature measurements obtained from the particles to be classified. The aim is to reduce the number of features required to perform the classification without reducing its accuracy. In the best-case scenario, 194 inputs are reduced to 8, with a subsequent multi-layer back-propagation ANN recognition rate of 98.65%. This paper demonstrates the ability of the algorithm to reduce noise, help overcome the curse of dimensionality, and facilitate an understanding of the saliency of IA features as discriminators for sedimentary OM classification.

  8. Transferability of decision trees for land cover classification in a ...

    African Journals Online (AJOL)

    This paper attempts to derive classification rules from training data of four Landsat-8 scenes by using the classification and regression tree (CART) implementation of the decision tree algorithm. The transferability of the ruleset was evaluated by classifying two adjacent scenes. The classification of the four mosaicked scenes ...

  9. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  10. LDA boost classification: boosting by topics

    Science.gov (United States)

    Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li

    2012-12-01

    AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.

  11. Algorithmic cryptanalysis

    CERN Document Server

    Joux, Antoine

    2009-01-01

    Illustrating the power of algorithms, Algorithmic Cryptanalysis describes algorithmic methods with cryptographically relevant examples. Focusing on both private- and public-key cryptographic algorithms, it presents each algorithm either as a textual description, in pseudo-code, or in a C code program.Divided into three parts, the book begins with a short introduction to cryptography and a background chapter on elementary number theory and algebra. It then moves on to algorithms, with each chapter in this section dedicated to a single topic and often illustrated with simple cryptographic applic

  12. Parallel exploitation of a spatial-spectral classification approach for hyperspectral images on RVC-CAL

    Science.gov (United States)

    Lazcano, R.; Madroñal, D.; Fabelo, H.; Ortega, S.; Salvador, R.; Callicó, G. M.; Juárez, E.; Sanz, C.

    2017-10-01

    Hyperspectral Imaging (HI) assembles high resolution spectral information from hundreds of narrow bands across the electromagnetic spectrum, thus generating 3D data cubes in which each pixel gathers the spectral information of the reflectance of every spatial pixel. As a result, each image is composed of large volumes of data, which turns its processing into a challenge, as performance requirements have been continuously tightened. For instance, new HI applications demand real-time responses. Hence, parallel processing becomes a necessity to achieve this requirement, so the intrinsic parallelism of the algorithms must be exploited. In this paper, a spatial-spectral classification approach has been implemented using a dataflow language known as RVCCAL. This language represents a system as a set of functional units, and its main advantage is that it simplifies the parallelization process by mapping the different blocks over different processing units. The spatial-spectral classification approach aims at refining the classification results previously obtained by using a K-Nearest Neighbors (KNN) filtering process, in which both the pixel spectral value and the spatial coordinates are considered. To do so, KNN needs two inputs: a one-band representation of the hyperspectral image and the classification results provided by a pixel-wise classifier. Thus, spatial-spectral classification algorithm is divided into three different stages: a Principal Component Analysis (PCA) algorithm for computing the one-band representation of the image, a Support Vector Machine (SVM) classifier, and the KNN-based filtering algorithm. The parallelization of these algorithms shows promising results in terms of computational time, as the mapping of them over different cores presents a speedup of 2.69x when using 3 cores. Consequently, experimental results demonstrate that real-time processing of hyperspectral images is achievable.

  13. Spectral band selection for classification of soil organic matter content

    Science.gov (United States)

    Henderson, Tracey L.; Szilagyi, Andrea; Baumgardner, Marion F.; Chen, Chih-Chien Thomas; Landgrebe, David A.

    1989-01-01

    This paper describes the spectral-band-selection (SBS) algorithm of Chen and Landgrebe (1987, 1988, and 1989) and uses the algorithm to classify the organic matter content in the earth's surface soil. The effectiveness of the algorithm was evaluated comparing the results of classification of the soil organic matter using SBS bands with those obtained using Landsat MSS bands and TM bands, showing that the algorithm was successful in finding important spectral bands for classification of organic matter content. Using the calculated bands, the probabilities of correct classification for climate-stratified data were found to range from 0.910 to 0.980.

  14. A simple algorithm for the identification of clinical COPD phenotypes

    DEFF Research Database (Denmark)

    Burgel, Pierre-Régis; Paillasseur, Jean-Louis; Janssens, Wim

    2017-01-01

    This study aimed to identify simple rules for allocating chronic obstructive pulmonary disease (COPD) patients to clinical phenotypes identified by cluster analyses. Data from 2409 COPD patients of French/Belgian COPD cohorts were analysed using cluster analysis resulting in the identification...... of subgroups, for which clinical relevance was determined by comparing 3-year all-cause mortality. Classification and regression trees (CARTs) were used to develop an algorithm for allocating patients to these subgroups. This algorithm was tested in 3651 patients from the COPD Cohorts Collaborative...... International Assessment (3CIA) initiative. Cluster analysis identified five subgroups of COPD patients with different clinical characteristics (especially regarding severity of respiratory disease and the presence of cardiovascular comorbidities and diabetes). The CART-based algorithm indicated...

  15. Fault Tolerant Neural Network for ECG Signal Classification Systems

    Directory of Open Access Journals (Sweden)

    MERAH, M.

    2011-08-01

    Full Text Available The aim of this paper is to apply a new robust hardware Artificial Neural Network (ANN for ECG classification systems. This ANN includes a penalization criterion which makes the performances in terms of robustness. Specifically, in this method, the ANN weights are normalized using the auto-prune method. Simulations performed on the MIT ? BIH ECG signals, have shown that significant robustness improvements are obtained regarding potential hardware artificial neuron failures. Moreover, we show that the proposed design achieves better generalization performances, compared to the standard back-propagation algorithm.

  16. Algorithmic mathematics

    CERN Document Server

    Hougardy, Stefan

    2016-01-01

    Algorithms play an increasingly important role in nearly all fields of mathematics. This book allows readers to develop basic mathematical abilities, in particular those concerning the design and analysis of algorithms as well as their implementation. It presents not only fundamental algorithms like the sieve of Eratosthenes, the Euclidean algorithm, sorting algorithms, algorithms on graphs, and Gaussian elimination, but also discusses elementary data structures, basic graph theory, and numerical questions. In addition, it provides an introduction to programming and demonstrates in detail how to implement algorithms in C++. This textbook is suitable for students who are new to the subject and covers a basic mathematical lecture course, complementing traditional courses on analysis and linear algebra. Both authors have given this "Algorithmic Mathematics" course at the University of Bonn several times in recent years.

  17. DECISION LEVEL FUSION OF ORTHOPHOTO AND LIDAR DATA USING CONFUSION MATRIX INFORMATION FOR LNAD COVER CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    S. Daneshtalab

    2017-09-01

    Full Text Available Automatic urban objects extraction from airborne remote sensing data is essential to process and efficiently interpret the vast amount of airborne imagery and Lidar data available today. The aim of this study is to propose a new approach for the integration of high-resolution aerial imagery and Lidar data to improve the accuracy of classification in the city complications. In the proposed method, first, the classification of each data is separately performed using Support Vector Machine algorithm. In this case, extracted Normalized Digital Surface Model (nDSM and pulse intensity are used in classification of LiDAR data, and three spectral visible bands (Red, Green, Blue are considered as feature vector for the orthoimage classification. Moreover, combining the extracted features of the image and Lidar data another classification is also performed using all the features. The outputs of these classifications are integrated in a decision level fusion system according to the their confusion matrices to find the final classification result. The proposed method was evaluated using an urban area of Zeebruges, Belgium. The obtained results represented several advantages of image fusion with respect to a single shot dataset. With the capabilities of the proposed decision level fusion method, most of the object extraction difficulties and uncertainty were decreased and, the overall accuracy and the kappa values were improved 7% and 10%, respectively.

  18. Total algorithms

    NARCIS (Netherlands)

    Tel, G.

    We define the notion of total algorithms for networks of processes. A total algorithm enforces that a "decision" is taken by a subset of the processes, and that participation of all processes is required to reach this decision. Total algorithms are an important building block in the design of

  19. Electroencephalography epilepsy classifications using hybrid cuckoo search and neural network

    Science.gov (United States)

    Pratiwi, A. B.; Damayanti, A.; Miswanto

    2017-07-01

    Epilepsy is a condition that affects the brain and causes repeated seizures. This seizure is episodes that can vary and nearly undetectable to long periods of vigorous shaking or brain contractions. Epilepsy often can be confirmed with an electrocephalography (EEG). Neural Networks has been used in biomedic signal analysis, it has successfully classified the biomedic signal, such as EEG signal. In this paper, a hybrid cuckoo search and neural network are used to recognize EEG signal for epilepsy classifications. The weight of the multilayer perceptron is optimized by the cuckoo search algorithm based on its error. The aim of this methods is making the network faster to obtained the local or global optimal then the process of classification become more accurate. Based on the comparison results with the traditional multilayer perceptron, the hybrid cuckoo search and multilayer perceptron provides better performance in term of error convergence and accuracy. The purpose methods give MSE 0.001 and accuracy 90.0 %.

  20. Classifier fusion for VoIP attacks classification

    Science.gov (United States)

    Safarik, Jakub; Rezac, Filip

    2017-05-01

    SIP is one of the most successful protocols in the field of IP telephony communication. It establishes and manages VoIP calls. As the number of SIP implementation rises, we can expect a higher number of attacks on the communication system in the near future. This work aims at malicious SIP traffic classification. A number of various machine learning algorithms have been developed for attack classification. The paper presents a comparison of current research and the use of classifier fusion method leading to a potential decrease in classification error rate. Use of classifier combination makes a more robust solution without difficulties that may affect single algorithms. Different voting schemes, combination rules, and classifiers are discussed to improve the overall performance. All classifiers have been trained on real malicious traffic. The concept of traffic monitoring depends on the network of honeypot nodes. These honeypots run in several networks spread in different locations. Separation of honeypots allows us to gain an independent and trustworthy attack information.

  1. Fuzzy One-Class Classification Model Using Contamination Neighborhoods

    Directory of Open Access Journals (Sweden)

    Lev V. Utkin

    2012-01-01

    Full Text Available A fuzzy classification model is studied in the paper. It is based on the contaminated (robust model which produces fuzzy expected risk measures characterizing classification errors. Optimal classification parameters of the models are derived by minimizing the fuzzy expected risk. It is shown that an algorithm for computing the classification parameters is reduced to a set of standard support vector machine tasks with weighted data points. Experimental results with synthetic data illustrate the proposed fuzzy model.

  2. Management of vascular anomalies: Review of institutional management algorithm

    Directory of Open Access Journals (Sweden)

    Lalit K Makhija

    2017-01-01

    Full Text Available Introduction: Vascular anomalies are congenital lesions broadly categorised into vascular tumour (haemangiomas and vascular dysmorphogenesis (vascular malformation. The management of these difficult problems has lately been simplified by the biological classification and multidisciplinary approach. To standardise the treatment protocol, an algorithm has been devised. The study aims to validate the algorithm in terms of its utility and presents our experience in managing vascular anomalies. Materials and Methods: The biological classification of Mulliken and Glowacki was followed. A detailed algorithm for management of vascular anomalies has been devised in the department. The protocol is being practiced by us since the past two decades. The data regarding the types of lesions and treatment modality used were maintained. Results and Conclusion: This study was conducted from 2002 to 2012. A total of 784 cases of vascular anomalies were included in the study of which 196 were haemangiomas and 588 were vascular malformations. The algorithmic approach has brought an element of much-needed objectivity in the management of vascular anomalies. This has helped us to define the management of particular lesion considering its pathology, extent and aesthetic and functional consequences of ablation to a certain extent.

  3. Designing Artificial Neural Networks Using Particle Swarm Optimization Algorithms.

    Science.gov (United States)

    Garro, Beatriz A; Vázquez, Roberto A

    2015-01-01

    Artificial Neural Network (ANN) design is a complex task because its performance depends on the architecture, the selected transfer function, and the learning algorithm used to train the set of synaptic weights. In this paper we present a methodology that automatically designs an ANN using particle swarm optimization algorithms such as Basic Particle Swarm Optimization (PSO), Second Generation of Particle Swarm Optimization (SGPSO), and a New Model of PSO called NMPSO. The aim of these algorithms is to evolve, at the same time, the three principal components of an ANN: the set of synaptic weights, the connections or architecture, and the transfer functions for each neuron. Eight different fitness functions were proposed to evaluate the fitness of each solution and find the best design. These functions are based on the mean square error (MSE) and the classification error (CER) and implement a strategy to avoid overtraining and to reduce the number of connections in the ANN. In addition, the ANN designed with the proposed methodology is compared with those designed manually using the well-known Back-Propagation and Levenberg-Marquardt Learning Algorithms. Finally, the accuracy of the method is tested with different nonlinear pattern classification problems.

  4. Arabic text classification using Polynomial Networks

    Directory of Open Access Journals (Sweden)

    Mayy M. Al-Tahrawi

    2015-10-01

    Full Text Available In this paper, an Arabic statistical learning-based text classification system has been developed using Polynomial Neural Networks. Polynomial Networks have been recently applied to English text classification, but they were never used for Arabic text classification. In this research, we investigate the performance of Polynomial Networks in classifying Arabic texts. Experiments are conducted on a widely used Arabic dataset in text classification: Al-Jazeera News dataset. We chose this dataset to enable direct comparisons of the performance of Polynomial Networks classifier versus other well-known classifiers on this dataset in the literature of Arabic text classification. Results of experiments show that Polynomial Networks classifier is a competitive algorithm to the state-of-the-art ones in the field of Arabic text classification.

  5. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  6. Seismic texture classification. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Vinther, R.

    1997-12-31

    The seismic texture classification method, is a seismic attribute that can both recognize the general reflectivity styles and locate variations from these. The seismic texture classification performs a statistic analysis for the seismic section (or volume) aiming at describing the reflectivity. Based on a set of reference reflectivities the seismic textures are classified. The result of the seismic texture classification is a display of seismic texture categories showing both the styles of reflectivity from the reference set and interpolations and extrapolations from these. The display is interpreted as statistical variations in the seismic data. The seismic texture classification is applied to seismic sections and volumes from the Danish North Sea representing both horizontal stratifications and salt diapers. The attribute succeeded in recognizing both general structure of successions and variations from these. Also, the seismic texture classification is not only able to display variations in prospective areas (1-7 sec. TWT) but can also be applied to deep seismic sections. The seismic texture classification is tested on a deep reflection seismic section (13-18 sec. TWT) from the Baltic Sea. Applied to this section the seismic texture classification succeeded in locating the Moho, which could not be located using conventional interpretation tools. The seismic texture classification is a seismic attribute which can display general reflectivity styles and deviations from these and enhance variations not found by conventional interpretation tools. (LN)

  7. Document Organization Using Kohonen's Algorithm.

    Science.gov (United States)

    Guerrero Bote, Vicente P.; Moya Anegon, Felix de; Herrero Solana, Victor

    2002-01-01

    Discussion of the classification of documents from bibliographic databases focuses on a method of vectorizing reference documents from LISA (Library and Information Science Abstracts) which permits their topological organization using Kohonen's algorithm. Analyzes possibilities of this type of neural network with respect to the development of…

  8. HIV classification using coalescent theory

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Ming [Los Alamos National Laboratory; Letiner, Thomas K [Los Alamos National Laboratory; Korber, Bette T [Los Alamos National Laboratory

    2008-01-01

    Algorithms for subtype classification and breakpoint detection of HIV-I sequences are based on a classification system of HIV-l. Hence, their quality highly depend on this system. Due to the history of creation of the current HIV-I nomenclature, the current one contains inconsistencies like: The phylogenetic distance between the subtype B and D is remarkably small compared with other pairs of subtypes. In fact, it is more like the distance of a pair of subsubtypes Robertson et al. (2000); Subtypes E and I do not exist any more since they were discovered to be composed of recombinants Robertson et al. (2000); It is currently discussed whether -- instead of CRF02 being a recombinant of subtype A and G -- subtype G should be designated as a circulating recombination form (CRF) nd CRF02 as a subtype Abecasis et al. (2007); There are 8 complete and over 400 partial HIV genomes in the LANL-database which belong neither to a subtype nor to a CRF (denoted by U). Moreover, the current classification system is somehow arbitrary like all complex classification systems that were created manually. To this end, it is desirable to deduce the classification system of HIV systematically by an algorithm. Of course, this problem is not restricted to HIV, but applies to all fast mutating and recombining viruses. Our work addresses the simpler subproblem to score classifications of given input sequences of some virus species (classification denotes a partition of the input sequences in several subtypes and CRFs). To this end, we reconstruct ancestral recombination graphs (ARG) of the input sequences under restrictions determined by the given classification. These restritions are imposed in order to ensure that the reconstructed ARGs do not contradict the classification under consideration. Then, we find the ARG with maximal probability by means of Markov Chain Monte Carlo methods. The probability of the most probable ARG is interpreted as a score for the classification. To our

  9. Fractographic classification in metallic materials by using 3D processing and computer vision techniques

    Directory of Open Access Journals (Sweden)

    Maria Ximena Bastidas-Rodríguez

    2016-09-01

    Full Text Available Failure analysis aims at collecting information about how and why a failure is produced. The first step in this process is a visual inspection on the flaw surface that will reveal the features, marks, and texture, which characterize each type of fracture. This is generally carried out by personnel with no experience that usually lack the knowledge to do it. This paper proposes a classification method for three kinds of fractures in crystalline materials: brittle, fatigue, and ductile. The method uses 3D vision, and it is expected to support failure analysis. The features used in this work were: i Haralick’s features and ii the fractal dimension. These features were applied to 3D images obtained from a confocal laser scanning microscopy Zeiss LSM 700. For the classification, we evaluated two classifiers: Artificial Neural Networks and Support Vector Machine. The performance evaluation was made by extracting four marginal relations from the confusion matrix: accuracy, sensitivity, specificity, and precision, plus three evaluation methods: Receiver Operating Characteristic space, the Individual Classification Success Index, and the Jaccard’s coefficient. Despite the classification percentage obtained by an expert is better than the one obtained with the algorithm, the algorithm achieves a classification percentage near or exceeding the 60 % accuracy for the analyzed failure modes. The results presented here provide a good approach to address future research on texture analysis using 3D data.

  10. Lossless Compression of Classification-Map Data

    Science.gov (United States)

    Hua, Xie; Klimesh, Matthew

    2009-01-01

    A lossless image-data-compression algorithm intended specifically for application to classification-map data is based on prediction, context modeling, and entropy coding. The algorithm was formulated, in consideration of the differences between classification maps and ordinary images of natural scenes, so as to be capable of compressing classification- map data more effectively than do general-purpose image-data-compression algorithms. Classification maps are typically generated from remote-sensing images acquired by instruments aboard aircraft (see figure) and spacecraft. A classification map is a synthetic image that summarizes information derived from one or more original remote-sensing image(s) of a scene. The value assigned to each pixel in such a map is the index of a class that represents some type of content deduced from the original image data for example, a type of vegetation, a mineral, or a body of water at the corresponding location in the scene. When classification maps are generated onboard the aircraft or spacecraft, it is desirable to compress the classification-map data in order to reduce the volume of data that must be transmitted to a ground station.

  11. Evaluation of Multiple Kernel Learning Algorithms for Crop Mapping Using Satellite Image Time-Series Data

    Science.gov (United States)

    Niazmardi, S.; Safari, A.; Homayouni, S.

    2017-09-01

    Crop mapping through classification of Satellite Image Time-Series (SITS) data can provide very valuable information for several agricultural applications, such as crop monitoring, yield estimation, and crop inventory. However, the SITS data classification is not straightforward. Because different images of a SITS data have different levels of information regarding the classification problems. Moreover, the SITS data is a four-dimensional data that cannot be classified using the conventional classification algorithms. To address these issues in this paper, we presented a classification strategy based on Multiple Kernel Learning (MKL) algorithms for SITS data classification. In this strategy, initially different kernels are constructed from different images of the SITS data and then they are combined into a composite kernel using the MKL algorithms. The composite kernel, once constructed, can be used for the classification of the data using the kernel-based classification algorithms. We compared the computational time and the classification performances of the proposed classification strategy using different MKL algorithms for the purpose of crop mapping. The considered MKL algorithms are: MKL-Sum, SimpleMKL, LPMKL and Group-Lasso MKL algorithms. The experimental tests of the proposed strategy on two SITS data sets, acquired by SPOT satellite sensors, showed that this strategy was able to provide better performances when compared to the standard classification algorithm. The results also showed that the optimization method of the used MKL algorithms affects both the computational time and classification accuracy of this strategy.

  12. Sparse Representation Based Binary Hypothesis Model for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Yidong Tang

    2016-01-01

    Full Text Available The sparse representation based classifier (SRC and its kernel version (KSRC have been employed for hyperspectral image (HSI classification. However, the state-of-the-art SRC often aims at extended surface objects with linear mixture in smooth scene and assumes that the number of classes is given. Considering the small target with complex background, a sparse representation based binary hypothesis (SRBBH model is established in this paper. In this model, a query pixel is represented in two ways, which are, respectively, by background dictionary and by union dictionary. The background dictionary is composed of samples selected from the local dual concentric window centered at the query pixel. Thus, for each pixel the classification issue becomes an adaptive multiclass classification problem, where only the number of desired classes is required. Furthermore, the kernel method is employed to improve the interclass separability. In kernel space, the coding vector is obtained by using kernel-based orthogonal matching pursuit (KOMP algorithm. Then the query pixel can be labeled by the characteristics of the coding vectors. Instead of directly using the reconstruction residuals, the different impacts the background dictionary and union dictionary have on reconstruction are used for validation and classification. It enhances the discrimination and hence improves the performance.

  13. Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment

    Directory of Open Access Journals (Sweden)

    Rashmi Mukherjee

    2014-01-01

    Full Text Available The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough scheme for chronic wound (CW evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM, were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793.

  14. Classification of breast cancer cytological specimen using convolutional neural network

    Science.gov (United States)

    Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

    2017-01-01

    The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.

  15. A quick survey of text categorization algorithms

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2007-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to text classification. This paper surveys the algorithms used in text categorization: handcrafted rules, decision trees, decision rules, on-line learning, linear classifier, Rocchio’s algorithm, k Nearest Neighbor (kNN, Support Vector Machines (SVM.

  16. Parallel Algorithms for Groebner-Basis Reduction

    Science.gov (United States)

    1987-09-25

    22209 ELEMENT NO. NO. NO. ACCESSION NO. 11. TITLE (Include Security Classification) * PARALLEL ALGORITHMS FOR GROEBNER -BASIS REDUCTION 12. PERSONAL...All other editions are obsolete. Productivity Engineering in the UNIXt Environment p Parallel Algorithms for Groebner -Basis Reduction Technical Report

  17. Content Abstract Classification Using Naive Bayes

    Science.gov (United States)

    Latif, Syukriyanto; Suwardoyo, Untung; Aldrin Wihelmus Sanadi, Edwin

    2018-03-01

    This study aims to classify abstract content based on the use of the highest number of words in an abstract content of the English language journals. This research uses a system of text mining technology that extracts text data to search information from a set of documents. Abstract content of 120 data downloaded at www.computer.org. Data grouping consists of three categories: DM (Data Mining), ITS (Intelligent Transport System) and MM (Multimedia). Systems built using naive bayes algorithms to classify abstract journals and feature selection processes using term weighting to give weight to each word. Dimensional reduction techniques to reduce the dimensions of word counts rarely appear in each document based on dimensional reduction test parameters of 10% -90% of 5.344 words. The performance of the classification system is tested by using the Confusion Matrix based on comparative test data and test data. The results showed that the best classification results were obtained during the 75% training data test and 25% test data from the total data. Accuracy rates for categories of DM, ITS and MM were 100%, 100%, 86%. respectively with dimension reduction parameters of 30% and the value of learning rate between 0.1-0.5.

  18. Semantic Document Image Classification Based on Valuable Text Pattern

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2011-01-01

    Full Text Available Knowledge extraction from detected document image is a complex problem in the field of information technology. This problem becomes more intricate when we know, a negligible percentage of the detected document images are valuable. In this paper, a segmentation-based classification algorithm is used to analysis the document image. In this algorithm, using a two-stage segmentation approach, regions of the image are detected, and then classified to document and non-document (pure region regions in the hierarchical classification. In this paper, a novel valuable definition is proposed to classify document image in to valuable or invaluable categories. The proposed algorithm is evaluated on a database consisting of the document and non-document image that provide from Internet. Experimental results show the efficiency of the proposed algorithm in the semantic document image classification. The proposed algorithm provides accuracy rate of 98.8% for valuable and invaluable document image classification problem.

  19. Classification of ASKAP Vast Radio Light Curves

    Science.gov (United States)

    Rebbapragada, Umaa; Lo, Kitty; Wagstaff, Kiri L.; Reed, Colorado; Murphy, Tara; Thompson, David R.

    2012-01-01

    The VAST survey is a wide-field survey that observes with unprecedented instrument sensitivity (0.5 mJy or lower) and repeat cadence (a goal of 5 seconds) that will enable novel scientific discoveries related to known and unknown classes of radio transients and variables. Given the unprecedented observing characteristics of VAST, it is important to estimate source classification performance, and determine best practices prior to the launch of ASKAP's BETA in 2012. The goal of this study is to identify light curve characterization and classification algorithms that are best suited for archival VAST light curve classification. We perform our experiments on light curve simulations of eight source types and achieve best case performance of approximately 90% accuracy. We note that classification performance is most influenced by light curve characterization rather than classifier algorithm.

  20. Classification Using Markov Blanket for Feature Selection

    DEFF Research Database (Denmark)

    Zeng, Yifeng; Luo, Jian

    2009-01-01

    Selecting relevant features is in demand when a large data set is of interest in a classification task. It produces a tractable number of features that are sufficient and possibly improve the classification performance. This paper studies a statistical method of Markov blanket induction algorithm...... for filtering features and then applies a classifier using the Markov blanket predictors. The Markov blanket contains a minimal subset of relevant features that yields optimal classification performance. We experimentally demonstrate the improved performance of several classifiers using a Markov blanket...... induction as a feature selection method. In addition, we point out an important assumption behind the Markov blanket induction algorithm and show its effect on the classification performance....

  1. Classification of hand eczema

    DEFF Research Database (Denmark)

    Agner, T; Aalto-Korte, K; Andersen, K E

    2015-01-01

    BACKGROUND: Classification of hand eczema (HE) is mandatory in epidemiological and clinical studies, and also important in clinical work. OBJECTIVES: The aim was to test a recently proposed classification system of HE in clinical practice in a prospective multicentre study. METHODS: Patients were...... recruited from nine different tertiary referral centres. All patients underwent examination by specialists in dermatology and were checked using relevant allergy testing. Patients were classified into one of the six diagnostic subgroups of HE: allergic contact dermatitis, irritant contact dermatitis, atopic...... system investigated in the present study was useful, being able to give an appropriate main diagnosis for 89% of HE patients, and for another 7% when using two main diagnoses. The fact that more than half of the patients had one or more additional diagnoses illustrates that HE is a multifactorial disease....

  2. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    Science.gov (United States)

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  3. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    International Nuclear Information System (INIS)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.; McEwen, Jason D.

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  4. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    Energy Technology Data Exchange (ETDEWEB)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K. [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); McEwen, Jason D., E-mail: dr.michelle.lochner@gmail.com [Mullard Space Science Laboratory, University College London, Surrey RH5 6NT (United Kingdom)

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  5. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  6. Supervised Classification Performance of Multispectral Images

    OpenAIRE

    Perumal, K.; Bhaskaran, R.

    2010-01-01

    Nowadays government and private agencies use remote sensing imagery for a wide range of applications from military applications to farm development. The images may be a panchromatic, multispectral, hyperspectral or even ultraspectral of terra bytes. Remote sensing image classification is one amongst the most significant application worlds for remote sensing. A few number of image classification algorithms have proved good precision in classifying remote sensing data. But, of late, due to the ...

  7. A Confidence Paradigm for Classification Systems

    Science.gov (United States)

    2008-09-01

    methodology to determine how much confi- dence one should have in a classifier output. This research proposes a framework to determine the level of...theoretical framework that attempts to unite the viewpoints of the classification system developer (or engineer) and the classification system user (or...operating point. An algorithm is developed that minimizes a “confidence” measure called Binned Error in the Posterior ( BEP ). Then, we prove that training a

  8. Take AIM and Keep Your Students Engaged

    Science.gov (United States)

    Nash, Catherine

    2014-01-01

    This paper outlines the benefits to distance education teachers of formatting a weekly online newsletter in accordance with motivational learning theory. It reflects on the delivery of weekly AIM newsletters to undergraduate economics students at the Open Polytechnic of New Zealand via Moodle. The acronym, AIM, stands for Academic content,…

  9. Discrimination and the aim of proportional representation

    DEFF Research Database (Denmark)

    Lippert-Rasmussen, Kasper

    2008-01-01

    Many organizations, companies, and so on are committed to certain representational aims as regards the composition of their workforce. One motivation for such aims is the assumption that numerical underrepresentation of groups manifests discrimination against them. In this article, I articulate r...

  10. Partitional clustering algorithms

    CERN Document Server

    2015-01-01

    This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised classification of patterns into groups, is one of the most important tasks in exploratory data analysis. Primary goals of clustering include gaining insight into, classifying, and compressing data. Clustering has a long and rich history that spans a variety of scientific disciplines including anthropology, biology, medicine, psychology, statistics, mathematics, engineering, and computer science. As a result, numerous clustering algorithms have been proposed since the early 1950s. Among these algorithms, partitional (nonhierarchical) ones have found many applications, especially in engineering and computer science. This book provides coverage of consensus clustering, constrained clustering, large scale and/or high dimensional clustering, cluster validity, cluster visualization, and applications of clustering. Examines clustering as it applies to large and/or high-dimensional data sets commonly encountered in reali...

  11. Classification in context

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  12. Classification of the web

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper discusses the challenges faced by investigations into the classification of the Web and outlines inquiries that are needed to use principles for bibliographic classification to construct classifications of the Web. This paper suggests that the classification of the Web meets challenges...... that call for inquiries into the theoretical foundation of bibliographic classification theory....

  13. The Classification of Tongue Colors with Standardized Acquisition and ICC Profile Correction in Traditional Chinese Medicine.

    Science.gov (United States)

    Qi, Zhen; Tu, Li-Ping; Chen, Jing-Bo; Hu, Xiao-Juan; Xu, Jia-Tuo; Zhang, Zhi-Feng

    2016-01-01

    Background and Goal . The application of digital image processing techniques and machine learning methods in tongue image classification in Traditional Chinese Medicine (TCM) has been widely studied nowadays. However, it is difficult for the outcomes to generalize because of lack of color reproducibility and image standardization. Our study aims at the exploration of tongue colors classification with a standardized tongue image acquisition process and color correction. Methods . Three traditional Chinese medical experts are chosen to identify the selected tongue pictures taken by the TDA-1 tongue imaging device in TIFF format through ICC profile correction. Then we compare the mean value of L * a * b * of different tongue colors and evaluate the effect of the tongue color classification by machine learning methods. Results . The L * a * b * values of the five tongue colors are statistically different. Random forest method has a better performance than SVM in classification. SMOTE algorithm can increase classification accuracy by solving the imbalance of the varied color samples. Conclusions . At the premise of standardized tongue acquisition and color reproduction, preliminary objectification of tongue color classification in Traditional Chinese Medicine (TCM) is feasible.

  14. Digitisation of films and texture analysis for digital classification of pulmonary opacities

    International Nuclear Information System (INIS)

    Desaga, J.F.; Dengler, J.; Wolf, T.; Engelmann, U.; Scheppelmann, D.; Meinzer, H.P.

    1988-01-01

    The study aimed at evaluating the effect of different methods of digitisation of radiographic films on the digital classification of pulmonary opacities. Test sets from the standard of the International Labour Office (ILO) Classification of Radiographs of Pneumoconiosis were prepared by film digitsation using a scanning microdensitometer or a video digitiser based on a personal computer equipped with a real time digitiser board and a vidicon or a Charge Coupled Device (CCD) camera. Seven different algorithms were used for texture analysis resulting in 16 texture parameters for each region. All methods used for texture analysis were independent of the mean grey value level and the size of the image analysed. Classification was performed by discriminant analysis using the classes from the ILO classification. A hit ratio of at least 85% was achieved for a digitisation by scanner digitisation or the vidicon, while the corresponding results of the CCD camera were significantly less good. Classification by texture analysis of opacities of chest X-rays of pneumoconiosis digitised by a personal computer based video digitiser and a vidicon are of equal quality compared to digitisation by a scanning microdensitometer. Correct classification of 90% was achieved via the described statistical approach. (orig.) [de

  15. Laser Range Profiling for Active Protection System Target Classification and Aim-Point Selection

    National Research Council Canada - National Science Library

    Jones, Michael

    2004-01-01

    ...) is currently developing the Close-In Active Protection System (CIAPS). The distinguishing capability of CIAPS is its ability to provide self-protection against missiles and projectiles launched at close range...

  16. Laser Range Profiling for Active Protection System Target Classification and Aim-Point Selection

    National Research Council Canada - National Science Library

    Jones, Michael

    2004-01-01

    .... The attractiveness of smaller, faster interceptors precipitated the investigation of a laser radar sensor augmentation for CIAPS that could quickly resolve the range profile of an incoming projectile...

  17. Discrimination and the aim of proportional representation

    DEFF Research Database (Denmark)

    Lippert-Rasmussen, Kasper

    2008-01-01

    Many organizations, companies, and so on are committed to certain representational aims as regards the composition of their workforce. One motivation for such aims is the assumption that numerical underrepresentation of groups manifests discrimination against them. In this article, I articulate...... representational aims in a way that best captures this rationale. My main claim is that the achievement of such representational aims is reducible to the elimination of the effects of wrongful discrimination on individuals and that this very important concern is, in principle, compatible with the representation...... of discrimination against numerically overrepresented groups, or overlook the innocently different ambitions of some numerically underrepresented groups. In relation to the latter point, I appeal to the fact that many luck egalitarians think justice should be ambition sensitive (but endowment insensitive). Also...

  18. AIM: An Integrated Approach to Organizational Improvement

    Directory of Open Access Journals (Sweden)

    Ronald A. Styron, Jr.

    2016-02-01

    Full Text Available This concept paper is based on the new problem-solving model of Blended Leadership called Alloy Improvement Model (AIM. This model consists of an integration of change theory, leadership theory, and democratic principles and practices to form a comprehensive problem-solving strategy for organizational leaders. The utilization of AIM will assist leaders in moving from problems to solutions while engaging stakeholders in a comprehensive, efficient, inclusive, informative, integrated and transparent process.

  19. Sow-activity classification from acceleration patterns

    DEFF Research Database (Denmark)

    Escalante, Hugo Jair; Rodriguez, Sara V.; Cordero, Jorge

    2013-01-01

    sow-activity classification can be approached with standard machine learning methods for pattern classification. Individual predictions for elements of times series of arbitrary length are combined to classify it as a whole. An extensive comparison of representative learning algorithms, including......This paper describes a supervised learning approach to sow-activity classification from accelerometer measurements. In the proposed methodology, pairs of accelerometer measurements and activity types are considered as labeled instances of a usual supervised classification task. Under this scenario...... neural networks, support vector machines, and ensemble methods, is presented. Experimental results are reported using a data set for sow-activity classification collected in a real production herd. The data set, which has been widely used in related works, includes measurements from active (Feeding...

  20. Clustering and classification of email contents

    Directory of Open Access Journals (Sweden)

    Izzat Alsmadi

    2015-01-01

    Full Text Available Information users depend heavily on emails’ system as one of the major sources of communication. Its importance and usage are continuously growing despite the evolution of mobile applications, social networks, etc. Emails are used on both the personal and professional levels. They can be considered as official documents in communication among users. Emails’ data mining and analysis can be conducted for several purposes such as: Spam detection and classification, subject classification, etc. In this paper, a large set of personal emails is used for the purpose of folder and subject classifications. Algorithms are developed to perform clustering and classification for this large text collection. Classification based on NGram is shown to be the best for such large text collection especially as text is Bi-language (i.e. with English and Arabic content.

  1. Classification of line features from remote sensing data

    OpenAIRE

    Kolankiewiczová, Soňa

    2009-01-01

    This work deals with object-based classification of high resolution data. The aim of the thesis (paper, work) is to develope an acceptable classification process of linear features (roads and railways) from high-resolution satellite images. The first part shows different approaches of the linear feature classification and compares theoretic differences between an object-oriented and a pixel-based classification. Linear feature classification was created in the second part. The high-resolution...

  2. Hazard classification methodology

    International Nuclear Information System (INIS)

    Brereton, S.J.

    1996-01-01

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  3. An edit script for taxonomic classifications

    Directory of Open Access Journals (Sweden)

    Valiente Gabriel

    2005-08-01

    Full Text Available Abstract Background The NCBI taxonomy provides one of the most powerful ways to navigate sequence data bases but currently users are forced to formulate queries according to a single taxonomic classification. Given that there is not universal agreement on the classification of organisms, providing a single classification places constraints on the questions biologists can ask. However, maintaining multiple classifications is burdensome in the face of a constantly growing NCBI classification. Results In this paper, we present a solution to the problem of generating modifications of the NCBI taxonomy, based on the computation of an edit script that summarises the differences between two classification trees. Our algorithms find the shortest possible edit script based on the identification of all shared subtrees, and only take time quasi linear in the size of the trees because classification trees have unique node labels. Conclusion These algorithms have been recently implemented, and the software is freely available for download from http://darwin.zoology.gla.ac.uk/~rpage/forest/.

  4. Combined genetic and splicing analysis of BRCA1 c.[594-2A>C; 641A>G] highlights the relevance of naturally occurring in-frame transcripts for developing disease gene variant classification algorithms

    OpenAIRE

    de la Hoya, Miguel; Soukarieh, Omar; L��pez-Perolio, Irene; Vega, Ana; Walker, Logan C.; van Ierland, Yvette; Baralle, Diana; Santamari��a, Marta; Lattimore, Vanessa; Wijnen, Juul; Whiley, Philip; Blanco, Ana; Raponi, Michela; Hauke, Jan; Wappenschmidt, Barbara

    2016-01-01

    A recent analysis using family history weighting and co-observation classification modeling indicated that BRCA1 c.594-2A > C (IVS9-2A > C), previously described to cause exon 10 skipping (a truncating alteration), displays characteristics inconsistent with those of a high risk pathogenic BRCA1 variant. We used large-scale genetic and clinical resources from the ENIGMA, CIMBA and BCAC consortia to assess pathogenicity of c.594-2A > C. The combined odds for causality considering case-control, ...

  5. An Efficient Optimization Method for Solving Unsupervised Data Classification Problems

    Directory of Open Access Journals (Sweden)

    Parvaneh Shabanzadeh

    2015-01-01

    Full Text Available Unsupervised data classification (or clustering analysis is one of the most useful tools and a descriptive task in data mining that seeks to classify homogeneous groups of objects based on similarity and is used in many medical disciplines and various applications. In general, there is no single algorithm that is suitable for all types of data, conditions, and applications. Each algorithm has its own advantages, limitations, and deficiencies. Hence, research for novel and effective approaches for unsupervised data classification is still active. In this paper a heuristic algorithm, Biogeography-Based Optimization (BBO algorithm, was adapted for data clustering problems by modifying the main operators of BBO algorithm, which is inspired from the natural biogeography distribution of different species. Similar to other population-based algorithms, BBO algorithm starts with an initial population of candidate solutions to an optimization problem and an objective function that is calculated for them. To evaluate the performance of the proposed algorithm assessment was carried on six medical and real life datasets and was compared with eight well known and recent unsupervised data classification algorithms. Numerical results demonstrate that the proposed evolutionary optimization algorithm is efficient for unsupervised data classification.

  6. Is Fitts' law continuous in discrete aiming?

    Directory of Open Access Journals (Sweden)

    Rita Sleimen-Malkoun

    Full Text Available The lawful continuous linear relation between movement time and task difficulty (i.e., index of difficulty; ID in a goal-directed rapid aiming task (Fitts' law has been recently challenged in reciprocal performance. Specifically, a discontinuity was observed at critical ID and was attributed to a transition between two distinct dynamic regimes that occurs with increasing difficulty. In the present paper, we show that such a discontinuity is also present in discrete aiming when ID is manipulated via target width (experiment 1 but not via target distance (experiment 2. Fitts' law's discontinuity appears, therefore, to be a suitable indicator of the underlying functional adaptations of the neuro-muscular-skeletal system to task properties/requirements, independently of reciprocal or discrete nature of the task. These findings open new perspectives to the study of dynamic regimes involved in discrete aiming and sensori-motor mechanisms underlying the speed-accuracy trade-off.

  7. Formalized classification of European fen vegetation at the alliance level

    DEFF Research Database (Denmark)

    Peterka, Tomáš; Hájek, Michal; Jiroušek, Martin

    2017-01-01

    Aims Phytosociological classification of fen vegetation (Scheuchzerio palustris-Caricetea fuscae class) differs among European countries. Here we propose a unified vegetation classification of European fens at the alliance level, provide unequivocal assignment rules for individual vegetation plot...

  8. Classification of Multichannel ECG Signals Using a Cross-Distance Analysis

    National Research Council Canada - National Science Library

    Shahram, Morteza

    2001-01-01

    This paper presents a multi-stage algorithm for multi-channel ECG beat classification into normal and abnormal categories using a sequential beat clustering and a cross- distance analysis algorithm...

  9. Asynchronous data-driven classification of weapon systems

    International Nuclear Information System (INIS)

    Jin, Xin; Mukherjee, Kushal; Gupta, Shalabh; Ray, Asok; Phoha, Shashi; Damarla, Thyagaraju

    2009-01-01

    This communication addresses real-time weapon classification by analysis of asynchronous acoustic data, collected from microphones on a sensor network. The weapon classification algorithm consists of two parts: (i) feature extraction from time-series data using symbolic dynamic filtering (SDF), and (ii) pattern classification based on the extracted features using the language measure (LM) and support vector machine (SVM). The proposed algorithm has been tested on field data, generated by firing of two types of rifles. The results of analysis demonstrate high accuracy and fast execution of the pattern classification algorithm with low memory requirements. Potential applications include simultaneous shooter localization and weapon classification with soldier-wearable networked sensors. (rapid communication)

  10. GLOBAL LAND COVER CLASSIFICATION USING MODIS SURFACE REFLECTANCE PROSUCTS

    Directory of Open Access Journals (Sweden)

    K. Fukue

    2016-06-01

    Full Text Available The objective of this study is to develop high accuracy land cover classification algorithm for Global scale by using multi-temporal MODIS land reflectance products. In this study, time-domain co-occurrence matrix was introduced as a classification feature which provides time-series signature of land covers. Further, the non-parametric minimum distance classifier was introduced for timedomain co-occurrence matrix, which performs multi-dimensional pattern matching for time-domain co-occurrence matrices of a classification target pixel and each classification classes. The global land cover classification experiments have been conducted by applying the proposed classification method using 46 multi-temporal(in one year SR(Surface Reflectance and NBAR(Nadir BRDF-Adjusted Reflectance products, respectively. IGBP 17 land cover categories were used in our classification experiments. As the results, SR and NBAR products showed similar classification accuracy of 99%.

  11. To Conclude: India can aim big

    Indian Academy of Sciences (India)

    ... Transmission and Distribution Losses. If 100 million middle class homes deploy 1 kW on rooftops. 100 GW peak power capacity added at homes alone; 40% of current peak power installed in India today. India must aim by 2030. To have 50% of its electric power from SOLAR; To have 50% of vehicles as Electric Vehicles ...

  12. Aims and harvest of moral case deliberation.

    Science.gov (United States)

    Weidema, Froukje C; Molewijk, Bert A C; Kamsteeg, Frans; Widdershoven, Guy A M

    2013-09-01

    Deliberative ways of dealing with ethical issues in health care are expanding. Moral case deliberation is an example, providing group-wise, structured reflection on dilemmas from practice. Although moral case deliberation is well described in literature, aims and results of moral case deliberation sessions are unknown. This research shows (a) why managers introduce moral case deliberation and (b) what moral case deliberation participants experience as moral case deliberation results. A responsive evaluation was conducted, explicating moral case deliberation experiences by analysing aims (N = 78) and harvest (N = 255). A naturalistic data collection included interviews with managers and evaluation questionnaires of moral case deliberation participants (nurses). From the analysis, moral case deliberation appeals for cooperation, team bonding, critical attitude towards routines and nurses' empowerment. Differences are that managers aim to foster identity of the nursing profession, whereas nurses emphasize learning processes and understanding perspectives. We conclude that moral case deliberation influences team cooperation that cannot be controlled with traditional management tools, but requires time and dialogue. Exchanging aims and harvest between manager and team could result in co-creating (moral) practice in which improvements for daily cooperation result from bringing together perspectives of managers and team members.

  13. Pragmatics and the aims of language evolution.

    Science.gov (United States)

    Scott-Phillips, Thomas C

    2017-02-01

    Pragmatics has historically played a relatively peripheral role in language evolution research. This is a profound mistake. Here I describe how a pragmatic perspective can inform language evolution in the most fundamental way: by making clear what the natural objects of study are, and hence what the aims of the field should be.

  14. User Classification in Crowdsourcing-Based Cooperative Spectrum Sensing

    Directory of Open Access Journals (Sweden)

    Linbo Zhai

    2017-07-01

    Full Text Available This paper studies cooperative spectrum sensing based on crowdsourcing in cognitive radio networks. Since intelligent mobile users such as smartphones and tablets can sense the wireless spectrum, channel sensing tasks can be assigned to these mobile users. This is referred to as the crowdsourcing method. However, there may be some malicious mobile users that send false sensing reports deliberately, for their own purposes. False sensing reports will influence decisions about channel state. Therefore, it is necessary to classify mobile users in order to distinguish malicious users. According to the sensing reports, mobile users should not just be divided into two classes (honest and malicious. There are two reasons for this: on the one hand, honest users in different positions may have different sensing outcomes, as shadowing, multi-path fading, and other issues may influence the sensing results; on the other hand, there may be more than one type of malicious users, acting differently in the network. Therefore, it is necessary to classify mobile users into more than two classes. Due to the lack of prior information of the number of user classes, this paper casts the problem of mobile user classification as a dynamic clustering problem that is NP-hard. The paper uses the interdistance-to-intradistance ratio of clusters as the fitness function, and aims to maximize the fitness function. To cast this optimization problem, this paper proposes a distributed algorithm for user classification in order to obtain bounded close-to-optimal solutions, and analyzes the approximation ratio of the proposed algorithm. Simulations show the distributed algorithm achieves higher performance than other algorithms.

  15. Advanced MTADS Classification for Detection and Discrimination of UXO

    National Research Council Canada - National Science Library

    Nelson, H

    2003-01-01

    ...) for the detection and classification of buried unexploded ordnance. In order to increase the discrimination ability of the system, we have developed advanced analysis algorithms for the Electromagnetic Induction (EMI) sensor data...

  16. Autonomous Non-Linear Classification of LPI Radar Signal Modulations

    National Research Council Canada - National Science Library

    Gulum, Taylan O

    2007-01-01

    ...) radar modulations is investigated. A software engineering architecture that allows a full investigation of various preprocessing algorithms and classification techniques is applied to a database of important LPI radar waveform...

  17. Decision tree approach for classification of remotely sensed satellite

    Indian Academy of Sciences (India)

    DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source ...

  18. Classification of Urinary Calculi using Feed-Forward Neural Networks

    African Journals Online (AJOL)

    NJD

    Genetic algorithms were used for optimization of neural networks and for selection of the ... Urinary calculi, infrared spectroscopy, classification, neural networks, variable ..... note that the best accuracy is obtained for whewellite, weddellite.

  19. Algorithms for Computerized Fetal Heart Rate Diagnosis with Direct Reporting

    Directory of Open Access Journals (Sweden)

    Kazuo Maeda

    2015-06-01

    Full Text Available Aims: Since pattern classification of fetal heart rate (FHR was subjective and enlarged interobserver difference, objective FHR analysis was achieved with computerized FHR diagnosis. Methods: The computer algorithm was composed of an experts’ knowledge system, including FHR analysis and FHR score calculation, and also of an objective artificial neural network system with software. In addition, a FHR frequency spectrum was studied to detect ominous sinusoidal FHR and the loss of baseline variability related to fetal brain damage. The algorithms were installed in a central-computerized automatic FHR monitoring system, which gave the diagnosis rapidly and directly to the attending doctor. Results: Clinically perinatal mortality decreased significantly and no cerebral palsy developed after introduction of the centralized system. Conclusion: The automatic multichannel FHR monitoring system improved the monitoring, increased the objectivity of FHR diagnosis and promoted clinical results.

  20. An ant colony optimization based feature selection for web page classification.

    Science.gov (United States)

    Saraç, Esra; Özel, Selma Ayşe

    2014-01-01

    The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods.

  1. A Novel Approach to ECG Classification Based upon Two-Layered HMMs in Body Sensor Networks

    Directory of Open Access Journals (Sweden)

    Wei Liang

    2014-03-01

    Full Text Available This paper presents a novel approach to ECG signal filtering and classification. Unlike the traditional techniques which aim at collecting and processing the ECG signals with the patient being still, lying in bed in hospitals, our proposed algorithm is intentionally designed for monitoring and classifying the patient’s ECG signals in the free-living environment. The patients are equipped with wearable ambulatory devices the whole day, which facilitates the real-time heart attack detection. In ECG preprocessing, an integral-coefficient-band-stop (ICBS filter is applied, which omits time-consuming floating-point computations. In addition, two-layered Hidden Markov Models (HMMs are applied to achieve ECG feature extraction and classification. The periodic ECG waveforms are segmented into ISO intervals, P subwave, QRS complex and T subwave respectively in the first HMM layer where expert-annotation assisted Baum-Welch algorithm is utilized in HMM modeling. Then the corresponding interval features are selected and applied to categorize the ECG into normal type or abnormal type (PVC, APC in the second HMM layer. For verifying the effectiveness of our algorithm on abnormal signal detection, we have developed an ECG body sensor network (BSN platform, whereby real-time ECG signals are collected, transmitted, displayed and the corresponding classification outcomes are deduced and shown on the BSN screen.

  2. A Novel Approach to ECG Classification Based upon Two-Layered HMMs in Body Sensor Networks

    Science.gov (United States)

    Liang, Wei; Zhang, Yinlong; Tan, Jindong; Li, Yang

    2014-01-01

    This paper presents a novel approach to ECG signal filtering and classification. Unlike the traditional techniques which aim at collecting and processing the ECG signals with the patient being still, lying in bed in hospitals, our proposed algorithm is intentionally designed for monitoring and classifying the patient's ECG signals in the free-living environment. The patients are equipped with wearable ambulatory devices the whole day, which facilitates the real-time heart attack detection. In ECG preprocessing, an integral-coefficient-band-stop (ICBS) filter is applied, which omits time-consuming floating-point computations. In addition, two-layered Hidden Markov Models (HMMs) are applied to achieve ECG feature extraction and classification. The periodic ECG waveforms are segmented into ISO intervals, P subwave, QRS complex and T subwave respectively in the first HMM layer where expert-annotation assisted Baum-Welch algorithm is utilized in HMM modeling. Then the corresponding interval features are selected and applied to categorize the ECG into normal type or abnormal type (PVC, APC) in the second HMM layer. For verifying the effectiveness of our algorithm on abnormal signal detection, we have developed an ECG body sensor network (BSN) platform, whereby real-time ECG signals are collected, transmitted, displayed and the corresponding classification outcomes are deduced and shown on the BSN screen. PMID:24681668

  3. Ototoxicity (cochleotoxicity) classifications: A review.

    Science.gov (United States)

    Crundwell, Gemma; Gomersall, Phil; Baguley, David M

    2016-01-01

    Drug-mediated ototoxicity, specifically cochleotoxicity, is a concern for patients receiving medications for the treatment of serious illness. A number of classification schemes exist, most of which are based on pure-tone audiometry, in order to assist non-audiological/non-otological specialists in the identification and monitoring of iatrogenic hearing loss. This review identifies the primary classification systems used in cochleototoxicity monitoring. By bringing together classifications published in discipline-specific literature, the paper aims to increase awareness of their relative strengths and limitations in the assessment and monitoring of ototoxic hearing loss and to indicate how future classification systems may improve upon the status-quo. Literature review. PubMed identified 4878 articles containing the search term ototox*. A systematic search identified 13 key classification systems. Cochleotoxicity classification systems can be divided into those which focus on hearing change from a baseline audiogram and those that focus on the functional impact of the hearing loss. Common weaknesses of these grading scales included a lack of sensitivity to small adverse changes in hearing thresholds, a lack of high-frequency audiometry (>8 kHz), and lack of indication of which changes are likely to be clinically significant for communication and quality of life.

  4. METHODS OF TEXT INFORMATION CLASSIFICATION ON THE BASIS OF ARTIFICIAL NEURAL AND SEMANTIC NETWORKS

    Directory of Open Access Journals (Sweden)

    L. V. Serebryanaya

    2016-01-01

    Full Text Available The article covers the use of perseptron, Hopfild artificial neural network and semantic network for classification of text information. Network training algorithms are studied. An algorithm of inverse mistake spreading for perceptron network and convergence algorithm for Hopfild network are implemented. On the basis of the offered models and algorithms automatic text classification software is developed and its operation results are evaluated.

  5. Tutorial: Asteroseismic Stellar Modelling with AIMS

    Science.gov (United States)

    Lund, Mikkel N.; Reese, Daniel R.

    The goal of aims (Asteroseismic Inference on a Massive Scale) is to estimate stellar parameters and credible intervals/error bars in a Bayesian manner from a set of asteroseismic frequency data and so-called classical constraints. To achieve reliable parameter estimates and computational efficiency, it searches through a grid of pre-computed models using an MCMC algorithm—interpolation within the grid of models is performed by first tessellating the grid using a Delaunay triangulation and then doing a linear barycentric interpolation on matching simplexes. Inputs for the modelling consist of individual frequencies from peak-bagging, which can be complemented with classical spectroscopic constraints. aims is mostly written in Python with a modular structure to facilitate contributions from the community. Only a few computationally intensive parts have been rewritten in Fortran in order to speed up calculations.

  6. CURRICULUM MATTERS: Aims assessments and workplace needs

    Science.gov (United States)

    Black, Paul

    1997-09-01

    This paper attempts to consider the aims that undergraduate physics degree courses actually reflect and serve in the light of the employment patterns of graduates and of the expressed needs of employers. It reviews the results of analyses of what degree examinations actually test, and goes on to quote criticisms of their courses and radical proposals to change them adopted by the UK conference of physics professors. The discussion is then broadened by discussion of evidence, about the employment of graduates and about the priorities that some industrialists now give in the qualities that they look for when recruiting new graduates. The evidence leads to a view that radical changes are needed, both in courses and examinations, and that there is a need for university departments to work more closely with employers in re-formulating the aims and priorities in their teaching.

  7. Aims and methods of nuclear materials management

    International Nuclear Information System (INIS)

    Leven, D.; Schier, H.

    1979-05-01

    Whilst international safeguarding of fissile materials against abuse has been the subject of extensive debate, little public attention has so far been devoted to the internal security of these materials. All countries using nuclear energy for peaceful purposes have laid down appropriate regulations. In the Federal Republic of Germany safeguards are required, for instance, by the Atomic Energy Act, and are therefore a prerequisite for licensing. The aims and methods of national nuclear materials management are contrasted with viewpoints on international safeguards

  8. Machine learning methods for the classification of gliomas: Initial results using features extracted from MR spectroscopy.

    Science.gov (United States)

    Ranjith, G; Parvathy, R; Vikas, V; Chandrasekharan, Kesavadas; Nair, Suresh

    2015-04-01

    With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. The aim of the study is to classify gliomas into benign and malignant types using MRI data. Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.

  9. Real time automatic scene classification

    NARCIS (Netherlands)

    Verbrugge, R.; Israël, Menno; Taatgen, N.; van den Broek, Egon; van der Putten, Peter; Schomaker, L.; den Uyl, Marten J.

    2004-01-01

    This work has been done as part of the EU VICAR (IST) project and the EU SCOFI project (IAP). The aim of the first project was to develop a real time video indexing classification annotation and retrieval system. For our systems, we have adapted the approach of Picard and Minka [3], who categorized

  10. Ensemble Classification of Data Streams Based on Attribute Reduction and a Sliding Window

    Directory of Open Access Journals (Sweden)

    Yingchun Chen

    2018-04-01

    Full Text Available With the current increasing volume and dimensionality of data, traditional data classification algorithms are unable to satisfy the demands of practical classification applications of data streams. To deal with noise and concept drift in data streams, we propose an ensemble classification algorithm based on attribute reduction and a sliding window in this paper. Using mutual information, an approximate attribute reduction algorithm based on rough sets is used to reduce data dimensionality and increase the diversity of reduced results in the algorithm. A double-threshold concept drift detection method and a three-stage sliding window control strategy are introduced to improve the performance of the algorithm when dealing with both noise and concept drift. The classification precision is further improved by updating the base classifiers and their nonlinear weights. Experiments on synthetic datasets and actual datasets demonstrate the performance of the algorithm in terms of classification precision, memory use, and time efficiency.

  11. Urogenital tuberculosis: definition and classification.

    Science.gov (United States)

    Kulchavenya, Ekaterina

    2014-10-01

    To improve the approach to the diagnosis and management of urogenital tuberculosis (UGTB), we need clear and unique classification. UGTB remains an important problem, especially in developing countries, but it is often an overlooked disease. As with any other infection, UGTB should be cured by antibacterial therapy, but because of late diagnosis it may often require surgery. Scientific literature dedicated to this problem was critically analyzed and juxtaposed with the author's own more than 30 years' experience in tuberculosis urology. The conception, terms and definition were consolidated into one system; classification stage by stage as well as complications are presented. Classification of any disease includes dispersion on forms and stages and exact definitions for each stage. Clinical features and symptoms significantly vary between different forms and stages of UGTB. A simple diagnostic algorithm was constructed. UGTB is multivariant disease and a standard unified approach to it is impossible. Clear definition as well as unique classification are necessary for real estimation of epidemiology and the optimization of therapy. The term 'UGTB' has insufficient information in order to estimate therapy, surgery and prognosis, or to evaluate the epidemiology.

  12. Classification of solar wind with machine learning

    NARCIS (Netherlands)

    E. Camporeale (Enrico); A. Carè (Algo); J.E. Borovsky (Joseph)

    2017-01-01

    htmlabstractWe present a four-category classification algorithm for the solar wind, based on Gaussian Process. The four categories are the ones previously adopted in Xu and Borovsky (2015): ejecta, coronal hole origin plasma, streamer belt origin plasma, and sector reversal origin plasma. The

  13. Segmentation and Classification of Burn Color Images

    National Research Council Canada - National Science Library

    Acha, Begonya

    2001-01-01

    .... In the classification part, we take advantage of color information by clustering, with a vector quantization algorithm, the color centroids of small squares, taken from the burnt segmented part of the image, in the (V1, V2) plane into two possible groups, where V1 and V2 are the two chrominance components of the CIE Lab representation.

  14. Rifkin takes aim at USDA animal research.

    Science.gov (United States)

    Fox, Jeffrey L

    1984-10-19

    Jeremy Rifkin has filed a lawsuit to block U.S. Department of Agriculture (USDA) experiments involving the transfer of human growth hormone genes into sheep and pigs, which he rejects on environmental, economic, and ethical grounds. His real target is the Department's animal breeding program; his ultimate aim is "to establish the principle that there should be no crossing of species barriers in animals." USDA officials have not yet responded to the lawsuit but they intend to continue the experiments, which they consider crucial to the progress of research, until told to stop.

  15. The aims of transfer prices formation

    Directory of Open Access Journals (Sweden)

    Tomašević Stevan

    2013-01-01

    Full Text Available More than two-thirds of today's world trade comprises of transactions between related legal persons. Prices for the above-mentioned transactions within legal person or group of related legal persons are called transfer pricing. The aim of this paper is to present the transfer prices as well as the main objectives of transfer pricing. Also, this paper explains application of transfer pricing in the Republic of Serbia and the normative rules that cover the issue of transfer pricing, their determination and their application in the calculation. Overall, there has been a great deal of attention paid to the transfer pricing in national and international levels.

  16. Evolutionary fuzzy ARTMAP neural networks for classification of semiconductor defects.

    Science.gov (United States)

    Tan, Shing Chiang; Watada, Junzo; Ibrahim, Zuwairie; Khalid, Marzuki

    2015-05-01

    Wafer defect detection using an intelligent system is an approach of quality improvement in semiconductor manufacturing that aims to enhance its process stability, increase production capacity, and improve yields. Occasionally, only few records that indicate defective units are available and they are classified as a minority group in a large database. Such a situation leads to an imbalanced data set problem, wherein it engenders a great challenge to deal with by applying machine-learning techniques for obtaining effective solution. In addition, the database may comprise overlapping samples of different classes. This paper introduces two models of evolutionary fuzzy ARTMAP (FAM) neural networks to deal with the imbalanced data set problems in a semiconductor manufacturing operations. In particular, both the FAM models and hybrid genetic algorithms are integrated in the proposed evolutionary artificial neural networks (EANNs) to classify an imbalanced data set. In addition, one of the proposed EANNs incorporates a facility to learn overlapping samples of different classes from the imbalanced data environment. The classification results of the proposed evolutionary FAM neural networks are presented, compared, and analyzed using several classification metrics. The outcomes positively indicate the effectiveness of the proposed networks in handling classification problems with imbalanced data sets.

  17. Cattle behaviour classification from collar, halter, and ear tag sensors

    Directory of Open Access Journals (Sweden)

    A. Rahman

    2018-03-01

    Full Text Available In this paper, we summarise the outcome of a set of experiments aimed at classifying cattle behaviour based on sensor data. Each animal carried sensors generating time series accelerometer data placed on a collar on the neck at the back of the head, on a halter positioned at the side of the head behind the mouth, or on the ear using a tag. The purpose of the study was to determine how sensor data from different placement can classify a range of typical cattle behaviours. Data were collected and animal behaviours (grazing, standing or ruminating were observed over a common time frame. Statistical features were computed from the sensor data and machine learning algorithms were trained to classify each behaviour. Classification accuracies were computed on separate independent test sets. The analysis based on behaviour classification experiments revealed that different sensor placement can achieve good classification accuracy if the feature space (representing motion patterns between the training and test animal is similar. The paper will discuss these analyses in detail and can act as a guide for future studies.

  18. New stopping rules for dendrogram classification in TWINSPAN

    Directory of Open Access Journals (Sweden)

    Omid Esmailzadeh

    2015-12-01

    Full Text Available The aim of this study is to propose a modification of TWINSPAN algorithm with introducing new stopping rules for TWINSPAN. Modified TWINSPAN combines the analysis of heterogeneity of the clusters prior to each division to prevent the imposed divisions of homogeneous clusters and it also solved the limitation of classical TWINSPAN in which the number of clusters increases power of two. For this purpose, ecological groups of Box tree stands in Farim forests were classified with using classical and modified TWINSPAN basis of plant species cover percentage of 60 plots with 400 m2 surface area which were made by releve method (by consideration of indicator stand concept. In this relation, five different heterogeneity measures including Whittaker’s beta diversity and total inertia, Sorensen, Jaccard and Orlo´ci dissimilarity indices which representing diversity and distance indices respectively were involved. Sample plots were also classified from basis of topographical properties using cluster analysis with emphasizing Euclidean distance coefficient and Wards clustering method. Results showed that using of two sets of heterogeneity indices lead to different classification dendrograms. In this relation, results of Whittaker’s beta with total inertia as diversity indices were similar and the other three dissimilarity indices have shown similar behavior. Finally, our results reiterated that modified TWINSPAN did not alter the logic of the TWINSPAN classification, but it increased the flexibility of TWINSPAN dendrogram with changing the hierarchy of divisions in the final classification of ecological groups of Box tree stands in Farim forests.

  19. Application of a neural network for reflectance spectrum classification

    Science.gov (United States)

    Yang, Gefei; Gartley, Michael

    2017-05-01

    Traditional reflectance spectrum classification algorithms are based on comparing spectrum across the electromagnetic spectrum anywhere from the ultra-violet to the thermal infrared regions. These methods analyze reflectance on a pixel by pixel basis. Inspired by high performance that Convolution Neural Networks (CNN) have demonstrated in image classification, we applied a neural network to analyze directional reflectance pattern images. By using the bidirectional reflectance distribution function (BRDF) data, we can reformulate the 4-dimensional into 2 dimensions, namely incident direction × reflected direction × channels. Meanwhile, RIT's micro-DIRSIG model is utilized to simulate additional training samples for improving the robustness of the neural networks training. Unlike traditional classification by using hand-designed feature extraction with a trainable classifier, neural networks create several layers to learn a feature hierarchy from pixels to classifier and all layers are trained jointly. Hence, the our approach of utilizing the angular features are different to traditional methods utilizing spatial features. Although training processing typically has a large computational cost, simple classifiers work well when subsequently using neural network generated features. Currently, most popular neural networks such as VGG, GoogLeNet and AlexNet are trained based on RGB spatial image data. Our approach aims to build a directional reflectance spectrum based neural network to help us to understand from another perspective. At the end of this paper, we compare the difference among several classifiers and analyze the trade-off among neural networks parameters.

  20. Quantum algorithm for support matrix machines

    Science.gov (United States)

    Duan, Bojia; Yuan, Jiabin; Liu, Ying; Li, Dan

    2017-09-01

    We propose a quantum algorithm for support matrix machines (SMMs) that efficiently addresses an image classification problem by introducing a least-squares reformulation. This algorithm consists of two core subroutines: a quantum matrix inversion (Harrow-Hassidim-Lloyd, HHL) algorithm and a quantum singular value thresholding (QSVT) algorithm. The two algorithms can be implemented on a universal quantum computer with complexity O[log(npq) ] and O[log(pq)], respectively, where n is the number of the training data and p q is the size of the feature space. By iterating the algorithms, we can find the parameters for the SMM classfication model. Our analysis shows that both HHL and QSVT algorithms achieve an exponential increase of speed over their classical counterparts.

  1. Odor Classification using Agent Technology

    Directory of Open Access Journals (Sweden)

    Sigeru OMATU

    2014-03-01

    Full Text Available In order to measure and classify odors, Quartz Crystal Microbalance (QCM can be used. In the present study, seven QCM sensors and three different odors are used. The system has been developed as a virtual organization of agents using an agent platform called PANGEA (Platform for Automatic coNstruction of orGanizations of intElligent Agents. This is a platform for developing open multi-agent systems, specifically those including organizational aspects. The main reason for the use of agents is the scalability of the platform, i.e. the way in which it models the services. The system models functionalities as services inside the agents, or as Service Oriented Approach (SOA architecture compliant services using Web Services. This way the adaptation of the odor classification systems with new algorithms, tools and classification techniques is allowed.

  2. The Top Ten Algorithms in Data Mining

    CERN Document Server

    Wu, Xindong

    2009-01-01

    From classification and clustering to statistical learning, association analysis, and link mining, this book covers the most important topics in data mining research. It presents the ten most influential algorithms used in the data mining community today. Each chapter provides a detailed description of the algorithm, a discussion of available software implementation, advanced topics, and exercises. With a simple data set, examples illustrate how each algorithm works and highlight the overall performance of each algorithm in a real-world application. Featuring contributions from leading researc

  3. An ordinal classification approach for CTG categorization.

    Science.gov (United States)

    Georgoulas, George; Karvelis, Petros; Gavrilis, Dimitris; Stylios, Chrysostomos D; Nikolakopoulos, George

    2017-07-01

    Evaluation of cardiotocogram (CTG) is a standard approach employed during pregnancy and delivery. But, its interpretation requires high level expertise to decide whether the recording is Normal, Suspicious or Pathological. Therefore, a number of attempts have been carried out over the past three decades for development automated sophisticated systems. These systems are usually (multiclass) classification systems that assign a category to the respective CTG. However most of these systems usually do not take into consideration the natural ordering of the categories associated with CTG recordings. In this work, an algorithm that explicitly takes into consideration the ordering of CTG categories, based on binary decomposition method, is investigated. Achieved results, using as a base classifier the C4.5 decision tree classifier, prove that the ordinal classification approach is marginally better than the traditional multiclass classification approach, which utilizes the standard C4.5 algorithm for several performance criteria.

  4. Iris Data Classification Using Quantum Neural Networks

    International Nuclear Information System (INIS)

    Sahni, Vishal; Patvardhan, C.

    2006-01-01

    Quantum computing is a novel paradigm that promises to be the future of computing. The performance of quantum algorithms has proved to be stunning. ANN within the context of classical computation has been used for approximation and classification tasks with some success. This paper presents an idea of quantum neural networks along with the training algorithm and its convergence property. It synergizes the unique properties of quantum bits or qubits with the various techniques in vogue in neural networks. An example application of Fisher's Iris data set, a benchmark classification problem has also been presented. The results obtained amply demonstrate the classification capabilities of the quantum neuron and give an idea of their promising capabilities

  5. Global Optimization Ensemble Model for Classification Methods

    Science.gov (United States)

    Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab

    2014-01-01

    Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382

  6. Global Optimization Ensemble Model for Classification Methods

    Directory of Open Access Journals (Sweden)

    Hina Anwar

    2014-01-01

    Full Text Available Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity.

  7. Aims and methods of education: A recapitulation

    Directory of Open Access Journals (Sweden)

    Pantić Nataša

    2007-01-01

    Full Text Available This paper gives an overview of principal distinction between the aims of the so-called "traditional" and "progressive" education and respective pedagogies associated with each. The term "traditional" education is used to denote the kind of education that prepares people for their role in society as it is, while the term "progressive" is used for education that aspires to equip mankind with capacity to shape the change of society. The paper raises some critical questions about the role of pedagogy in achieving the aims of the progressive model, arguing that the employment of "progressive" methods does not necessarily guarantee the achievement of the commonly professed purposes of progressive education. This is illustrated in the paper by the results of a study in English schools showing how despite the claim of progressive methods, teachers tend to retain traditional attitudes and on the other hand, how even traditional teaching methods can serve the progressive purpose. This is not to advocate for the traditional pedagogy, but to suggest that it might be something other than pedagogy that makes a critical difference in educating liberal-minded citizens of the future. In this sense the paper explores the role of other factors that make a difference towards progressive education, such as democratization of human relations in school ethos and respect for children's freedom.

  8. Algorithmic alternatives

    International Nuclear Information System (INIS)

    Creutz, M.

    1987-11-01

    A large variety of Monte Carlo algorithms are being used for lattice gauge simulations. For purely bosonic theories, present approaches are generally adequate; nevertheless, overrelaxation techniques promise savings by a factor of about three in computer time. For fermionic fields the situation is more difficult and less clear. Algorithms which involve an extrapolation to a vanishing step size are all quite closely related. Methods which do not require such an approximation tend to require computer time which grows as the square of the volume of the system. Recent developments combining global accept/reject stages with Langevin or microcanonical updatings promise to reduce this growth to V/sup 4/3/

  9. Combinatorial algorithms

    CERN Document Server

    Hu, T C

    2002-01-01

    Newly enlarged, updated second edition of a valuable text presents algorithms for shortest paths, maximum flows, dynamic programming and backtracking. Also discusses binary trees, heuristic and near optimums, matrix multiplication, and NP-complete problems. 153 black-and-white illus. 23 tables.Newly enlarged, updated second edition of a valuable, widely used text presents algorithms for shortest paths, maximum flows, dynamic programming and backtracking. Also discussed are binary trees, heuristic and near optimums, matrix multiplication, and NP-complete problems. New to this edition: Chapter 9

  10. AIM cryocooler developments for HOT detectors

    Science.gov (United States)

    Rühlich, I.; Mai, M.; Withopf, A.; Rosenhagen, C.

    2014-06-01

    Significantly increased FPA temperatures for both Mid Wave and Long Wave IR detectors, i.e. HOT detectors, which have been developed in recent years are now leaving the development phase and are entering real application. HOT detectors allowing to push size weight and power (SWaP) of Integrated Detectors Cooler Assemblies (IDCA's) to a new level. Key component mainly driving achievable weight, volume and power consumption is the cryocooler. AIM cryocooler developments are focused on compact, lightweight linear cryocoolers driven by compact and high efficient digital cooler drive electronics (DCE) to also achieve highest MTTF targets. This technology is using moving magnet driving mechanisms and dual or single piston compressors. Whereas SX030 which was presented at SPIE in 2012 consuming less 3 WDC to operate a typical IDCA at 140K, next smaller cooler SX020 is designed to provide sufficient cooling power at detector temperature above 160K. The cooler weight of less than 200g and a total compressor length of 60mm makes it an ideal solution for all applications with limited weight and power budget, like in handheld applications. For operating a typical 640x512, 15μm MW IR detector the power consumption will be less than 1.5WDC. MTTF for the cooler will be in excess of 30,000h and thus achieving low maintenance cost also in 24/7 applications. The SX020 compressor is based on a single piston design with integrated passive balancer in a new design achieves very low exported vibration in the order of 100mN in the compressor axis. AIM is using a modular approach, allowing the chose between 5 different compressor types for one common Stirling expander. The 6mm expander with a total length of 74mm is now available in a new design that fits into standard dewar bores originally designed for rotary coolers. Also available is a 9mm coldfinger in both versions. In development is an ultra-short expander with around 35mm total length to achieve highest compactness. Technical

  11. Plasma health care - Aims, constraints and progress

    International Nuclear Information System (INIS)

    Morfill, G.E.; Zimmerman, J.L.

    2013-01-01

    Health Care covers three areas of interest for cold atmospheric pressure plasmas: Cosmetics, Hygiene and Medicine. These areas can be subdivided into personal and professional care. In this review will concentrate on Hygiene and Medicine. In professional hygiene the most important plasma contribution is sterilization, decontamination and disinfection. The main aim is the prevention of diseases or their containment. Progress in the development of efficient bactericidal plasma sources has been rapid, so that it appears realistic to use plasmas to combat nosocomial infections as well as community associated infections in the not too distant future. The advantages of plasma devices – they use air and electricity only, there are no waste products, they are inexpensive to manufacture and operate, easy to transport and install, and bactericidal effects are fast (seconds). Plasmas can efficiently kill resistant bacteria (e.g. MRSA) and tests have shown no resistance build-up so far. With an estimated 2 Million hospital induced infections each year in the US alone, and about 100.000 resulting deaths, very efficient, safe and fast hospital plasma hygiene devices would appear to be a very important weapon to help contain the spread of infectious diseases. In Medicine there are a number of ambitious ideas and aims. Plasmas can be “designed” to some extent. They can include different active species that can have an effect at the cellular level. There are ionic atoms and molecules, whose medical use need to be evaluated – the vision is that a new area of “plasma pharmacy” could develop. First steps are currently being taken in biological studies. Also the excited atoms in cold atmospheric plasmas may make cell walls more permeable for such species. (author)

  12. SHIP CLASSIFICATION FROM MULTISPECTRAL VIDEOS

    Directory of Open Access Journals (Sweden)

    Frederique Robert-Inacio

    2012-05-01

    Full Text Available Surveillance of a seaport can be achieved by different means: radar, sonar, cameras, radio communications and so on. Such a surveillance aims, on the one hand, to manage cargo and tanker traffic, and, on the other hand, to prevent terrorist attacks in sensitive areas. In this paper an application to video-surveillance of a seaport entrance is presented, and more particularly, the different steps enabling to classify mobile shapes. This classification is based on a parameter measuring the similarity degree between the shape under study and a set of reference shapes. The classification result describes the considered mobile in terms of shape and speed.

  13. Automated Instrumentation, Monitoring and Visualization of PVM Programs Using AIMS

    Science.gov (United States)

    Mehra, Pankaj; VanVoorst, Brian; Yan, Jerry; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    We present views and analysis of the execution of several PVM (Parallel Virtual Machine) codes for Computational Fluid Dynamics on a networks of Sparcstations, including: (1) NAS Parallel Benchmarks CG and MG; (2) a multi-partitioning algorithm for NAS Parallel Benchmark SP; and (3) an overset grid flowsolver. These views and analysis were obtained using our Automated Instrumentation and Monitoring System (AIMS) version 3.0, a toolkit for debugging the performance of PVM programs. We will describe the architecture, operation and application of AIMS. The AIMS toolkit contains: (1) Xinstrument, which can automatically instrument various computational and communication constructs in message-passing parallel programs; (2) Monitor, a library of runtime trace-collection routines; (3) VK (Visual Kernel), an execution-animation tool with source-code clickback; and (4) Tally, a tool for statistical analysis of execution profiles. Currently, Xinstrument can handle C and Fortran 77 programs using PVM 3.2.x; Monitor has been implemented and tested on Sun 4 systems running SunOS 4.1.2; and VK uses XIIR5 and Motif 1.2. Data and views obtained using AIMS clearly illustrate several characteristic features of executing parallel programs on networked workstations: (1) the impact of long message latencies; (2) the impact of multiprogramming overheads and associated load imbalance; (3) cache and virtual-memory effects; and (4) significant skews between workstation clocks. Interestingly, AIMS can compensate for constant skew (zero drift) by calibrating the skew between a parent and its spawned children. In addition, AIMS' skew-compensation algorithm can adjust timestamps in a way that eliminates physically impossible communications (e.g., messages going backwards in time). Our current efforts are directed toward creating new views to explain the observed performance of PVM programs. Some of the features planned for the near future include: (1) ConfigView, showing the physical topology

  14. [Implementation of cytology images classification--the Bethesda 2001 System--in a group of screened women from Podlaskie region--effect evaluation].

    Science.gov (United States)

    Zbroch, Tomasz; Knapp, Paweł Grzegorz; Knapp, Piotr Andrzej

    2007-09-01

    Increasing knowledge concerning carcinogenesis within cervical epithelium has forced us to make continues modifications of cytology classification of the cervical smears. Eventually, new descriptions of the submicroscopic cytomorphological abnormalities have enabled the implementation of Bethesda System which was meant to take place of the former Papanicolaou classification although temporarily both are sometimes used simultaneously. The aim of this study was to compare results of these two classification systems in the aspect of diagnostic accuracy verified by further tests of the diagnostic algorithm for the cervical lesion evaluation. The study was conducted in the group of women selected from general population, the criteria being the place of living and cervical cancer age risk group, in the consecutive periods of mass screening in Podlaski region. The performed diagnostic tests have been based on the commonly used algorithm, as well as identical laboratory and methodological conditions. Performed assessment revealed comparable diagnostic accuracy of both analyzing classifications, verified by histological examination, although with marked higher specificity for dysplastic lesions with decreased number of HSIL results and increased diagnosis of LSILs. Higher number of performed colposcopies and biopsies were an additional consequence of TBS classification. Results based on Bethesda System made it possible to find the sources and reasons of abnormalities with much greater precision, which enabled causing agent treatment. Two evaluated cytology classification systems, although not much different, depicted higher potential of TBS and better, more effective communication between cytology laboratory and gynecologist, making reasonable implementation of The Bethesda System in the daily cytology screening work.

  15. Analysis on Target Detection and Classification in LTE Based Passive Forward Scattering Radar

    Directory of Open Access Journals (Sweden)

    Raja Syamsul Azmir Raja Abdullah

    2016-09-01

    Full Text Available The passive bistatic radar (PBR system can utilize the illuminator of opportunity to enhance radar capability. By utilizing the forward scattering technique and procedure into the specific mode of PBR can provide an improvement in target detection and classification. The system is known as passive Forward Scattering Radar (FSR. The passive FSR system can exploit the peculiar advantage of the enhancement in forward scatter radar cross section (FSRCS for target detection. Thus, the aim of this paper is to show the feasibility of passive FSR for moving target detection and classification by experimental analysis and results. The signal source is coming from the latest technology of 4G Long-Term Evolution (LTE base station. A detailed explanation on the passive FSR receiver circuit, the detection scheme and the classification algorithm are given. In addition, the proposed passive FSR circuit employs the self-mixing technique at the receiver; hence the synchronization signal from the transmitter is not required. The experimental results confirm the passive FSR system’s capability for ground target detection and classification. Furthermore, this paper illustrates the first classification result in the passive FSR system. The great potential in the passive FSR system provides a new research area in passive radar that can be used for diverse remote monitoring applications.

  16. Analysis on Target Detection and Classification in LTE Based Passive Forward Scattering Radar.

    Science.gov (United States)

    Raja Abdullah, Raja Syamsul Azmir; Abdul Aziz, Noor Hafizah; Abdul Rashid, Nur Emileen; Ahmad Salah, Asem; Hashim, Fazirulhisyam

    2016-09-29

    The passive bistatic radar (PBR) system can utilize the illuminator of opportunity to enhance radar capability. By utilizing the forward scattering technique and procedure into the specific mode of PBR can provide an improvement in target detection and classification. The system is known as passive Forward Scattering Radar (FSR). The passive FSR system can exploit the peculiar advantage of the enhancement in forward scatter radar cross section (FSRCS) for target detection. Thus, the aim of this paper is to show the feasibility of passive FSR for moving target detection and classification by experimental analysis and results. The signal source is coming from the latest technology of 4G Long-Term Evolution (LTE) base station. A detailed explanation on the passive FSR receiver circuit, the detection scheme and the classification algorithm are given. In addition, the proposed passive FSR circuit employs the self-mixing technique at the receiver; hence the synchronization signal from the transmitter is not required. The experimental results confirm the passive FSR system's capability for ground target detection and classification. Furthermore, this paper illustrates the first classification result in the passive FSR system. The great potential in the passive FSR system provides a new research area in passive radar that can be used for diverse remote monitoring applications.

  17. Secondary Vertex Finder Algorithm

    CERN Document Server

    Heer, Sebastian; The ATLAS collaboration

    2017-01-01

    If a jet originates from a b-quark, a b-hadron is formed during the fragmentation process. In its dominant decay modes, the b-hadron decays into a c-hadron via the electroweak interaction. Both b- and c-hadrons have lifetimes long enough, to travel a few millimetres before decaying. Thus displaced vertices from b- and subsequent c-hadron decays provide a strong signature for a b-jet. Reconstructing these secondary vertices (SV) and their properties is the aim of this algorithm. The performance of this algorithm is studied with tt̄ events, requiring at least one lepton, simulated at 13 TeV.

  18. Classification for Inconsistent Decision Tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2016-01-01

    Decision trees have been used widely to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples with equal values of conditional attributes but different labels, then to discover the essential patterns or knowledge from the data set is challenging. Three approaches (generalized, most common and many-valued decision) have been considered to handle such inconsistency. The decision tree model has been used to compare the classification results among three approaches. Many-valued decision approach outperforms other approaches, and M_ws_entM greedy algorithm gives faster and better prediction accuracy.

  19. Classification for Inconsistent Decision Tables

    KAUST Repository

    Azad, Mohammad

    2016-09-28

    Decision trees have been used widely to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples with equal values of conditional attributes but different labels, then to discover the essential patterns or knowledge from the data set is challenging. Three approaches (generalized, most common and many-valued decision) have been considered to handle such inconsistency. The decision tree model has been used to compare the classification results among three approaches. Many-valued decision approach outperforms other approaches, and M_ws_entM greedy algorithm gives faster and better prediction accuracy.

  20. Autodriver algorithm

    Directory of Open Access Journals (Sweden)

    Anna Bourmistrova

    2011-02-01

    Full Text Available The autodriver algorithm is an intelligent method to eliminate the need of steering by a driver on a well-defined road. The proposed method performs best on a four-wheel steering (4WS vehicle, though it is also applicable to two-wheel-steering (TWS vehicles. The algorithm is based on coinciding the actual vehicle center of rotation and road center of curvature, by adjusting the kinematic center of rotation. The road center of curvature is assumed prior information for a given road, while the dynamic center of rotation is the output of dynamic equations of motion of the vehicle using steering angle and velocity measurements as inputs. We use kinematic condition of steering to set the steering angles in such a way that the kinematic center of rotation of the vehicle sits at a desired point. At low speeds the ideal and actual paths of the vehicle are very close. With increase of forward speed the road and tire characteristics, along with the motion dynamics of the vehicle cause the vehicle to turn about time-varying points. By adjusting the steering angles, our algorithm controls the dynamic turning center of the vehicle so that it coincides with the road curvature center, hence keeping the vehicle on a given road autonomously. The position and orientation errors are used as feedback signals in a closed loop control to adjust the steering angles. The application of the presented autodriver algorithm demonstrates reliable performance under different driving conditions.

  1. Classification of Herbaceous Vegetation Using Airborne Hyperspectral Imagery

    Directory of Open Access Journals (Sweden)

    Péter Burai

    2015-02-01

    Full Text Available Alkali landscapes hold an extremely fine-scale mosaic of several vegetation types, thus it seems challenging to separate these classes by remote sensing. Our aim was to test the applicability of different image classification methods of hyperspectral data in this complex situation. To reach the highest classification accuracy, we tested traditional image classifiers (maximum likelihood classifier—MLC, machine learning algorithms (support vector machine—SVM, random forest—RF and feature extraction (minimum noise fraction (MNF-transformation on training datasets of different sizes. Digital images were acquired from an AISA EAGLE II hyperspectral sensor of 128 contiguous bands (400–1000 nm, a spectral sampling of 5 nm bandwidth and a ground pixel size of 1 m. For the classification, we established twenty vegetation classes based on the dominant species, canopy height, and total vegetation cover. Image classification was applied to the original and MNF (minimum noise fraction transformed dataset with various training sample sizes between 10 and 30 pixels. In order to select the optimal number of the transformed features, we applied SVM, RF and MLC classification to 2–15 MNF transformed bands. In the case of the original bands, SVM and RF classifiers provided high accuracy irrespective of the number of the training pixels. We found that SVM and RF produced the best accuracy when using the first nine MNF transformed bands; involving further features did not increase classification accuracy. SVM and RF provided high accuracies with the transformed bands, especially in the case of the aggregated groups. Even MLC provided high accuracy with 30 training pixels (80.78%, but the use of a smaller training dataset (10 training pixels significantly reduced the accuracy of classification (52.56%. Our results suggest that in alkali landscapes, the application of SVM is a feasible solution, as it provided the highest accuracies compared to RF and MLC

  2. Scalable Packet Classification with Hash Tables

    Science.gov (United States)

    Wang, Pi-Chung

    In the last decade, the technique of packet classification has been widely deployed in various network devices, including routers, firewalls and network intrusion detection systems. In this work, we improve the performance of packet classification by using multiple hash tables. The existing hash-based algorithms have superior scalability with respect to the required space; however, their search performance may not be comparable to other algorithms. To improve the search performance, we propose a tuple reordering algorithm to minimize the number of accessed hash tables with the aid of bitmaps. We also use pre-computation to ensure the accuracy of our search procedure. Performance evaluation based on both real and synthetic filter databases shows that our scheme is effective and scalable and the pre-computation cost is moderate.

  3. SQL based cardiovascular ultrasound image classification.

    Science.gov (United States)

    Nandagopalan, S; Suryanarayana, Adiga B; Sudarshan, T S B; Chandrashekar, Dhanalakshmi; Manjunath, C N

    2013-01-01

    This paper proposes a novel method to analyze and classify the cardiovascular ultrasound echocardiographic images using Naïve-Bayesian model via database OLAP-SQL. Efficient data mining algorithms based on tightly-coupled model is used to extract features. Three algorithms are proposed for classification namely Naïve-Bayesian Classifier for Discrete variables (NBCD) with SQL, NBCD with OLAP-SQL, and Naïve-Bayesian Classifier for Continuous variables (NBCC) using OLAP-SQL. The proposed model is trained with 207 patient images containing normal and abnormal categories. Out of the three proposed algorithms, a high classification accuracy of 96.59% was achieved from NBCC which is better than the earlier methods.

  4. Supervised Learning for Visual Pattern Classification

    Science.gov (United States)

    Zheng, Nanning; Xue, Jianru

    This chapter presents an overview of the topics and major ideas of supervised learning for visual pattern classification. Two prevalent algorithms, i.e., the support vector machine (SVM) and the boosting algorithm, are briefly introduced. SVMs and boosting algorithms are two hot topics of recent research in supervised learning. SVMs improve the generalization of the learning machine by implementing the rule of structural risk minimization (SRM). It exhibits good generalization even when little training data are available for machine training. The boosting algorithm can boost a weak classifier to a strong classifier by means of the so-called classifier combination. This algorithm provides a general way for producing a classifier with high generalization capability from a great number of weak classifiers.

  5. Automatic classification of defects in weld pipe

    International Nuclear Information System (INIS)

    Anuar Mikdad Muad; Mohd Ashhar Hj Khalid; Abdul Aziz Mohamad; Abu Bakar Mhd Ghazali; Abdul Razak Hamzah

    2000-01-01

    With the advancement of computer imaging technology, the image on hard radiographic film can be digitized and stored in a computer and the manual process of defect recognition and classification may be replace by the computer. In this paper a computerized method for automatic detection and classification of common defects in film radiography of weld pipe is described. The detection and classification processes consist of automatic selection of interest area on the image and then classify common defects using image processing and special algorithms. Analysis of the attributes of each defect such as area, size, shape and orientation are carried out by the feature analysis process. These attributes reveal the type of each defect. These methods of defect classification result in high success rate. Our experience showed that sharp film images produced better results

  6. Automatic classification of defects in weld pipe

    International Nuclear Information System (INIS)

    Anuar Mikdad Muad; Mohd Ashhar Khalid; Abdul Aziz Mohamad; Abu Bakar Mhd Ghazali; Abdul Razak Hamzah

    2001-01-01

    With the advancement of computer imaging technology, the image on hard radiographic film can be digitized and stored in a computer and the manual process of defect recognition and classification may be replaced by the computer. In this paper, a computerized method for automatic detection and classification of common defects in film radiography of weld pipe is described. The detection and classification processes consist of automatic selection of interest area on the image and then classify common defects using image processing and special algorithms. Analysis of the attributes of each defect such area, size, shape and orientation are carried out by the feature analysis process. These attributes reveal the type of each defect. These methods of defect classification result in high success rate. Our experience showed that sharp film images produced better results. (Author)

  7. Classification of positive blood cultures

    DEFF Research Database (Denmark)

    Gradel, Kim Oren; Knudsen, Jenny Dahl; Arpi, Magnus

    2012-01-01

    . For each classification, we tabulated episodes derived by the physicians assessment and the computer algorithm and compared 30-day mortality between concordant and discrepant groups with adjustment for age, gender, and comorbidity. RESULTS: Physicians derived 9,482 reference episodes from 21,705 positive......- vs. hospitalonset, whereas there were no material differences within the other comparison groups. CONCLUSIONS: Using data from health administrative registries, we found high agreement between the computer algorithms and the physicians assessments as regards contamination vs. bloodstream infection......ABSTRACT: BACKGROUND: Information from blood cultures is utilized for infection control, public health surveillance, and clinical outcome research. This information can be enriched by physicians assessments of positive blood cultures, which are, however, often available from selected patient groups...

  8. On correlations in IMRT planning aims

    Science.gov (United States)

    Roy, Arkajyoti; Das, Indra J.

    2016-01-01

    The purpose was to study correlations amongst IMRT DVH evaluation points and how their relaxation impacts the overall plan. 100 head‐and‐neck cancer cases, using the Eclipse treatment planning system with the same protocol, are statistically analyzed for PTV, brainstem, and spinal cord. To measure variations amongst the plans, we use (i) interquartile range (IQR) of volume as a function of dose, (ii) interquartile range of dose as a function of volume, and (iii) dose falloff. To determine correlations for institutional and ICRU goals, conditional probabilities and medians are computed. We observe that most plans exceed the median PTV dose (average D50 = 104% prescribed dose). Furthermore, satisfying D50 reduced the probability of also satisfying D98, constituting a negative correlation of these goals. On the other hand, satisfying D50 increased the probability of satisfying D2, suggesting a positive correlation. A positive correlation is also observed between the PTV V105 and V110. Similarly, a positive correlation between the brainstem V45 and V50 is measured by an increase in the conditional median of V45, when V50 is violated. Despite the imposed institutional and international recommendations, significant variations amongst DVH points can occur. Even though DVH aims are evaluated independently, sizable correlations amongst them are possible, indicating that some goals cannot be satisfied concurrently, calling for unbiased plan criteria. PACS number(s): 87.55.dk, 87.53.Bn, 87.55.Qr, 87.55.de. PMID:27929480

  9. Rock suitability classification RSC 2012

    Energy Technology Data Exchange (ETDEWEB)

    McEwen, T. (ed.) [McEwen Consulting, Leicester (United Kingdom); Kapyaho, A. [Geological Survey of Finland, Espoo (Finland); Hella, P. [Saanio and Riekkola, Helsinki (Finland); Aro, S.; Kosunen, P.; Mattila, J.; Pere, T.

    2012-12-15

    This report presents Posiva's Rock Suitability Classification (RSC) system, developed for locating suitable rock volumes for repository design and construction. The RSC system comprises both the revised rock suitability criteria and the procedure for the suitability classification during the construction of the repository. The aim of the classification is to avoid such features of the host rock that may be detrimental to the favourable conditions within the repository, either initially or in the long term. This report also discusses the implications of applying the RSC system for the fulfilment of the regulatory requirements concerning the host rock as a natural barrier and the site's overall suitability for hosting a final repository of spent nuclear fuel.

  10. Rock suitability classification RSC 2012

    International Nuclear Information System (INIS)

    McEwen, T.; Kapyaho, A.; Hella, P.; Aro, S.; Kosunen, P.; Mattila, J.; Pere, T.

    2012-12-01

    This report presents Posiva's Rock Suitability Classification (RSC) system, developed for locating suitable rock volumes for repository design and construction. The RSC system comprises both the revised rock suitability criteria and the procedure for the suitability classification during the construction of the repository. The aim of the classification is to avoid such features of the host rock that may be detrimental to the favourable conditions within the repository, either initially or in the long term. This report also discusses the implications of applying the RSC system for the fulfilment of the regulatory requirements concerning the host rock as a natural barrier and the site's overall suitability for hosting a final repository of spent nuclear fuel

  11. Machine Learning an algorithmic perspective

    CERN Document Server

    Marsland, Stephen

    2009-01-01

    Traditional books on machine learning can be divided into two groups - those aimed at advanced undergraduates or early postgraduates with reasonable mathematical knowledge and those that are primers on how to code algorithms. The field is ready for a text that not only demonstrates how to use the algorithms that make up machine learning methods, but also provides the background needed to understand how and why these algorithms work. Machine Learning: An Algorithmic Perspective is that text.Theory Backed up by Practical ExamplesThe book covers neural networks, graphical models, reinforcement le

  12. Automatic Classification Using Supervised Learning in a Medical Document Filtering Application.

    Science.gov (United States)

    Mostafa, J.; Lam, W.

    2000-01-01

    Presents a multilevel model of the information filtering process that permits document classification. Evaluates a document classification approach based on a supervised learning algorithm, measures the accuracy of the algorithm in a neural network that was trained to classify medical documents on cell biology, and discusses filtering…

  13. A classification framework for drug relapse prediction | Salleh ...

    African Journals Online (AJOL)

    mining algorithms, Artificial Intelligence Neural Network (ANN) is one of the best algorithms to predict relapse among drug addicts. This may help the rehabilitation center to predict relapse individually and the prediction result is hoped to prevent drug addicts from relapse. Keywords: classification; artificial neural network; ...

  14. Cloud field classification based on textural features

    Science.gov (United States)

    Sengupta, Sailes Kumar

    1989-01-01

    An essential component in global climate research is accurate cloud cover and type determination. Of the two approaches to texture-based classification (statistical and textural), only the former is effective in the classification of natural scenes such as land, ocean, and atmosphere. In the statistical approach that was adopted, parameters characterizing the stochastic properties of the spatial distribution of grey levels in an image are estimated and then used as features for cloud classification. Two types of textural measures were used. One is based on the distribution of the grey level difference vector (GLDV), and the other on a set of textural features derived from the MaxMin cooccurrence matrix (MMCM). The GLDV method looks at the difference D of grey levels at pixels separated by a horizontal distance d and computes several statistics based on this distribution. These are then used as features in subsequent classification. The MaxMin tectural features on the other hand are based on the MMCM, a matrix whose (I,J)th entry give the relative frequency of occurrences of the grey level pair (I,J) that are consecutive and thresholded local extremes separated by a given pixel distance d. Textural measures are then computed based on this matrix in much the same manner as is done in texture computation using the grey level cooccurrence matrix. The database consists of 37 cloud field scenes from LANDSAT imagery using a near IR visible channel. The classification algorithm used is the well known Stepwise Discriminant Analysis. The overall accuracy was estimated by the percentage or correct classifications in each case. It turns out that both types of classifiers, at their best combination of features, and at any given spatial resolution give approximately the same classification accuracy. A neural network based classifier with a feed forward architecture and a back propagation training algorithm is used to increase the classification accuracy, using these two classes

  15. Automated Decision Tree Classification of Corneal Shape

    Science.gov (United States)

    Twa, Michael D.; Parthasarathy, Srinivasan; Roberts, Cynthia; Mahmoud, Ashraf M.; Raasch, Thomas W.; Bullimore, Mark A.

    2011-01-01

    Purpose The volume and complexity of data produced during videokeratography examinations present a challenge of interpretation. As a consequence, results are often analyzed qualitatively by subjective pattern recognition or reduced to comparisons of summary indices. We describe the application of decision tree induction, an automated machine learning classification method, to discriminate between normal and keratoconic corneal shapes in an objective and quantitative way. We then compared this method with other known classification methods. Methods The corneal surface was modeled with a seventh-order Zernike polynomial for 132 normal eyes of 92 subjects and 112 eyes of 71 subjects diagnosed with keratoconus. A decision tree classifier was induced using the C4.5 algorithm, and its classification performance was compared with the modified Rabinowitz–McDonnell index, Schwiegerling’s Z3 index (Z3), Keratoconus Prediction Index (KPI), KISA%, and Cone Location and Magnitude Index using recommended classification thresholds for each method. We also evaluated the area under the receiver operator characteristic (ROC) curve for each classification method. Results Our decision tree classifier performed equal to or better than the other classifiers tested: accuracy was 92% and the area under the ROC curve was 0.97. Our decision tree classifier reduced the information needed to distinguish between normal and keratoconus eyes using four of 36 Zernike polynomial coefficients. The four surface features selected as classification attributes by the decision tree method were inferior elevation, greater sagittal depth, oblique toricity, and trefoil. Conclusions Automated decision tree classification of corneal shape through Zernike polynomials is an accurate quantitative method of classification that is interpretable and can be generated from any instrument platform capable of raw elevation data output. This method of pattern classification is extendable to other classification

  16. Comparative analysis of decision tree algorithms on quality of water contaminated with soil

    Directory of Open Access Journals (Sweden)

    Mara Andrea Dota

    2015-02-01

    Full Text Available Agriculture, roads, animal farms and other land uses may modify the water quality from rivers, dams and other surface freshwaters. In the control of the ecological process and for environmental management, it is necessary to quickly and accurately identify surface water contamination (in areas such as rivers and dams with contaminated runoff waters coming, for example, from cultivation and urban areas. This paper presents a comparative analysis of different classification algorithms applied to the data collected from a sample of soil-contaminated water aiming to identify if the water quality classification proposed in this research agrees with reality. The sample was part of a laboratory experiment, which began with a sample of treated water added with increasing fractions of soil. The results show that the proposed classification for water quality in this scenario is coherent, because different algorithms indicated a strong statistic relationship between the classes and their instances, that is, in the classes that qualify the water sample and the values which describe each class. The proposed water classification varies from excelling to very awful (12 classes

  17. Effect of threshold quantization in opportunistic splitting algorithm

    KAUST Repository

    Nam, Haewoon; Alouini, Mohamed-Slim

    2011-01-01

    This paper discusses algorithms to find the optimal threshold and also investigates the impact of threshold quantization on the scheduling outage performance of the opportunistic splitting scheduling algorithm. Since this algorithm aims at finding

  18. Performance Evaluation of Machine Learning Algorithms for Urban Pattern Recognition from Multi-spectral Satellite Images

    Directory of Open Access Journals (Sweden)

    Marc Wieland

    2014-03-01

    Full Text Available In this study, a classification and performance evaluation framework for the recognition of urban patterns in medium (Landsat ETM, TM and MSS and very high resolution (WorldView-2, Quickbird, Ikonos multi-spectral satellite images is presented. The study aims at exploring the potential of machine learning algorithms in the context of an object-based image analysis and to thoroughly test the algorithm’s performance under varying conditions to optimize their usage for urban pattern recognition tasks. Four classification algorithms, Normal Bayes, K Nearest Neighbors, Random Trees and Support Vector Machines, which represent different concepts in machine learning (probabilistic, nearest neighbor, tree-based, function-based, have been selected and implemented on a free and open-source basis. Particular focus is given to assess the generalization ability of machine learning algorithms and the transferability of trained learning machines between different image types and image scenes. Moreover, the influence of the number and choice of training data, the influence of the size and composition of the feature vector and the effect of image segmentation on the classification accuracy is evaluated.

  19. Universal Authenticated Item Monitoring System (AIMS) second generation equipment

    International Nuclear Information System (INIS)

    Schoeneman, J.L.; Baumann, M.J.; Fox, L.J.; Jenkins, C.D.; Perlinsk, A.W.

    1992-01-01

    Sandia National Laboratories (SNL) is in the final stages of developing a Universal Authenticated Item Monitoring System (AIMS). When completed, AIMS will provide applicable agencies in the US government, and those in the International arena, with a secure and convenient method of monitoring the physical status of selected items. The benefit derived from this development activity will be the commercial availability of an item monitoring system with the capability for ''quick set-up'' monitoring, as well as long-term unattended monitoring. The AIMS includes a variety of sensors, a robust and authenticated radio frequency (RF) communication link, a Receiver Processing Unit (RPU), and an inspector-friendly personal computer (PC) interface for collecting, sorting, viewing and archiving pertinent event histories. The system will provide the capability to monitor selected items in a real-time mode, a remotely interrogated mode, and a stand-alone, unattended data collection mode. The sensor suite under development includes advanced motion sensors, interior volumetric intrusion sensors, Re-usable, In-situ Verifiable Authenticated (RIVA) fiber-optic seal sensors, generic utility sensors (to accommodate contact closure inputs), and radiation and environmental sensors. A new generation authentication algorithm recently has been developed that provides a high degree of system security 121. The AIMS has potential safeguards applications in the areas of arms control and treaty verification military asset control, International Atomic Energy Agency (IAEA) and Euratom safeguards verification activities, as well as domestic nuclear safeguard activities. Commercial applications could include high-value inventory control and security systems. This paper describes the second-generation AIMS along with its recently expanded sensor suite and enhanced data collection capabilities

  20. Progressive Classification Using Support Vector Machines

    Science.gov (United States)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user