WorldWideScience

Sample records for classification algorithm aimed

  1. Development and Validation of a Spike Detection and Classification Algorithm Aimed at Implementation on Hardware Devices

    OpenAIRE

    Ferrigno, G.; Ghezzi, D; Pedrocchi, A.; Biffi, E

    2010-01-01

    Neurons cultured in vitro on MicroElectrode Array (MEA) devices connect to each other, forming a network. To study electrophysiological activity and long term plasticity effects, long period recording and spike sorter methods are needed. Therefore, on-line and real time analysis, optimization of memory use and data transmission rate improvement become necessary. We developed an algorithm for amplitude-threshold spikes detection, whose performances were verified with (a) statistical analysis o...

  2. Development and validation of a spike detection and classification algorithm aimed at implementation on hardware devices.

    Science.gov (United States)

    Biffi, E; Ghezzi, D; Pedrocchi, A; Ferrigno, G

    2010-01-01

    Neurons cultured in vitro on MicroElectrode Array (MEA) devices connect to each other, forming a network. To study electrophysiological activity and long term plasticity effects, long period recording and spike sorter methods are needed. Therefore, on-line and real time analysis, optimization of memory use and data transmission rate improvement become necessary. We developed an algorithm for amplitude-threshold spikes detection, whose performances were verified with (a) statistical analysis on both simulated and real signal and (b) Big O Notation. Moreover, we developed a PCA-hierarchical classifier, evaluated on simulated and real signal. Finally we proposed a spike detection hardware design on FPGA, whose feasibility was verified in terms of CLBs number, memory occupation and temporal requirements; once realized, it will be able to execute on-line detection and real time waveform analysis, reducing data storage problems. PMID:20300592

  3. A Gene Selection Algorithm using Bayesian Classification Approach

    OpenAIRE

    Alok Sharma; Kuldip K. Paliwal

    2012-01-01

    In this study, we propose a new feature (or gene) selection algorithm using Bayes classification approach. The algorithm can find gene subset crucial for cancer classification problem. Problem statement: Gene identification plays important role in human cancer classification problem. Several feature selection algorithms have been proposed for analyzing and understanding influential genes using gene expression profiles. Approach: The feature selection algorithms aim to explore genes that are c...

  4. An Ensemble Classification Algorithm for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    K.Kavitha

    2014-04-01

    Full Text Available Hyperspectral image analysis has been used for many purposes in environmental monitoring, remote sensing, vegetation research and also for land cover classification. A hyperspectral image consists of many layers in which each layer represents a specific wavelength. The layers stack on top of one another making a cube-like image for entire spectrum. This work aims to classify the hyperspectral images and to produce a thematic map accurately. Spatial information of hyperspectral images is collected by applying morphological profile and local binary pattern. Support vector machine is an efficient classification algorithm for classifying the hyperspectral images. Genetic algorithm is used to obtain the best feature subjected for classification. Selected features are classified for obtaining the classes and to produce a thematic map. Experiment is carried out with AVIRIS Indian Pines and ROSIS Pavia University. Proposed method produces accuracy as 93% for Indian Pines and 92% for Pavia University.

  5. Projection Classification Based Iterative Algorithm

    Science.gov (United States)

    Zhang, Ruiqiu; Li, Chen; Gao, Wenhua

    2015-05-01

    Iterative algorithm has good performance as it does not need complete projection data in 3D image reconstruction area. It is possible to be applied in BGA based solder joints inspection but with low convergence speed which usually acts with x-ray Laminography that has a worse reconstruction image compared to the former one. This paper explores to apply one projection classification based method which tries to separate the object to three parts, i.e. solute, solution and air, and suppose that the reconstruction speed decrease from solution to two other parts on both side lineally. And then SART and CAV algorithms are improved under the proposed idea. Simulation experiment result with incomplete projection images indicates the fast convergence speed of the improved iterative algorithms and the effectiveness of the proposed method. Less the projection images, more the superiority is also founded.

  6. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    OpenAIRE

    Dong-sheng Liu; Shu-jiang Fan

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context cla...

  7. Classification algorithms using adaptive partitioning

    KAUST Repository

    Binev, Peter

    2014-12-01

    © 2014 Institute of Mathematical Statistics. Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335.1353; Mach. Learn. 66 (2007) 209.242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter - of margin conditions and a rate s of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness β of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Holder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.

  8. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    Science.gov (United States)

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  9. ONLINE REGULARIZED GENERALIZED GRADIENT CLASSIFICATION ALGORITHMS

    Institute of Scientific and Technical Information of China (English)

    Leilei Zhang; Baohui Sheng; Jianli Wang

    2010-01-01

    This paper considers online classification learning algorithms for regularized classification schemes with generalized gradient.A novel capacity independent approach is presented.It verifies the strong convergence of sizes and yields satisfactory convergence rates for polynomially decaying step sizes.Compared with the gradient schemes,this al-gorithm needs only less additional assumptions on the loss function and derives a stronger result with respect to the choice of step sizes and the regularization parameters.

  10. Aime

    OpenAIRE

    Vermeulen, Christine

    2015-01-01

    Code INSEE de la commune : 73006Lien Atlas (MCC) :http://atlas.patrimoines.culture.fr/atlas/trunk/index.php?ap_theme=DOM_2.01.02&ap_bbox=6.561;45.491;6.670;45.620 Le diagnostic réalisé à Aime, au lieu-dit Le Poëncet, rue du Prince, concerne une surface de 250 m2. Cette emprise est localisée en limite occidentale du forum antique d’Aime. Deux murs de facture grossière, perpendiculaires, orientés est-ouest et nord-sud ont été mis au jour. En l’absence de mobilier pour la dater, seule leur orien...

  11. Aime

    OpenAIRE

    Chemin, René; Feuillet, Marie-Pierre

    2013-01-01

    Identifiant de l'opération archéologique : 9728 Date de l'opération : 2008 (SP) La création d'un lotissement au lieu-dit Les Chaudannes a suscité un diagnostic mené par P.-J. Rey dont les résultats ont conduit le service régional de l'Archéologie à prescrire un fouille. La commune d'Aime a déjà livré de nombreux vestiges pré et protohistoriques : occupation néolithique au Dos de Borgaz, à Saint-Sigismond, au Replat (nécropole de type Chamblandes) et à proximité du présent projet, occupation d...

  12. Behavior Classification Algorithms at Intersections

    OpenAIRE

    Aoude, Georges; Desaraju, Vishnu Rajeswar; Stephens, Lauren H.; How, Jonathan P.

    2011-01-01

    The ability to classify driver behavior lays the foundation for more advanced driver assistance systems. Improving safety at intersections has also been identified as high priority due to the large number of intersection related fatalities. This paper focuses on developing algorithms for estimating driver behavior at road intersections. It introduces two classes of algorithms that can classify drivers as compliant or violating. They are based on 1) Support Vector Machines (SVM) and 2) Hidden ...

  13. Mining Online Store Client Assessment Classification Rules with Genetic Algorithms

    OpenAIRE

    Galinina, A; Paršutins, S

    2011-01-01

    The paper presents the results of the research into algorithms that are not meant to mine classification rules, yet they contain all the necessary functions which allow us to use them for mining classification rules such as Genetic algorithm (GA). The main task of the research is associated with the application of GA to classification rule mining. A classic GA was modified to match the chosen classification task and was compared with other popular classification algorithms – JRip, J48 and Nai...

  14. An Experimental Comparative Study on Three Classification Algorithms

    Institute of Scientific and Technical Information of China (English)

    蔡巍; 王永成; 李伟; 尹中航

    2003-01-01

    Classification algorithm is one of the key techniques to affect text automatic classification system's performance, play an important role in automatic classification research area. This paper comparatively analyzed k-NN. VSM and hybrid classification algorithm presented by our research group. Some 2000 pieces of Internet news provided by ChinaInfoBank are used in the experiment. The result shows that the hybrid algorithm's performance presented by the groups is superior to the other two algorithms.

  15. Automatic modulation classification principles, algorithms and applications

    CERN Document Server

    Zhu, Zhechen

    2014-01-01

    Automatic Modulation Classification (AMC) has been a key technology in many military, security, and civilian telecommunication applications for decades. In military and security applications, modulation often serves as another level of encryption; in modern civilian applications, multiple modulation types can be employed by a signal transmitter to control the data rate and link reliability. This book offers comprehensive documentation of AMC models, algorithms and implementations for successful modulation recognition. It provides an invaluable theoretical and numerical comparison of AMC algo

  16. A new classification algorithm based on RGH-tree search

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In this paper, we put forward a new classification algorithm based on RGH-Tree search and perform the classification analysis and comparison study. This algorithm can save computing resource and increase the classification efficiency. The experiment shows that this algorithm can get better effect in dealing with three dimensional multi-kind data. We find that the algorithm has better generalization ability for small training set and big testing result.

  17. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  18. A non-linear learning & classification algorithm that achieves full training accuracy with stellar classification accuracy

    OpenAIRE

    Khogali, Rashid

    2014-01-01

    A fast Non-linear and non-iterative learning and classification algorithm is synthesized and validated. This algorithm named the "Reverse Ripple Effect(R.R.E)", achieves 100% learning accuracy but is computationally expensive upon classification. The R.R.E is a (deterministic) algorithm that super imposes Gaussian weighted functions on training points. In this work, the R.R.E algorithm is compared against known learning and classification techniques/algorithms such as: the Perceptron Criterio...

  19. Machine Learning Algorithms in Web Page Classification

    Directory of Open Access Journals (Sweden)

    W.A.AWAD

    2012-11-01

    Full Text Available In this paper we use machine learning algorithms like SVM, KNN and GIS to perform a behaviorcomparison on the web pages classifications problem, from the experiment we see in the SVM with smallnumber of negative documents to build the centroids has the smallest storage requirement and the least online test computation cost. But almost all GIS with different number of nearest neighbors have an evenhigher storage requirement and on line test computation cost than KNN. This suggests that some futurework should be done to try to reduce the storage requirement and on list test cost of GIS.

  20. An SMP soft classification algorithm for remote sensing

    Science.gov (United States)

    Phillips, Rhonda D.; Watson, Layne T.; Easterling, David R.; Wynne, Randolph H.

    2014-07-01

    This work introduces a symmetric multiprocessing (SMP) version of the continuous iterative guided spectral class rejection (CIGSCR) algorithm, a semiautomated classification algorithm for remote sensing (multispectral) images. The algorithm uses soft data clusters to produce a soft classification containing inherently more information than a comparable hard classification at an increased computational cost. Previous work suggests that similar algorithms achieve good parallel scalability, motivating the parallel algorithm development work here. Experimental results of applying parallel CIGSCR to an image with approximately 108 pixels and six bands demonstrate superlinear speedup. A soft two class classification is generated in just over 4 min using 32 processors.

  1. Multiscale modeling for classification of SAR imagery using hybrid EM algorithm and genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    Xianbin Wen; Hua Zhang; Jianguang Zhang; Xu Jiao; Lei Wang

    2009-01-01

    A novel method that hybridizes genetic algorithm (GA) and expectation maximization (EM) algorithm for the classification of syn-thetic aperture radar (SAR) imagery is proposed by the finite Gaussian mixtures model (GMM) and multiscale autoregressive (MAR)model. This algorithm is capable of improving the global optimality and consistency of the classification performance. The experiments on the SAR images show that the proposed algorithm outperforms the standard EM method significantly in classification accuracy.

  2. Support vector classification algorithm based on variable parameter linear programming

    Institute of Scientific and Technical Information of China (English)

    Xiao Jianhua; Lin Jian

    2007-01-01

    To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed.In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model.The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given.An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.

  3. Improved RMR Rock Mass Classification Using Artificial Intelligence Algorithms

    Science.gov (United States)

    Gholami, Raoof; Rasouli, Vamegh; Alimoradi, Andisheh

    2013-09-01

    Rock mass classification systems such as rock mass rating (RMR) are very reliable means to provide information about the quality of rocks surrounding a structure as well as to propose suitable support systems for unstable regions. Many correlations have been proposed to relate measured quantities such as wave velocity to rock mass classification systems to limit the associated time and cost of conducting the sampling and mechanical tests conventionally used to calculate RMR values. However, these empirical correlations have been found to be unreliable, as they usually overestimate or underestimate the RMR value. The aim of this paper is to compare the results of RMR classification obtained from the use of empirical correlations versus machine-learning methodologies based on artificial intelligence algorithms. The proposed methods were verified based on two case studies located in northern Iran. Relevance vector regression (RVR) and support vector regression (SVR), as two robust machine-learning methodologies, were used to predict the RMR for tunnel host rocks. RMR values already obtained by sampling and site investigation at one tunnel were taken into account as the output of the artificial networks during training and testing phases. The results reveal that use of empirical correlations overestimates the predicted RMR values. RVR and SVR, however, showed more reliable results, and are therefore suggested for use in RMR classification for design purposes of rock structures.

  4. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Science.gov (United States)

    Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933

  5. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  6. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    OpenAIRE

    C. Fernandez-Lozano; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected.

  7. Discovering comprehensible classification rules with a genetic algorithm

    OpenAIRE

    Fidelis, M.V.; Lopes, Heitor S.; Freitas, Alex. A.

    2000-01-01

    Presents a classification algorithm based on genetic algorithms (GAs) that discovers comprehensible IF-THEN rules, in the spirit of data mining. The proposed GA has a flexible chromosome encoding, where each chromosome corresponds to a classification rule. Although the number of genes (the genotype) is fixed, the number of rule conditions (the phenotype) is variable. The GA also has specific mutation operators for this chromosome encoding. The algorithm was evaluated on two public-domain real...

  8. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    Directory of Open Access Journals (Sweden)

    Hongxia Li

    2013-08-01

    Full Text Available With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It realizes the independent component analysis of complex network text classification. Through the ICA clustering algorithm of independent component, it realizes character words clustering extraction of text classification. The visualization of text retrieval is improved. Finally, we make a comparative analysis of collocation algorithm and ICA clustering algorithm through text classification and keyword search experiment. The paper gives the clustering degree of algorithm and accuracy figure. Through simulation analysis, we find that ICA clustering algorithm increases by 1.2% comparing with text classification clustering degree. Accuracy can be improved by 11.1% at most. It improves the efficiency and accuracy of text classification retrieval. It also provides a theoretical reference for text retrieval classification of eBook

  9. Comparative Analysis of Serial Decision Tree Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Matthew Nwokejizie Anyanwu

    2009-09-01

    Full Text Available Classification of data objects based on a predefined knowledge of the objects is a data mining and knowledge management technique used in grouping similar data objects together. It can be defined as supervised learning algorithms as it assigns class labels to data objects based on the relationship between the data items with a pre-defined class label. Classification algorithms have a wide range of applications like churn prediction, fraud detection, artificial intelligence, and credit card rating etc. Also there are many classification algorithms available in literature but decision trees is the most commonly used because of its ease of implementation and easier to understand compared to other classification algorithms. Decision Tree classification algorithm can be implemented in a serial or parallel fashion based on the volume of data, memory space available on the computer resource and scalability of the algorithm. In this paper we will review the serial implementations of the decision tree algorithms, identify those that are commonly used. We will also use experimental analysis based on sample data records (Statlog data sets to evaluate the performance of the commonly used serial decision tree algorithms

  10. The software application and classification algorithms for welds radiograms analysis

    Science.gov (United States)

    Sikora, R.; Chady, T.; Baniukiewicz, P.; Grzywacz, B.; Lopato, P.; Misztal, L.; Napierała, L.; Piekarczyk, B.; Pietrusewicz, T.; Psuj, G.

    2013-01-01

    The paper presents a software implementation of an Intelligent System for Radiogram Analysis (ISAR). The system has to support radiologists in welds quality inspection. The image processing part of software with a graphical user interface and a welds classification part are described with selected classification results. Classification was based on a few algorithms: an artificial neural network, a k-means clustering, a simplified k-means and a rough sets theory.

  11. Comparative Analysis of Serial Decision Tree Classification Algorithms

    OpenAIRE

    Matthew Nwokejizie Anyanwu; Sajjan Shiva

    2009-01-01

    Classification of data objects based on a predefined knowledge of the objects is a data mining and knowledge management technique used in grouping similar data objects together. It can be defined as supervised learning algorithms as it assigns class labels to data objects based on the relationship between the data items with a pre-defined class label. Classification algorithms have a wide range of applications like churn prediction, fraud detection, artificial intelligence, and credit card ra...

  12. Comparative Evaluation of Packet Classification Algorithms, with Implementation

    Directory of Open Access Journals (Sweden)

    Hediyeh AmirJahanshahi Sistani

    2014-05-01

    Full Text Available in a realm of ever-increasing Internet connectivity, together with swelling computer security threats, security-cognizant network applications technology is gaining widespread popularity. Packet classifiers are extensively employed for numerous network applications in different types of network devices such as Firewalls and Router, among others. Appreciating the tangible performance of recommended packet classifiers is a prerequisite for both algorithm creators and consumers. However, this is occasionally challenging to accomplish. Each innovative algorithm published is assessed from diverse perceptions and is founded on different suppositions. Devoid of a mutual foundation, it is virtually impossible to compare different algorithms directly. In the interim, it too aids the system implementers to effortlessly pick the most suitable algorithm for their actual applications. Electing an ineffectual algorithm for an application can invite major expenditures. This is particularly true for packet classification in network routers, as packet classification is fundamentally a tough problem and all current algorithms are constructed on specific heuristics and filter set characteristics. The performance of the packet classification subsystem is vital for the aggregate success of the network routers. In this study, we have piloted an advanced exploration of the existing algorithms to provide a comparative evaluation of a number of known classification algorithms that have been considered for both software and hardware implementation. We have explained our earlier suggested DimCut packet classification algorithm, and related it with the BV, HiCuts and HyperCuts decision tree-based packet classification algorithms with the comparative evaluation analysis. This comparison has been carried out on implementations based on the same principles and design choices from different sources. Performance measurements have been obtained by feeding the implemented

  13. A Study of Different Quality Evaluation Functions in the cAnt-MinerPB Classification Algorithm

    OpenAIRE

    Medland, Matthew; Otero, Fernando E. B.

    2012-01-01

    Ant colony optimization (ACO) algorithms for classification in general employ a sequential covering strategy to create a list of classification rules. A key component in this strategy is the selection of the rule quality function, since the algorithm aims at creating one rule at a time using an ACO-based procedure to search the best rule. Recently, an improved strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules inst...

  14. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification

    Directory of Open Access Journals (Sweden)

    R. Sathya

    2013-02-01

    Full Text Available This paper presents a comparative account of unsupervised and supervised learning models and their pattern classification evaluations as applied to the higher education scenario. Classification plays a vital role in machine based learning algorithms and in the present study, we found that, though the error back-propagation learning algorithm as provided by supervised learning model is very efficient for a number of non-linear real-time problems, KSOM of unsupervised learning model, offers efficient solution and classification in the present study.

  15. An Algorithm for Classification of 3-D Spherical Spatial Points

    Institute of Scientific and Technical Information of China (English)

    ZHU Qing-xin; Mudur SP; LIU Chang; PENG Bo; WU Jia

    2003-01-01

    This paper presents a highly efficient algorithm for classification of 3D points sampled from lots of spheres, using neighboring relations of spatial points to construct a neighbor graph from points cloud. This algorithm can be used in object recognition, computer vision, and CAD model building, etc.

  16. Using Genetic Algorithms for Texts Classification Problems

    Directory of Open Access Journals (Sweden)

    A. A. Shumeyko

    2009-01-01

    Full Text Available The avalanche quantity of the information developed by mankind has led to concept of automation of knowledge extraction – Data Mining ([1]. This direction is connected with a wide spectrum of problems - from recognition of the fuzzy set to creation of search machines. Important component of Data Mining is processing of the text information. Such problems lean on concept of classification and clustering ([2]. Classification consists in definition of an accessory of some element (text to one of in advance created classes. Clustering means splitting a set of elements (texts on clusters which quantity are defined by localization of elements of the given set in vicinities of these some natural centers of these clusters. Realization of a problem of classification initially should lean on the given postulates, basic of which – the aprioristic information on primary set of texts and a measure of affinity of elements and classes.

  17. FEATURES EXTRACTION ALGORITHM FROM SGML FOR CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Zailani Abdullah

    2007-06-01

    Full Text Available The basic phases in text categorization include preprocessing features, extracting relevant features against the features in a database, and finally categorizing a set of documents into predefined categories. Most of the researches in text categorization are focusing more on the development of algorithms and computer techniques. An algorithm for pre-processing features is seem to be like a "black-box" and ignored by them. Thus, it is significant and worthwhile to develop an algorithm for preprocessing features and finally can be used by other beginners before going in depth in the field of text categorization. This research proposes an algorithm for preprocessing features with capability of Microsoft .NET framework technology. The actual implementation shows that, this algorithm can extract interested features from the standard corpus of collection and upload into a relational database.

  18. Discovering Fuzzy Censored Classification Rules (Fccrs: A Genetic Algorithm Approach

    Directory of Open Access Journals (Sweden)

    Renu Bala

    2012-07-01

    Full Text Available Classification Rules (CRs are often discovered in the form of ‘If-Then’ Production Rules (PRs. PRs, being high level symbolic rules, are comprehensible and easy to implement. However, they are not capable of dealing with cognitive uncertainties like vagueness and ambiguity imperative to real word decision making situations. Fuzzy Classification Rules (FCRs based on fuzzy logic provide a framework for a flexible human like reasoning involving linguistic variables. Moreover, a classification system consisting of simple ‘If-Then’ rules is not competent in handling exceptional circumstances. In this paper, we propose a Genetic Algorithm approach to discover Fuzzy Censored Classification Rules (FCCRs. A FCCR is a Fuzzy Classification Rule (FCRs augmented with censors. Here, censors are exceptional conditions in which the behaviour of a rule gets modified. The proposed algorithm works in two phases. In the first phase, the Genetic Algorithm discovers Fuzzy Classification Rules. Subsequently, these Fuzzy Classification Rules are mutated to produce FCCRs in the second phase. The appropriate encoding scheme, fitness function and genetic operators are designed for the discovery of FCCRs. The proposed approach for discovering FCCRs is then illustrated on a synthetic dataset.

  19. Discovering Fuzzy Censored Classification Rules (Fccrs: A Genetic Algorithm Approach

    Directory of Open Access Journals (Sweden)

    Renu Bala

    2012-08-01

    Full Text Available Classification Rules (CRs are often discovered in the form of ‘If-Then’ Production Rules (PRs. PRs, beinghigh level symbolic rules, are comprehensible and easy to implement. However, they are not capable ofdealing with cognitive uncertainties like vagueness and ambiguity imperative to real word decision makingsituations. Fuzzy Classification Rules (FCRs based on fuzzy logic provide a framework for a flexiblehuman like reasoning involving linguistic variables. Moreover, a classification system consisting of simple‘If-Then’ rules is not competent in handling exceptional circumstances. In this paper, we propose aGenetic Algorithm approach to discover Fuzzy Censored Classification Rules (FCCRs. A FCCR is aFuzzy Classification Rule (FCRs augmented with censors. Here, censors are exceptional conditions inwhich the behaviour of a rule gets modified. The proposed algorithm works in two phases. In the firstphase, the Genetic Algorithm discovers Fuzzy Classification Rules. Subsequently, these FuzzyClassification Rules are mutated to produce FCCRs in the second phase. The appropriate encodingscheme, fitness function and genetic operators are designed for the discovery of FCCRs. The proposedapproach for discovering FCCRs is then illustrated on a synthetic dataset.

  20. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    OpenAIRE

    Hongxia Li

    2013-01-01

    With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It...

  1. Intrusion Detection in Mobile Ad Hoc Networks Using Classification Algorithms

    CERN Document Server

    Mitrokotsa, Aikaterini; Douligeris, Christos

    2008-01-01

    In this paper we present the design and evaluation of intrusion detection models for MANETs using supervised classification algorithms. Specifically, we evaluate the performance of the MultiLayer Perceptron (MLP), the Linear classifier, the Gaussian Mixture Model (GMM), the Naive Bayes classifier and the Support Vector Machine (SVM). The performance of the classification algorithms is evaluated under different traffic conditions and mobility patterns for the Black Hole, Forging, Packet Dropping, and Flooding attacks. The results indicate that Support Vector Machines exhibit high accuracy for almost all simulated attacks and that Packet Dropping is the hardest attack to detect.

  2. Algorithms for classification of combinatorial objects

    OpenAIRE

    Kaski, Petteri

    2005-01-01

    A recurrently occurring problem in combinatorics is the need to completely characterize a finite set of finite objects implicitly defined by a set of constraints. For example, one could ask for a list of all possible ways to schedule a football tournament for twelve teams: every team is to play against every other team during an eleven-round tournament, such that every team plays exactly one game in every round. Such a characterization is called a classification for the objects of interest. C...

  3. A Hybrid Applied Optimization Algorithm for Training Multi-Layer Neural Networks in the Data Classification

    OpenAIRE

    ÖRKÇÜ, H. Hasan; Mustafa İsa DOĞAN; Örkçü, Mediha

    2015-01-01

    Backpropagation algorithm is classical technique used in the training of the artificial neural networks. Since this algorithm has many disadvantages, the training of the neural networks has been implemented with various optimization methods. In this paper, a hybrid intelligent model, i.e., hybridGSA, is developed to training artificial neural networks (ANN) and undertaking data classification problems. The hybrid intelligent system aims to exploit the advantages of genetic and simulated annea...

  4. Incremental learning algorithm for spike pattern classification

    OpenAIRE

    Mohemmed, A; Kasabov, N

    2012-01-01

    In a previous work (Mohemmed et al.), the authors proposed a supervised learning algorithm to train a spiking neuron to associate input/output spike patterns. In this paper, the association learning rule is applied in training a single layer of spiking neurons to classify multiclass spike patterns whereby the neurons are trained to recognize an input spike pattern by emitting a predetermined spike train. The training is performed in incremental fashion, i.e. the synaptic weights are adjusted ...

  5. Online Network Traffic Classification Algorithm Based on RVM

    Directory of Open Access Journals (Sweden)

    Zhang Qunhui

    2013-06-01

    Full Text Available Since compared with the Support Vector Machine (SVM, the Relevance Vector Machine (RVM not only has the advantage of avoiding the over- learn which is the characteristic of the SVM, but also greatly reduces the amount of computation of the kernel function and avoids the defects of the SVM that the scarcity is not strong, the large amount of calculation as well as the kernel function must satisfy the Mercer's condition and that human empirically determined parameters, so we proposed a new online traffic classification algorithm base on the RVM for this purpose. Through the analysis of the basic principles of RVM and the steps of the modeling, we made use of the training traffic classification model of the RVM to identify the network traffic in the real time through this model and the “port number+ DPI”. When the RVM predicts that the probability is in the query interval, we jointly used the "port number" and "DPI". Finally, we made a detailed experimental validation which shows that: compared with the Support Vector Machine (SVM network traffic classification algorithm, this algorithm can achieve the online network traffic classification, and the classification predication probability is greatly improved.

  6. Optimal classification of standoff bioaerosol measurements using evolutionary algorithms

    Science.gov (United States)

    Nyhavn, Ragnhild; Moen, Hans J. F.; Farsund, Øystein; Rustad, Gunnar

    2011-05-01

    Early warning systems based on standoff detection of biological aerosols require real-time signal processing of a large quantity of high-dimensional data, challenging the systems efficiency in terms of both computational complexity and classification accuracy. Hence, optimal feature selection is essential in forming a stable and efficient classification system. This involves finding optimal signal processing parameters, characteristic spectral frequencies and other data transformations in large magnitude variable space, stating the need for an efficient and smart search algorithm. Evolutionary algorithms are population-based optimization methods inspired by Darwinian evolutionary theory. These methods focus on application of selection, mutation and recombination on a population of competing solutions and optimize this set by evolving the population of solutions for each generation. We have employed genetic algorithms in the search for optimal feature selection and signal processing parameters for classification of biological agents. The experimental data were achieved with a spectrally resolved lidar based on ultraviolet laser induced fluorescence, and included several releases of 5 common simulants. The genetic algorithm outperform benchmark methods involving analytic, sequential and random methods like support vector machines, Fisher's linear discriminant and principal component analysis, with significantly improved classification accuracy compared to the best classical method.

  7. Classification algorithm of Web document in ionization radiation

    International Nuclear Information System (INIS)

    Resources in the Internet is numerous. It is one of research directions of Web mining (WM) how to mine the resource of some calling or trade more efficiently. The paper studies the classification of Web document in ionization radiation (IR) based on the algorithm of Bayes, Rocchio, Widrow-Hoff, and analyses the result of trial effect. (authors)

  8. A Study on Sub-pixel Interpolation Filtering Algorithm and Hardware Structural Design Aiming at HEVC

    Directory of Open Access Journals (Sweden)

    Wang Gang

    2013-07-01

    Full Text Available Aiming at the new-generation video compression standard being formulated—HEVC, a kind of sub-pixel interpolation filtering algorithm is proposed (luminance: 1/4 precision, chrominance: 1/8 precision. Based on the algorithm, a hardware design with pipeline structure and high degree of parallelism is put forward. The hardware overhead is reduced by multiplex Wiener filter and the reduction of the size of register array. And the interpolation order of vertical priority is adopted to reduce the reading bandwidth of the storage. It is indicated from the performance analysis that this interpolation structure possesses better performance and smaller hardware overhead. This design also takes full consideration of the balance between speed and area, meeting the requirements of processing standard definition and high definition video image.  

  9. Improved Feature Weight Algorithm and Its Application to Text Classification

    Directory of Open Access Journals (Sweden)

    Songtao Shang

    2016-01-01

    Full Text Available Text preprocessing is one of the key problems in pattern recognition and plays an important role in the process of text classification. Text preprocessing has two pivotal steps: feature selection and feature weighting. The preprocessing results can directly affect the classifiers’ accuracy and performance. Therefore, choosing the appropriate algorithm for feature selection and feature weighting to preprocess the document can greatly improve the performance of classifiers. According to the Gini Index theory, this paper proposes an Improved Gini Index algorithm. This algorithm constructs a new feature selection and feature weighting function. The experimental results show that this algorithm can improve the classifiers’ performance effectively. At the same time, this algorithm is applied to a sensitive information identification system and has achieved a good result. The algorithm’s precision and recall are higher than those of traditional ones. It can identify sensitive information on the Internet effectively.

  10. Novel classification method for remote sensing images based on information entropy discretization algorithm and vector space model

    Science.gov (United States)

    Xie, Li; Li, Guangyao; Xiao, Mang; Peng, Lei

    2016-04-01

    Various kinds of remote sensing image classification algorithms have been developed to adapt to the rapid growth of remote sensing data. Conventional methods typically have restrictions in either classification accuracy or computational efficiency. Aiming to overcome the difficulties, a new solution for remote sensing image classification is presented in this study. A discretization algorithm based on information entropy is applied to extract features from the data set and a vector space model (VSM) method is employed as the feature representation algorithm. Because of the simple structure of the feature space, the training rate is accelerated. The performance of the proposed method is compared with two other algorithms: back propagation neural networks (BPNN) method and ant colony optimization (ACO) method. Experimental results confirm that the proposed method is superior to the other algorithms in terms of classification accuracy and computational efficiency.

  11. Pyroelectric sensors and classification algorithms for border / perimeter security

    Science.gov (United States)

    Jacobs, Eddie L.; Chari, Srikant; Halford, Carl; McClellan, Harry

    2009-09-01

    It has been shown that useful classifications can be made with a sensor that detects the shape of moving objects. This type of sensor has been referred to as a profiling sensor. In this research, two configurations of pyroelectric detectors are considered for use in a profiling sensor, a linear array and a circular array. The linear array produces crude images representing the shape of objects moving through the field of view. The circular array produces a temporal motion vector. A simulation of the output of each detector configuration is created and used to generate simulated profiles. The simulation is performed by convolving the pyroelectric detector response with images derived from calibrated thermal infrared video sequences. Profiles derived from these simulations are then used to train and test classification algorithms. Classification algorithms examined in this study include a naive Bayesian (NB) classifier and Linear discriminant analysis (LDA). Each classification algorithm assumes a three class problem where profiles are classified as either human, animal, or vehicle. Simulation results indicate that these systems can reliably classify outputs from these types of sensors. These types of sensors can be used in applications involving border or perimeter security.

  12. A multi-feature based morphological algorithm for ST shape classification.

    Science.gov (United States)

    Fan, Shuqiong; Miao, Fen; Ma, Ruiqing; Li, Ye; Huang, Xuhui

    2015-08-01

    Abnormal ST segment is an important parameter for the diagnosis of myocardial ischemia and other heart diseases. As most abnormal ST segments sustain for only a few seconds, it is impractical for the doctors to detect and classify abnormal ones manually on time. Even though many ST segment classification algorithms are proposed to meet the rising demand of automatic myocardial ischemia diagnosis, they are often with lower recognition rate. The aim of this study is to detect abnormal ST segments precisely and classify them into more categories, and thus provide more detailed category information to help the clinicians make decisions. This study sums up ten common abnormal ST segments according to the clinical ECG records and proposes a morphological classification algorithm of ST segment based on multi-features. This algorithm consists of two parts: Feature points extraction and ST segment classification. In the first part, R wave is detected by using the 2B-spline wavelet transform, and mode-filtering method and morphological characteristics are used for other feature points extraction. In the ST segment classification process, ST segment level, variance, slope value, number of convex/concave points and other feature parameters are employed to classify the ST segment. This algorithm can classify abnormal ST segments into ten categories above. We evaluated the performance of the proposed algorithm based on ECG data in the European ST-T database. The global recognition rate of 92.7% and the best accuracy of 97% demonstrated the effectiveness of the proposed solution. PMID:26737618

  13. Assessing the Accuracy of Prediction Algorithms for Classification

    DEFF Research Database (Denmark)

    Baldi, P.; Brunak, Søren; Chauvin, Y.; Nielsen, Henrik

    2000-01-01

    We provide a unified overview of methods that currently are widely used to assess the accuracy of prediction algorithms, from raw percentages, quadratic error measures and other distances, ann correlation coefficients, and to information theoretic measures such as relative entropy and mutual...... information. We briefly discuss the advantages and disadvantages of each approach. For classification tasks, we derive new learning algorithms for the design of prediction systems by directly optimising the correlation coefficient. We observe and prove several results relating sensitivity nod specificity of...... optimal systems. While the principles are general, we illustrate the applicability on specific problems such as protein secondary structure and signal peptide prediction....

  14. Classification of Different Species Families using Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    D.Chandravathi

    2010-08-01

    Full Text Available The division of similar objects into groups is known as Clustering. The main objective of this implementation is classification of DNA sequences related to different species and their families using Clustering Algorithm- Leader-sub leader algorithm. Clustering is done with the help of threshold value of scoring matrix. It is another simple and efficient technique that may help to find family, uperfamily and sub-family by generating sub clusters. From thisanalysis there may be a chance that members in sub-cluster may be affected if one of the leader clusters gets affected.

  15. Optimized features selection for gender classification using optimization algorithms

    OpenAIRE

    KHAN, Sajid Ali; Nazir, Muhammad; RIAZ, Naveed

    2013-01-01

    Optimized feature selection is an important task in gender classification. The optimized features not only reduce the dimensions, but also reduce the error rate. In this paper, we have proposed a technique for the extraction of facial features using both appearance-based and geometric-based feature extraction methods. The extracted features are then optimized using particle swarm optimization (PSO) and the bee algorithm. The geometric-based features are optimized by PSO with ensem...

  16. Protein Sequence Classification with Improved Extreme Learning Machine Algorithms

    Directory of Open Access Journals (Sweden)

    Jiuwen Cao

    2014-01-01

    Full Text Available Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms.

  17. The Optimization of Trained and Untrained Image Classification Algorithms for Use on Large Spatial Datasets

    Science.gov (United States)

    Kocurek, Michael J.

    2005-01-01

    The HARVIST project seeks to automatically provide an accurate, interactive interface to predict crop yield over the entire United States. In order to accomplish this goal, large images must be quickly and automatically classified by crop type. Current trained and untrained classification algorithms, while accurate, are highly inefficient when operating on large datasets. This project sought to develop new variants of two standard trained and untrained classification algorithms that are optimized to take advantage of the spatial nature of image data. The first algorithm, harvist-cluster, utilizes divide-and-conquer techniques to precluster an image in the hopes of increasing overall clustering speed. The second algorithm, harvistSVM, utilizes support vector machines (SVMs), a type of trained classifier. It seeks to increase classification speed by applying a "meta-SVM" to a quick (but inaccurate) SVM to approximate a slower, yet more accurate, SVM. Speedups were achieved by tuning the algorithm to quickly identify when the quick SVM was incorrect, and then reclassifying low-confidence pixels as necessary. Comparing the classification speeds of both algorithms to known baselines showed a slight speedup for large values of k (the number of clusters) for harvist-cluster, and a significant speedup for harvistSVM. Future work aims to automate the parameter tuning process required for harvistSVM, and further improve classification accuracy and speed. Additionally, this research will move documents created in Canvas into ArcGIS. The launch of the Mars Reconnaissance Orbiter (MRO) will provide a wealth of image data such as global maps of Martian weather and high resolution global images of Mars. The ability to store this new data in a georeferenced format will support future Mars missions by providing data for landing site selection and the search for water on Mars.

  18. A Hybrid Algorithm for Classification of Compressed ECG

    Directory of Open Access Journals (Sweden)

    Shubhada S.Ardhapurkar

    2012-03-01

    Full Text Available Efficient compression reduces memory requirement in long term recording and reduces power and time requirement in transmission. A new compression algorithm combining Linear Predictive coding (LPC and Discrete Wavelet transform is proposed in this study. Our coding algorithm offers compression ratio above 85% for records of MIT-BIH compression database. The performance of algorithm is quantified by computing distortion measures like percentage root mean square difference (PRD, wavelet-based weighted PRD (WWPRD and Wavelet energy based diagnostic distortion (WEDD. The PRD is found to be below 6 %, values of WWPRD and WEDD are less than 0.03. Classification of decompressed signals, by employing fuzzy c means method, is achieved with accuracy of 97%.

  19. An Imbalanced Data Classification Algorithm of De-noising Auto-Encoder Neural Network Based on SMOTE

    Directory of Open Access Journals (Sweden)

    Zhang Chenggang

    2016-01-01

    Full Text Available Imbalanced data classification problem has always been one of the hot issues in the field of machine learning. Synthetic minority over-sampling technique (SMOTE is a classical approach to balance datasets, but it may give rise to such problem as noise. Stacked De-noising Auto-Encoder neural network (SDAE, can effectively reduce data redundancy and noise through unsupervised layer-wise greedy learning. Aiming at the shortcomings of SMOTE algorithm when synthesizing new minority class samples, the paper proposed a Stacked De-noising Auto-Encoder neural network algorithm based on SMOTE, SMOTE-SDAE, which is aimed to deal with imbalanced data classification. The proposed algorithm is not only able to synthesize new minority class samples, but it also can de-noise and classify the sampled data. Experimental results show that compared with traditional algorithms, SMOTE-SDAE significantly improves the minority class classification accuracy of the imbalanced datasets.

  20. Target Image Classification through Encryption Algorithm Based on the Biological Features

    OpenAIRE

    Zhiwu Chen; Qing E. Wu; Weidong Yang

    2014-01-01

    In order to effectively make biological image classification and identification, this paper studies the biological owned characteristics, gives an encryption algorithm, and presents a biological classification algorithm based on the encryption process. Through studying the composition characteristics of palm, this paper uses the biological classification algorithm to carry out the classification or recognition of palm, improves the accuracy and efficiency of the existing biological classifica...

  1. Unsupervised classification of multivariate geostatistical data: Two algorithms

    Science.gov (United States)

    Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

    2015-12-01

    With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

  2. Implementation of several mathematical algorithms to breast tissue density classification

    Science.gov (United States)

    Quintana, C.; Redondo, M.; Tirao, G.

    2014-02-01

    The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories.

  3. Classification of adaptive memetic algorithms: a comparative study.

    Science.gov (United States)

    Ong, Yew-Soon; Lim, Meng-Hiot; Zhu, Ning; Wong, Kok-Wai

    2006-02-01

    Adaptation of parameters and operators represents one of the recent most important and promising areas of research in evolutionary computations; it is a form of designing self-configuring algorithms that acclimatize to suit the problem in hand. Here, our interests are on a recent breed of hybrid evolutionary algorithms typically known as adaptive memetic algorithms (MAs). One unique feature of adaptive MAs is the choice of local search methods or memes and recent studies have shown that this choice significantly affects the performances of problem searches. In this paper, we present a classification of memes adaptation in adaptive MAs on the basis of the mechanism used and the level of historical knowledge on the memes employed. Then the asymptotic convergence properties of the adaptive MAs considered are analyzed according to the classification. Subsequently, empirical studies on representatives of adaptive MAs for different type-level meme adaptations using continuous benchmark problems indicate that global-level adaptive MAs exhibit better search performances. Finally we conclude with some promising research directions in the area. PMID:16468573

  4. Automatic classification of schizophrenia using resting-state functional language network via an adaptive learning algorithm

    Science.gov (United States)

    Zhu, Maohu; Jie, Nanfeng; Jiang, Tianzi

    2014-03-01

    A reliable and precise classification of schizophrenia is significant for its diagnosis and treatment of schizophrenia. Functional magnetic resonance imaging (fMRI) is a novel tool increasingly used in schizophrenia research. Recent advances in statistical learning theory have led to applying pattern classification algorithms to access the diagnostic value of functional brain networks, discovered from resting state fMRI data. The aim of this study was to propose an adaptive learning algorithm to distinguish schizophrenia patients from normal controls using resting-state functional language network. Furthermore, here the classification of schizophrenia was regarded as a sample selection problem where a sparse subset of samples was chosen from the labeled training set. Using these selected samples, which we call informative vectors, a classifier for the clinic diagnosis of schizophrenia was established. We experimentally demonstrated that the proposed algorithm incorporating resting-state functional language network achieved 83.6% leaveone- out accuracy on resting-state fMRI data of 27 schizophrenia patients and 28 normal controls. In contrast with KNearest- Neighbor (KNN), Support Vector Machine (SVM) and l1-norm, our method yielded better classification performance. Moreover, our results suggested that a dysfunction of resting-state functional language network plays an important role in the clinic diagnosis of schizophrenia.

  5. Review of WiMAX Scheduling Algorithms and Their Classification

    Science.gov (United States)

    Yadav, A. L.; Vyavahare, P. D.; Bansod, P. P.

    2014-07-01

    Providing quality of service (QoS) in wireless communication networks has become an important consideration for supporting variety of applications. IEEE 802.16 based WiMAX is the most promising technology for broadband wireless access with best QoS features for tripe play (voice, video and data) service users. Unlike wired networks, QoS support is difficult in wireless networks due to variable and unpredictable nature of wireless channels. In transmission of voice and video main issue involves allocation of available resources among the users to meet QoS criteria such as delay, jitter and throughput requirements to maximize goodput, to minimize power consumption while keeping feasible algorithm flexibility and ensuring system scalability. WiMAX assures guaranteed QoS by including several mechanisms at the MAC layer such as admission control and scheduling. Packet scheduling is a process of resolving contention for bandwidth which determines allocation of bandwidth among users and their transmission order. Various approaches for classification of scheduling algorithms in WiMAX have appeared in literature as homogeneous, hybrid and opportunistic scheduling algorithms. The paper consolidates the parameters and performance metrics that need to be considered in developing a scheduler. The paper surveys recently proposed scheduling algorithms, their shortcomings, assumptions, suitability and improvement issues associated with these uplink scheduling algorithms.

  6. A fast version of the k-means classification algorithm for astronomical applications

    CERN Document Server

    Ordovás-Pascual, I

    2014-01-01

    Context. K-means is a clustering algorithm that has been used to classify large datasets in astronomical databases. It is an unsupervised method, able to cope very different types of problems. Aims. We check whether a variant of the algorithm called single-pass k-means can be used as a fast alternative to the traditional k-means. Methods. The execution time of the two algorithms are compared when classifying subsets drawn from the SDSS-DR7 catalog of galaxy spectra. Results. Single-pass k-means turn out to be between 20 % and 40 % faster than k-means and provide statistically equivalent classifications. This conclusion can be scaled up to other larger databases because the execution time of both algorithms increases linearly with the number of objects. Conclusions. Single-pass k-means can be safely used as a fast alternative to k-means.

  7. Comparison of Unsupervised Vegetation Classification Methods from Vhr Images after Shadows Removal by Innovative Algorithms

    Science.gov (United States)

    Movia, A.; Beinat, A.; Crosilla, F.

    2015-04-01

    The recognition of vegetation by the analysis of very high resolution (VHR) aerial images provides meaningful information about environmental features; nevertheless, VHR images frequently contain shadows that generate significant problems for the classification of the image components and for the extraction of the needed information. The aim of this research is to classify, from VHR aerial images, vegetation involved in the balance process of the environmental biochemical cycle, and to discriminate it with respect to urban and agricultural features. Three classification algorithms have been experimented in order to better recognize vegetation, and compared to NDVI index; unfortunately all these methods are conditioned by the presence of shadows on the images. Literature presents several algorithms to detect and remove shadows in the scene: most of them are based on the RGB to HSI transformations. In this work some of them have been implemented and compared with one based on RGB bands. Successively, in order to remove shadows and restore brightness on the images, some innovative algorithms, based on Procrustes theory, have been implemented and applied. Among these, we evaluate the capability of the so called "not-centered oblique Procrustes" and "anisotropic Procrustes" methods to efficiently restore brightness with respect to a linear correlation correction based on the Cholesky decomposition. Some experimental results obtained by different classification methods after shadows removal carried out with the innovative algorithms are presented and discussed.

  8. Improved Algorithms for the Classification of Rough Rice Using a Bionic Electronic Nose Based on PCA and the Wilks Distribution

    Directory of Open Access Journals (Sweden)

    Sai Xu

    2014-03-01

    Full Text Available Principal Component Analysis (PCA is one of the main methods used for electronic nose pattern recognition. However, poor classification performance is common in classification and recognition when using regular PCA. This paper aims to improve the classification performance of regular PCA based on the existing Wilks ?-statistic (i.e., combined PCA with the Wilks distribution. The improved algorithms, which combine regular PCA with the Wilks ?-statistic, were developed after analysing the functionality and defects of PCA. Verification tests were conducted using a PEN3 electronic nose. The collected samples consisted of the volatiles of six varieties of rough rice (Zhongxiang1, Xiangwan13, Yaopingxiang, WufengyouT025, Pin 36, and Youyou122, grown in same area and season. The first two principal components used as analysis vectors cannot perform the rough rice varieties classification task based on a regular PCA. Using the improved algorithms, which combine the regular PCA with the Wilks ?-statistic, many different principal components were selected as analysis vectors. The set of data points of the Mahalanobis distance between each of the varieties of rough rice was selected to estimate the performance of the classification. The result illustrates that the rough rice varieties classification task is achieved well using the improved algorithm. A Probabilistic Neural Networks (PNN was also established to test the effectiveness of the improved algorithms. The first two principal components (namely PC1 and PC2 and the first and fifth principal component (namely PC1 and PC5 were selected as the inputs of PNN for the classification of the six rough rice varieties. The results indicate that the classification accuracy based on the improved algorithm was improved by 6.67% compared to the results of the regular method. These results prove the effectiveness of using the Wilks ?-statistic to improve the classification accuracy of the regular PCA approach. The

  9. A novel hybrid classification model of genetic algorithms, modified k-Nearest Neighbor and developed backpropagation neural network.

    Directory of Open Access Journals (Sweden)

    Nader Salari

    Full Text Available Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that

  10. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    Science.gov (United States)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  11. CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH

    Directory of Open Access Journals (Sweden)

    V. A. Ayma

    2015-03-01

    Full Text Available Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP, which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA’s machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM. The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  12. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  13. Land-cover classification with an expert classification algorithm using digital aerial photographs

    Directory of Open Access Journals (Sweden)

    José L. de la Cruz

    2010-05-01

    Full Text Available The purpose of this study was to evaluate the usefulness of the spectral information of digital aerial sensors in determining land-cover classification using new digital techniques. The land covers that have been evaluated are the following, (1 bare soil, (2 cereals, including maize (Zea mays L., oats (Avena sativa L., rye (Secale cereale L., wheat (Triticum aestivum L. and barley (Hordeun vulgare L., (3 high protein crops, such as peas (Pisum sativum L. and beans (Vicia faba L., (4 alfalfa (Medicago sativa L., (5 woodlands and scrublands, including holly oak (Quercus ilex L. and common retama (Retama sphaerocarpa L., (6 urban soil, (7 olive groves (Olea europaea L. and (8 burnt crop stubble. The best result was obtained using an expert classification algorithm, achieving a reliability rate of 95%. This result showed that the images of digital airborne sensors hold considerable promise for the future in the field of digital classifications because these images contain valuable information that takes advantage of the geometric viewpoint. Moreover, new classification techniques reduce problems encountered using high-resolution images; while reliabilities are achieved that are better than those achieved with traditional methods.

  14. A New Approach Using Data Envelopment Analysis for Ranking Classification Algorithms

    OpenAIRE

    A. Bazleh; P. Gholami; Soleymani, F.

    2011-01-01

    Problem statement: A variety of methods and algorithms for classification problems have been developed recently. But the main question is that how to select an appropriate and effective classification algorithm. This has always been an important and difficult issue. Approach: Since the classification algorithm selection task needs to examine more than one criterion such as accuracy and computational time, it can be modeled and also ranked by Data Envelopment Analysis (DEA) technique. Results:...

  15. Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

    Directory of Open Access Journals (Sweden)

    Ghazi Raho

    2015-02-01

    Full Text Available Feature selection is necessary for effective text classification. Dataset preprocessing is essential to make upright result and effective performance. This paper investigates the effectiveness of using feature selection. In this paper we have been compared the performance between different classifiers in different situations using feature selection with stemming, and without stemming.Evaluation used a BBC Arabic dataset, different classification algorithms such as decision tree (D.T, K-nearest neighbors (KNN, Naïve Bayesian (NB method and Naïve Bayes Multinomial(NBM classifier were used. The experimental results are presented in term of precision, recall, F-Measures, accuracy and time to build model.

  16. Swarm Negative Selection Algorithm for Electroencephalogram Signals Classification

    Directory of Open Access Journals (Sweden)

    Nasser O.S.B. Karait

    2009-01-01

    Full Text Available Problem statement: The process of epilepsy diagnosis from EEG signals by a human scorer is a very time consuming and costly task considering the large number of epileptic patients admitted to the hospitals and the large amount of data needs to be scored. Therefore, there is a strong need to automate this process. Such automated systems must rely on robust and effective algorithms for detection and prediction. Approach: The proposed detection system of epileptic seizure in EEG signals is based on Discrete Wavelet Transform (DWT and Swarm Negative Selection (SNS algorithm. DWT used to analyze EEG signals at different frequency bands and statistics over the set of the wavelet coefficients were calculated to introduce the feature vector for SNS classifier. The SNS classification model use negative selection and PSO algorithms to form a set of memory Artificial Lymphocytes (ALCs that have the ability to distinguish between normal and epileptic EEG patterns. Thus, adapted negative selection is employed to create a set of self-tolerant ALCs. Whereas, PSO is used to evolve these ALCs away from self patterns towards non-self space and to maintain diversity and generality among the ALCs. Results: The experimental results proved that the proposed method reveals very promising performance in classifying EEG signals. A comparison with many previous studies showed that the presented algorithm has better results outperforming those reported by earlier methods. Conclusion: The technique was approved to be robust and effective in detecting and localizing epileptic seizure in EEG recording. Hence, the proposed system can be very helpful to make faster and accurate diagnosis decision.

  17. FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification

    Directory of Open Access Journals (Sweden)

    Wei-Hao Lee

    2012-05-01

    Full Text Available This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA. It is embedded in a System-On-Programmable-Chip (SOPC platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance andlow area costs.

  18. An Evolutionary Algorithm for Enhanced Magnetic Resonance Imaging Classification

    Directory of Open Access Journals (Sweden)

    T.S. Murunya

    2014-11-01

    Full Text Available This study presents an image classification method for retrieval of images from a multi-varied MRI database. With the development of sophisticated medical imaging technology which helps doctors in diagnosis, medical image databases contain a huge amount of digital images. Magnetic Resonance Imaging (MRI is a widely used imaging technique which picks signals from a body's magnetic particles spinning to magnetic tune and through a computer converts scanned data into pictures of internal organs. Image processing techniques are required to analyze medical images and retrieve it from database. The proposed framework extracts features using Moment Invariants (MI and Wavelet Packet Tree (WPT. Extracted features are reduced using Correlation based Feature Selection (CFS and a CFS with cuckoo search algorithm is proposed. Naïve Bayes and K-Nearest Neighbor (KNN classify the selected features. National Biomedical Imaging Archive (NBIA dataset including colon, brain and chest is used to evaluate the framework.

  19. Improving the cAnt-MinerPB Classification Algorithm

    OpenAIRE

    Medland, Matthew; Otero, Fernando E. B.; Freitas, Alex A

    2012-01-01

    Ant Colony Optimisation (ACO) has been successfully applied to the classification task of data mining in the form of Ant-Miner. A new extension of Ant-Miner, called cAnt-MinerPB, uses the ACO procedure in a different fashion. The main difference is that the search in cAnt-MinerPB is optimised to find the best list of rules, whereas in Ant-Miner the search is optimised to find the best individual rule at each step of the sequential covering, producing a list of best rules. We aim to improve cA...

  20. Predicting disease risk using bootstrap ranking and classification algorithms.

    Directory of Open Access Journals (Sweden)

    Ohad Manor

    Full Text Available Genome-wide association studies (GWAS are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a "black box" in order to promote changes in life-style and screening for early diagnosis, or as a model that can be studied to better understand the mechanism of the disease. Current methods for risk prediction typically rank single nucleotide polymorphisms (SNPs by the p-value of their association with the disease, and use the top-associated SNPs as input to a classification algorithm. However, the predictive power of such methods is relatively poor. To improve the predictive power, we devised BootRank, which uses bootstrapping in order to obtain a robust prioritization of SNPs for use in predictive models. We show that BootRank improves the ability to predict disease risk of unseen individuals in the Wellcome Trust Case Control Consortium (WTCCC data and results in a more robust set of SNPs and a larger number of enriched pathways being associated with the different diseases. Finally, we show that combining BootRank with seven different classification algorithms improves performance compared to previous studies that used the WTCCC data. Notably, diseases for which BootRank results in the largest improvements were recently shown to have more heritability than previously thought, likely due to contributions from variants with low minimum allele frequency (MAF, suggesting that BootRank can be beneficial in cases where SNPs affecting the disease are poorly tagged or have low MAF. Overall, our results show that improving disease risk prediction from genotypic information may be a tangible goal, with potential implications for personalized disease screening and treatment.

  1. CLASSIFICATION OF DEFECTS IN SOFTWARE USING DECISION TREE ALGORITHM

    Directory of Open Access Journals (Sweden)

    M. SURENDRA NAIDU

    2013-06-01

    Full Text Available Software defects due to coding errors continue to plague the industry with disastrous impact, especially in the enterprise application software category. Identifying how much of these defects are specifically due to coding errors is a challenging problem. Defect prevention is the most vivid but usually neglected aspect of softwarequality assurance in any project. If functional at all stages of software development, it can condense the time, overheads and wherewithal entailed to engineer a high quality product. In order to reduce the time and cost, we will focus on finding the total number of defects if the test case shows that the software process not executing properly. That has occurred in the software development process. The proposed system classifying various defects using decision tree based defect classification technique, which is used to group the defects after identification. The classification can be done by employing algorithms such as ID3 or C4.5 etc. After theclassification the defect patterns will be measured by employing pattern mining technique. Finally the quality will be assured by using various quality metrics such as defect density, etc. The proposed system will be implemented in JAVA.

  2. Multi-classification algorithm and its realization based on least square support vector machine algorithm

    Institute of Scientific and Technical Information of China (English)

    Fan Youping; Chen Yunping; Sun Wansheng; Li Yu

    2005-01-01

    As a new type of learning machine developed on the basis of statistics learning theory, support vector machine (SVM) plays an important role in knowledge discovering and knowledge updating by constructing non-linear optimal classifier. However, realizing SVM requires resolving quadratic programming under constraints of inequality, which results in calculation difficulty while learning samples gets larger. Besides, standard SVM is incapable of tackling multi-classification. To overcome the bottleneck of populating SVM, with training algorithm presented, the problem of quadratic programming is converted into that of resolving a linear system of equations composed of a group of equation constraints by adopting the least square SVM(LS-SVM) and introducing a modifying variable which can change inequality constraints into equation constraints, which simplifies the calculation. With regard to multi-classification, an LS-SVM applicable in multi-classification is deduced. Finally, efficiency of the algorithm is checked by using universal Circle in square and two-spirals to measure the performance of the classifier.

  3. A Novel Training Algorithm of Genetic Neural Networks and Its Application to Classification

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    First of all, this paper discusses the drawbacks of multilayer perceptron (MLP), which is trained by the traditional back propagation (BP) algorithm and used in a special classification problem. A new training algorithm for neural networks based on genetic algorithm and BP algorithm is developed. The difference between the new training algorithm and BP algorithm in the ability of nonlinear approaching is expressed through an example, and the application foreground is illustrated by an example.

  4. Contributions to "k"-Means Clustering and Regression via Classification Algorithms

    Science.gov (United States)

    Salman, Raied

    2012-01-01

    The dissertation deals with clustering algorithms and transforming regression problems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learning environment for solving regression problems as classification tasks by using…

  5. Study and Implementation of Web Mining Classification Algorithm Based on Building Tree of Detection Class Threshold

    Institute of Scientific and Technical Information of China (English)

    CHEN Jun-jie; SONG Han-tao; LU Yu-chang

    2005-01-01

    A new classification algorithm for web mining is proposed on the basis of general classification algorithm for data mining in order to implement personalized information services. The building tree method of detecting class threshold is used for construction of decision tree according to the concept of user expectation so as to find classification rules in different layers. Compared with the traditional C4. 5 algorithm, the disadvantage of excessive adaptation in C4. 5 has been improved so that classification results not only have much higher accuracy but also statistic meaning.

  6. Integrating genetic algorithm method with neural network for land use classification using SZ-3 CMODIS data

    Institute of Scientific and Technical Information of China (English)

    WANG Changyao; LUO Chengfeng; LIU Zhengjun

    2005-01-01

    This paper presents a methodology on land use mapping using CMODIS (Chinese Moderate Resolution Imaging Spectroradiometer ) data on-board SZ-3 (Shenzhou 3) spacecraft. The integrated method is composed of genetic algorithm (GA) for feature extraction and neural network classifier for land use classification. In the data preprocessing, a moment matching method was adopted to reuse classification was obtained. To generate a land use map, the three layers back propagation neural network classifier is used for training the samples and classification. Compared with the Maximum Likelihood classification algorithm, the results show that the accuracy of land use classification is obviously improved by using our proposed method, the selected band number in the classification process is reduced,and the computational performance for training and classification is improved. The result also shows that the CMODIS data can be effectively used for land use/land cover classification and change monitoring at regional and global scale.

  7. Analysis of Classification and Clustering Algorithms using Weka For Banking Data

    OpenAIRE

    G.Roch Libia Rani; K. Vanitha

    2010-01-01

    In this paper, we investigate the performance of different classification and clustering algorithms using weka software. The J48,Naive Bayes and Simple CART Classification algorthims are evaluated based on accuracy, time efficiency and error rates. The K-means, DBScan and EM clustering algorithms are evaluated based on accuracy of clustering. We run these algorithms on large and small data sets to evaluate how well they work.

  8. Text Classification using Association Rule with a Hybrid Concept of Naive Bayes Classifier and Genetic Algorithm

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the automated assignment of natural language texts to predefined categories based on their content. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Now a day the demand of text classification is increasing tremendously. Keeping this demand into consideration, new and updated techniques are being developed for the purpose of automated text classification. This paper presents a new algorithm for text classification. Instead of using words, word relation i.e. association rules is used to derive feature set from pre-classified text documents. The concept of Naive Bayes Classifier is then used on derived features and finally a concept of Genetic Algorithm has been added for final classification. A system based on the proposed algorithm has been implemented and tested. The experimental ...

  9. A Simple Algorithm for Maximum Margin Classification, Revisited

    OpenAIRE

    Har-Peled, Sariel

    2015-01-01

    In this note, we revisit the algorithm of Har-Peled et. al. [HRZ07] for computing a linear maximum margin classifier. Our presentation is self contained, and the algorithm itself is slightly simpler than the original algorithm. The algorithm itself is a simple Perceptron like iterative algorithm. For more details and background, the reader is referred to the original paper.

  10. Unsupervised classification algorithm based on EM method for polarimetric SAR images

    Science.gov (United States)

    Fernández-Michelli, J. I.; Hurtado, M.; Areta, J. A.; Muravchik, C. H.

    2016-07-01

    In this work we develop an iterative classification algorithm using complex Gaussian mixture models for the polarimetric complex SAR data. It is a non supervised algorithm which does not require training data or an initial set of classes. Additionally, it determines the model order from data, which allows representing data structure with minimum complexity. The algorithm consists of four steps: initialization, model selection, refinement and smoothing. After a simple initialization stage, the EM algorithm is iteratively applied in the model selection step to compute the model order and an initial classification for the refinement step. The refinement step uses Classification EM (CEM) to reach the final classification and the smoothing stage improves the results by means of non-linear filtering. The algorithm is applied to both simulated and real Single Look Complex data of the EMISAR mission and compared with the Wishart classification method. We use confusion matrix and kappa statistic to make the comparison for simulated data whose ground-truth is known. We apply Davies-Bouldin index to compare both classifications for real data. The results obtained for both types of data validate our algorithm and show that its performance is comparable to Wishart's in terms of classification quality.

  11. A Novel Algorithm of Network Trade Customer Classification Based on Fourier Basis Functions

    Directory of Open Access Journals (Sweden)

    Li Xinwu

    2013-11-01

    Full Text Available Learning algorithm of neural network is always an important research contents in neural network theory research and application field, learning algorithm about the feed-forward neural network has no satisfactory solution in particular for its defects in calculation speed. The paper presents a new Fourier basis functions neural network algorithm and applied it to classify network trade customer. First, 21 customer classification indicators are designed, based on characteristics and behaviors analysis of network trade customer, including customer characteristics type variables and customer behaviors type variables,; Second, Fourier basis functions is used to improve the calculation flow and algorithm structure of original BP neural network algorithm to speed up its convergence and then a new Fourier basis neural network model is constructed. Finally the experimental results show that the problem of convergence speed can been solved, and the accuracy of the customer classification are ensured when the new algorithm is used in network trade customer classification practically.

  12. Study on An Absolute Non-Collision Hash and Jumping Table IP Classification Algorithms

    Institute of Scientific and Technical Information of China (English)

    SHANG Feng-jun; PAN Ying-jun

    2004-01-01

    In order to classify packet, we propose a novel IP classification based the non-collision hash and jumping table Trie-tree (NHJTTT) algorithm, which is based on non-collision hash Trie-tree and Lakshman and Stiliadis proposing a 2-dimensional classification algorithm (LS algorithm).The core of algorithm consists of two parts: structure the non-collision hash function, which is constructed mainly based on destination /source port and protocol type field so that the hash function can avoid space explosion problem; introduce jumping table Trie-tree based LS algorithm in order to reduce time complexity.The test results show that the classification rate of NHJTTT algorithm is up to 1 million packets per second and the maximum memory consumed is 9 MB for 10 000 rules.

  13. A Non-Collision Hash Trie-Tree Based FastIP Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    徐恪; 吴建平; 喻中超; 徐明伟

    2002-01-01

    With the development of network applications, routers must support such functions as firewalls, provision of QoS, traffic billing, etc. All these functions need the classification of IP packets, according to how different the packets are processed subsequently, which is determined. In this article, a novel IP classification algorithm is proposed based on the Grid of Tries algorithm. The new algorithm not only eliminates original limitations in the case of multiple fields but also shows better performance in regard to both time and space. It has better overall performance than many other algorithms.

  14. Classification of hyperspectral remote sensing images based on simulated annealing genetic algorithm and multiple instance learning

    Institute of Scientific and Technical Information of China (English)

    高红民; 周惠; 徐立中; 石爱业

    2014-01-01

    A hybrid feature selection and classification strategy was proposed based on the simulated annealing genetic algorithm and multiple instance learning (MIL). The band selection method was proposed from subspace decomposition, which combines the simulated annealing algorithm with the genetic algorithm in choosing different cross-over and mutation probabilities, as well as mutation individuals. Then MIL was combined with image segmentation, clustering and support vector machine algorithms to classify hyperspectral image. The experimental results show that this proposed method can get high classification accuracy of 93.13%at small training samples and the weaknesses of the conventional methods are overcome.

  15. Data Mining Algorithms for Classification of Complex Biomedical Data

    Science.gov (United States)

    Lan, Liang

    2012-01-01

    In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data. In microarray…

  16. Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

    OpenAIRE

    Huibin Lu; Zhengping Hu; Hongxiao Gao

    2015-01-01

    In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the...

  17. Development of a Fingerprint Gender Classification Algorithm Using Fingerprint Global Features

    OpenAIRE

    S. F. Abdullah; A.F.N.A. Rahman; Z.A.Abas; W.H.M Saad

    2016-01-01

    In forensic world, the process of identifying and calculating the fingerprint features is complex and take time when it is done manually using fingerprint laboratories magnifying glass. This study is meant to enhance the forensic manual method by proposing a new algorithm for fingerprint global feature extraction for gender classification. The result shows that the new algorithm gives higher acceptable readings which is above 70% of classification rate when it is compared to the manual method...

  18. An Analytic Hierarchy Model for Classification Algorithms Selection in Credit Risk Analysis

    OpenAIRE

    Gang Kou; Wenshuai Wu

    2014-01-01

    This paper proposes an analytic hierarchy model (AHM) to evaluate classification algorithms for credit risk analysis. The proposed AHM consists of three stages: data mining stage, multicriteria decision making stage, and secondary mining stage. For verification, 2 public-domain credit datasets, 10 classification algorithms, and 10 performance criteria are used to test the proposed AHM in the experimental study. The results demonstrate that the proposed AHM is an efficient tool to select class...

  19. A method for classification of network traffic based on C5.0 Machine Learning Algorithm

    DEFF Research Database (Denmark)

    Bujlow, Tomasz; Riaz, M. Tahir; Pedersen, Jens Myrup

    2012-01-01

    and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options...

  20. An Imbalanced Data Classification Algorithm of De-noising Auto-Encoder Neural Network Based on SMOTE

    OpenAIRE

    Zhang Chenggang; Song Jiazhi; Pei Zhili; Jiang Jingqing

    2016-01-01

    Imbalanced data classification problem has always been one of the hot issues in the field of machine learning. Synthetic minority over-sampling technique (SMOTE) is a classical approach to balance datasets, but it may give rise to such problem as noise. Stacked De-noising Auto-Encoder neural network (SDAE), can effectively reduce data redundancy and noise through unsupervised layer-wise greedy learning. Aiming at the shortcomings of SMOTE algorithm when synthesizing new minority class samples...

  1. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. PMID:25880524

  2. Polarimetric synthetic aperture radar image classification using fuzzy logic in the H/α-Wishart algorithm

    Science.gov (United States)

    Zhu, Teng; Yu, Jie; Li, Xiaojuan; Yang, Jie

    2015-01-01

    To solve the problem that the H/α-Wishart unsupervised classification algorithm can generate only inflexible clusters due to arbitrarily fixed zone boundaries in the clustering processing, a refined fuzzy logic based classification scheme called the H/α-Wishart fuzzy clustering algorithm is proposed in this paper. A fuzzy membership function was developed for the degree of pixels belonging to each class instead of an arbitrary boundary. To devise a unified fuzzy function, a normalized Wishart distance is proposed during the clustering step in the new algorithm. Then the degree of membership is computed to implement fuzzy clustering. After an iterative procedure, the algorithm yields a classification result. The new classification scheme is applied to two L-band polarimetric synthetic aperture radar (PolSAR) images and an X-band high-resolution PolSAR image of a field in LingShui, Hainan Province, China. Experimental results show that the classification precision of the refined algorithm is greater than that of the H/α-Wishart algorithm and that the refined algorithm performs well in differentiating shadows and water areas.

  3. Algorithms for the Automatic Classification and Sorting of Conifers in the Garden Nursery Industry

    DEFF Research Database (Denmark)

    Petri, Stig

    throughout development to keep this bias to a minimum. The specific goals with regard to classification performance was determined in cooperation with Peter Schjøtt of the Danish Garden Nursery Owner Association, and set to an average error rate of less than 2% for all categories of defects, and a goal of a...... classification models, and evaluating classification performance. A total of six feature extraction algorithms are reported in this work. These include algorithms that record the image data directly, describe the border of the plant object, describe the color characteristics of the plant, or attempts to extract...

  4. Differential characteristic set algorithm for the complete symmetry classification of partial differential equations

    Institute of Scientific and Technical Information of China (English)

    Chaolu Temuer; Yu-shan BAI

    2009-01-01

    In this paper,we present a differential polynomial characteristic set algorithm for the complete symmetry classification of partial differential equations (PDEs)with some parameters. It can make the solution to the complete symmetry classification problem for PDEs become direct and systematic. As an illustrative example,the complete potential symmetry classifications of nonlinear and linear wave equations with an arbitrary function parameter are presented. This is a new application of the differential form characteristic set algorithm,i.e.,Wu's method,in differential equations.

  5. IMPROVEMENT OF TCAM-BASED PACKET CLASSIFICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Xu Zhen; Zhang Jun; Rui Liyang; Sun Jun

    2008-01-01

    The feature of Ternary Content Addressable Memories (TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However, the limitations of TCAMs impede their utilization. In this paper, the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.

  6. Analysis of Distributed and Adaptive Genetic Algorithm for Mining Interesting Classification Rules

    Institute of Scientific and Technical Information of China (English)

    YI Yunfei; LIN Fang; QIN Jun

    2008-01-01

    Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules. The paper gives the method to encode for the rules, the fitness function, the selecting, crossover, mutation and migration operator for the DAGA at the same time are designed.

  7. A NEW UNSUPERVISED CLASSIFICATION ALGORITHM FOR POLARIMETRIC SAR IMAGES BASED ON FUZZY SET THEORY

    Institute of Scientific and Technical Information of China (English)

    Fu Yusheng; Xie Yan; Pi Yiming; Hou Yinming

    2006-01-01

    In this letter, a new method is proposed for unsupervised classification of terrain types and man-made objects using POLarimetric Synthetic Aperture Radar (POLSAR) data. This technique is a combination of the usage of polarimetric information of SAR images and the unsupervised classification method based on fuzzy set theory. Image quantization and image enhancement are used to preprocess the POLSAR data. Then the polarimetric information and Fuzzy C-Means (FCM) clustering algorithm are used to classify the preprocessed images. The advantages of this algorithm are the automated classification, its high classification accuracy, fast convergence and high stability. The effectiveness of this algorithm is demonstrated by experiments using SIR-C/X-SAR (Spaceborne Imaging Radar-C/X-band Synthetic Aperture Radar) data.

  8. Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data

    Directory of Open Access Journals (Sweden)

    Viswanath Satish

    2012-02-01

    Full Text Available Abstract Background Dimensionality reduction (DR enables the construction of a lower dimensional space (embedding from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of noise in the data. In this paper, we present a novel DR technique known as consensus embedding that aims to overcome these problems by generating and combining multiple low-dimensional embeddings, hence exploiting the variance among them in a manner similar to ensemble classifier schemes such as Bagging. We demonstrate theoretical properties of consensus embedding which show that it will result in a single stable embedding solution that preserves information more accurately as compared to any individual embedding (generated via DR schemes such as Principal Component Analysis, Graph Embedding, or Locally Linear Embedding. Intelligent sub-sampling (via mean-shift and code parallelization are utilized to provide for an efficient implementation of the scheme. Results Applications of consensus embedding are shown in the context of classification and clustering as applied to: (1 image partitioning of white matter and gray matter on 10 different synthetic brain MRI images corrupted with 18 different combinations of noise and bias field inhomogeneity, (2 classification of 4 high-dimensional gene-expression datasets, (3 cancer detection (at a pixel-level on 16 image slices obtained from 2 different high-resolution prostate MRI datasets. In over 200 different experiments concerning classification and segmentation of biomedical data, consensus embedding was found to consistently outperform both linear and non-linear DR methods within all applications considered. Conclusions We have presented a novel framework termed consensus embedding which leverages ensemble classification theory within dimensionality reduction, allowing for application to a wide range

  9. [MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique].

    Science.gov (United States)

    Chen, Zhiru; Hong, Wenxue

    2016-02-01

    Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier. PMID:27382743

  10. A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis

    Directory of Open Access Journals (Sweden)

    Bendi Venkata Ramana

    2011-03-01

    Full Text Available Patients with Liver disease have been continuously increasing because of excessive consumption ofalcohol, inhale of harmful gases, intake of contaminated food, pickles and drugs. Automatic classificationtools may reduce burden on doctors. This paper evaluates the selected classification algorithms for theclassification of some liver patient datasets. The classification algorithms considered here are Naïve Bayesclassifier, C4.5, Back propagation Neural Network algorithm, and Support Vector Machines. Thesealgorithms are evaluated based on four criteria: Accuracy, Precision, Sensitivity and Specificity

  11. Review of Image Classification Techniques Based on LDA, PCA and Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Mukul Yadav

    2014-02-01

    Full Text Available Image classification is play an important role in security surveillance in current scenario of huge Amount of image data base. Due to rapid change of feature content of image are major issues in classification. The image classification is improved by various authors using different model of classifier. The efficiency of classifier model depends on feature extraction process of traffic image. For the feature extraction process various authors used a different technique such as Gabor feature extraction, histogram and many more method on extraction process for classification. We apply the FLDA-GA for improved the classification rate of content based image classification. The improved method used heuristic function genetic algorithm. In the form of optimal GA used as feature optimizer for FLDA classification. The normal FLDA suffered from a problem of core and outlier problem. The both side kernel technique improved the classification process of support vector machine.FLDA perform a better classification in compression of another binary multi-class classification. Directed acyclic graph applied a graph portion technique for the mapping of feature data. The mapping space of feature data mapped correctly automatically improved the voting process of classification.

  12. Performance Analysis of Gender Clustering and Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Dr.K.Meena

    2012-03-01

    Full Text Available In speech processing, gender clustering and classification plays a major role. In both gender clustering and classification, selecting the feature is an important process and the often utilized featurefor gender clustering and classification in speech processing is pitch. The pitch value of a male speech differs much from that of a female speech. Normally, there is a considerable frequency value difference between the male and female speech. But, in some cases the frequency of male is almost equal to female or frequency of female is equal to male. In such situation, it is difficult to identify the exact gender. By considering this drawback, here three features namely; energy entropy, zero crossing rate and short time energy are used for identifying the gender. Gender clustering and classification of speech signal are estimated using the aforementioned three features. Here, the gender clustering is computed using Euclidean distance, Mahalanobis distance, Manhattan distance & Bhattacharyya distance method and the gender classification method is computed using combined fuzzy logic and neural network, neuro fuzzy and support vector machine and its performance are analyzed.

  13. Application of ant colony algorithm in plant leaves classification based on infrared spectroscopy

    Science.gov (United States)

    Guo, Tiantai; Hong, Bo; Kong, Ming; Zhao, Jun

    2014-04-01

    This paper proposes to use ant colony algorithm in the analysis of spectral data of plant leaves to achieve the best classification of different plants within a short time. Intelligent classification is realized according to different components of featured information included in near infrared spectrum data of plants. The near infrared diffusive emission spectrum curves of the leaves of Cinnamomum camphora and Acer saccharum Marsh are acquired, which have 75 leaves respectively, and are divided into two groups. Then, the acquired data are processed using ant colony algorithm and the same kind of leaves can be classified as a class by ant colony clustering algorithm. Finally, the two groups of data are classified into two classes. Experiment results show that the algorithm can distinguish different species up to the percentage of 100%. The classification of plant leaves has important application value in agricultural development, research of species invasion, floriculture etc.

  14. Research of Plant-Leaves Classification Algorithm Based on Supervised LLE

    Directory of Open Access Journals (Sweden)

    Yan Qing

    2013-06-01

    Full Text Available A new supervised LLE method based on the fisher projection was proposed in this paper, and combined it with a new classification algorithm based on manifold learning to realize the recognition of the plant leaves. Firstly,the method utilizes the Fisher projection distance to replace the sample's geodesic distance, and a new supervised LLE algorithm is obtained .Then, a classification algorithm which uses the manifold reconstruction error to distinguish the sample classification directly is adopted. This algorithm can utilize the category information better,and improve recognition rate effectively. At the same time, it has the advantage of the easily parameter estimation. The experimental results based on the real-world plant leaf databases shows its average accuracy of recognition was up to 95.17%.

  15. A MapReduce based distributed SVM algorithm for binary classification

    OpenAIRE

    Çatak, Ferhat Özgür; Balaban, Mehmet Erdal

    2013-01-01

    Although Support Vector Machine (SVM) algorithm has a high generalization property to classify for unseen examples after training phase and it has small loss value, the algorithm is not suitable for real-life classification and regression problems. SVMs cannot solve hundreds of thousands examples in training dataset. In previous studies on distributed machine learning algorithms, SVM is trained over a costly and preconfigured computer environment. In this research, we present a MapReduce base...

  16. Packet Classification by Multilevel Cutting of the Classification Space: An Algorithmic-Architectural Solution for IP Packet Classification in Next Generation Networks

    Directory of Open Access Journals (Sweden)

    Motasem Aldiab

    2008-01-01

    Full Text Available Traditionally, the Internet provides only a “best-effort” service, treating all packets going to the same destination equally. However, providing differentiated services for different users based on their quality requirements is increasingly becoming a demanding issue. For this, routers need to have the capability to distinguish and isolate traffic belonging to different flows. This ability to determine the flow each packet belongs to is called packet classification. Technology vendors are reluctant to support algorithmic solutions for classification due to their nondeterministic performance. Although content addressable memories (CAMs are favoured by technology vendors due to their deterministic high-lookup rates, they suffer from the problems of high-power consumption and high-silicon cost. This paper provides a new algorithmic-architectural solution for packet classification that mixes CAMs with algorithms based on multilevel cutting of the classification space into smaller spaces. The provided solution utilizes the geometrical distribution of rules in the classification space. It provides the deterministic performance of CAMs, support for dynamic updates, and added flexibility for system designers.

  17. Woven fabric defects detection based on texture classification algorithm

    International Nuclear Information System (INIS)

    In this paper we have compared two famous methods in texture classification to solve the problem of recognition and classification of defects occurring in a textile manufacture. We have compared local binary patterns method with co-occurrence matrix. The classifier used is the support vector machines (SVM). The system has been tested using TILDA database. The results obtained are interesting and show that LBP is a good method for the problems of recognition and classifcation defects, it gives a good running time especially for the real time applications.

  18. Video Analytics Algorithm for Automatic Vehicle Classification (Intelligent Transport System)

    OpenAIRE

    ArtaIftikhar; Ali Javed

    2013-01-01

    Automated Vehicle detection and classification is an important component of intelligent transport system. Due to significant importance in various fields such as traffic accidents avoidance, toll collection, congestion avoidance, terrorist activities monitoring, security and surveillance systems, intelligent transport system has become important field of study. Various technologies have been used for detecting and classifying vehicles automatically. Automated vehicle detection is broadly divi...

  19. Analysis and Evaluation of IKONOS Image Fusion Algorithm Based on Land Cover Classification

    Institute of Scientific and Technical Information of China (English)

    Xia; JING; Yan; BAO

    2015-01-01

    Different fusion algorithm has its own advantages and limitations,so it is very difficult to simply evaluate the good points and bad points of the fusion algorithm. Whether an algorithm was selected to fuse object images was also depended upon the sensor types and special research purposes. Firstly,five fusion methods,i. e. IHS,Brovey,PCA,SFIM and Gram-Schmidt,were briefly described in the paper. And then visual judgment and quantitative statistical parameters were used to assess the five algorithms. Finally,in order to determine which one is the best suitable fusion method for land cover classification of IKONOS image,the maximum likelihood classification( MLC) was applied using the above five fusion images. The results showed that the fusion effect of SFIM transform and Gram-Schmidt transform were better than the other three image fusion methods in spatial details improvement and spectral information fidelity,and Gram-Schmidt technique was superior to SFIM transform in the aspect of expressing image details. The classification accuracy of the fused image using Gram-Schmidt and SFIM algorithms was higher than that of the other three image fusion methods,and the overall accuracy was greater than 98%. The IHS-fused image classification accuracy was the lowest,the overall accuracy and kappa coefficient were 83. 14% and 0. 76,respectively. Thus the IKONOS fusion images obtained by the Gram-Schmidt and SFIM were better for improving the land cover classification accuracy.

  20. New enumeration algorithm for protein structure comparison and classification

    OpenAIRE

    2013-01-01

    Background Protein structure comparison and classification is an effective method for exploring protein structure-function relations. This problem is computationally challenging. Many different computational approaches for protein structure comparison apply the secondary structure elements (SSEs) representation of protein structures. Results We study the complexity of the protein structure comparison problem based on a mixed-graph model with respect to different computational frameworks. We d...

  1. A Comparative Study of Classification Algorithms for Spam Email Data Analysis

    Directory of Open Access Journals (Sweden)

    Aman Sharma,

    2011-05-01

    Full Text Available In recent years email has become one of the fastest and most economical means of communication. However increase of email users has resulted in the dramatic increase of spam emails during the past few years. Data mining -classification algorithms are used to categorize the email as spam or non-spam. In this paper, we conducted experiment in the WEKA environment by using four algorithms namely ID3, J48, Simple CART and Alternating Decision Tree on the spam email dataset and later the four algorithms were compared in terms of classification accuracy. According to our simulation results the J48 classifier outperforms the ID3, CART and ADTree in terms of classification accuracy.

  2. A hierarchical classification ant colony algorithm for predicting gene ontology terms

    OpenAIRE

    Otero, Fernando E. B.; Freitas, Alex. A.; Johnson, Colin G.

    2009-01-01

    This paper proposes a novel Ant Colony Optimisation algorithm for the hierarchical problem of predicting protein functions using the Gene Ontology (GO). The GO structure represents a challenging case of hierarchical classification, since its terms are organised in a direct acyclic graph fashion where a term can have more than one parent in contrast to only one parent in tree structures. The proposed method discovers an ordered list of classification rules which is able to predict all GO terms...

  3. A Supervised Classification Algorithm for Note Onset Detection

    Directory of Open Access Journals (Sweden)

    Douglas Eck

    2007-01-01

    Full Text Available This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

  4. A novel sparse coding algorithm for classification of tumors based on gene expression data.

    Science.gov (United States)

    Kolali Khormuji, Morteza; Bazrafkan, Mehrnoosh

    2016-06-01

    High-dimensional genomic and proteomic data play an important role in many applications in medicine such as prognosis of diseases, diagnosis, prevention and molecular biology, to name a few. Classifying such data is a challenging task due to the various issues such as curse of dimensionality, noise and redundancy. Recently, some researchers have used the sparse representation (SR) techniques to analyze high-dimensional biological data in various applications in classification of cancer patients based on gene expression datasets. A common problem with all SR-based biological data classification methods is that they cannot utilize the topological (geometrical) structure of data. More precisely, these methods transfer the data into sparse feature space without preserving the local structure of data points. In this paper, we proposed a novel SR-based cancer classification algorithm based on gene expression data that takes into account the geometrical information of all data. Precisely speaking, we incorporate the local linear embedding algorithm into the sparse coding framework, by which we can preserve the geometrical structure of all data. For performance comparison, we applied our algorithm on six tumor gene expression datasets, by which we demonstrate that the proposed method achieves higher classification accuracy than state-of-the-art SR-based tumor classification algorithms. PMID:26337064

  5. Pap-smear Classification Using Efficient Second Order Neural Network Training Algorithms

    DEFF Research Database (Denmark)

    Ampazis, Nikolaos; Dounias, George; Jantzen, Jan

    In this paper we make use of two highly efficient second order neural network training algorithms, namely the LMAM (Levenberg-Marquardt with Adaptive Momentum) and OLMAM (Optimized Levenberg-Marquardt with Adaptive Momentum), for the construction of an efficient pap-smear test classifier. The...... problem. The classification results obtained from the application of the algorithms on a standard benchmark pap-smear data set reveal the power of the two methods to obtain excellent solutions in difficult classification problems whereas other standard computational intelligence techniques achieve...

  6. Pap-smear Classification Using Efficient Second Order Neural Network Training Algorithms

    DEFF Research Database (Denmark)

    Ampazis, Nikolaos; Dounias, George; Jantzen, Jan

    2004-01-01

    In this paper we make use of two highly efficient second order neural network training algorithms, namely the LMAM (Levenberg-Marquardt with Adaptive Momentum) and OLMAM (Optimized Levenberg-Marquardt with Adaptive Momentum), for the construction of an efficient pap-smear test classifier. The...... problem. The classification results obtained from the application of the algorithms on a standard benchmark pap-smear data set reveal the power of the two methods to obtain excellent solutions in difficult classification problems whereas other standard computational intelligence techniques achieve...

  7. EVALUATION OF SOUND CLASSIFICATION USING MODIFIED CLASSIFIER AND SPEECH ENHANCEMENT USING ICA ALGORITHM FOR HEARING AID APPLICATION

    OpenAIRE

    N. Shanmugapriya; E. Chandra

    2016-01-01

    Hearing aid users are exposed to diversified vocal scenarios. The necessity for sound classification algorithms becomes a vital factor to yield good listening experience. In this work, an approach is proposed to improve the speech quality in the hearing aids based on Independent Component Analysis (ICA) algorithm with modified speech signal classification methods. The proposed algorithm has better results on speech intelligibility than other existing algorithm and this result has been proved ...

  8. Research of information classification and strategy intelligence extract algorithm based on military strategy hall

    Science.gov (United States)

    Chen, Lei; Li, Dehua; Yang, Jie

    2007-12-01

    Constructing virtual international strategy environment needs many kinds of information, such as economy, politic, military, diploma, culture, science, etc. So it is very important to build an information auto-extract, classification, recombination and analysis management system with high efficiency as the foundation and component of military strategy hall. This paper firstly use improved Boost algorithm to classify obtained initial information, then use a strategy intelligence extract algorithm to extract strategy intelligence from initial information to help strategist to analysis information.

  9. A Constructive Data Classification Version of the Particle Swarm Optimization Algorithm

    OpenAIRE

    Alexandre Szabo; Leandro Nunes de Castro

    2013-01-01

    The particle swarm optimization algorithm was originally introduced to solve continuous parameter optimization problems. It was soon modified to solve other types of optimization tasks and also to be applied to data analysis. In the latter case, however, there are few works in the literature that deal with the problem of dynamically building the architecture of the system. This paper introduces new particle swarm algorithms specifically designed to solve classification problems. The first pro...

  10. A Comparison of Two Open Source LiDAR Surface Classification Algorithms

    OpenAIRE

    Danny G Marks; Nancy F. Glenn; Timothy E. Link; Hudak, Andrew T.; Rupesh Shrestha; Michael J. Falkowski; Alistair M. S. Smith; Hongyu Huang; Wade T. Tinkham

    2011-01-01

    With the progression of LiDAR (Light Detection and Ranging) towards a mainstream resource management tool, it has become necessary to understand how best to process and analyze the data. While most ground surface identification algorithms remain proprietary and have high purchase costs; a few are openly available, free to use, and are supported by published results. Two of the latter are the multiscale curvature classification and the Boise Center Aerospace Laboratory LiDAR (BCAL) algorithms....

  11. Automated detection and classification of cryptographic algorithms in binary programs through machine learning

    OpenAIRE

    Hosfelt, Diane Duros

    2015-01-01

    Threats from the internet, particularly malicious software (i.e., malware) often use cryptographic algorithms to disguise their actions and even to take control of a victim's system (as in the case of ransomware). Malware and other threats proliferate too quickly for the time-consuming traditional methods of binary analysis to be effective. By automating detection and classification of cryptographic algorithms, we can speed program analysis and more efficiently combat malware. This thesis wil...

  12. Scene/object classification using multispectral data fusion algorithms

    Science.gov (United States)

    Kuzma, Thomas J.; Lazofson, Laurence E.; Choe, Howard C.; Chovan, John D.

    1994-06-01

    Near-simultaneous, multispectral, coregistered imagery of ground target and background signatures were collected over a full diurnal cycle in visible, infrared, and ultraviolet spectrally filtered wavebands using Battelle's portable sensor suite. The imagery data were processed using classical statistical algorithms, artificial neural networks and data clustering techniques to classify objects in the imaged scenes. Imagery collected at different times throughout the day were employed to verify algorithm robustness with respect to temporal variations of spectral signatures. In addition, several multispectral sensor fusion medical imaging applications were explored including imaging of subcutaneous vasculature, retinal angiography, and endoscopic cholecystectomy. Work is also being performed to advance the state of the art using differential absorption lidar as an active remote sensing technique for spectrally detecting, identifying, and tracking hazardous emissions. These investigations support a wide variety of multispectral signature discrimination applications including the concepts of automated target search, landing zone detection, enhanced medical imaging, and chemical/biological agent tracking.

  13. Proposed Algorithm for Network Traffic Classification Based On DB Scan.

    OpenAIRE

    Shevali Agarwal; Anurag Punde; Shubhi Kesharwani

    2013-01-01

    The trend of using internet is increasing rapidly. Everyone wants to share their information on the urgent basis but in the secured manner. Security issues have posed the giant problems within the organization. Many researchers have given their solution for intrusion detection system. In this they are unable to find labeled and unlabeled data. Due to this the efficiency of IDS gets decreased and it generates the wrong alarm in place. In this paper we have given an algorithm called DBSCAN that...

  14. Web spam classification using supervised artificial neural network algorithms

    OpenAIRE

    Chandra, Ashish; Suaib, Mohammad; Beg, Dr. Rizwan

    2015-01-01

    Due to the rapid growth in technology employed by the spammers, there is a need of classifiers that are more efficient, generic and highly adaptive. Neural Network based technologies have high ability of adaption as well as generalization. As per our knowledge, very little work has been done in this field using neural network. We present this paper to fill this gap. This paper evaluates performance of three supervised learning algorithms of artificial neural network by creating classifiers fo...

  15. Model classification rate control algorithm for video coding

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    A model classification rate control method for video coding is proposed. The macro-blocks are classified according to their prediction errors, and different parameters are used in the rate-quantization and distortion-quantization model.The different model parameters are calculated from the previous frame of the same type in the process of coding. These models are used to estimate the relations among rate, distortion and quantization of the current frame. Further steps,such as R-D optimization based quantization adjustment and smoothing of quantization of adjacent macroblocks, are used to improve the quality. The results of the experiments prove that the technique is effective and can be realized easily. The method presented in the paper can be a good way for MPEG and H. 264 rate control.

  16. Data classification using metaheuristic Cuckoo Search technique for Levenberg Marquardt back propagation (CSLM) algorithm

    Science.gov (United States)

    Nawi, Nazri Mohd.; Khan, Abdullah; Rehman, M. Z.

    2015-05-01

    A nature inspired behavior metaheuristic techniques which provide derivative-free solutions to solve complex problems. One of the latest additions to the group of nature inspired optimization procedure is Cuckoo Search (CS) algorithm. Artificial Neural Network (ANN) training is an optimization task since it is desired to find optimal weight set of a neural network in training process. Traditional training algorithms have some limitation such as getting trapped in local minima and slow convergence rate. This study proposed a new technique CSLM by combining the best features of two known algorithms back-propagation (BP) and Levenberg Marquardt algorithm (LM) for improving the convergence speed of ANN training and avoiding local minima problem by training this network. Some selected benchmark classification datasets are used for simulation. The experiment result show that the proposed cuckoo search with Levenberg Marquardt algorithm has better performance than other algorithm used in this study.

  17. Improved Fault Classification in Series Compensated Transmission Line: Comparative Evaluation of Chebyshev Neural Network Training Algorithms.

    Science.gov (United States)

    Vyas, Bhargav Y; Das, Biswarup; Maheshwari, Rudra Prakash

    2016-08-01

    This paper presents the Chebyshev neural network (ChNN) as an improved artificial intelligence technique for power system protection studies and examines the performances of two ChNN learning algorithms for fault classification of series compensated transmission line. The training algorithms are least-square Levenberg-Marquardt (LSLM) and recursive least-square algorithm with forgetting factor (RLSFF). The performances of these algorithms are assessed based on their generalization capability in relating the fault current parameters with an event of fault in the transmission line. The proposed algorithm is fast in response as it utilizes postfault samples of three phase currents measured at the relaying end corresponding to half-cycle duration only. After being trained with only a small part of the generated fault data, the algorithms have been tested over a large number of fault cases with wide variation of system and fault parameters. Based on the studies carried out in this paper, it has been found that although the RLSFF algorithm is faster for training the ChNN in the fault classification application for series compensated transmission lines, the LSLM algorithm has the best accuracy in testing. The results prove that the proposed ChNN-based method is accurate, fast, easy to design, and immune to the level of compensations. Thus, it is suitable for digital relaying applications. PMID:25314714

  18. Classification of EEG Signals using adaptive weighted distance nearest neighbor algorithm

    Directory of Open Access Journals (Sweden)

    E. Parvinnia

    2014-01-01

    Full Text Available Electroencephalogram (EEG signals are often used to diagnose diseases such as seizure, alzheimer, and schizophrenia. One main problem with the recorded EEG samples is that they are not equally reliable due to the artifacts at the time of recording. EEG signal classification algorithms should have a mechanism to handle this issue. It seems that using adaptive classifiers can be useful for the biological signals such as EEG. In this paper, a general adaptive method named weighted distance nearest neighbor (WDNN is applied for EEG signal classification to tackle this problem. This classification algorithm assigns a weight to each training sample to control its influence in classifying test samples. The weights of training samples are used to find the nearest neighbor of an input query pattern. To assess the performance of this scheme, EEG signals of thirteen schizophrenic patients and eighteen normal subjects are analyzed for the classification of these two groups. Several features including, fractal dimension, band power and autoregressive (AR model are extracted from EEG signals. The classification results are evaluated using Leave one (subject out cross validation for reliable estimation. The results indicate that combination of WDNN and selected features can significantly outperform the basic nearest-neighbor and the other methods proposed in the past for the classification of these two groups. Therefore, this method can be a complementary tool for specialists to distinguish schizophrenia disorder.

  19. Proposed Algorithm for Network Traffic Classification Based On DB Scan.

    Directory of Open Access Journals (Sweden)

    Shevali Agarwal

    2013-11-01

    Full Text Available The trend of using internet is increasing rapidly. Everyone wants to share their information on the urgent basis but in the secured manner. Security issues have posed the giant problems within the organization. Many researchers have given their solution for intrusion detection system. In this they are unable to find labeled and unlabeled data. Due to this the efficiency of IDS gets decreased and it generates the wrong alarm in place. In this paper we have given an algorithm called DBSCAN that increases the efficiency of the IDS system.

  20. Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

    Directory of Open Access Journals (Sweden)

    Huibin Lu

    2015-01-01

    Full Text Available In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the source domain to the target domain; the second step is to construct L1-Graph on the basis of sparse representation, so as to search for the nearest neighbor data with self-adaptation and preserve the samples and the geometric structure; lastly, we integrate two complementary objective functions into the unified optimization issue and use the iterative algorithm to cope with it, and then the estimation of the testing sample classification is completed. Comparative experiments are conducted in USPS-Binary digital database, Three-Domain Object Benchmark database, and ALOI database; the experimental results verify the effectiveness of the proposed algorithm, which improves the recognition accuracy and ensures the robustness of algorithm.

  1. A Comparative Study of Classification and Regression Algorithms for Modelling Students' Academic Performance

    Science.gov (United States)

    Strecht, Pedro; Cruz, Luís; Soares, Carlos; Mendes-Moreira, João; Abreu, Rui

    2015-01-01

    Predicting the success or failure of a student in a course or program is a problem that has recently been addressed using data mining techniques. In this paper we evaluate some of the most popular classification and regression algorithms on this problem. We address two problems: prediction of approval/failure and prediction of grade. The former is…

  2. Classification of EEG signals using a greedy algorithm for constructing a committee of weak classifiers

    International Nuclear Information System (INIS)

    A greedy algorithm has been proposed for the construction of a committee of weak EEG classifiers, which work in the simplest one-dimensional feature spaces. It has been shown that the accuracy of classification by the committee is several times higher than the accuracy of the best weak classifier

  3. A Novel Algorithm for Imbalance Data Classification Based on Neighborhood Hypergraph

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2014-01-01

    Full Text Available The classification problem for imbalance data is paid more attention to. So far, many significant methods are proposed and applied to many fields. But more efficient methods are needed still. Hypergraph may not be powerful enough to deal with the data in boundary region, although it is an efficient tool to knowledge discovery. In this paper, the neighborhood hypergraph is presented, combining rough set theory and hypergraph. After that, a novel classification algorithm for imbalance data based on neighborhood hypergraph is developed, which is composed of three steps: initialization of hyperedge, classification of training data set, and substitution of hyperedge. After conducting an experiment of 10-fold cross validation on 18 data sets, the proposed algorithm has higher average accuracy than others.

  4. Synthesis of supervised classification algorithm using intelligent and statistical tools

    Directory of Open Access Journals (Sweden)

    Ali Douik

    2009-09-01

    Full Text Available A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby, the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national competition. This work is prominent for many next computer vision studies as it's detailed in this study.

  5. Synthesis of supervised classification algorithm using intelligent and statistical tools

    CERN Document Server

    Douik, Ali

    2009-01-01

    A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby..., the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national c...

  6. Land use mapping from CBERS-2 images with open source tools by applying different classification algorithms

    Science.gov (United States)

    Sanhouse-García, Antonio J.; Rangel-Peraza, Jesús Gabriel; Bustos-Terrones, Yaneth; García-Ferrer, Alfonso; Mesas-Carrascosa, Francisco J.

    2016-02-01

    Land cover classification is often based on different characteristics between their classes, but with great homogeneity within each one of them. This cover is obtained through field work or by mean of processing satellite images. Field work involves high costs; therefore, digital image processing techniques have become an important alternative to perform this task. However, in some developing countries and particularly in Casacoima municipality in Venezuela, there is a lack of geographic information systems due to the lack of updated information and high costs in software license acquisition. This research proposes a low cost methodology to develop thematic mapping of local land use and types of coverage in areas with scarce resources. Thematic mapping was developed from CBERS-2 images and spatial information available on the network using open source tools. The supervised classification method per pixel and per region was applied using different classification algorithms and comparing them among themselves. Classification method per pixel was based on Maxver algorithms (maximum likelihood) and Euclidean distance (minimum distance), while per region classification was based on the Bhattacharya algorithm. Satisfactory results were obtained from per region classification, where overall reliability of 83.93% and kappa index of 0.81% were observed. Maxver algorithm showed a reliability value of 73.36% and kappa index 0.69%, while Euclidean distance obtained values of 67.17% and 0.61% for reliability and kappa index, respectively. It was demonstrated that the proposed methodology was very useful in cartographic processing and updating, which in turn serve as a support to develop management plans and land management. Hence, open source tools showed to be an economically viable alternative not only for forestry organizations, but for the general public, allowing them to develop projects in economically depressed and/or environmentally threatened areas.

  7. Image analysis, classification, and change detection in remote sensing with algorithms for ENVI/IDL

    CERN Document Server

    Canty, Morton J

    2011-01-01

    Demonstrating the breadth and depth of growth in the field since the publication of the popular first edition, Image Analysis, Classification and Change Detection in Remote Sensing, with Algorithms for ENVI/IDL, Second Edition has been updated and expanded to keep pace with the latest versions of the ENVI software environment. Effectively interweaving theory, algorithms, and computer codes, the text supplies an accessible introduction to the techniques used in the processing of remotely sensed imagery. This significantly expanded edition presents numerous image analysis examples and algorithms

  8. Hybrid neural network and statistical classification algorithms in computer-assisted diagnosis

    Science.gov (United States)

    Stotzka, Rainer

    2000-06-01

    The development of computer assisted diagnosis systems for image-patterns is still in the early stages compared to the powerful image and object recognition capabilities of the human eye and visual cortex. Rules have to be defined and features have to be found manually in digital images to come to an automatic classification. The extraction of discriminating features is especially in medical applications a very time consuming process. The quality of the defined features influences directly the classification success. Artificial neural networks are in principle able to solve complex recognition and classification tasks, but their computational expenses restrict their use to small images. A new improved image object classification scheme consists of neural networks as feature extractors and common statistical discrimination algorithms. Applied to the recognition of different types of tumor nuclei images this system is able to find differences which are barely discernible by human eyes.

  9. An arrhythmia classification algorithm using a dedicated wavelet adapted to different subjects

    Directory of Open Access Journals (Sweden)

    Min Se Dong

    2011-06-01

    Full Text Available Abstract Background Numerous studies have been conducted regarding a heartbeat classification algorithm over the past several decades. However, many algorithms have also been studied to acquire robust performance, as biosignals have a large amount of variation among individuals. Various methods have been proposed to reduce the differences coming from personal characteristics, but these expand the differences caused by arrhythmia. Methods In this paper, an arrhythmia classification algorithm using a dedicated wavelet adapted to individual subjects is proposed. We reduced the performance variation using dedicated wavelets, as in the ECG morphologies of the subjects. The proposed algorithm utilizes morphological filtering and a continuous wavelet transform with a dedicated wavelet. A principal component analysis and linear discriminant analysis were utilized to compress the morphological data transformed by the dedicated wavelets. An extreme learning machine was used as a classifier in the proposed algorithm. Results A performance evaluation was conducted with the MIT-BIH arrhythmia database. The results showed a high sensitivity of 97.51%, specificity of 85.07%, accuracy of 97.94%, and a positive predictive value of 97.26%. Conclusions The proposed algorithm achieves better accuracy than other state-of-the-art algorithms with no intrasubject between the training and evaluation datasets. And it significantly reduces the amount of intervention needed by physicians.

  10. Image processing and classification algorithm for yeast cell morphology in a microfluidic chip

    Science.gov (United States)

    Yang Yu, Bo; Elbuken, Caglar; Ren, Carolyn L.; Huissoon, Jan P.

    2011-06-01

    The study of yeast cell morphology requires consistent identification of cell cycle phases based on cell bud size. A computer-based image processing algorithm is designed to automatically classify microscopic images of yeast cells in a microfluidic channel environment. The images were enhanced to reduce background noise, and a robust segmentation algorithm is developed to extract geometrical features including compactness, axis ratio, and bud size. The features are then used for classification, and the accuracy of various machine-learning classifiers is compared. The linear support vector machine, distance-based classification, and k-nearest-neighbor algorithm were the classifiers used in this experiment. The performance of the system under various illumination and focusing conditions were also tested. The results suggest it is possible to automatically classify yeast cells based on their morphological characteristics with noisy and low-contrast images.

  11. Nesting one-against-one algorithm based on SVMs for pattern classification.

    Science.gov (United States)

    Liu, Bo; Hao, Zhifeng; Tsang, Eric C C

    2008-12-01

    Support vector machines (SVMs), which were originally designed for binary classifications, are an excellent tool for machine learning. For the multiclass classifications, they are usually converted into binary ones before they can be used to classify the examples. In the one-against-one algorithm with SVMs, there exists an unclassifiable region where the data samples cannot be classified by its decision function. This paper extends the one-against-one algorithm to handle this problem. We also give the convergence and computational complexity analysis of the proposed method. Finally, one-against-one, fuzzy decision function (FDF), and decision-directed acyclic graph (DDAG) algorithms and our proposed method are compared using five University of California at Irvine (UCI) data sets. The results report that the proposed method can handle the unclassifiable region better than others. PMID:19054729

  12. Robust algorithm for arrhythmia classification in ECG using extreme learning machine

    Directory of Open Access Journals (Sweden)

    Shin Kwangsoo

    2009-10-01

    Full Text Available Abstract Background Recently, extensive studies have been carried out on arrhythmia classification algorithms using artificial intelligence pattern recognition methods such as neural network. To improve practicality, many studies have focused on learning speed and the accuracy of neural networks. However, algorithms based on neural networks still have some problems concerning practical application, such as slow learning speeds and unstable performance caused by local minima. Methods In this paper we propose a novel arrhythmia classification algorithm which has a fast learning speed and high accuracy, and uses Morphology Filtering, Principal Component Analysis and Extreme Learning Machine (ELM. The proposed algorithm can classify six beat types: normal beat, left bundle branch block, right bundle branch block, premature ventricular contraction, atrial premature beat, and paced beat. Results The experimental results of the entire MIT-BIH arrhythmia database demonstrate that the performances of the proposed algorithm are 98.00% in terms of average sensitivity, 97.95% in terms of average specificity, and 98.72% in terms of average accuracy. These accuracy levels are higher than or comparable with those of existing methods. We make a comparative study of algorithm using an ELM, back propagation neural network (BPNN, radial basis function network (RBFN, or support vector machine (SVM. Concerning the aspect of learning time, the proposed algorithm using ELM is about 290, 70, and 3 times faster than an algorithm using a BPNN, RBFN, and SVM, respectively. Conclusion The proposed algorithm shows effective accuracy performance with a short learning time. In addition we ascertained the robustness of the proposed algorithm by evaluating the entire MIT-BIH arrhythmia database.

  13. Seasonal cultivated and fallow cropland mapping using MODIS-based automated cropland classification algorithm

    Science.gov (United States)

    Wu, Zhuoting; Thenkabail, Prasad S.; Mueller, Rick; Zakzeski, Audra; Melton, Forrest; Johnson, Lee; Rosevelt, Carolyn; Dwyer, John; Jones, Jeanine; Verdin, James P.

    2014-01-01

    Increasing drought occurrences and growing populations demand accurate, routine, and consistent cultivated and fallow cropland products to enable water and food security analysis. The overarching goal of this research was to develop and test automated cropland classification algorithm (ACCA) that provide accurate, consistent, and repeatable information on seasonal cultivated as well as seasonal fallow cropland extents and areas based on the Moderate Resolution Imaging Spectroradiometer remote sensing data. Seasonal ACCA development process involves writing series of iterative decision tree codes to separate cultivated and fallow croplands from noncroplands, aiming to accurately mirror reliable reference data sources. A pixel-by-pixel accuracy assessment when compared with the U.S. Department of Agriculture (USDA) cropland data showed, on average, a producer's accuracy of 93% and a user's accuracy of 85% across all months. Further, ACCA-derived cropland maps agreed well with the USDA Farm Service Agency crop acreage-reported data for both cultivated and fallow croplands with R-square values over 0.7 and field surveys with an accuracy of ≥95% for cultivated croplands and ≥76% for fallow croplands. Our results demonstrated the ability of ACCA to generate cropland products, such as cultivated and fallow cropland extents and areas, accurately, automatically, and repeatedly throughout the growing season.

  14. Seasonal cultivated and fallow cropland mapping using MODIS-based automated cropland classification algorithm

    Science.gov (United States)

    Wu, Zhuoting; Thenkabail, Prasad S.; Mueller, Rick; Zakzeski, Audra; Melton, Forrest; Johnson, Lee; Rosevelt, Carolyn; Dwyer, John; Jones, Jeanine; Verdin, James P.

    2013-01-01

    Increasing drought occurrences and growing populations demand accurate, routine, and consistent cultivated and fallow cropland products to enable water and food security analysis. The overarching goal of this research was to develop and test automated cropland classification algorithm (ACCA) that provide accurate, consistent, and repeatable information on seasonal cultivated as well as seasonal fallow cropland extents and areas based on the Moderate Resolution Imaging Spectroradiometer remote sensing data. Seasonal ACCA development process involves writing series of iterative decision tree codes to separate cultivated and fallow croplands from noncroplands, aiming to accurately mirror reliable reference data sources. A pixel-by-pixel accuracy assessment when compared with the U.S. Department of Agriculture (USDA) cropland data showed, on average, a producer’s accuracy of 93% and a user’s accuracy of 85% across all months. Further, ACCA-derived cropland maps agreed well with the USDA Farm Service Agency crop acreage-reported data for both cultivated and fallow croplands with R-square values over 0.7 and field surveys with an accuracy of ≥95% for cultivated croplands and ≥76% for fallow croplands. Our results demonstrated the ability of ACCA to generate cropland products, such as cultivated and fallow cropland extents and areas, accurately, automatically, and repeatedly throughout the growing season.

  15. Classification of Aerosol Retrievals from Spaceborne Polarimetry Using a Multiparameter Algorithm

    Science.gov (United States)

    Russell, Philip B.; Kacenelenbogen, Meloe; Livingston, John M.; Hasekamp, Otto P.; Burton, Sharon P.; Schuster, Gregory L.; Johnson, Matthew S.; Knobelspiesse, Kirk D.; Redemann, Jens; Ramachandran, S.; Holben, Brent

    2013-01-01

    In this presentation, we demonstrate application of a new aerosol classification algorithm to retrievals from the POLDER-3 polarimter on the PARASOL spacecraft. Motivation and method: Since the development of global aerosol measurements by satellites and AERONET, classification of observed aerosols into several types (e.g., urban-industrial, biomass burning, mineral dust, maritime, and various subtypes or mixtures of these) has proven useful to: understanding aerosol sources, transformations, effects, and feedback mechanisms; improving accuracy of satellite retrievals and quantifying assessments of aerosol radiative impacts on climate.

  16. Comparison of some classification algorithms based on deterministic and nondeterministic decision rules

    KAUST Repository

    Delimata, Paweł

    2010-01-01

    We discuss two, in a sense extreme, kinds of nondeterministic rules in decision tables. The first kind of rules, called as inhibitory rules, are blocking only one decision value (i.e., they have all but one decisions from all possible decisions on their right hand sides). Contrary to this, any rule of the second kind, called as a bounded nondeterministic rule, can have on the right hand side only a few decisions. We show that both kinds of rules can be used for improving the quality of classification. In the paper, two lazy classification algorithms of polynomial time complexity are considered. These algorithms are based on deterministic and inhibitory decision rules, but the direct generation of rules is not required. Instead of this, for any new object the considered algorithms extract from a given decision table efficiently some information about the set of rules. Next, this information is used by a decision-making procedure. The reported results of experiments show that the algorithms based on inhibitory decision rules are often better than those based on deterministic decision rules. We also present an application of bounded nondeterministic rules in construction of rule based classifiers. We include the results of experiments showing that by combining rule based classifiers based on minimal decision rules with bounded nondeterministic rules having confidence close to 1 and sufficiently large support, it is possible to improve the classification quality. © 2010 Springer-Verlag.

  17. A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation

    Directory of Open Access Journals (Sweden)

    Lavner Yizhar

    2009-01-01

    Full Text Available We present an efficient algorithm for segmentation of audio signals into speech or music. The central motivation to our study is consumer audio applications, where various real-time enhancements are often applied. The algorithm consists of a learning phase and a classification phase. In the learning phase, predefined training data is used for computing various time-domain and frequency-domain features, for speech and music signals separately, and estimating the optimal speech/music thresholds, based on the probability density functions of the features. An automatic procedure is employed to select the best features for separation. In the test phase, initial classification is performed for each segment of the audio signal, using a three-stage sieve-like approach, applying both Bayesian and rule-based methods. To avoid erroneous rapid alternations in the classification, a smoothing technique is applied, averaging the decision on each segment with past segment decisions. Extensive evaluation of the algorithm, on a database of more than 12 hours of speech and more than 22 hours of music showed correct identification rates of 99.4% and 97.8%, respectively, and quick adjustment to alternating speech/music sections. In addition to its accuracy and robustness, the algorithm can be easily adapted to different audio types, and is suitable for real-time operation.

  18. Human Talent Prediction in HRM using C4.5 Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Hamidah Jantan,

    2010-11-01

    Full Text Available In HRM, among the challenges for HR professionals is to manage an organization’s talents, especially to ensure the right person for the right job at the right time. Human talent prediction is an alternative to handle this issue. Due to that reason, classification and prediction in data mining which is commonly used in many areas can also be implemented to human talent. There are many classification techniques in data mining techniques such as Decision Tree, Neural Network, Rough Set Theory, Bayesian theory and Fuzzy logic. Decision tree is among the popular classification techniques, which can produce the interpretable rules or logic statement. Thegenerated rules from the selected technique can be used for future prediction. In this article, we present the study on how the potential human talent can be predicted using a decision tree classifier. By using this technique, the pattern of talent performance can be identified through the classification process. In that case, the hidden and valuable knowledge discovered in the related databases will be summarized in the decision tree structure. In this study, we use decision tree C4.5 classification algorithm to generate the classification rules for human talent performance records. Finally, the generated rules are evaluated using the unseen data in order to estimate the accuracy of the prediction result.

  19. Linear Programming Support Vector Machines for Pattern Classification and Regression Estimation: and The SR Algorithm: Improving Speed and Tightness of VC Bounds in SV Algorithms

    OpenAIRE

    Friel, Thilo-Thomas; Harrison, R.

    1998-01-01

    Three novel algorithms are presented; the linear programming (LP) machine for pattern classification, the LP machine for regression estimation and the set-reduction (SR) algorithm. The LP machine is a learning machine which achieves solutions as good as the SV machine by only maximising a linear cost-function (SV machine are based on quadratic programming). The set-reduction algorithm improves the speed and accuracy of LP machines, SV machines and other related algorithms. An LP machines's de...

  20. Recent processing string and fusion algorithm improvements for automated sea mine classification in shallow water

    Science.gov (United States)

    Aridgides, Tom; Fernandez, Manuel F.; Dobeck, Gerald J.

    2003-09-01

    A novel sea mine computer-aided-detection / computer-aided-classification (CAD/CAC) processing string has been developed. The overall CAD/CAC processing string consists of pre-processing, adaptive clutter filtering (ACF), normalization, detection, feature extraction, feature orthogonalization, optimal subset feature selection, classification and fusion processing blocks. The range-dimension ACF is matched both to average highlight and shadow information, while also adaptively suppressing background clutter. For each detected object, features are extracted and processed through an orthogonalization transformation, enabling an efficient application of the optimal log-likelihood-ratio-test (LLRT) classification rule, in the orthogonal feature space domain. The classified objects of 4 distinct processing strings are fused using the classification confidence values as features and logic-based, "M-out-of-N", or LLRT-based fusion rules. The utility of the overall processing strings and their fusion was demonstrated with new shallow water high-resolution sonar imagery data. The processing string detection and classification parameters were tuned and the string classification performance was optimized, by appropriately selecting a subset of the original feature set. A significant improvement was made to the CAD/CAC processing string by utilizing a repeated application of the subset feature selection / LLRT classification blocks. It was shown that LLRT-based fusion algorithms outperform the logic based and the "M-out-of-N" ones. The LLRT-based fusion of the CAD/CAC processing strings resulted in up to a nine-fold false alarm rate reduction, compared to the best single CAD/CAC processing string results, while maintaining a constant correct mine classification probability.

  1. Survey on Parameters of Fingerprint Classification Methods Based On Algorithmic Flow

    Directory of Open Access Journals (Sweden)

    Dimple Parekh

    2011-09-01

    Full Text Available Classification refers to assigning a given fingerprint to one of the existing classes already recognized inthe literature. A search over all the records in the database takes a long time, so the goal is to reduce thesize of the search space by choosing an appropriate subset of database for search. Classifying afingerprint images is a very difficult pattern recognition problem, due to the minimal interclassvariability and maximal intraclass variability. This paper presents a sequence flow diagram which willhelp in developing the clarity on designing algorithm for classification based on various parametersextracted from the fingerprint image. It discusses in brief the ways in which the parameters are extractedfrom the image. Existing fingerprint classification approaches are based on these parameters as inputfor classifying the image. Parameters like orientation map, singular points, spurious singular points,ridge flow, transforms and hybrid feature are discussed in the paper.

  2. A low-latency Glitch Classification Algorithm Based in Waveform Morphology

    Science.gov (United States)

    Gabbard, Hunter; Mukherjee, Soma; Stone, Robert

    2016-03-01

    We present a novel and efficient algorithm for classification of signals that arise in gravitational wave channels of the Laser Interferometer Gravitational Wave Observatory (LIGO). Using data from LIGO's sixth science run (S6), we developed a new glitch classification algorithm based mainly on the morphology of the waveform as well as several other parameters (signal-to-noise ratio (SNR), duration, bandwidth, etc.). This is done using two novel methods, Kohonen Self Organizing Feature Maps (SOMs), and discrete wavelet transform coefficients. This study shows the feasibility of utilizing unsupervised machine learning techniques (SOMs) in order to display a multidimensional trigger set in a low-latency two dimensional format. UTRGV NSF REU Program.

  3. An Index-Inspired Algorithm for Anytime Classification on Evolving Data Streams

    DEFF Research Database (Denmark)

    Kranen, Phillip; Assent, Ira; Seidl, Thomas

    2012-01-01

    Due to the ever growing presence of data streams there has been a considerable amount of research on stream data mining over the past years. Anytime algorithms are particularly well suited for stream mining, since they flexibly use all available time on streams of varying data rates, and are also...... shown to outperform traditional budget approaches on constant streams. In this article we present an index-inspired algorithm for Bayesian anytime classification on evolving data streams and show its performance on benchmark data sets....

  4. A Comparative Study of Classification Algorithms for Spam Email Data Analysis

    OpenAIRE

    Aman Sharma; Suruchi Sahni

    2011-01-01

    In recent years email has become one of the fastest and most economical means of communication. However increase of email users has resulted in the dramatic increase of spam emails during the past few years. Data mining -classification algorithms are used to categorize the email as spam or non-spam. In this paper, we conducted experiment in the WEKA environment by using four algorithms namely ID3, J48, Simple CART and Alternating Decision Tree on the spam email dataset and later the four algo...

  5. A simulation of remote sensor systems and data processing algorithms for spectral feature classification

    Science.gov (United States)

    Arduini, R. F.; Aherron, R. M.; Samms, R. W.

    1984-01-01

    A computational model of the deterministic and stochastic processes involved in multispectral remote sensing was designed to evaluate the performance of sensor systems and data processing algorithms for spectral feature classification. Accuracy in distinguishing between categories of surfaces or between specific types is developed as a means to compare sensor systems and data processing algorithms. The model allows studies to be made of the effects of variability of the atmosphere and of surface reflectance, as well as the effects of channel selection and sensor noise. Examples of these effects are shown.

  6. FREQUENCY ESTIMATION OF DISTORTED POWER SYSTEM SIGNALS USING MULTIPLE SIGNAL CLASSIFICATION ALGORITHM

    OpenAIRE

    UZUNOĞLU, Cengiz Polat

    2013-01-01

    In this paper a sub-space based fundamental frequency estimator of multiple signal classification (MUSIC) algorithm for distorted power system signals is proposed. Noise and signal sub-spaces are obtained from eigenvalue decomposition and desired frequency is estimated. Distortion level of the power system signals which have been observed from measurements may affect the accuracy of frequency measurement in a power system. Thus, the proposed method is employed to decompose noise sub-space for...

  7. An Automated Cropland Classification Algorithm (ACCA) for Tajikistan by Combining Landsat, MODIS, and Secondary Data

    OpenAIRE

    Prasad S. Thenkabail; Zhuoting Wu

    2012-01-01

    The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA) that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed) over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan u...

  8. Mahalanobis Distance Metric Learning Algorithm for Instance-based Data Stream Classification

    OpenAIRE

    Perez, Jorge Luis Rivero; Ribeiro, Bernardete; Perez, Carlos Morell

    2016-01-01

    With the massive data challenges nowadays and the rapid growing of technology, stream mining has recently received considerable attention. To address the large number of scenarios in which this phenomenon manifests itself suitable tools are required in various research fields. Instance-based data stream algorithms generally employ the Euclidean distance for the classification task underlying this problem. A novel way to look into this issue is to take advantage of a more flexible metric due t...

  9. Determination of Optimum Classification System for Hyperspectral Imagery and LIDAR Data Based on Bees Algorithm

    Science.gov (United States)

    Samadzadega, F.; Hasani, H.

    2015-12-01

    Hyperspectral imagery is a rich source of spectral information and plays very important role in discrimination of similar land-cover classes. In the past, several efforts have been investigated for improvement of hyperspectral imagery classification. Recently the interest in the joint use of LiDAR data and hyperspectral imagery has been remarkably increased. Because LiDAR can provide structural information of scene while hyperspectral imagery provide spectral and spatial information. The complementary information of LiDAR and hyperspectral data may greatly improve the classification performance especially in the complex urban area. In this paper feature level fusion of hyperspectral and LiDAR data is proposed where spectral and structural features are extract from both dataset, then hybrid feature space is generated by feature stacking. Support Vector Machine (SVM) classifier is applied on hybrid feature space to classify the urban area. In order to optimize the classification performance, two issues should be considered: SVM parameters values determination and feature subset selection. Bees Algorithm (BA) is powerful meta-heuristic optimization algorithm which is applied to determine the optimum SVM parameters and select the optimum feature subset simultaneously. The obtained results show the proposed method can improve the classification accuracy in addition to reducing significantly the dimension of feature space.

  10. Genetic Algorithm Optimized Back Propagation Neural Network for Knee Osteoarthritis Classification

    Directory of Open Access Journals (Sweden)

    Jian WeiKoh

    2014-10-01

    Full Text Available Osteoarthritis (OA is the most common form of arthritis that caused by degeneration of articular cartilage, which function as shock absorption cushion in our joint. The most common joints that infected by osteoarthritis are hand, hip, spine and knee. Knee osteoarthritis is the focus in this study. These days, Magnetic Resonance Imaging (MRI technique is widely applied in diagnosis the progression of osteoarthritis due to the ability to display the contrast between bone and cartilage. Traditionally, interpretation of MR image is done manually by physicians who are very inconsistent and time consuming. Hence, automated classifier is needed for minimize the processing time of classification. In this study, genetic algorithm optimized neural network technique is used for the knee osteoarthritis classification. This classifier consists of 4 stages, which are feature extraction by Discrete Wavelet Transform (DWT, training stage of neural network, testing stage of neural network and optimization stage by Genetic Algorithm (GA. This technique obtained 98.5% of classification accuracy when training and 94.67% on testing stage. Besides, classification time is reduced by 17.24% after optimization of the neural network.

  11. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis

    Science.gov (United States)

    Lo, Benjamin W. Y.; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R. Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A. H.

    2016-01-01

    Background: Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). Methods: The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Results: Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56–2.45, P decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH. PMID:27512607

  12. Classification of Medical Datasets Using SVMs with Hybrid Evolutionary Algorithms Based on Endocrine-Based Particle Swarm Optimization and Artificial Bee Colony Algorithms.

    Science.gov (United States)

    Lin, Kuan-Cheng; Hsieh, Yi-Hsiu

    2015-10-01

    The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features. PMID:26289628

  13. Evolving Neural Network Using Variable String Genetic Algorithm for Color Infrared Aerial Image Classification

    Institute of Scientific and Technical Information of China (English)

    FU Xiaoyang; P E R Dale; ZHANG Shuqing

    2008-01-01

    Coastal wetlands are characterized by complex patterns both in their geomorphic and ecological features.Besides field observations,it is necessary to analyze the land cover of wetlands through the color infrared (CIR) aerial photography or remote sensing image.In this paper,we designed an evolving neural network classifier using variable string genetic algorithm (VGA) for the land cover classification of CIR aerial image.With the VGA,the classifier that we designed is able to evolve automatically the appropriate number of hidden nodes for modeling the neural network topology optimally and to find a near-optimal set of connection weights globally.Then,with backpropagation algorithm (BP),it can find the best connection weights.The VGA-BP classifier,which is derived from hybrid algorithms mentioned above,is demonstrated on CIR images classification effectively.Compared with standard classifiers,such as Bayes maximum-likelihood classifier,VGA classifier and BP-MLP (multi-layer perception) classifier,it has shown that the VGA-BP classifier can have better performance on highly resolution land cover classification.

  14. A Hybrid Multiobjective Differential Evolution Algorithm and Its Application to the Optimization of Grinding and Classification

    Directory of Open Access Journals (Sweden)

    Yalin Wang

    2013-01-01

    Full Text Available The grinding-classification is the prerequisite process for full recovery of the nonrenewable minerals with both production quality and quantity objectives concerned. Its natural formulation is a constrained multiobjective optimization problem of complex expression since the process is composed of one grinding machine and two classification machines. In this paper, a hybrid differential evolution (DE algorithm with multi-population is proposed. Some infeasible solutions with better performance are allowed to be saved, and they participate randomly in the evolution. In order to exploit the meaningful infeasible solutions, a functionally partitioned multi-population mechanism is designed to find an optimal solution from all possible directions. Meanwhile, a simplex method for local search is inserted into the evolution process to enhance the searching strategy in the optimization process. Simulation results from the test of some benchmark problems indicate that the proposed algorithm tends to converge quickly and effectively to the Pareto frontier with better distribution. Finally, the proposed algorithm is applied to solve a multiobjective optimization model of a grinding and classification process. Based on the technique for order performance by similarity to ideal solution (TOPSIS, the satisfactory solution is obtained by using a decision-making method for multiple attributes.

  15. EVALUATION OF SOUND CLASSIFICATION USING MODIFIED CLASSIFIER AND SPEECH ENHANCEMENT USING ICA ALGORITHM FOR HEARING AID APPLICATION

    Directory of Open Access Journals (Sweden)

    N. Shanmugapriya

    2016-03-01

    Full Text Available Hearing aid users are exposed to diversified vocal scenarios. The necessity for sound classification algorithms becomes a vital factor to yield good listening experience. In this work, an approach is proposed to improve the speech quality in the hearing aids based on Independent Component Analysis (ICA algorithm with modified speech signal classification methods. The proposed algorithm has better results on speech intelligibility than other existing algorithm and this result has been proved by the intelligibility experiments. The ICA algorithm and modified Bayesian with Adaptive Neural Fuzzy Interference System (ANFIS is to effectiveness of the strategies of speech quality, thus this classification increases noise resistance of the new speech processing algorithm that proposed in this present work. This proposed work indicates that the new Modified classifier can be feasible in hearing aid applications.

  16. A BENCHMARK TO SELECT DATA MINING BASED CLASSIFICATION ALGORITHMS FOR BUSINESS INTELLIGENCE AND DECISION SUPPORT SYSTEMS

    Directory of Open Access Journals (Sweden)

    Pardeep Kumar

    2012-09-01

    Full Text Available In today’s business scenario, we percept major changes in how managers use computerized support inmaking decisions. As more number of decision-makers use computerized support in decision making,decision support systems (DSS is developing from its starting as a personal support tool and is becomingthe common resource in an organization. DSS serve the management, operations, and planning levels of anorganization and help to make decisions, which may be rapidly changing and not easily specified inadvance. Data mining has a vital role to extract important information to help in decision making of adecision support system. It has been the active field of research in the last two-three decades. Integration ofdata mining and decision support systems (DSS can lead to the improved performance and can enable thetackling of new types of problems. Artificial Intelligence methods are improving the quality of decisionsupport, and have become embedded in many applications ranges from ant locking automobile brakes tothese days interactive search engines. It provides various machine learning techniques to support datamining. The classification is one of the main and valuable tasks of data mining. Several types ofclassification algorithms have been suggested, tested and compared to determine the future trends based onunseen data. There has been no single algorithm found to be superior over all others for all data sets.Various issues such as predictive accuracy, training time to build the model, robustness and scalabilitymust be considered and can have tradeoffs, further complex the quest for an overall superior method. Theobjective of this paper is to compare various classification algorithms that have been frequently used indata mining for decision support systems. Three decision trees based algorithms, one artificial neuralnetwork, one statistical, one support vector machines with and without adaboost and one clusteringalgorithm are tested and compared on

  17. A Benchmark to Select Data Mining Based Classification Algorithms for Business Intelligence and Decision Support Systems

    Directory of Open Access Journals (Sweden)

    Pardeep Kumar

    2012-10-01

    Full Text Available In today’s business scenario, we percept major changes in how managers use computerized support inmaking decisions. As more number of decision-makers use computerized support in decision making,decision support systems (DSS is developing from its starting as a personal support tool and is becomingthe common resource in an organization. DSS serve the management, operations, and planning levels of anorganization and help to make decisions, which may be rapidly changing and not easily specified inadvance. Data mining has a vital role to extract important information to help in decision making of adecision support system. It has been the active field of research in the last two-three decades. Integration ofdata mining and decision support systems (DSS can lead to the improved performance and can enable thetackling of new types of problems. Artificial Intelligence methods are improving the quality of decisionsupport, and have become embedded in many applications ranges from ant locking automobile brakes tothese days interactive search engines. It provides various machine learning techniques to support datamining. The classification is one of the main and valuable tasks of data mining. Several types ofclassification algorithms have been suggested, tested and compared to determine the future trends based onunseen data. There has been no single algorithm found to be superior over all others for all data sets.Various issues such as predictive accuracy, training time to build the model, robustness and scalabilitymust be considered and can have tradeoffs, further complex the quest for an overall superior method. Theobjective of this paper is to compare various classification algorithms that have been frequently used indata mining for decision support systems. Three decision trees based algorithms, one artificial neuralnetwork, one statistical, one support vector machines with and without adaboost and one clusteringalgorithm are tested and compared on

  18. Classification decision tree algorithm assisting in diagnosing solitary pulmonary nodule by SPECT/CT fusion imaging

    Institute of Scientific and Technical Information of China (English)

    Qiang Yongqian; Guo Youmin; Jin Chenwang; Liu Min; Yang Aimin; Wang Qiuping; Niu Gang

    2008-01-01

    Objective To develop a classification tree algorithm to improve diagnostic performances of 99mTc-MIBI SPECT/CT fusion imaging in differentiating solitary pulmonary nodules (SPNs). Methods Forty-four SPNs, including 30 malignant cases and 14 benign ones that were eventually pathologically identified, were included in this prospective study. All patients received 99Tcm-MIBI SPECT/CT scanning at an early stage and a delayed stage before operation. Thirty predictor variables, including 11 clinical variables, 4 variables of emission and 15 variables of transmission information from SPECT/CT scanning, were analyzed independently by the classification tree algorithm and radiological residents. Diagnostic rules were demonstrated in tree-topology, and diagnostic performances were compared with Area under Curve (AUC) of Receiver Operating Characteristic Curve (ROC). Results A classification decision tree with lowest relative cost of 0.340 was developed for 99Tcm-MIBI SPECT/CT scanning in which the value of Target/Normal region of 99Tcm-MIBI uptake in the delayed stage and in the early stage, age, cough and specula sign were five most important contributors. The sensitivity and specificity were 93.33% and 78. 57e, respectively, a little higher than those of the expert. The sensitivity and specificity by residents of Grade one were 76.67% and 28.57%, respectively, and AUC of CART and expert was 0.886±0.055 and 0.829±0.062, respectively, and the corresponding AUC of residents was 0.566±0.092. Comparisons of AUCs suggest that performance of CART was similar to that of expert (P=0.204), but greater than that of residents (P<0.001). Conclusion Our data mining technique using classification decision tree has a much higher accuracy than residents. It suggests that the application of this algorithm will significantly improve the diagnostic performance of residents.

  19. Library Event Matching event classification algorithm for electron neutrino interactions in the NOνA detectors

    Science.gov (United States)

    Backhouse, C.; Patterson, R. B.

    2015-04-01

    We describe the Library Event Matching classification algorithm implemented for use in the NOνA νμ →νe oscillation measurement. Library Event Matching, developed in a different form by the earlier MINOS experiment, is a powerful approach in which input trial events are compared to a large library of simulated events to find those that best match the input event. A key feature of the algorithm is that the comparisons are based on all the information available in the event, as opposed to higher-level derived quantities. The final event classifier is formed by examining the details of the best-matched library events. We discuss the concept, definition, optimization, and broader applications of the algorithm as implemented here. Library Event Matching is well-suited to the monolithic, segmented detectors of NOνA and thus provides a powerful technique for event discrimination.

  20. A Wavelet-Based Algorithm for Delineation and Classification of Wave Patterns in Continuous Holter ECG Recordings

    OpenAIRE

    Johannesen, L; Grove, USL; Sørensen, JS; Schmidt, ML; Couderc, J-P; Graff, C

    2010-01-01

    Quantitative analysis of the electrocardiogram (ECG) requires delineation and classification of the individual ECG wave patterns. We propose a wavelet-based waveform classifier that uses the fiducial points identified by a delineation algorithm. For validation of the algorithm, manually annotated ECG records from the QT database (Physionet) were used. ECG waveform classification accuracies were: 85.6% (P-wave), 89.7% (QRS complex), 92.8% (T-wave) and 76.9% (U-wave). The proposed classificatio...

  1. SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES

    Directory of Open Access Journals (Sweden)

    Hugo Leonardo Pereira Rufino

    2016-04-01

    Full Text Available Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases

  2. Integrity Classification Algorithm of Images obtained from Impact Damaged Composite Structures

    Directory of Open Access Journals (Sweden)

    Mahmoud Z. Iskandarani

    2010-01-01

    Full Text Available Problem statement: Many NDT systems used for damage detection in composites are difficult to apply to complex geometric structures, also, they are time-consuming. As a solution to the problems associated with NDT applications, an intelligent analysis system that supports a portable testing environment, which allowed various types of inputs and provided sufficient data regarding level of damage in a tested structure was designed and tested. The developed technique was a novel approach that allowed locating defects with good accuracy. Approach: This research presented a novel approach to fast NDT using intelligent image analysis through a specifically developed algorithm that checks the integrity of composite structures. Such a novel approach allowed not only to determine the level of damage, but also, to correlate damage detected by one imaging technique using available instruments and methods to results that would be obtained using other instruments and techniques. Results: Using the developed ICA algorithm, accurate classification was achieved using C-Scan and Low Temperature Thermal imaging (LTT. Both techniques agreed on damage classification and structural integrity. Conclusion: This very successful approach to damage detection and classification is further supported by its ability to correlate different NDT technologies and predict others.

  3. Defining and evaluating classification algorithm for high-dimensional data based on latent topics.

    Directory of Open Access Journals (Sweden)

    Le Luo

    Full Text Available Automatic text categorization is one of the key techniques in information retrieval and the data mining field. The classification is usually time-consuming when the training dataset is large and high-dimensional. Many methods have been proposed to solve this problem, but few can achieve satisfactory efficiency. In this paper, we present a method which combines the Latent Dirichlet Allocation (LDA algorithm and the Support Vector Machine (SVM. LDA is first used to generate reduced dimensional representation of topics as feature in VSM. It is able to reduce features dramatically but keeps the necessary semantic information. The Support Vector Machine (SVM is then employed to classify the data based on the generated features. We evaluate the algorithm on 20 Newsgroups and Reuters-21578 datasets, respectively. The experimental results show that the classification based on our proposed LDA+SVM model achieves high performance in terms of precision, recall and F1 measure. Further, it can achieve this within a much shorter time-frame. Our process improves greatly upon the previous work in this field and displays strong potential to achieve a streamlined classification process for a wide range of applications.

  4. Classification of Ultrasonic NDE Signals Using the Expectation Maximization (EM) and Least Mean Square (LMS) Algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Dae Won [Dankook University, Cheonan (Korea, Republic of)

    2005-02-15

    Ultrasonic inspection methods are widely used for detecting flaws in materials. The signal analysis step plays a crucial part in the data interpretation process. A number of signal processing methods have been proposed to classify ultrasonic flaw signals. One of the more popular methods involves the extraction of an appropriate set of features followed by the use of a neural network for the classification of the signals in the feature spare. This paper describes an alternative approach which uses the least mean square (LMS) method and exportation maximization (EM) algorithm with the model based deconvolution which is employed for classifying nondestructive evaluation (NDE) signals from steam generator tubes in a nuclear power plant. The signals due to cracks and deposits are not significantly different. These signals must be discriminated to prevent from happening a huge disaster such as contamination of water or explosion. A model based deconvolution has been described to facilitate comparison of classification results. The method uses the space alternating generalized expectation maximiBation (SAGE) algorithm ill conjunction with the Newton-Raphson method which uses the Hessian parameter resulting in fast convergence to estimate the time of flight and the distance between the tube wall and the ultrasonic sensor. Results using these schemes for the classification of ultrasonic signals from cracks and deposits within steam generator tubes are presented and showed a reasonable performances

  5. Implementation of the Associative Classification Algorithm and Format of Dataset in Context of Data Mining

    Directory of Open Access Journals (Sweden)

    Gajraj Singh

    2013-06-01

    Full Text Available Construction of classification models based on association rules. Although association rules have been predominantly used for data exploration and description, the interest in using them for prediction has rapidly increased in the data mining community. In order to mine only rules that can be used for classification, I had modified the well known association rule mining algorithm Apriori to handle user-defined input constraints. We considered constraints that require the presence/absence of particular items or that limit the number of items in the antecedents and/or the consequents of the rules. We developed a characterization of those item sets that will potentially form rules that satisfy the given constraints. This characterization allows us to prune during item set construction. This improves the time performance of item set construction. Using this characterization, we implemented a classification system based on association rules. Furthermore, I enhanced the algorithm by relaying on the typical support/confidence framework, and mining for the best possible rules above a user-defined minimum confidence and within a desired range for the number of rules[9]. This avoids long mining times that might produce large collections of rules with low predictive power.

  6. Bands selection and classification of hyperspectral images based on hybrid kernels SVM by evolutionary algorithm

    Science.gov (United States)

    Hu, Yan-Yan; Li, Dong-Sheng

    2016-01-01

    The hyperspectral images(HSI) consist of many closely spaced bands carrying the most object information. While due to its high dimensionality and high volume nature, it is hard to get satisfactory classification performance. In order to reduce HSI data dimensionality preparation for high classification accuracy, it is proposed to combine a band selection method of artificial immune systems (AIS) with a hybrid kernels support vector machine (SVM-HK) algorithm. In fact, after comparing different kernels for hyperspectral analysis, the approach mixed radial basis function kernel (RBF-K) with sigmoid kernel (Sig-K) and applied the optimized hybrid kernels in SVM classifiers. Then the SVM-HK algorithm used to induce the bands selection of an improved version of AIS. The AIS was composed of clonal selection and elite antibody mutation, including evaluation process with optional index factor (OIF). Experimental classification performance was on a San Diego Naval Base acquired by AVIRIS, the HRS dataset shows that the method is able to efficiently achieve bands redundancy removal while outperforming the traditional SVM classifier.

  7. Feasibility of Genetic Algorithm for Textile Defect Classification Using Neural Network

    Directory of Open Access Journals (Sweden)

    Md. Tarek Habib

    2012-07-01

    Full Text Available The global market for textile industry is highly competitive nowadays. Quality control in production process in textile industry has been a key factor for retaining existence in such competitive market. Automated textile inspection systems are very useful in this respect, because manual inspection is time consuming and not accurate enough. Hence, automated textile inspection systems have been drawing plenty of attention of the researchers of different countries in order to replace manual inspection. Defect detection and defect classification are the two major problems that are posed by the research of automated textile inspection systems. In this paper, we perform an extensive investigation on the applicability of genetic algorithm (GA in the context of textile defect classification using neural network (NN. We observe the effect of tuning different network parameters and explain the reasons. We empirically find a suitable NN model in the context of textile defect classification. We compare the performance of this model with that of the classification models implemented by others.

  8. Hybrid Ant Bee Algorithm for Fuzzy Expert System Based Sample Classification.

    Science.gov (United States)

    GaneshKumar, Pugalendhi; Rani, Chellasamy; Devaraj, Durairaj; Victoire, T Aruldoss Albert

    2014-01-01

    Accuracy maximization and complexity minimization are the two main goals of a fuzzy expert system based microarray data classification. Our previous Genetic Swarm Algorithm (GSA) approach has improved the classification accuracy of the fuzzy expert system at the cost of their interpretability. The if-then rules produced by the GSA are lengthy and complex which is difficult for the physician to understand. To address this interpretability-accuracy tradeoff, the rule set is represented using integer numbers and the task of rule generation is treated as a combinatorial optimization task. Ant colony optimization (ACO) with local and global pheromone updations are applied to find out the fuzzy partition based on the gene expression values for generating simpler rule set. In order to address the formless and continuous expression values of a gene, this paper employs artificial bee colony (ABC) algorithm to evolve the points of membership function. Mutual Information is used for idenfication of informative genes. The performance of the proposed hybrid Ant Bee Algorithm (ABA) is evaluated using six gene expression data sets. From the simulation study, it is found that the proposed approach generated an accurate fuzzy system with highly interpretable and compact rules for all the data sets when compared with other approaches. PMID:26355782

  9. Unraveling cognitive traits using the Morris water maze unbiased strategy classification (MUST-C) algorithm.

    Science.gov (United States)

    Illouz, Tomer; Madar, Ravit; Louzon, Yoram; Griffioen, Kathleen J; Okun, Eitan

    2016-02-01

    The assessment of spatial cognitive learning in rodents is a central approach in neuroscience, as it enables one to assess and quantify the effects of treatments and genetic manipulations from a broad perspective. Although the Morris water maze (MWM) is a well-validated paradigm for testing spatial learning abilities, manual categorization of performance in the MWM into behavioral strategies is subject to individual interpretation, and thus to biases. Here we offer a support vector machine (SVM) - based, automated, MWM unbiased strategy classification (MUST-C) algorithm, as well as a cognitive score scale. This model was examined and validated by analyzing data obtained from five MWM experiments with changing platform sizes, revealing a limitation in the spatial capacity of the hippocampus. We have further employed this algorithm to extract novel mechanistic insights on the impact of members of the Toll-like receptor pathway on cognitive spatial learning and memory. The MUST-C algorithm can greatly benefit MWM users as it provides a standardized method of strategy classification as well as a cognitive scoring scale, which cannot be derived from typical analysis of MWM data. PMID:26522398

  10. Hybrid Medical Image Classification Using Association Rule Mining with Decision Tree Algorithm

    CERN Document Server

    Rajendran, P

    2010-01-01

    The main focus of image mining in the proposed method is concerned with the classification of brain tumor in the CT scan brain images. The major steps involved in the system are: pre-processing, feature extraction, association rule mining and hybrid classifier. The pre-processing step has been done using the median filtering process and edge features have been extracted using canny edge detection technique. The two image mining approaches with a hybrid manner have been proposed in this paper. The frequent patterns from the CT scan images are generated by frequent pattern tree (FP-Tree) algorithm that mines the association rules. The decision tree method has been used to classify the medical images for diagnosis. This system enhances the classification process to be more accurate. The hybrid method improves the efficiency of the proposed method than the traditional image mining methods. The experimental result on prediagnosed database of brain images showed 97% sensitivity and 95% accuracy respectively. The ph...

  11. Classification and authentication of unknown water samples using machine learning algorithms.

    Science.gov (United States)

    Kundu, Palash K; Panchariya, P C; Kundu, Madhusree

    2011-07-01

    This paper proposes the development of water sample classification and authentication, in real life which is based on machine learning algorithms. The proposed techniques used experimental measurements from a pulse voltametry method which is based on an electronic tongue (E-tongue) instrumentation system with silver and platinum electrodes. E-tongue include arrays of solid state ion sensors, transducers even of different types, data collectors and data analysis tools, all oriented to the classification of liquid samples and authentication of unknown liquid samples. The time series signal and the corresponding raw data represent the measurement from a multi-sensor system. The E-tongue system, implemented in a laboratory environment for 6 numbers of different ISI (Bureau of Indian standard) certified water samples (Aquafina, Bisleri, Kingfisher, Oasis, Dolphin, and McDowell) was the data source for developing two types of machine learning algorithms like classification and regression. A water data set consisting of 6 numbers of sample classes containing 4402 numbers of features were considered. A PCA (principal component analysis) based classification and authentication tool was developed in this study as the machine learning component of the E-tongue system. A proposed partial least squares (PLS) based classifier, which was dedicated as well; to authenticate a specific category of water sample evolved out as an integral part of the E-tongue instrumentation system. The developed PCA and PLS based E-tongue system emancipated an overall encouraging authentication percentage accuracy with their excellent performances for the aforesaid categories of water samples. PMID:21507400

  12. A fast version of the k-means classification algorithm for astronomical applications

    OpenAIRE

    Ordovás-Pascual, I.; Sánchez Almeida, J.

    2014-01-01

    [Context]: K-means is a clustering algorithm that has been used to classify large datasets in astronomical databases. It is an unsupervised method, able to cope very different types of problems. [Aims]: We check whether a variant of the algorithm called single pass k-means can be used as a fast alternative to the traditional k-means. [Methods]: The execution time of the two algorithms are compared when classifying subsets drawn from the SDSS-DR7 catalog of galaxy spectra. [Results]: Single-pa...

  13. PMSVM: An Optimized Support Vector Machine Classification Algorithm Based on PCA and Multilevel Grid Search Methods

    Directory of Open Access Journals (Sweden)

    Yukai Yao

    2015-01-01

    Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.

  14. E-mail Spam Classification With Artificial Neural Network and Negative Selection Algorithm

    OpenAIRE

    Ismaila Idris

    2011-01-01

    This paper apply neural network and spam model based on Negative selection algorithm for solving complex problems in spam detection. This is achieved by distinguishing spam from non-spam (self from non-self). We propose an optimized technique for e-mail classification; The e-mail are classified as self and non-self whose redundancy was removed from the detector set in the previous research to generate a self and non-self detector memory. A vector with an array of two element self and non-self...

  15. Generation of a Supervised Classification Algorithm for Time-Series Variable Stars with an Application to the LINEAR Dataset

    CERN Document Server

    Johnston, Kyle B

    2016-01-01

    With the advent of digital astronomy, new benefits and new problems have been presented to the modern day astronomer. While data can be captured in a more efficient and accurate manor using digital means, the efficiency of data retrieval has led to an overload of scientific data for processing and storage. This paper will focus on the construction and application of a supervised pattern classification algorithm for the identification of variable stars. Given the reduction of a survey of stars into a standard feature space, the problem of using prior patterns to identify new observed patterns can be reduced to time tested classification methodologies and algorithms. Such supervised methods, so called because the user trains the algorithms prior to application using patterns with known classes or labels, provide a means to probabilistically determine the estimated class type of new observations. This paper will demonstrate the construction and application of a supervised classification algorithm on variable sta...

  16. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  17. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  18. A Template Matching Approach to Classification of QAM Modulation using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Negar ahmadi

    2009-11-01

    Full Text Available The automatic recognition of the modulation format of a detected signal, the intermediate step between signal detection and demodulation, is a major task of an intelligent receiver, with various civilian and military applications. Obviously, with no knowledge of the transmitted data and many unknown parameters at the receiver, such as the signal power, carrier frequency and phase offsets, timing information, etc., blind identification of the modulation is a difficult task. This becomes even more challenging in real-world. In this paper modulation classification for QAM is performed by Genetic Algorithm followed by Template matching, considering the constellation of the received signal. In addition this classification finds the decision boundary of the signal which is critical information for bit detection. I have proposed and implemented a technique that casts modulation recognition into shape recognition. Constellation diagram is a traditional and powerful tool for design and evaluation of digital modulations. The simulation results show the capability of this method for modulation classification with high accuracy and appropriate convergence in the presence of noise.

  19. Fast Algorithm for Vectorcardiogram and Interbeat Intervals Analysis: Application for Premature Ventricular Contractions Classification

    Directory of Open Access Journals (Sweden)

    Irena Jekova

    2005-12-01

    Full Text Available In this study we investigated the adequacy of two non-orthogonal ECG leads from Holter recordings to provide reliable vectorcardiogram (VCG parameters. The VCG loop was constructed using the QRS samples in a fixed-size window around the fiducial point. We developed an algorithm for fast approximation of the VCG loop, estimation of its area and calculation of relative VCG characteristics, which are expected to be minimally dependent on the patient individuality and the ECG recording conditions. Moreover, in order to obtain independent from the heart rate temporal QRS characteristics, we introduced a parameter for estimation of the differences of the interbeat RR intervals. The statistical assessment of the proposed VCG and RR interval parameters showed distinguishing distributions for N and PVC beats. The reliability for PVC detection of the extracted parameter set was estimated independently with two classification methods - a stepwise discriminant analysis and a decision-tree-like classification algorithm, using the publicly available MIT-BIH arrhythmia database. The accuracy achieved with the stepwise discriminant analysis presented sensitivity of 91% and specificity of 95.6%, while the decision-tree-like technique assured sensitivity of 93.3% and specificity of 94.6%. We suggested possibilities for accuracy improvement with adequate electrodes placement of the Holter leads, supplementary analysis of the type of the predominant beats in the reference VCG matrix and smaller step for VCG loop approximation.

  20. A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS AND PERKS

    Directory of Open Access Journals (Sweden)

    Mitali Desai

    2016-03-01

    Full Text Available The social networking sites have brought a new horizon for expressing views and opinions of individuals. Moreover, they provide medium to students to share their sentiments including struggles and joy during the learning process. Such informal information has a great venue for decision making. The large and growing scale of information needs automatic classification techniques. Sentiment analysis is one of the automated techniques to classify large data. The existing predictive sentiment analysis techniques are highly used to classify reviews on E-commerce sites to provide business intelligence. However, they are not much useful to draw decisions in education system since they classify the sentiments into merely three pre-set categories: positive, negative and neutral. Moreover, classifying the students’ sentiments into positive or negative category does not provide deeper insight into their problems and perks. In this paper, we propose a novel Hybrid Classification Algorithm to classify engineering students’ sentiments. Unlike traditional predictive sentiment analysis techniques, the proposed algorithm makes sentiment analysis process descriptive. Moreover, it classifies engineering students’ perks in addition to problems into several categories to help future students and education system in decision making.

  1. Discrimination of Rice Varieties using LS-SVM Classification Algorithms and Hyperspectral Data

    Directory of Open Access Journals (Sweden)

    Jin Xiaming

    2015-03-01

    Full Text Available Fast discrimination of rice varieties plays a key role in the rice processing industry and benefits the management of rice in the supermarket. In order to discriminate rice varieties in a fast and nondestructive way, hyperspectral technology and several classification algorithms were used in this study. The hyperspectral data of 250 rice samples of 5 varieties were obtained using FieldSpec®3 spectrometer. Multiplication Scatter Correction (MSC was used to preprocess the raw spectra. Principal Component Analysis (PCA was used to reduce the dimension of raw spectra. To investigate the influence of different linear and non-linear classification algorithms on the discrimination results, K-Nearest Neighbors (KNN, Support Vector Machine (SVM and Least Square Support Vector Machine (LS-SVM were used to develop the discrimination models respectively. Then the performances of these three multivariate classification methods were compared according to the discrimination accuracy. The number of Principal Components (PCs and K parameter of KNN, kernel function of SVM or LS-SVM, were optimized by cross-validation in corresponding models. One hundred and twenty five rice samples (25 of each variety were chosen as calibration set and the remaining 125 rice samples were prediction set. The experiment results showed that, the optimal PCs was 8 and the cross-validation accuracy of KNN (K = 2, SVM, LS-SVM were 94.4, 96.8 and 100%, respectively, while the prediction accuracy of KNN (K = 2, SVM, LS-SVM were 89.6, 93.6 and 100%, respectively. The results indicated that LS-SVM performed the best in the discrimination of rice varieties.

  2. Development of an algorithm for heartbeats detection and classification in Holter records based on temporal and morphological features

    Science.gov (United States)

    García, A.; Romano, H.; Laciar, E.; Correa, R.

    2011-12-01

    In this work a detection and classification algorithm for heartbeats analysis in Holter records was developed. First, a QRS complexes detector was implemented and their temporal and morphological characteristics were extracted. A vector was built with these features; this vector is the input of the classification module, based on discriminant analysis. The beats were classified in three groups: Premature Ventricular Contraction beat (PVC), Atrial Premature Contraction beat (APC) and Normal Beat (NB). These beat categories represent the most important groups of commercial Holter systems. The developed algorithms were evaluated in 76 ECG records of two validated open-access databases "arrhythmias MIT BIH database" and "MIT BIH supraventricular arrhythmias database". A total of 166343 beats were detected and analyzed, where the QRS detection algorithm provides a sensitivity of 99.69 % and a positive predictive value of 99.84 %. The classification stage gives sensitivities of 97.17% for NB, 97.67% for PCV and 92.78% for APC.

  3. Development of an algorithm for heartbeats detection and classification in Holter records based on temporal and morphological features

    International Nuclear Information System (INIS)

    In this work a detection and classification algorithm for heartbeats analysis in Holter records was developed. First, a QRS complexes detector was implemented and their temporal and morphological characteristics were extracted. A vector was built with these features; this vector is the input of the classification module, based on discriminant analysis. The beats were classified in three groups: Premature Ventricular Contraction beat (PVC), Atrial Premature Contraction beat (APC) and Normal Beat (NB). These beat categories represent the most important groups of commercial Holter systems. The developed algorithms were evaluated in 76 ECG records of two validated open-access databases 'arrhythmias MIT BIH database' and MIT BIH supraventricular arrhythmias database. A total of 166343 beats were detected and analyzed, where the QRS detection algorithm provides a sensitivity of 99.69 % and a positive predictive value of 99.84 %. The classification stage gives sensitivities of 97.17% for NB, 97.67% for PCV and 92.78% for APC.

  4. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine

    Science.gov (United States)

    Gao, Fei; Mei, Jingyuan; Sun, Jinping; Wang, Jun; Yang, Erfu; Hussain, Amir

    2015-01-01

    For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM) is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a “soft-start” approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment. PMID:26275294

  5. MODIS Collection 6 shortwave-derived cloud phase classification algorithm and comparisons with CALIOP

    Science.gov (United States)

    Marchant, Benjamin; Platnick, Steven; Meyer, Kerry; Arnold, G. Thomas; Riedi, Jérôme

    2016-04-01

    Cloud thermodynamic phase (ice, liquid, undetermined) classification is an important first step for cloud retrievals from passive sensors such as MODIS (Moderate Resolution Imaging Spectroradiometer). Because ice and liquid phase clouds have very different scattering and absorbing properties, an incorrect cloud phase decision can lead to substantial errors in the cloud optical and microphysical property products such as cloud optical thickness or effective particle radius. Furthermore, it is well established that ice and liquid clouds have different impacts on the Earth's energy budget and hydrological cycle, thus accurately monitoring the spatial and temporal distribution of these clouds is of continued importance. For MODIS Collection 6 (C6), the shortwave-derived cloud thermodynamic phase algorithm used by the optical and microphysical property retrievals has been completely rewritten to improve the phase discrimination skill for a variety of cloudy scenes (e.g., thin/thick clouds, over ocean/land/desert/snow/ice surface, etc). To evaluate the performance of the C6 cloud phase algorithm, extensive granule-level and global comparisons have been conducted against the heritage C5 algorithm and CALIOP. A wholesale improvement is seen for C6 compared to C5.

  6. Classification of Atrial Septal Defect and Ventricular Septal Defect with Documented Hemodynamic Parameters via Cardiac Catheterization by Genetic Algorithms and Multi-Layered Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Mustafa Yıldız

    2012-08-01

    Full Text Available Introduction: We aimed to develop a classification method to discriminate ventricular septal defect and atrial septal defect by using severalhemodynamic parameters.Patients and Methods: Forty three patients (30 atrial septal defect, 13 ventricular septal defect; 26 female, 17 male with documentedhemodynamic parameters via cardiac catheterization are included to study. Such parameters as blood pressure values of different areas,gender, age and Qp/Qs ratios are used for classification. Parameters, we used in classification are determined by divergence analysismethod. Those parameters are; i pulmonary artery diastolic pressure, ii Qp/Qs ratio, iii right atrium pressure, iv age, v pulmonary arterysystolic pressure, vi left ventricular sistolic pressure, vii aorta mean pressure, viii left ventricular diastolic pressure, ix aorta diastolicpressure, x aorta systolic pressure. Those parameters detected from our study population, are uploaded to multi-layered artificial neuralnetwork and the network was trained by genetic algorithm.Results: Trained cluster consists of 14 factors (7 atrial septal defect and 7 ventricular septal defect. Overall success ratio is 79.2%, andwith a proper instruction of artificial neural network this ratio increases up to 89%.Conclusion: Parameters, belonging to artificial neural network, which are needed to be detected by the investigator in classical methods,can easily be detected with the help of genetic algorithms. During the instruction of artificial neural network by genetic algorithms, boththe topology of network and factors of network can be determined. During the test stage, elements, not included in instruction cluster, areassumed as in test cluster, and as a result of this study, we observed that multi-layered artificial neural network can be instructed properly,and neural network is a successful method for aimed classification.

  7. HOS network-based classification of power quality events via regression algorithms

    Science.gov (United States)

    Palomares Salas, José Carlos; González de la Rosa, Juan José; Sierra Fernández, José María; Pérez, Agustín Agüera

    2015-12-01

    This work compares seven regression algorithms implemented in artificial neural networks (ANNs) supported by 14 power-quality features, which are based in higher-order statistics. Combining time and frequency domain estimators to deal with non-stationary measurement sequences, the final goal of the system is the implementation in the future smart grid to guarantee compatibility between all equipment connected. The principal results are based in spectral kurtosis measurements, which easily adapt to the impulsive nature of the power quality events. These results verify that the proposed technique is capable of offering interesting results for power quality (PQ) disturbance classification. The best results are obtained using radial basis networks, generalized regression, and multilayer perceptron, mainly due to the non-linear nature of data.

  8. Support vector machines and evolutionary algorithms for classification single or together?

    CERN Document Server

    Stoean, Catalin

    2014-01-01

    When discussing classification, support vector machines are known to be a capable and efficient technique to learn and predict with high accuracy within a quick time frame. Yet, their black box means to do so make the practical users quite circumspect about relying on it, without much understanding of the how and why of its predictions. The question raised in this book is how can this ‘masked hero’ be made more comprehensible and friendly to the public: provide a surrogate model for its hidden optimization engine, replace the method completely or appoint a more friendly approach to tag along and offer the much desired explanations? Evolutionary algorithms can do all these and this book presents such possibilities of achieving high accuracy, comprehensibility, reasonable runtime as well as unconstrained performance.

  9. E-mail Spam Classification With Artificial Neural Network and Negative Selection Algorithm

    Directory of Open Access Journals (Sweden)

    Ismaila Idris

    2011-12-01

    Full Text Available This paper apply neural network and spam model based on Negative selection algorithm for solving complex problems in spam detection. This is achieved by distinguishing spam from non-spam (self from non-self. We propose an optimized technique for e-mail classification; The e-mail are classified as self and non-self whose redundancy was removed from the detector set in the previous research to generate a self and non-self detector memory. A vector with an array of two element self and non-self concentration vector are generated into a feature vector used as an input in neural network classifier to classify the self and non-self feature vector of self and nonself program. The hybridization of both neural network and our previous model will further enhance our spam detector by improving the false rate and also enable the two different detectors to have a uniform platform for effective performance rate.

  10. Content-based and Algorithmic Classifications of Journals: Perspectives on the Dynamics of Scientific Communication and Indexer Effects

    CERN Document Server

    Rafols, Ismael

    2008-01-01

    The aggregated journal-journal citation matrix -based on the Journal Citation Reports (JCR) of the Science Citation Index- can be decomposed by indexers and/or algorithmically. In this study, we test the results of two recently available algorithms for the decomposition of large matrices against two content-based classifications of journals: the ISI Subject Categories and the field/subfield classification of Glaenzel & Schubert (2003). The content-based schemes allow for the attribution of more than a single category to a journal, whereas the algorithms maximize the ratio of within-category citations over between-category citations in the aggregated category-category citation matrix. By adding categories, indexers generate between-category citations, which may enrich the database, for example, in the case of inter-disciplinary developments. The consequent indexer effects are significant in sparse areas of the matrix more than in denser ones. Algorithmic decompositions, on the other hand, are more heavily ...

  11. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions. PMID:26484222

  12. Development and validation of a computerized algorithm for International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI)

    DEFF Research Database (Denmark)

    Walden, K; Bélanger, L M; Biering-Sørensen, F;

    2016-01-01

    STUDY DESIGN: Validation study. OBJECTIVES: To describe the development and validation of a computerized application of the international standards for neurological classification of spinal cord injury (ISNCSCI). SETTING: Data from acute and rehabilitation care. METHODS: The Rick Hansen Institute......-ISNCSCI Algorithm (RHI-ISNCSCI Algorithm) was developed based on the 2011 version of the ISNCSCI and the 2013 version of the worksheet. International experts developed the design and logic with a focus on usability and features to standardize the correct classification of challenging cases. A five-phased process...... was used to develop and validate the algorithm. Discrepancies between the clinician-derived and algorithm-calculated results were reconciled. RESULTS: Phase one of the validation used 48 cases to develop the logic. Phase three used these and 15 additional cases for further logic development to...

  13. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  14. Aims & Scope

    Institute of Scientific and Technical Information of China (English)

    2014-01-01

    Asian Pacific Journal of Tropical Biomedicine(APJTB)aims to set up and provide an international academic communication platform for physicians,medical scientists,allied health scientists and public health workers,especially those in the Asian Pacific region and worldwide on tropical biomedicine,infectious diseases and public health,and to meet the growing challenges

  15. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  16. Aims & Scope

    Institute of Scientific and Technical Information of China (English)

    2013-01-01

    <正>Asian Pacific Journal of Tropical Biomedicine(APJTB)aims to set up and provide an international academic communication platform for physicians,medical scientists,allied health scientists and public health workers,especially those in the Asian Pacific region and worldwide on tropical biomedicine,infectious diseases and public health,and to meet the growing challenges of understanding,preventing and controlling the dramatic global emergence and reemergence of infectious diseases in the Asian Pacific region.

  17. Aims & Scope

    Institute of Scientific and Technical Information of China (English)

    2015-01-01

    <正>Asian Pacific Journal of Tropical Biomedicine(APJTB)aims to set up and provide an international academic communication platform for physicians,medical scientists,allied health scientists and public health workers,especially those in the Asian Pacific region and worldwide on tropical biomedicine,infectious diseases and public health,and to meet the growing challenges of understanding,preventing and controlling the dramatic global emergence and reemergence of infectious diseases in the Asian Pacific region.

  18. The CR‐Ω+ Classification Algorithm for Spatio‐Temporal Prediction of Criminal Activity

    Directory of Open Access Journals (Sweden)

    S. Godoy‐Calderón

    2010-04-01

    Full Text Available We present a spatio‐temporal prediction model that allows forecasting of the criminal activity behavior in a particular region byusing supervised classification. The degree of membership of each pattern is interpreted as the forecasted increase or decreasein the criminal activity for the specified time and location. The proposed forecasting model (CR‐Ω+ is based on the family ofKora‐Ω Logical‐Combinatorial algorithms operating on large data volumes from several heterogeneous sources using aninductive learning process. We propose several modifications to the original algorithms by Bongard and Baskakova andZhuravlëv which improve the prediction performance on the studied dataset of criminal activity. We perform two analyses:punctual prediction and tendency analysis, which show that it is possible to predict punctually one of four crimes to beperpetrated (crime family, in a specific space and time, and 66% of effectiveness in the prediction of the place of crime, despiteof the noise of the dataset. The tendency analysis yielded an STRMSE (Spatio‐Temporal RMSE of less than 1.0.

  19. Binary classification SVM-based algorithms with interval-valued training data using triangular and Epanechnikov kernels.

    Science.gov (United States)

    Utkin, Lev V; Chekh, Anatoly I; Zhuk, Yulia A

    2016-08-01

    Classification algorithms based on different forms of support vector machines (SVMs) for dealing with interval-valued training data are proposed in the paper. L2-norm and L∞-norm SVMs are used for constructing the algorithms. The main idea allowing us to represent the complex optimization problems as a set of simple linear or quadratic programming problems is to approximate the Gaussian kernel by the well-known triangular and Epanechnikov kernels. The minimax strategy is used to choose an optimal probability distribution from the set and to construct optimal separating functions. Numerical experiments illustrate the algorithms. PMID:27179616

  20. Land Use Classification using Support Vector Machine and Maximum Likelihood Algorithms by Landsat 5 TM Images

    Directory of Open Access Journals (Sweden)

    Abbas TAATI

    2015-08-01

    Full Text Available Nowadays, remote sensing images have been identified and exploited as the latest information to study land cover and land uses. These digital images are of significant importance, since they can present timely information, and capable of providing land use maps. The aim of this study is to create land use classification using a support vector machine (SVM and maximum likelihood classifier (MLC in Qazvin, Iran, by TM images of the Landsat 5 satellite. In the pre-processing stage, the necessary corrections were applied to the images. In order to evaluate the accuracy of the 2 algorithms, the overall accuracy and kappa coefficient were used. The evaluation results verified that the SVM algorithm with an overall accuracy of 86.67 % and a kappa coefficient of 0.82 has a higher accuracy than the MLC algorithm in land use mapping. Therefore, this algorithm has been suggested to be applied as an optimal classifier for extraction of land use maps due to its higher accuracy and better consistency within the study area.

  1. A Novel User Classification Method for Femtocell Network by Using Affinity Propagation Algorithm and Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Afaz Uddin Ahmed

    2014-01-01

    Full Text Available An artificial neural network (ANN and affinity propagation (AP algorithm based user categorization technique is presented. The proposed algorithm is designed for closed access femtocell network. ANN is used for user classification process and AP algorithm is used to optimize the ANN training process. AP selects the best possible training samples for faster ANN training cycle. The users are distinguished by using the difference of received signal strength in a multielement femtocell device. A previously developed directive microstrip antenna is used to configure the femtocell device. Simulation results show that, for a particular house pattern, the categorization technique without AP algorithm takes 5 indoor users and 10 outdoor users to attain an error-free operation. While integrating AP algorithm with ANN, the system takes 60% less training samples reducing the training time up to 50%. This procedure makes the femtocell more effective for closed access operation.

  2. The Self-Directed Violence Classification System and the Columbia Classification Algorithm for Suicide Assessment: A Crosswalk

    Science.gov (United States)

    Matarazzo, Bridget B.; Clemans, Tracy A.; Silverman, Morton M.; Brenner, Lisa A.

    2013-01-01

    The lack of a standardized nomenclature for suicide-related thoughts and behaviors prompted the Centers for Disease Control and Prevention, with the Veterans Integrated Service Network 19 Mental Illness Research Education and Clinical Center, to create the Self-Directed Violence Classification System (SDVCS). SDVCS has been adopted by the…

  3. Verdict Accuracy of Quick Reduct Algorithm using Clustering and Classification Techniques for Gene Expression Data

    Directory of Open Access Journals (Sweden)

    T.Chandrasekhar

    2012-01-01

    Full Text Available In most gene expression data, the number of training samples is very small compared to the large number of genes involved in the experiments. However, among the large amount of genes, only a small fraction is effective for performing a certain task. Furthermore, a small subset of genes is desirable in developing gene expression based diagnostic tools for delivering reliable and understandable results. With the gene selection results, the cost of biological experiment and decision can be greatly reduced by analyzing only the marker genes. An important application of gene expression data in functional genomics is to classify samples according to their gene expression profiles. Feature selection (FS is a process which attempts to select more informative features. It is one of the important steps in knowledge discovery. Conventional supervised FS methods evaluate various feature subsets using an evaluation function or metric to select only those features which are related to the decision classes of the data under consideration. This paper studies a feature selection method based on rough set theory. Further K-Means, Fuzzy C-Means (FCM algorithm have implemented for the reduced feature set without considering class labels. Then the obtained results are compared with the original class labels. Back Propagation Network (BPN has also been used for classification. Then the performance of K-Means, FCM, and BPN are analyzed through the confusion matrix. It is found that the BPN is performing well comparatively.

  4. Business Analysis and Decision Making Through Unsupervised Classification of Mixed Data Type of Attributes Through Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Rohit Rastogi

    2014-01-01

    Full Text Available Grouping or unsupervised classification has variety of demands in which the major one is the capability of the chosen clustering approach to deal with scalability and to handle the mixed variety of data set. There are variety of data sets like categorical/nominal, ordinal, binary (symmetric or asymmetric, ratio and interval scaled variables. In the present scenario, latest approaches of unsupervised classification are Swarm Optimization based, Customer Segmentation based, Soft Computing methods like Fuzzy Based and GA based, Entropy Based methods and hierarchical approaches. These approaches have two serious bottlenecks…Either they are hybrid mathematical techniques or large computation demanding which increases their complexity and hence compromises with accuracy. It is very easy to compare and analyze that unsupervised classification by Genetic Algorithm is feasible, suitable and efficient for high-dimensional data sets with mixed data values that are obtained from real life results, events and happenings.

  5. A Novel Algorithm for Fault Classification on Transmission Lines using a Combined Adaptive Network-based Fuzzy Inference System

    Energy Technology Data Exchange (ETDEWEB)

    Yeo, S.M.; Kim, C.H. [Sungkyunkwan University (Korea); Chai, Y.M. [Chungju National University (Korea); Choi, J.D. [Daelim College (Korea)

    2001-07-01

    Accurate detection and classification of faults on transmission lines is vitally important. High impedance faults (HIF) in particular pose difficulties for the commonly employed conventional overcurrent and distance relays, and if not detected, can cause damage to expensive equipment, threaten life and cause fire hazards. Although HIFs are far less common than LIFs, it is imperative that any protection device should be able to satisfactorily deal with both HIFs and LIFs. This paper proposes an algorithm for fault detection and classification for both LIFs and HIFs using Adaptive Network-based Fuzzy Inference System(ANFIS). The performance of the proposed algorithm is tested on a typical 154[kV] Korean transmission line system under various fault conditions. Test results show that the ANFIS can detect and classify faults including (LIFs and HIFs) accurately within half a cycle. (author). 11 refs., 7 figs., 3 tabs.

  6. Classification of Aerosol Retrievals from Spaceborne Polarimetry Using a Multi-Parameter Algorithm

    Science.gov (United States)

    Russell, P. B.; Kacenelenbogen, M. S.; Livingston, J. M.; Hasekamp, O.; Burton, S. P.; Schuster, G. L.; Redemann, J.; Ramachandran, S.; Holben, B. N.

    2013-12-01

    In this presentation we demonstrate application of a new aerosol classification algorithm to retrievals from the POLDER-3 polarimeter on the PARASOL spacecraft. Motivation and method: Since the development of global aerosol measurements by satellites and AERONET, classification of observed aerosols into several types (e,g., urban-industrial, biomass burning, mineral dust, maritime, and various subtypes or mixtures of these) has proven useful to: understanding aerosol sources, transformations, effects, and feedback mechanisms; improving accuracy of satellite retrievals; and quantifying assessments of aerosol radiative impacts on climate. With ongoing improvements in satellite measurement capability, the number of aerosol parameters retrieved from spaceborne sensors has been growing, from the initial aerosol optical depth at one or a few wavelengths to a list that now includes complex refractive index, single scattering albedo (SSA), and depolarization of backscatter, each at several wavelengths; wavelength dependences of extinction, scattering, absorption, SSA, and backscatter; and several particle size and shape parameters. Making optimal use of these varied data products requires objective, multi-dimensional analysis methods. We describe such a method, which uses a modified Mahalanobis distance to quantify how far a data point described by N aerosol parameters is from each of several prespecified classes. The method makes explicit use of uncertainties in input parameters, treating a point and its N-dimensional uncertainty as an extended data point or pseudo-cluster E. It then uses a modified Mahalanobis distance, DEC, to assign that observation to the class (cluster) C that has minimum DEC from the point (equivalently, the class to which the point has maximum probability of belonging). The method also uses Wilks' overall lambda to indicate how well the input data lend themselves to separation into classes and Wilks' partial lambda to indicate the relative

  7. The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists

    OpenAIRE

    Huang, Da Wei; Sherman, Brad T; Tan, Qina; Collins, Jack R; Alvord, W. Gregory; Roayaei, Jean; Stephens, Robert; Baseler, Michael W; Lane, H. Clifford; Lempicki, Richard A.

    2007-01-01

    The DAVID Gene Functional Classification Tool uses a novel agglomeration algorithm to condense a list of genes or associated biological terms into organized classes of related genes or biology, called biological modules. This organization is accomplished by mining the complex biological co-occurrences found in multiple sources of functional annotation. It is a powerful method to group functionally related genes and terms into a manageable number of biological modules for efficient interpretat...

  8. A zero-training algorithm for EEG single-trial classification applied to a face recognition ERP experiment.

    Science.gov (United States)

    Lage-Castellanos, Agustin; Nieto, Juan I; Quiñones, Ileana; Martinez-Montes, Eduardo

    2010-01-01

    This paper proposes a machine learning based approach to discriminate between EEG single trials of two experimental conditions in a face recognition experiment. The algorithm works using a single-trial EEG database of multiple subjects and thus does not require subject-specific training data. This approach supports the idea that zero-training classification and on-line detection Brain Computer Interface (BCI) systems are areas with a significant amount of potential. PMID:21096895

  9. A global aerosol classification algorithm incorporating multiple satellite data sets of aerosol and trace gas abundances

    Science.gov (United States)

    Penning de Vries, M. J. M.; Beirle, S.; Hörmann, C.; Kaiser, J. W.; Stammes, P.; Tilstra, L. G.; Tuinder, O. N. E.; Wagner, T.

    2015-09-01

    Detecting the optical properties of aerosols using passive satellite-borne measurements alone is a difficult task due to the broadband effect of aerosols on the measured spectra and the influences of surface and cloud reflection. We present another approach to determine aerosol type, namely by studying the relationship of aerosol optical depth (AOD) with trace gas abundance, aerosol absorption, and mean aerosol size. Our new Global Aerosol Classification Algorithm, GACA, examines relationships between aerosol properties (AOD and extinction Ångström exponent from the Moderate Resolution Imaging Spectroradiometer (MODIS), UV Aerosol Index from the second Global Ozone Monitoring Experiment, GOME-2) and trace gas column densities (NO2, HCHO, SO2 from GOME-2, and CO from MOPITT, the Measurements of Pollution in the Troposphere instrument) on a monthly mean basis. First, aerosol types are separated based on size (Ångström exponent) and absorption (UV Aerosol Index), then the dominating sources are identified based on mean trace gas columns and their correlation with AOD. In this way, global maps of dominant aerosol type and main source type are constructed for each season and compared with maps of aerosol composition from the global MACC (Monitoring Atmospheric Composition and Climate) model. Although GACA cannot correctly characterize transported or mixed aerosols, GACA and MACC show good agreement regarding the global seasonal cycle, particularly for urban/industrial aerosols. The seasonal cycles of both aerosol type and source are also studied in more detail for selected 5° × 5° regions. Again, good agreement between GACA and MACC is found for all regions, but some systematic differences become apparent: the variability of aerosol composition (yearly and/or seasonal) is often not well captured by MACC, the amount of mineral dust outside of the dust belt appears to be overestimated, and the abundance of secondary organic aerosols is underestimated in comparison

  10. Novel round-robin tabu search algorithm for prostate cancer classification and diagnosis using multispectral imagery.

    Science.gov (United States)

    Tahir, Muhammad Atif; Bouridane, Ahmed

    2006-10-01

    Quantitative cell imagery in cancer pathology has progressed greatly in the last 25 years. The application areas are mainly those in which the diagnosis is still critically reliant upon the analysis of biopsy samples, which remains the only conclusive method for making an accurate diagnosis of the disease. Biopsies are usually analyzed by a trained pathologist who, by analyzing the biopsies under a microscope, assesses the normality or malignancy of the samples submitted. Different grades of malignancy correspond to different structural patterns as well as to apparent textures. In the case of prostate cancer, four major groups have to be recognized: stroma, benign prostatic hyperplasia, prostatic intraepithelial neoplasia, and prostatic carcinoma. Recently, multispectral imagery has been used to solve this multiclass problem. Unlike conventional RGB color space, multispectral images allow the acquisition of a large number of spectral bands within the visible spectrum, resulting in a large feature vector size. For such a high dimensionality, pattern recognition techniques suffer from the well-known "curse-of-dimensionality" problem. This paper proposes a novel round-robin tabu search (RR-TS) algorithm to address the curse-of-dimensionality for this multiclass problem. The experiments have been carried out on a number of prostate cancer textured multispectral images, and the results obtained have been assessed and compared with previously reported works. The system achieved 98%-100% classification accuracy when testing on two datasets. It outperformed principal component/linear discriminant classifier (PCA-LDA), tabu search/nearest neighbor classifier (TS-1NN), and bagging/boosting with decision tree (C4.5) classifier. PMID:17044412

  11. Classification-based summation of cerebral digital subtraction angiography series for image post-processing algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Schuldhaus, D; Spiegel, M; Polyanskaya, M; Hornegger, J [Pattern Recognition Lab, University Erlangen-Nuremberg (Germany); Redel, T [Siemens AG Healthcare Sector, Forchheim (Germany); Struffert, T; Doerfler, A, E-mail: martin.spiegel@informatik.uni-erlangen.de [Department of Neuroradiology, University Erlangen-Nuremberg (Germany)

    2011-03-21

    X-ray-based 2D digital subtraction angiography (DSA) plays a major role in the diagnosis, treatment planning and assessment of cerebrovascular disease, i.e. aneurysms, arteriovenous malformations and intracranial stenosis. DSA information is increasingly used for secondary image post-processing such as vessel segmentation, registration and comparison to hemodynamic calculation using computational fluid dynamics. Depending on the amount of injected contrast agent and the duration of injection, these DSA series may not exhibit one single DSA image showing the entire vessel tree. The interesting information for these algorithms, however, is usually depicted within a few images. If these images would be combined into one image the complexity of segmentation or registration methods using DSA series would drastically decrease. In this paper, we propose a novel method automatically splitting a DSA series into three parts, i.e. mask, arterial and parenchymal phase, to provide one final image showing all important vessels with less noise and moving artifacts. This final image covers all arterial phase images, either by image summation or by taking the minimum intensities. The phase classification is done by a two-step approach. The mask/arterial phase border is determined by a Perceptron-based method trained from a set of DSA series. The arterial/parenchymal phase border is specified by a threshold-based method. The evaluation of the proposed method is two-sided: (1) comparison between automatic and medical expert-based phase selection and (2) the quality of the final image is measured by gradient magnitudes inside the vessels and signal-to-noise (SNR) outside. Experimental results show a match between expert and automatic phase separation of 93%/50% and an average SNR increase of up to 182% compared to summing up the entire series.

  12. Confident Predictability: Identifying reliable gene expression patterns for individualized tumor classification using a local minimax kernel algorithm

    Directory of Open Access Journals (Sweden)

    Berry Damon

    2011-01-01

    Full Text Available Abstract Background Molecular classification of tumors can be achieved by global gene expression profiling. Most machine learning classification algorithms furnish global error rates for the entire population. A few algorithms provide an estimate of probability of malignancy for each queried patient but the degree of accuracy of these estimates is unknown. On the other hand local minimax learning provides such probability estimates with best finite sample bounds on expected mean squared error on an individual basis for each queried patient. This allows a significant percentage of the patients to be identified as confidently predictable, a condition that ensures that the machine learning algorithm possesses an error rate below the tolerable level when applied to the confidently predictable patients. Results We devise a new learning method that implements: (i feature selection using the k-TSP algorithm and (ii classifier construction by local minimax kernel learning. We test our method on three publicly available gene expression datasets and achieve significantly lower error rate for a substantial identifiable subset of patients. Our final classifiers are simple to interpret and they can make prediction on an individual basis with an individualized confidence level. Conclusions Patients that were predicted confidently by the classifiers as cancer can receive immediate and appropriate treatment whilst patients that were predicted confidently as healthy will be spared from unnecessary treatment. We believe that our method can be a useful tool to translate the gene expression signatures into clinical practice for personalized medicine.

  13. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  14. A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine

    Science.gov (United States)

    Zhuo, Li; Zheng, Jing; Li, Xia; Wang, Fang; Ai, Bin; Qian, Junping

    2008-10-01

    The high-dimensional feature vectors of hyper spectral data often impose a high computational cost as well as the risk of "over fitting" when classification is performed. Therefore it is necessary to reduce the dimensionality through ways like feature selection. Currently, there are two kinds of feature selection methods: filter methods and wrapper methods. The former kind requires no feedback from classifiers and estimates the classification performance indirectly. The latter kind evaluates the "goodness" of selected feature subset directly based on the classification accuracy. Many experimental results have proved that the wrapper methods can yield better performance, although they have the disadvantage of high computational cost. In this paper, we present a Genetic Algorithm (GA) based wrapper method for classification of hyper spectral data using Support Vector Machine (SVM), a state-of-art classifier that has found success in a variety of areas. The genetic algorithm (GA), which seeks to solve optimization problems using the methods of evolution, specifically survival of the fittest, was used to optimize both the feature subset, i.e. band subset, of hyper spectral data and SVM kernel parameters simultaneously. A special strategy was adopted to reduce computation cost caused by the high-dimensional feature vectors of hyper spectral data when the feature subset part of chromosome was designed. The GA-SVM method was realized using the ENVI/IDL language, and was then tested by applying to a HYPERION hyper spectral image. Comparison of the optimized results and the un-optimized results showed that the GA-SVM method could significantly reduce the computation cost while improving the classification accuracy. The number of bands used for classification was reduced from 198 to 13, while the classification accuracy increased from 88.81% to 92.51%. The optimized values of the two SVM kernel parameters were 95.0297 and 0.2021, respectively, which were different from the

  15. DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification

    Directory of Open Access Journals (Sweden)

    Arran Schlosberg

    2014-05-01

    Full Text Available Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV.

  16. An Automated Cropland Classification Algorithm (ACCA for Tajikistan by Combining Landsat, MODIS, and Secondary Data

    Directory of Open Access Journals (Sweden)

    Prasad S. Thenkabail

    2012-09-01

    Full Text Available The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan using mega file data cubes (MFDCs involving data from Landsat Global Land Survey (GLS, Landsat Enhanced Thematic Mapper Plus (ETM+ 30 m, Moderate Resolution Imaging Spectroradiometer (MODIS 250 m time-series, a suite of secondary data (e.g., elevation, slope, precipitation, temperature, and in situ data. First, the process involved producing an accurate reference (or truth cropland layer (TCL, consisting of cropland extent, areas, and irrigated vs. rainfed cropland areas, for the entire country of Tajikistan based on MFDC of year 2005 (MFDC2005. The methods involved in producing TCL included using ISOCLASS clustering, Tasseled Cap bi-spectral plots, spectro-temporal characteristics from MODIS 250 m monthly normalized difference vegetation index (NDVI maximum value composites (MVC time-series, and textural characteristics of higher resolution imagery. The TCL statistics accurately matched with the national statistics of Tajikistan for irrigated and rainfed croplands, where about 70% of croplands were irrigated and the rest rainfed. Second, a rule-based ACCA was developed to replicate the TCL accurately (~80% producer’s and user’s accuracies or within 20% quantity disagreement involving about 10 million Landsat 30 m sized cropland pixels of Tajikistan. Development of ACCA was an iterative process involving series of rules that are coded, refined, tweaked, and re-coded till ACCA derived croplands (ACLs match accurately with TCLs. Third, the ACCA derived

  17. An Automated Cropland Classification Algorithm (ACCA) for Tajikistan by combining Landsat, MODIS, and secondary data

    Science.gov (United States)

    Thenkabail, Prasad S.; Wu, Zhuoting

    2012-01-01

    The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA) that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed) over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan using mega file data cubes (MFDCs) involving data from Landsat Global Land Survey (GLS), Landsat Enhanced Thematic Mapper Plus (ETM+) 30 m, Moderate Resolution Imaging Spectroradiometer (MODIS) 250 m time-series, a suite of secondary data (e.g., elevation, slope, precipitation, temperature), and in situ data. First, the process involved producing an accurate reference (or truth) cropland layer (TCL), consisting of cropland extent, areas, and irrigated vs. rainfed cropland areas, for the entire country of Tajikistan based on MFDC of year 2005 (MFDC2005). The methods involved in producing TCL included using ISOCLASS clustering, Tasseled Cap bi-spectral plots, spectro-temporal characteristics from MODIS 250 m monthly normalized difference vegetation index (NDVI) maximum value composites (MVC) time-series, and textural characteristics of higher resolution imagery. The TCL statistics accurately matched with the national statistics of Tajikistan for irrigated and rainfed croplands, where about 70% of croplands were irrigated and the rest rainfed. Second, a rule-based ACCA was developed to replicate the TCL accurately (~80% producer’s and user’s accuracies or within 20% quantity disagreement involving about 10 million Landsat 30 m sized cropland pixels of Tajikistan). Development of ACCA was an iterative process involving series of rules that are coded, refined, tweaked, and re-coded till ACCA derived croplands (ACLs) match accurately with TCLs. Third, the ACCA derived cropland

  18. Feasibility of Genetic Algorithm for Textile Defect Classification Using Neural Network

    Directory of Open Access Journals (Sweden)

    Md. Tarek Habib

    2012-08-01

    Full Text Available The global market for textile industry is highly competitive nowadays. Quality control in productionprocess in textile industry has been a key factor for retaining existence in such competitive market.Automated textile inspection systems are very useful in this respect, because manual inspection is timeconsuming and not accurate enough. Hence, automated textile inspection systems have been drawing plentyof attention of the researchers of different countries in order to replace manual inspection. Defect detectionand defect classification are the two major problems that are posed by the research of automated textileinspection systems. In this paper, we perform an extensive investigation on the applicability of geneticalgorithm (GA in the context of textile defect classification using neural network (NN. We observe theeffect of tuning different network parameters and explain the reasons. We empirically find a suitable NNmodel in the context of textile defect classification. We compare the performance of this model with that ofthe classification models implemented by others.

  19. Development and comparative assessment of Raman spectroscopic classification algorithms for lesion discrimination in stereotactic breast biopsies with microcalcifications.

    Science.gov (United States)

    Dingari, Narahara Chari; Barman, Ishan; Saha, Anushree; McGee, Sasha; Galindo, Luis H; Liu, Wendy; Plecha, Donna; Klein, Nina; Dasari, Ramachandra Rao; Fitzmaurice, Maryann

    2013-04-01

    Microcalcifications are an early mammographic sign of breast cancer and a target for stereotactic breast needle biopsy. Here, we develop and compare different approaches for developing Raman classification algorithms to diagnose invasive and in situ breast cancer, fibrocystic change and fibroadenoma that can be associated with microcalcifications. In this study, Raman spectra were acquired from tissue cores obtained from fresh breast biopsies and analyzed using a constituent-based breast model. Diagnostic algorithms based on the breast model fit coefficients were devised using logistic regression, C4.5 decision tree classification, k-nearest neighbor (k -NN) and support vector machine (SVM) analysis, and subjected to leave-one-out cross validation. The best performing algorithm was based on SVM analysis (with radial basis function), which yielded a positive predictive value of 100% and negative predictive value of 96% for cancer diagnosis. Importantly, these results demonstrate that Raman spectroscopy provides adequate diagnostic information for lesion discrimination even in the presence of microcalcifications, which to the best of our knowledge has not been previously reported. PMID:22815240

  20. Human Talent Prediction in HRM using C4.5 Classification Algorithm

    OpenAIRE

    Hamidah Jantan,; Abdul Razak Hamdan; Zulaiha Ali Othman

    2010-01-01

    In HRM, among the challenges for HR professionals is to manage an organization’s talents, especially to ensure the right person for the right job at the right time. Human talent prediction is an alternative to handle this issue. Due to that reason, classification and prediction in data mining which is commonly used in many areas can also be implemented to human talent. There are many classification techniques in data mining techniques such as Decision Tree, Neural Network, Rough Set Theory, B...

  1. DOA Estimation of Low Altitude Target Based on Adaptive Step Glowworm Swarm Optimization-multiple Signal Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Zhou Hao

    2015-06-01

    Full Text Available The traditional MUltiple SIgnal Classification (MUSIC algorithm requires significant computational effort and can not be employed for the Direction Of Arrival (DOA estimation of targets in a low-altitude multipath environment. As such, a novel MUSIC approach is proposed on the basis of the algorithm of Adaptive Step Glowworm Swarm Optimization (ASGSO. The virtual spatial smoothing of the matrix formed by each snapshot is used to realize the decorrelation of the multipath signal and the establishment of a fullorder correlation matrix. ASGSO optimizes the function and estimates the elevation of the target. The simulation results suggest that the proposed method can overcome the low altitude multipath effect and estimate the DOA of target readily and precisely without radar effective aperture loss.

  2. Pharmacogenomics of drug efficacy in the interferon treatment of chronic hepatitis C using classification algorithms

    Directory of Open Access Journals (Sweden)

    Wan-Sheng Ke

    2010-06-01

    Full Text Available Wan-Sheng Ke1, Yuchi Hwang2, Eugene Lin21Department of Internal Medicine, Kuang Tien General Hospital, Taichung County, Taiwan; 2Vita Genomics, Inc., Jung-Shing Road, Wugu Shiang, Taipei, TaiwanAbstract: Chronic hepatitis C (CHC patients often stop pursuing interferon-alfa and ribavirin (IFN-alfa/RBV treatment because of the high cost and associated adverse effects. It is highly desirable, both clinically and economically, to establish tools to distinguish responders from nonresponders and to predict possible outcomes of the IFN-alfa/RBV treatments. Single nucleotide polymorphisms (SNPs can be used to understand the relationship between genetic inheritance and IFN-alfa/RBV therapeutic response. The aim in this study was to establish a predictive model based on a pharmacogenomic approach. Our study population comprised Taiwanese patients with CHC who were recruited from multiple sites in Taiwan. The genotyping data was generated in the high-throughput genomics lab of Vita Genomics, Inc. With the wrapper-based feature selection approach, we employed multilayer feedforward neural network (MFNN and logistic regression as a basis for comparisons. Our data revealed that the MFNN models were superior to the logistic regression model. The MFNN approach provides an efficient way to develop a tool for distinguishing responders from nonresponders prior to treatments. Our preliminary results demonstrated that the MFNN algorithm is effective for deriving models for pharmacogenomics studies and for providing the link from clinical factors such as SNPs to the responsiveness of IFN-alfa/RBV in clinical association studies in pharmacogenomics.Keywords: chronic hepatitis C, artificial neural networks, interferon, pharmacogenomics, ribavirin, single nucleotide polymorphisms

  3. Greedy heuristic algorithm for solving series of eee components classification problems*

    Science.gov (United States)

    Kazakovtsev, A. L.; Antamoshkin, A. N.; Fedosov, V. V.

    2016-04-01

    Algorithms based on using the agglomerative greedy heuristics demonstrate precise and stable results for clustering problems based on k- means and p-median models. Such algorithms are successfully implemented in the processes of production of specialized EEE components for using in space systems which include testing each EEE device and detection of homogeneous production batches of the EEE components based on results of the tests using p-median models. In this paper, authors propose a new version of the genetic algorithm with the greedy agglomerative heuristic which allows solving series of problems. Such algorithm is useful for solving the k-means and p-median clustering problems when the number of clusters is unknown. Computational experiments on real data show that the preciseness of the result decreases insignificantly in comparison with the initial genetic algorithm for solving a single problem.

  4. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data.

    Science.gov (United States)

    Yang, Sheng; Guo, Li; Shao, Fang; Zhao, Yang; Chen, Feng

    2015-01-01

    Sequencing is widely used to discover associations between microRNAs (miRNAs) and diseases. However, the negative binomial distribution (NB) and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility. Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare. The performance of seven feature selection (FS) algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF), was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio. Real data were used to evaluate the performance of RF, logistic regression, and support vector machine. Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms. The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96) from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas. Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes. PMID:26508990

  5. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data

    Directory of Open Access Journals (Sweden)

    Sheng Yang

    2015-01-01

    Full Text Available Sequencing is widely used to discover associations between microRNAs (miRNAs and diseases. However, the negative binomial distribution (NB and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility. Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare. The performance of seven feature selection (FS algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF, was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio. Real data were used to evaluate the performance of RF, logistic regression, and support vector machine. Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms. The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96 from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas. Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes.

  6. An enhanced algorithm for knee joint sound classification using feature extraction based on time-frequency analysis.

    Science.gov (United States)

    Kim, Keo Sik; Seo, Jeong Hwan; Kang, Jin U; Song, Chul Gyu

    2009-05-01

    Vibroarthrographic (VAG) signals, generated by human knee movement, are non-stationary and multi-component in nature and their time-frequency distribution (TFD) provides a powerful means to analyze such signals. The objective of this paper is to improve the classification accuracy of the features, obtained from the TFD of normal and abnormal VAG signals, using segmentation by the dynamic time warping (DTW) and denoising algorithm by the singular value decomposition (SVD). VAG and knee angle signals, recorded simultaneously during one flexion and one extension of the knee, were segmented and normalized at 0.5 Hz by the DTW method. Also, the noise within the TFD of the segmented VAG signals was reduced by the SVD algorithm, and a back-propagation neural network (BPNN) was used to classify the normal and abnormal VAG signals. The characteristic parameters of VAG signals consist of the energy, energy spread, frequency and frequency spread parameter extracted by the TFD. A total of 1408 segments (normal 1031, abnormal 377) were used for training and evaluating the BPNN. As a result, the average classification accuracy was 91.4 (standard deviation +/-1.7) %. The proposed method showed good potential for the non-invasive diagnosis and monitoring of joint disorders such as osteoarthritis. PMID:19217685

  7. A novel algorithm for fault classification in transmission lines using a combined adaptive network and fuzzy inference system

    Energy Technology Data Exchange (ETDEWEB)

    Yeo, S.M.; Kim, C.H.; Hong, K.S. [Sungkyunkwan Univ., Suwon (Korea). School fo Information and Computer Engineering; Lim, Y.B. [LG Electronics CDMA Handsets Lab., Seoul (Korea); Aggarwal, R.K.; Johns, A.T. [University of Bath (United Kingdom). Dept. of Electronic and Electrical Engineering; Choi, M.S. [Myongji Univ., Yongin (Korea). Division of Electrical and Information Control Engineering

    2003-11-01

    Accurate detection and classification of faults on transmission lines is vitally important. In this respect, many different types of faults occur, inter alia low impedance faults (LIF) and high impedance faults (HIF). The latter in particular pose difficulties for the commonly employed conventional overcurrent and distance relays, and if not detected, can cause damage to expensive equipment, threaten life and cause fire hazards. Although HIFs are far less common than LIFs, it is imperative that any protection device should be able to satisfactorily deal with both HIFs and LIFs. Because of the randomness and asymmetric characteristics of HIFs, the modelling of HIF is difficult and many papers relating to various HIF models have been published. In this paper, the model of HIFs in transmission lines is accomplished using the characteristics of a ZnO arrester, which is then implemented within the overall transmission system model based on the electromagnetic transients programme. This paper proposes an algorithm for fault detection and classification for both LIFs and HIFs using Adaptive Network-based Fuzzy Inference System (ANFIS). The inputs into ANFIS are current signals only based on Root-Mean-Square values of three-phase currents and zero sequence current. The performance of the proposed algorithm is tested on a typical 154 kV Korean transmission line system under various fault conditions. Test results show that the ANFIS can detect and classify faults including (LIFs and HIFs) accurately within half a cycle. (author)

  8. A Leaf Recognition Algorithm for Plant Classification Using Probabilistic Neural Network

    CERN Document Server

    Wu, Stephen Gang; Xu, Eric You; Wang, Yu-Xuan; Chang, Yi-Fan; Xiang, Qiao-Liang

    2007-01-01

    In this paper, we employ Probabilistic Neural Network (PNN) with image and data processing techniques to implement a general purpose automated leaf recognition algorithm. 12 leaf features are extracted and orthogonalized into 5 principal variables which consist the input vector of the PNN. The PNN is trained by 1800 leaves to classify 32 kinds of plants with an accuracy greater than 90%. Compared with other approaches, our algorithm is an accurate artificial intelligence approach which is fast in execution and easy in implementation.

  9. A Comparison Study on Performance Analysis of Data Mining Algorithms in Classification of Local Area News Dataset using WEKA Tool.

    Directory of Open Access Journals (Sweden)

    G.Kesavaraj

    2013-10-01

    Full Text Available Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD, [1] a field at the intersection of computer science and statistics, is the process that attempts to discover patterns in large data sets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. It is commonly used in marketing, surveillance, fraud detection, scientific discovery and now gaining wide way in social networking. Anything and everything on the Internet is fair game for extreme data mining practices. Social media covers all aspects of the social side of the internet that allow us to get contact and carve up information with others as well as intermingle with any number of people in any place in the world. This paper uses the dataset “Local News Survey” from Pew Research Center. The focus of the research is towards exploration on impact of the internet on Local News activities using Data Mining Techniques. The original dataset contains 102 attributes which is very large and hence the essential attributes required for the analysis are selected by feature reduction method. The selected attributes were applied to Data Mining Classification Algorithms such as RndTree, ID3, K-NN, C4.5 and CS-MC4. The Error rates of various classification Algorithms were compared to bring out the best and effective Algorithm suitable for this dataset.

  10. Active Learning Algorithms for the Classification of Hyperspectral Sea Ice Images

    OpenAIRE

    Yanling Han; Jing Ren; Zhonghua Hong; Yun Zhang; Long Zhang; Wanting Meng; Qiming Gu

    2015-01-01

    Sea ice is one of the most critical marine disasters, especially in the polar and high latitude regions. Hyperspectral image is suitable for monitoring the sea ice, which contains continuous spectrum information and has better ability of target recognition. The principal bottleneck for the classification of hyperspectral image is a large number of labeled training samples required. However, the collection of labeled samples is time consuming and costly. In order to solve this problem, we appl...

  11. Combining and Comparing Multiple Algorithms for Better Learning and Classification: A Case Study of MARF

    OpenAIRE

    Mokhov, Serguei

    2010-01-01

    We presented an overview of MARF, a modular and extensible pattern recognition framework for a reasonably diverse spectrum of the learning and recognition tasks. We outlined the pipeline and the data structures used in this open-source project in a practical manner. We provided some typical results one can obtain by running MARF’s implementations for various learning and classification problems. 8.1 Advantages and disadvantages of the approach The framework approach is both an advantage and...

  12. DETERMINATION OF OPTIMUM CLASSIFICATION SYSTEM FOR HYPERSPECTRAL IMAGERY AND LIDAR DATA BASED ON BEES ALGORITHM

    OpenAIRE

    Samadzadega, F.; H. Hasani

    2015-01-01

    Hyperspectral imagery is a rich source of spectral information and plays very important role in discrimination of similar land-cover classes. In the past, several efforts have been investigated for improvement of hyperspectral imagery classification. Recently the interest in the joint use of LiDAR data and hyperspectral imagery has been remarkably increased. Because LiDAR can provide structural information of scene while hyperspectral imagery provide spectral and spatial information. The comp...

  13. Field-normalized citation impact indicators using algorithmically constructed classification systems of science

    OpenAIRE

    Waltman, Ludo; Ruiz-Castillo, Javier

    2014-01-01

    We study the problem of normalizing citation impact indicators for differences in citation practices across scientific fields. Normalization of citation impact indicators is usually done based on a field classification system. In practice, the Web of Science journal subject categories are often used for this purpose. However, many of these subject categories have a quite broad scope and are not sufficiently homogeneous in terms of citation practices. As an alternative, we propose ...

  14. Combinatorial Feature Optimization using Multi-objective Evolutionary Algorithms applied to a Biological Warfare Classification Problem.

    OpenAIRE

    2010-01-01

    Biological weapons is the aggressive use of organisms or toxins, also known as biological warfare agents. These weapons are invisible, odorless, tasteless and can be spread without a sound, making it difficult to detect an attack. Early warning systems based on environmental standoff detection of biological warfare agents using lidar technology require real-time signal processing, challenging the systems efficiency in terms of both computational complexity and classification accuracy. Hence, ...

  15. A Fast Logdet Divergence Based Metric Learning Algorithm for Large Data Sets Classification

    OpenAIRE

    Jiangyuan Mei; Jian Hou; Jicheng Chen; Hamid Reza Karimi

    2014-01-01

    Large data sets classification is widely used in many industrial applications. It is a challenging task to classify large data sets efficiently, accurately, and robustly, as large data sets always contain numerous instances with high dimensional feature space. In order to deal with this problem, in this paper we present an online Logdet divergence based metric learning (LDML) model by making use of the powerfulness of metric learning. We firstly generate a Mahalanobis matrix via l...

  16. REAL TIME CLASSIFICATION AND CLUSTERING OF IDS ALERTS USING MACHINE LEARNING ALGORITHMS

    Directory of Open Access Journals (Sweden)

    T. Subbulakshmi

    2010-01-01

    Full Text Available Intrusion Detection Systems (IDS monitor a secured network for the evidence of malicious activities originating either inside or outside. Upon identifying a suspicious traffic, IDS generates and logs an alert. Unfortunately, most of the alerts generated are either false positive, i.e. benign traffic that has been classified as intrusions, or irrelevant, i.e. attacks that are not successful. The abundance of false positive alerts makes it difficult for the security analyst to find successful attacks and take remedial action. This paper describes a two phase automatic alert classification system to assist the human analyst in identifying the false positives. In the first phase, the alerts collected from one or more sensors are normalized and similar alerts are grouped to form a meta-alert. These meta-alerts are passively verified with an asset database to find out irrelevant alerts. In addition, an optional alert generalization is also performed for root cause analysis and thereby reduces false positives with human interaction. In the second phase, the reduced alerts are labeled and passed to an alert classifier which uses machine learning techniques for building the classification rules. This helps the analyst in automatic classification of the alerts. The system is tested in real environments and found to be effective in reducing the number of alerts as well as false positives dramatically, and thereby reducing the workload of human analyst.

  17. Scalable Algorithms for Unsupervised Classification and Anomaly Detection in Large Geospatiotemporal Data Sets

    Science.gov (United States)

    Mills, R. T.; Hoffman, F. M.; Kumar, J.

    2015-12-01

    The increasing availability of high-resolution geospatiotemporal datasets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery and mining of ecological data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe some unsupervised knowledge discovery and anomaly detection approaches based on highly scalable parallel algorithms for k-means clustering and singular value decomposition, consider a few practical applications thereof to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.

  18. Direction of Onset estimation using Multiple Signal Classification, Estimation of Signal Parameter by Revolving Invariance Techniques and Maximumlikelihood Algorithms for Antenna arrays

    Directory of Open Access Journals (Sweden)

    Yuvaraja.T

    2015-12-01

    Full Text Available In this paper a comparison of the performance of three famous Eigen structure based Direction of arrival (DOA algorithms known as the Multiple Signal Classification (MUSIC, the Estimation of Signal Parameter via Rotational Invariance Techniques (ESPRIT and a non-subspace method maximum-likelihood estimation (MLE has been extensively studied in this research work The performance of this DOA estimation algorithm based on Uniform Linear Array (ULA. We estimated various DOA using MATLAB, results shows that MUSIC algorithm is more accurate and stable compared to ESPRIT and MLE algorithms.

  19. A Novel Detection and Classification Algorithm for Power Quality Disturbances using Wavelets

    Directory of Open Access Journals (Sweden)

    C. Sharmeela

    2006-01-01

    Full Text Available This study presents a novel method to detect and classify power quality disturbances using wavelets. The proposed algorithm uses different wavelets each for a particular class of disturbance. The method uses wavelet filter banks in an effective way and does multiple filtering to detect the disturbances. A qualitative comparison of results shows the advantages and drawbacks of each wavelet when applied to the detection of the disturbances. This method is tested for a large class of test conditions simulated in MATLAB. Power quality monitoring together with the ability of the proposed algorithm to classify the disturbances will be a powerful tool for the power system engineers.

  20. Remote Sensing Classification based on Improved Ant Colony Rules Mining Algorithm

    Directory of Open Access Journals (Sweden)

    Shuying Liu

    2014-09-01

    Full Text Available Data mining can uncover previously undetected relationships among data items using automated data analysis techniques. In data mining, association rule mining is a prevalent and well researched method for discovering useful relations between variables in large databases. This paper investigates the principle of traditional rule mining, which will produce more non-essential candidate sets when it reads data into candidate items. Particularly when it deals with massive data, if the minimum support and minimum confidence are relatively small, combinatorial explosion of frequent item sets will occur and computing power and storage space required are likely to exceed the limits of machine. A new ant colony algorithm based on conventional Ant-Miner algorithm is proposed and is used in rules mining. Measurement formula of effectiveness of the rules is improved and pheromone concentration update strategy is also carried out. The experiment results show that execution time of proposed algorithm is lower than traditional algorithm and has better execution time and accuracy

  1. A comparison of two open source LiDAR surface classification algorithms

    Science.gov (United States)

    With the progression of LiDAR (Light Detection and Ranging) towards a mainstream resource management tool, it has become necessary to understand how best to process and analyze the data. While most ground surface identification algorithms remain proprietary and have high purchase costs; a few are op...

  2. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    importance in the projection (VIP) information of the DPLS method. The power of the gene selection method and the proposed supervised hierarchical clustering method is illustrated on a three microarray data sets of leukemia, breast, and colon cancer. Supervised machine learning algorithms thus enable the...... an important tool for clinical doctors....

  3. Mucinous Adenocarcinoma Involving the Ovary: Comparative Evaluation of the Classification Algorithms using Tumor Size and Laterality

    Science.gov (United States)

    Jung, Eun Sun; Bae, Jeong Hoon; Choi, Yeong Jin; Park, Jong-Sup; Lee, Kyo-Young

    2010-01-01

    For intraoperative consultation of mucinous adenocarcinoma involving the ovary, it would be useful to have approaching methods in addition to the traditional limited microscopic findings in order to determine the nature of the tumors. Mucinous adenocarcinomas involving the ovaries were evaluated in 91 cases of metastatic mucinous adenocarcinomas and 19 cases of primary mucinous adenocarcinomas using both an original algorithm (unilateral ≥10 cm tumors were considered primary and unilateral <10 cm tumors or bilateral tumors were considered metastatic) and a modified cut-off size algorithm. With 10 cm, 13 cm, and 15 cm size cut-offs, the algorithm correctly classified primary and metastatic tumors in 82.7%, 87.3%, and 89.1% of cases and in 80.6%, 84.9%, and 87.1% of signet ring cell carcinoma (SRC) excluded cases. In total cases and SRC excluded cases, 98.0% and 97.2% of bilateral tumors were metastatic and 100% and 100% of unilateral tumors <10 cm were metastatic, respectively. In total cases and SRC excluded cases, 68.4% and 68.4% of unilateral tumors ≥15 cm were primary, respectively. The diagnostic algorithm using size and laterality, in addition to clinical history, preoperative image findings, and operative findings, is a useful adjunct tool for differentiation of metastatic mucinous adenocarcinomas from primary mucinous adenocarcinomas of the ovary. PMID:20119573

  4. An application of the Self Organizing Map Algorithm to computer aided classification of ASTER multispectral data

    Directory of Open Access Journals (Sweden)

    Ferdinando Giacco

    2008-01-01

    Full Text Available In this paper we employ the Kohonen’s Self Organizing Map (SOM as a strategy for an unsupervised analysis of ASTER multispectral (MS images. In order to obtain an accurate clusterization we introduce as input for the network, in addition to spectral data, some texture measures extracted from IKONOS images, which gives a contribution to the classification of manmade structures. After clustering of SOM outcomes, we associated each cluster with a major land cover and compared them with prior knowledge of the scene analyzed.

  5. Evaluation of an algorithm based on single-condition decision rules for binary classification of 12-lead ambulatory ECG recording quality

    International Nuclear Information System (INIS)

    A new algorithm for classifying ECG recording quality based on the detection of commonly observed ECG contaminants which often render the ECG unusable for diagnostic purposes was evaluated. Contaminants (baseline drift, flat line, QRS-artefact, spurious spikes, amplitude stepwise changes, noise) were detected on individual leads from joint time-frequency analysis and QRS amplitude. Classification was based on cascaded single-condition decision rules (SCDR) that tested levels of contaminants against classification thresholds. A supervised learning classifier (SLC) was implemented for comparison. The SCDR and SLC algorithms were trained on an annotated database (Set A, PhysioNet Challenge 2011) of ‘acceptable’ versus ‘unacceptable’ quality recordings using the ‘leave M out’ approach with repeated random partitioning and cross-validation. Two training approaches were considered: (i) balanced, in which training records had equal numbers of ‘acceptable’ and ‘unacceptable’ recordings, (ii) unbalanced, in which the ratio of ‘acceptable’ to ‘unacceptable’ recordings from Set A was preserved. For each training approach, thresholds were calculated, and classification accuracy of the algorithm compared to other rule based algorithms and the SLC using a database for which classifications were unknown (Set B PhysioNet Challenge 2011). The SCDR algorithm achieved the highest accuracy (91.40%) compared to the SLC (90.40%) in spite of its simple logic. It also offers the advantage that it facilitates reporting of meaningful causes of poor signal quality to users. (paper)

  6. A Novel Dispersion Degree and EBFNN-based Fingerprint Classification Algorithm%基于离散度和EBFNN的指纹分类方法

    Institute of Scientific and Technical Information of China (English)

    罗菁; 林树忠; 倪建云; 宋丽梅

    2009-01-01

    Aiming at shift and rotation in fingerprint images, a novel dispersion degree and Ellipsoidal Basis Function Neural Network (EBFNN)-based fingerprint classification algorithm was proposed in this paper. Firstly, feature space was obtained through wavelet transform on fingerprint image. Then, the optimal feature combinations of different dimension were acquired by searching features in the feature space. And the feature vector was determined by studying the changes of divergence degree of those optimal feature combinations along with the dimensions. Finally, EBFNN was trained by the feature vector and fingerprint classification was accomplished. The experimental results on FVC2000 and FVC2002-DB1 show that the average classification accuracy is 91.45% if the number of the hidden neurons is 11. Moreover, the proposed algorithm is robust to shift and rotation in fingerprint images, thus it has some values in practice.%针对指纹图像中的较大平移和旋转,提出了一种基于离散度和EBFNN的指纹分类方法.首先,对指纹图像进行离散小波变换获得特征空间.然后,对特征空间进行搜索得到不同维数下的优化特征组合,通过研究这些优化特征组合的散度值随维数的变化趋势,最终确定特征向量的构成.最后,以此特征向量训练EBFNN,完成指纹纹型分类,并在FVC2000和FVC2002-DB1上作了测试.实验结果表明,当隐层节点为11时 ,总的纹型辨识正确率可达91.45%,而且对指纹图像中的平移和旋转具有良好的鲁棒性,具有一定的实用价值.

  7. Shape classification of wear particles by image boundary analysis using machine learning algorithms

    Science.gov (United States)

    Yuan, Wei; Chin, K. S.; Hua, Meng; Dong, Guangneng; Wang, Chunhui

    2016-05-01

    The shape features of wear particles generated from wear track usually contain plenty of information about the wear states of a machinery operational condition. Techniques to quickly identify types of wear particles quickly to respond to the machine operation and prolong the machine's life appear to be lacking and are yet to be established. To bridge rapid off-line feature recognition with on-line wear mode identification, this paper presents a new radial concave deviation (RCD) method that mainly involves the use of the particle boundary signal to analyze wear particle features. Signal output from the RCDs subsequently facilitates the determination of several other feature parameters, typically relevant to the shape and size of the wear particle. Debris feature and type are identified through the use of various classification methods, such as linear discriminant analysis, quadratic discriminant analysis, naïve Bayesian method, and classification and regression tree method (CART). The average errors of the training and test via ten-fold cross validation suggest CART is a highly suitable approach for classifying and analyzing particle features. Furthermore, the results of the wear debris analysis enable the maintenance team to diagnose faults appropriately.

  8. Power quality events classification and recognition using a novel support vector algorithm

    International Nuclear Information System (INIS)

    This paper presents a method of power quality classification using support vector machines (SVMs). In SVM training, the kernel parameters, and feature selection have very important roles for SVM classification accuracy. Therefore, most appropriates of these kernel types, kernel parameters and features should be used for the SVM training. In this paper to get optimal features for the classifier two stage of feature selection has been used. In first stage mutual information feature selection (MIFS) and in the second stage correlation feature selection (CFS) techniques are used for feature extraction from signals to build distinguished patterns for classifiers. MIFS can reduce the dimensionality of inputs, speed up the training of the network and get better performance and with CFS can get optimal features. In order to create training and testing vectors, different disturbance classes were simulated using parametric equations i.e., pure sinusoid, sag, swell, harmonic, outage, sag and harmonics and swell and harmonics. Finally, the investigation results of this novel approach are shown. The test results show that the classifier has an excellent performance on training speed, reliability and accuracy.

  9. Decision making in double-pedicled DIEP and SIEA abdominal free flap breast reconstructions: An algorithmic approach and comprehensive classification.

    Directory of Open Access Journals (Sweden)

    Charles M Malata

    2015-10-01

    Full Text Available Introduction: The deep inferior epigastric artery perforator (DIEP free flap is the gold standard for autologous breast reconstruction. However, using a single vascular pedicle may not yield sufficient tissue in patients with midline scars or insufficient lower abdominal pannus. Double-pedicled free flaps overcome this problem using different vascular arrangements to harvest the entire lower abdominal flap. The literature is, however, sparse regarding technique selection. We therefore reviewed our experience in order to formulate an algorithm and comprehensive classification for this purpose. Methods: All patients undergoing unilateral double-pedicled abdominal perforator free flap breast reconstruction (AFFBR by a single surgeon (CMM over 40 months were reviewed from a prospectively collected database. Results: Of the 112 consecutive breast free flaps performed, 25 (22% utilised two vascular pedicles. The mean patient age was 45 years (range=27-54. All flaps but one (which used the thoracodorsal system were anastomosed to the internal mammary vessels using the rib-preservation technique. The surgical duration was 656 minutes (range=468-690 mins. The median flap weight was 618g (range=432-1275g and the mastectomy weight was 445g (range=220-896g. All flaps were successful and only three patients requested minor liposuction to reduce and reshape their reconstructed breasts.Conclusion: Bipedicled free abdominal perforator flaps, employed in a fifth of all our AFFBRs, are a reliable and safe option for unilateral breast reconstruction. They, however, necessitate clear indications to justify the additional technical complexity and surgical duration. Our algorithm and comprehensive classification facilitate technique selection for the anastomotic permutations and successful execution of these operations.

  10. Assessment on the classification of landslide risk level using Genetic Algorithm of Operation Tree in central Taiwan

    Science.gov (United States)

    Wei, Chiang; Yeh, Hui-Chung; Chen, Yen-Chang

    2015-04-01

    This study assessed the classification of landslide areas by Genetic Algorithm of Operation Tree (GAOT) of Chen-Yu-Lan River upstream watershed of National Taiwan University Experimental Forest (NTUEF) after the Typhoon Morakot in 2009 using remotely and geological data. Landslides of 624.5 ha which accounting for 1.9% of total area were delineated with the threshold of slope (22°) and area size (1 hectare), 48 landslide sites were located in the upstream Chen-Yu-Lan watershed using FORMOSAT-II satellite imagery, the aerial photo and GIS related coverage. The five risk levels of these landslide areas was classified by the area, elevation, slope order, aspect, erosion order and geological factor order using the Simplicity Method suggested in the Technical Regulations for Soil and Water Conservation of Taiwan. If all the landslide sites were considered, the accuracy of classification using GAOT is 97.9%, superior than the K-means, Ward method, Shared Nearest Neighbor method, Maximum Likelihood Classifier and Bayesian Classifier; if 36 sites were used as training samples and the rest 12 sites were tested, the accuracy still can reach 81.3%. More geological data, anthropogenic influence and hydrological factors may be necessary for clarifying the landside area and the results benefit the assessment for future correction and management of the authorities.

  11. Analysis of Speed Sign Classification Algorithms Using Shape Based Segmentation of Binary Images

    OpenAIRE

    Muhammad, Azam Sheikh; Lavesson, Niklas; Davidsson, Paul; Nilsson, Mikael

    2009-01-01

    Traffic Sign Recognition is a widely studied problem and its dynamic nature calls for the application of a broad range of preprocessing, segmentation, and recognition techniques but few databases are available for evaluation. We have produced a database consisting of 1,300 images captured by a video camera. On this database we have conducted a systematic experimental study. We used four different preprocessing techniques and designed a generic speed sign segmentation algorithm. Then we select...

  12. An Evaluation of Clustering and Classification Algorithms in Life-Logging Devices

    OpenAIRE

    Amlinger, Anton

    2015-01-01

    Using life-logging devices and wearables is a growing trend in today’s society. These yield vast amounts of information, data that is not directly overseeable or graspable at a glance due to its size. Gathering a qualitative, comprehensible overview over this quantitative information is essential for life-logging services to serve its purpose. This thesis provides an overview comparison of CLARANS, DBSCAN and SLINK, representing different branches of clustering algorithm types, as tools for a...

  13. Mucinous Adenocarcinoma Involving the Ovary: Comparative Evaluation of the Classification Algorithms using Tumor Size and Laterality

    OpenAIRE

    Jung, Eun Sun; Bae, Jeong Hoon; Lee, Ahwon; Choi, Yeong Jin; Park, Jong-Sup; Lee, Kyo-Young

    2010-01-01

    For intraoperative consultation of mucinous adenocarcinoma involving the ovary, it would be useful to have approaching methods in addition to the traditional limited microscopic findings in order to determine the nature of the tumors. Mucinous adenocarcinomas involving the ovaries were evaluated in 91 cases of metastatic mucinous adenocarcinomas and 19 cases of primary mucinous adenocarcinomas using both an original algorithm (unilateral ≥10 cm tumors were considered primary and unilateral

  14. Development, Implementation and Evaluation of Segmentation Algorithms for the Automatic Classification of Cervical Cells

    Science.gov (United States)

    Macaulay, Calum Eric

    Cancer of the uterine cervix is one of the most common cancers in women. An effective screening program for pre-cancerous and cancerous lesions can dramatically reduce the mortality rate for this disease. In British Columbia where such a screening program has been in place for some time, 2500 to 3000 slides of cervical smears need to be examined daily. More than 35 years ago, it was recognized that an automated pre-screening system could greatly assist people in this task. Such a system would need to find and recognize stained cells, segment the images of these cells into nucleus and cytoplasm, numerically describe the characteristics of the cells, and use these features to discriminate between normal and abnormal cells. The thrust of this work was (1) to research and develop new segmentation methods and compare their performance to those in the literature, (2) to determine dependence of the numerical cell descriptors on the segmentation method used, (3) to determine the dependence of cell classification accuracy on the segmentation used, and (4) to test the hypothesis that using numerical cell descriptors one can correctly classify the cells. The segmentation accuracies of 32 different segmentation procedures were examined. It was found that the best nuclear segmentation procedure was able to correctly segment 98% of the nuclei of a 1000 and a 3680 image database. Similarly the best cytoplasmic segmentation procedure was found to correctly segment 98.5% of the cytoplasm of the same 1000 image database. Sixty-seven different numerical cell descriptors (features) were calculated for every segmented cell. On a database of 800 classified cervical cells these features when used in a linear discriminant function analysis could correctly classify 98.7% of the normal cells and 97.0% of the abnormal cells. While some features were found to vary a great deal between segmentation procedures, the classification accuracy of groups of features was found to be independent of the

  15. Analysis and Classification of Stride Patterns Associated with Children Development Using Gait Signal Dynamics Parameters and Ensemble Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Meihong Wu

    2016-01-01

    Full Text Available Measuring stride variability and dynamics in children is useful for the quantitative study of gait maturation and neuromotor development in childhood and adolescence. In this paper, we computed the sample entropy (SampEn and average stride interval (ASI parameters to quantify the stride series of 50 gender-matched children participants in three age groups. We also normalized the SampEn and ASI values by leg length and body mass for each participant, respectively. Results show that the original and normalized SampEn values consistently decrease over the significance level of the Mann-Whitney U test (p<0.01 in children of 3–14 years old, which indicates the stride irregularity has been significantly ameliorated with the body growth. The original and normalized ASI values are also significantly changing when comparing between any two groups of young (aged 3–5 years, middle (aged 6–8 years, and elder (aged 10–14 years children. Such results suggest that healthy children may better modulate their gait cadence rhythm with the development of their musculoskeletal and neurological systems. In addition, the AdaBoost.M2 and Bagging algorithms were used to effectively distinguish the children’s gait patterns. These ensemble learning algorithms both provided excellent gait classification results in terms of overall accuracy (≥90%, recall (≥0.8, and precision (≥0.8077.

  16. An Improved Polarimetric Radar Rainfall Algorithm With Hydrometeor Classification Optimized For Rainfall Estimation

    Science.gov (United States)

    Cifelli, R.; Wang, Y.; Lim, S.; Kennedy, P.; Chandrasekar, V.; Rutledge, S. A.

    2009-05-01

    The efficacy of dual polarimetric radar for quantitative precipitation estimation (QPE) is firmly established. Specifically, rainfall retrievals using combinations of reflectivity (ZH), differential reflectivity (ZDR), and specific differential phase (KDP) have advantages over traditional Z-R methods because more information about the drop size distribution and hydrometeor type are available. In addition, dual-polarization radar measurements are generally less susceptible to error and biases due to the presence of ice in the sampling volume. A number of methods have been developed to estimate rainfall from dual-polarization radar measurements. However, the robustness of these techniques in different precipitation regimes is unknown. Because the National Weather Service (NWS) will soon upgrade the WSR 88-D radar network to dual-polarization capability, it is important to test retrieval algorithms in different meteorological environments in order to better understand the limitations of the different methodologies. An important issue in dual-polarimetric rainfall estimation is determining which method to employ for a given set of polarimetric observables. For example, under what circumstances does differential phase information provide superior rain estimates relative to methods using reflectivity and differential reflectivity? At Colorado State University (CSU), a "blended" algorithm has been developed and used for a number of years to estimate rainfall based on ZH, ZDR, and KDP (Cifelli et al. 2002). The rainfall estimators for each sampling volume are chosen on the basis of fixed thresholds, which maximize the measurement capability of each polarimetric variable and combinations of variables. Tests have shown, however, that the retrieval is sensitive to the calculation of ice fraction in the radar volume via the difference reflectivity (ZDP - Golestani et al. 1989) methodology such that an inappropriate estimator can be selected in situations where radar echo is

  17. Operational algorithm for ice/water classification on dual-polarized RADARSAT-2 images

    OpenAIRE

    Zakhvatkina, Natalia; Korosov, Anton; Muckenhuber, Stefan; Sandven, Stein; Babiker, Mohamed

    2016-01-01

    Synthetic aperture radar (SAR) data from RADARSAT-2 (RS2) taken in dual-polarization mode provide additional information for discriminating sea ice and open water compared to single-polarization data. We have developed a fully automatic algorithm to distinguish between open water (rough/calm) and sea ice based on dual-polarized RS2 SAR images. Several technical problems inherent in RS2 data were solved on the pre-processing stage including thermal noise reduction in HV-polarization channel an...

  18. Application of a kernel-based online learning algorithm to the classification of nodule candidates in computer-aided detection of CT lung nodules

    International Nuclear Information System (INIS)

    Classification of the nodule candidates in computer-aided detection (CAD) of lung nodules in CT images was addressed by constructing a nonlinear discriminant function using a kernel-based learning algorithm called the kernel recursive least-squares (KRLS) algorithm. Using the nodule candidates derived from the processing by a CAD scheme of 100 CT datasets containing 253 non-calcified nodules or 3 mm or larger as determined by the consensus of two thoracic radiologists, the following trial were carried out 100 times: by randomly selecting 50 datasets for training, a nonlinear discriminant function was obtained using the nodule candidates in the training datasets and tested with the remaining candidates; for comparison, a rule-based classification was tested in a similar manner. At the number of false positives per case of about 5, the nonlinear classification method showed an improved sensitivity of 80% (mean over the 100 trials) compared with 74% of the rule-based method. (orig.)

  19. Classification of EEG-P300 Signals Extracted from Brain Activities in BCI Systems Using ν-SVM and BLDA Algorithms

    Directory of Open Access Journals (Sweden)

    Ali MOMENNEZHAD

    2014-06-01

    Full Text Available In this paper, a linear predictive coding (LPC model is used to improve classification accuracy, convergent speed to maximum accuracy, and maximum bitrates in brain computer interface (BCI system based on extracting EEG-P300 signals. First, EEG signal is filtered in order to eliminate high frequency noise. Then, the parameters of filtered EEG signal are extracted using LPC model. Finally, the samples are reconstructed by LPC coefficients and two classifiers, a Bayesian Linear discriminant analysis (BLDA, and b the υ-support vector machine (υ-SVM are applied in order to classify. The proposed algorithm performance is compared with fisher linear discriminant analysis (FLDA. Results show that the efficiency of our algorithm in improving classification accuracy and convergent speed to maximum accuracy are much better. As example at the proposed algorithms, respectively BLDA with LPC model and υ-SVM with LPC model with8 electrode configuration for subject S1 the total classification accuracy is improved as 9.4% and 1.7%. And also, subject 7 at BLDA and υ-SVM with LPC model algorithms (LPC+BLDA and LPC+ υ-SVM after block 11th converged to maximum accuracy but Fisher Linear Discriminant Analysis (FLDA algorithm did not converge to maximum accuracy (with the same configuration. So, it can be used as a promising tool in designing BCI systems.

  20. Effectiveness of Partition and Graph Theoretic Clustering Algorithms for Multiple Source Partial Discharge Pattern Classification Using Probabilistic Neural Network and Its Adaptive Version: A Critique Based on Experimental Studies

    Directory of Open Access Journals (Sweden)

    S. Venkatesh

    2012-01-01

    Full Text Available Partial discharge (PD is a major cause of failure of power apparatus and hence its measurement and analysis have emerged as a vital field in assessing the condition of the insulation system. Several efforts have been undertaken by researchers to classify PD pulses utilizing artificial intelligence techniques. Recently, the focus has shifted to the identification of multiple sources of PD since it is often encountered in real-time measurements. Studies have indicated that classification of multi-source PD becomes difficult with the degree of overlap and that several techniques such as mixed Weibull functions, neural networks, and wavelet transformation have been attempted with limited success. Since digital PD acquisition systems record data for a substantial period, the database becomes large, posing considerable difficulties during classification. This research work aims firstly at analyzing aspects concerning classification capability during the discrimination of multisource PD patterns. Secondly, it attempts at extending the previous work of the authors in utilizing the novel approach of probabilistic neural network versions for classifying moderate sets of PD sources to that of large sets. The third focus is on comparing the ability of partition-based algorithms, namely, the labelled (learning vector quantization and unlabelled (K-means versions, with that of a novel hypergraph-based clustering method in providing parsimonious sets of centers during classification.

  1. Assessment of maximum likelihood (ML) and artificial neural network (ANN) algorithms for classification of remote sensing data

    Science.gov (United States)

    Gupta, R. K.; Prasad, T. S.; Vijayan, D.; Balamanikavelu, P. M.

    Due to mix-up of contributions from varied features on the ground surface, getting back of individual feature in remote sensing data using pattern recognition techniques is an ill-defined inverse problem. By placing maximum likelihood (ML) constraint, the available operational softwares classify the image. Without placing any parametric constraint, the image could also be classified using artificial neural networks (ANN). As GIS overlay, developed professionally by forest officials, was available for Antilova reserve forest in Andhra Pradesh, India (170 50^' to 170 56^' N, 810 45^' to 810 54^' E), the IRS-1C LISS-III image of February 11, 1999 was used for assessing the limits of classification accuracy attainable from ML and ANN classifiers. In ML classifier, full GIS overlay was used to give training sets over whole of the image (approach `a') and in approach `b', a priori probability (normally taken equal for all the classes in operational softwares) was assigned (in addition to full spectral signature) based on the fraction areas under each class in GIS overlay. Under such ideal situation of inputs, the achieved accuracy, i.e. Kappa coefficients were 0.709 and 0.735 for approaches `a' and `b' , respectively (called iteration `0'). Using fraction area under each class in the classified output to assign a priori probability for the next iteration, the convergence (within 2% variation) was achieved for 2nd and 3rd iterations with Kappa coefficient values of 0.773 and 0.797 for approaches `a' and `b', respectively. The non-attaining of 100% classification accuracy under ideal inputs situation could be due to assumption of guassian distribution in spectral signatures. In back propagation technique based ANN classifier, spectral signatures for training were identified from GIS overlay. The number of learning iterations were 20,000 with momentum and learning rate of 0.7 and 0.25, respectively. With one hidden layer the Kappa coefficient for ANN classifier was 0

  2. Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study

    Directory of Open Access Journals (Sweden)

    Kanthida Kusonmano

    2012-01-01

    Full Text Available A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs, random forest (RF, k-nearest neighbors (k-NN, penalized logistic regression (PLR, and prediction analysis for microarrays (PAMs. We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.

  3. A New Nearest Neighbor Classification Algorithm Based on Local Probability Centers

    Directory of Open Access Journals (Sweden)

    I-Jing Li

    2014-01-01

    Full Text Available The nearest neighbor is one of the most popular classifiers, and it has been successfully used in pattern recognition and machine learning. One drawback of kNN is that it performs poorly when class distributions are overlapping. Recently, local probability center (LPC algorithm is proposed to solve this problem; its main idea is giving weight to samples according to their posterior probability. However, LPC performs poorly when the value of k is very small and the higher-dimensional datasets are used. To deal with this problem, this paper suggests that the gradient of the posterior probability function can be estimated under sufficient assumption. The theoretic property is beneficial to faithfully calculate the inner product of two vectors. To increase the performance in high-dimensional datasets, the multidimensional Parzen window and Euler-Richardson method are utilized, and a new classifier based on local probability centers is developed in this paper. Experimental results show that the proposed method yields stable performance with a wide range of k for usage, robust performance to overlapping issue, and good performance to dimensionality. The proposed theorem can be applied to mathematical problems and other applications. Furthermore, the proposed method is an attractive classifier because of its simplicity.

  4. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Directory of Open Access Journals (Sweden)

    Mark D McDonnell

    Full Text Available Recent advances in training deep (multi-layer architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM approach, which also enables a very rapid training time (∼ 10 minutes. Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  5. Performance portability study of an automatic target detection and classification algorithm for hyperspectral image analysis using OpenCL

    Science.gov (United States)

    Bernabe, Sergio; Igual, Francisco D.; Botella, Guillermo; Garcia, Carlos; Prieto-Matias, Manuel; Plaza, Antonio

    2015-10-01

    Recent advances in heterogeneous high performance computing (HPC) have opened new avenues for demanding remote sensing applications. Perhaps one of the most popular algorithm in target detection and identification is the automatic target detection and classification algorithm (ATDCA) widely used in the hyperspectral image analysis community. Previous research has already investigated the mapping of ATDCA on graphics processing units (GPUs) and field programmable gate arrays (FPGAs), showing impressive speedup factors that allow its exploitation in time-critical scenarios. Based on these studies, our work explores the performance portability of a tuned OpenCL implementation across a range of processing devices including multicore processors, GPUs and other accelerators. This approach differs from previous papers, which focused on achieving the optimal performance on each platform. Here, we are more interested in the following issues: (1) evaluating if a single code written in OpenCL allows us to achieve acceptable performance across all of them, and (2) assessing the gap between our portable OpenCL code and those hand-tuned versions previously investigated. Our study includes the analysis of different tuning techniques that expose data parallelism as well as enable an efficient exploitation of the complex memory hierarchies found in these new heterogeneous devices. Experiments have been conducted using hyperspectral data sets collected by NASA's Airborne Visible Infra- red Imaging Spectrometer (AVIRIS) and the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensors. To the best of our knowledge, this kind of analysis has not been previously conducted in the hyperspectral imaging processing literature, and in our opinion it is very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.

  6. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    Science.gov (United States)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  7. Emotion Recognition of Weblog Sentences Based on an Ensemble Algorithm of Multi-label Classification and Word Emotions

    Science.gov (United States)

    Li, Ji; Ren, Fuji

    Weblogs have greatly changed the communication ways of mankind. Affective analysis of blog posts is found valuable for many applications such as text-to-speech synthesis or computer-assisted recommendation. Traditional emotion recognition in text based on single-label classification can not satisfy higher requirements of affective computing. In this paper, the automatic identification of sentence emotion in weblogs is modeled as a multi-label text categorization task. Experiments are carried out on 12273 blog sentences from the Chinese emotion corpus Ren_CECps with 8-dimension emotion annotation. An ensemble algorithm RAKEL is used to recognize dominant emotions from the writer's perspective. Our emotion feature using detailed intensity representation for word emotions outperforms the other main features such as the word frequency feature and the traditional lexicon-based feature. In order to deal with relatively complex sentences, we integrate grammatical characteristics of punctuations, disjunctive connectives, modification relations and negation into features. It achieves 13.51% and 12.49% increases for Micro-averaged F1 and Macro-averaged F1 respectively compared to the traditional lexicon-based feature. Result shows that multiple-dimension emotion representation with grammatical features can efficiently classify sentence emotion in a multi-label problem.

  8. Multi-layer Attribute Selection and Classification Algorithm for the Diagnosis of Cardiac Autonomic Neuropathy Based on HRV Attributes

    Directory of Open Access Journals (Sweden)

    Herbert F. Jelinek

    2015-12-01

    Full Text Available Cardiac autonomic neuropathy (CAN poses an important clinical problem, which often remains undetected due difficulty of conducting the current tests and their lack of sensitivity. CAN has been associated with growth in the risk of unexpected death in cardiac patients with diabetes mellitus. Heart rate variability (HRV attributes have been actively investigated, since they are important for diagnostics in diabetes, Parkinson's disease, cardiac and renal disease. Due to the adverse effects of CAN it is important to obtain a robust and highly accurate diagnostic tool for identification of early CAN, when treatment has the best outcome. Use of HRV attributes to enhance the effectiveness of diagnosis of CAN progression may provide such a tool. In the present paper we propose a new machine learning algorithm, the Multi-Layer Attribute Selection and Classification (MLASC, for the diagnosis of CAN progression based on HRV attributes. It incorporates our new automated attribute selection procedure, Double Wrapper Subset Evaluator with Particle Swarm Optimization (DWSE-PSO. We present the results of experiments, which compare MLASC with other simpler versions and counterpart methods. The experiments used our large and well-known diabetes complications database. The results of experiments demonstrate that MLASC has significantly outperformed other simpler techniques.

  9. Bearings-Only Algorithm on Distinguishing the Stability of Speed and Course of Aim%一种判别目标稳速稳向的纯方位算法

    Institute of Scientific and Technical Information of China (English)

    徐功慧; 郝阳

    2016-01-01

    A kind of bearings-only mathematics model on the distinguishing to stability of speed and course of aim was offered,because it can be served on bearings-only mathematics of submarine for dormant attack.Using this algorithm,ubmarine will be moved with steady speed and course at the stated time which is compartmentalized for three equal periods at least,and then it can give four shipboard angles of aim from its sonar.The model consisted of logic relation about four shipboard angles was compared and analyzed on its calculation result.The stability of speed and course of aim was affirmed.By validating from emluator,the algorithm is logical.It can asatisfy the practical demands that the kind of bearings-only mathematics model on the distinguishing to stability of speed and course of aim.%针对潜艇隐身攻击的纯方位算法,提出一种判定目标稳速稳向态势的算法;该算法要求潜艇在一定时间内稳速稳向运动,通过设定的至少3个相同周期间隔,利用声纳测得目标的4个舷角数值,按照4个舷角的逻辑关系形成的模型进行计算结果比较分析,可确认目标是否处于稳速稳向态势;经仿真验证,该算法具有逻辑符合性;这种判定目标稳速稳向的纯方位解算数学模型可以满足实际需求。

  10. Mapping the distributions of C3 and C4 grasses in the mixed-grass prairies of southwest Oklahoma using the Random Forest classification algorithm

    Science.gov (United States)

    Yan, Dong; de Beurs, Kirsten M.

    2016-05-01

    The objective of this paper is to demonstrate a new method to map the distributions of C3 and C4 grasses at 30 m resolution and over a 25-year period of time (1988-2013) by combining the Random Forest (RF) classification algorithm and patch stable areas identified using the spatial pattern analysis software FRAGSTATS. Predictor variables for RF classifications consisted of ten spectral variables, four soil edaphic variables and three topographic variables. We provided a confidence score in terms of obtaining pure land cover at each pixel location by retrieving the classification tree votes. Classification accuracy assessments and predictor variable importance evaluations were conducted based on a repeated stratified sampling approach. Results show that patch stable areas obtained from larger patches are more appropriate to be used as sample data pools to train and validate RF classifiers for historical land cover mapping purposes and it is more reasonable to use patch stable areas as sample pools to map land cover in a year closer to the present rather than years further back in time. The percentage of obtained high confidence prediction pixels across the study area ranges from 71.18% in 1988 to 73.48% in 2013. The repeated stratified sampling approach is necessary in terms of reducing the positive bias in the estimated classification accuracy caused by the possible selections of training and validation pixels from the same patch stable areas. The RF classification algorithm was able to identify the important environmental factors affecting the distributions of C3 and C4 grasses in our study area such as elevation, soil pH, soil organic matter and soil texture.

  11. Webpage Classification Based on Deep Learning Algorithm%基于深度学习的网页分类算法研究

    Institute of Scientific and Technical Information of China (English)

    陈芊希; 范磊

    2016-01-01

    Webpage classification can be used to select accurate webpage for users, which improves the accuracy of information retrieval. Deep learning is a new field in machine learning world. It's a multi-layer neural network learning algorithm, which achieves a very high accuracy by initializing the layer by layer. It has been used in image recognition, speech recognition and text classification. This paper uses the deep learning algorithm in webpage classification. With the experiments, it finds out that the deep learning has obvious advantages for webpage classification.%网页分类可将信息准确筛选与呈现给用户,提高信息检索的准确率.深度学习是机器学习中一个全新的领域,其本质是一种多层的神经网络学习算法,通过逐层初始化的方法来达到极高的准确率,被多次使用在图像识别、语音识别、文本分类中.提出了基于深度学习的网页分类算法,实验数据证明该方法可有效提高网页分类的准确率.

  12. Classification dynamique d'un flux documentaire : une \\'evaluation statique pr\\'ealable de l'algorithme GERMEN

    CERN Document Server

    Lelu, Alain; Johansson, Joel

    2008-01-01

    Data-stream clustering is an ever-expanding subdomain of knowledge extraction. Most of the past and present research effort aims at efficient scaling up for the huge data repositories. Our approach focuses on qualitative improvement, mainly for "weak signals" detection and precise tracking of topical evolutions in the framework of information watch - though scalability is intrinsically guaranteed in a possibly distributed implementation. Our GERMEN algorithm exhaustively picks up the whole set of density peaks of the data at time t, by identifying the local perturbations induced by the current document vector, such as changing cluster borders, or new/vanishing clusters. Optimality yields from the uniqueness 1) of the density landscape for any value of our zoom parameter, 2) of the cluster allocation operated by our border propagation rule. This results in a rigorous independence from the data presentation ranking or any initialization parameter. We present here as a first step the only assessment of a static ...

  13. A non-contact method based on multiple signal classification algorithm to reduce the measurement time for accurately heart rate detection

    Science.gov (United States)

    Bechet, P.; Mitran, R.; Munteanu, M.

    2013-08-01

    Non-contact methods for the assessment of vital signs are of great interest for specialists due to the benefits obtained in both medical and special applications, such as those for surveillance, monitoring, and search and rescue. This paper investigates the possibility of implementing a digital processing algorithm based on the MUSIC (Multiple Signal Classification) parametric spectral estimation in order to reduce the observation time needed to accurately measure the heart rate. It demonstrates that, by proper dimensioning the signal subspace, the MUSIC algorithm can be optimized in order to accurately assess the heart rate during an 8-28 s time interval. The validation of the processing algorithm performance was achieved by minimizing the mean error of the heart rate after performing simultaneous comparative measurements on several subjects. In order to calculate the error the reference value of heart rate was measured using a classic measurement system through direct contact.

  14. A Complete Solution Classification and Unified Algorithmic Treatment for the One- and Two-Step Asymmetric S-Transverse Mass (MT2) Event Scale Statistic

    CERN Document Server

    Walker, Joel W

    2014-01-01

    The MT2 or "s-transverse mass", statistic was developed to cope with the difficulty of associating a parent mass scale with a missing transverse energy signature, given that models of new physics generally predict production of escaping particles in pairs, while collider experiments are sensitive to just a single vector sum over all sources of missing transverse momentum. This document focuses on the generalized extension of that statistic to asymmetric one- and two-step decay chains, with arbitrary child particle masses and upstream missing transverse momentum. It provides a unified theoretical formulation, complete solution classification, taxonomy of critical points, and technical algorithmic prescription for treatment of the MT2 event scale. An implementation of the described algorithm is available for download, and is also a deployable component of the author's fully-featured selection cut software package AEACuS (Algorithmic Event Arbiter and Cut Selector).

  15. The "Life Potential": a new complex algorithm to assess "Heart Rate Variability" from Holter records for cognitive and diagnostic aims. Preliminary experimental results showing its dependence on age, gender and health conditions

    CERN Document Server

    Barra, Orazio A

    2013-01-01

    Although HRV (Heart Rate Variability) analyses have been carried out for several decades, several limiting factors still make these analyses useless from a clinical point of view. The present paper aims at overcoming some of these limits by introducing the "Life Potential" (BMP), a new mathematical algorithm which seems to exhibit surprising cognitive and predictive capabilities. BMP is defined as a linear combination of five HRV Non-Linear Variables, in turn derived from the thermodynamic formalism of chaotic dynamic systems. The paper presents experimental measurements of BMP (Average Values and Standard Deviations) derived from 1048 Holter tests, matched in age and gender, including a control group of 356 healthy subjects. The main results are: (a) BMP always decreases when the age increases, and its dependence on age and gender is well established; (b) the shape of the age dependence within "healthy people" is different from that found in the general group: this behavior provides evidence of possible illn...

  16. A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    Directory of Open Access Journals (Sweden)

    Li Zhen

    2008-05-01

    Full Text Available Abstract Background Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in complex in vitro/in vivo datasets. We present a novel model to simulate complex chemical-toxicology data sets and use this model to evaluate the relative performance of different machine learning (ML methods. Results The classification performance of Artificial Neural Networks (ANN, K-Nearest Neighbors (KNN, Linear Discriminant Analysis (LDA, Naïve Bayes (NB, Recursive Partitioning and Regression Trees (RPART, and Support Vector Machines (SVM in the presence and absence of filter-based feature selection was analyzed using K-way cross-validation testing and independent validation on simulated in vitro assay data sets with varying levels of model complexity, number of irrelevant features and measurement noise. While the prediction accuracy of all ML methods decreased as non-causal (irrelevant features were added, some ML methods performed better than others. In the limit of using a large number of features, ANN and SVM were always in the top performing set of methods while RPART and KNN (k = 5 were always in the poorest performing set. The addition of measurement noise and irrelevant features decreased the classification accuracy of all ML methods, with LDA suffering the greatest performance degradation. LDA performance is especially sensitive to the use of feature selection. Filter-based feature selection generally improved performance, most strikingly for LDA. Conclusion We have developed a novel simulation model to evaluate machine learning methods for the

  17. Research on SVM KNN Classification Algorithm Based on Hadoop Platform%基于Hadoop平台的SVM KNN分类算法的研究

    Institute of Scientific and Technical Information of China (English)

    李正杰; 黄刚

    2016-01-01

    数据的变革带来了前所未有的发展,对丰富且复杂的结构化、半结构化或者是非结构化数据的监测、分析、采集、存储以及应用,已经成为了数据信息时代发展的主流,分类和处理海量数据包含的信息,需要有更好的解决方法。传统的数据挖掘分类方式显然已经不能满足需求,面对这些问题,这里对数据挖掘的一些分类算法进行分析和改进,对算法进行结合,提出了改进的SVM KNN分类算法。在这个基础上,利用Hadoop云计算平台,将研究后的分类算法在MapReduce模型中进行并行化应用,使改进后的算法能够适用于大数据的处理。最后用数据集对算法进行实验验证,通过对比传统的SVM分类算法,结果表明改进后的算法达到了高效、快速、准确、低成本的要求,可以有效地进行大数据分类工作。%The reform of data has brought the unprecedented development,to monitor,analyze,collect,store and apply to the rich and complex structured,semi-structured or unstructured data has become the mainstream of the development of the information age. To classi-fy and deal with the information contained in mass data,it’ s needed to have a better solution. The traditional data mining classification method cannot meet the demand any longer. To face these problems,it analyzes and improves the classification algorithm in data mining in this paper. Combined with the algorithms,an improved SVM KNN classification algorithm is proposed. Then on this basis,by utilizing Hadoop cloud computing platform,the new classification algorithm is put into MapReduce model for parallelization application,so the im-proved algorithm can be applied to large data processing. Finally,data set is used to conduct experimental verification on the algorithm. By comparing with traditional SVM classification algorithm,the results show that the improved algorithm has become more efficient,fast, accurate

  18. Precision disablement aiming system

    Energy Technology Data Exchange (ETDEWEB)

    Monda, Mark J.; Hobart, Clinton G.; Gladwell, Thomas Scott

    2016-02-16

    A disrupter to a target may be precisely aimed by positioning a radiation source to direct radiation towards the target, and a detector is positioned to detect radiation that passes through the target. An aiming device is positioned between the radiation source and the target, wherein a mechanical feature of the aiming device is superimposed on the target in a captured radiographic image. The location of the aiming device in the radiographic image is used to aim a disrupter towards the target.

  19. Laser Raman detection of platelets for early and differential diagnosis of Alzheimer’s disease based on an adaptive Gaussian process classification algorithm

    International Nuclear Information System (INIS)

    Early and differential diagnosis of Alzheimer’s disease (AD) has puzzled many clinicians. In this work, laser Raman spectroscopy (LRS) was developed to diagnose AD from platelet samples from AD transgenic mice and non-transgenic controls of different ages. An adaptive Gaussian process (GP) classification algorithm was used to re-establish the classification models of early AD, advanced AD and the control group with just two features and the capacity for noise reduction. Compared with the previous multilayer perceptron network method, the GP showed much better classification performance with the same feature set. Besides, spectra of platelets isolated from AD and Parkinson’s disease (PD) mice were also discriminated. Spectral data from 4 month AD (n = 39) and 12 month AD (n = 104) platelets, as well as control data (n = 135), were collected. Prospective application of the algorithm to the data set resulted in a sensitivity of 80%, a specificity of about 100% and a Matthews correlation coefficient of 0.81. Samples from PD (n = 120) platelets were also collected for differentiation from 12 month AD. The results suggest that platelet LRS detection analysis with the GP appears to be an easier and more accurate method than current ones for early and differential diagnosis of AD. (paper)

  20. Binary classification of chalcone derivatives with LDA or KNN based on their antileishmanial activity and molecular descriptors selected using the Successive Projections Algorithm feature-selection technique.

    Science.gov (United States)

    Goodarzi, Mohammad; Saeys, Wouter; de Araujo, Mario Cesar Ugulino; Galvão, Roberto Kawakami Harrop; Vander Heyden, Yvan

    2014-01-23

    Chalcones are naturally occurring aromatic ketones, which consist of an α-, β-unsaturated carbonyl system joining two aryl rings. These compounds are reported to exhibit several pharmacological activities, including antiparasitic, antibacterial, antifungal, anticancer, immunomodulatory, nitric oxide inhibition and anti-inflammatory effects. In the present work, a Quantitative Structure-Activity Relationship (QSAR) study is carried out to classify chalcone derivatives with respect to their antileishmanial activity (active/inactive) on the basis of molecular descriptors. For this purpose, two techniques to select descriptors are employed, the Successive Projections Algorithm (SPA) and the Genetic Algorithm (GA). The selected descriptors are initially employed to build Linear Discriminant Analysis (LDA) models. An additional investigation is then carried out to determine whether the results can be improved by using a non-parametric classification technique (One Nearest Neighbour, 1NN). In a case study involving 100 chalcone derivatives, the 1NN models were found to provide better rates of correct classification than LDA, both in the training and test sets. The best result was achieved by a SPA-1NN model with six molecular descriptors, which provided correct classification rates of 97% and 84% for the training and test sets, respectively. PMID:24090733

  1. Knowledge discovery from patients' behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services.

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers. PMID:27610177

  2. Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.

  3. Content-based and algorithmic classifications of journals: perspectives on the dynamics of scientific communication and indexer effects

    NARCIS (Netherlands)

    I. Rafols; L. Leydesdorff

    2009-01-01

    The aggregated journal-journal citation matrix—based on the Journal Citation Reports (JCR) of the Science Citation Index—can be decomposed by indexers and/or algorithmically. In this study, we test the results of two recently available algorithms for the decomposition of large matrices against two c

  4. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  5. Efficient and robust phase unwrapping algorithm based on unscented Kalman filter, the strategy of quantizing paths-guided map, and pixel classification strategy.

    Science.gov (United States)

    Xie, Xian Ming; Zeng, Qing Ning

    2015-11-01

    This paper presents an efficient and robust phase unwrapping algorithm which combines an unscented Kalman filter (UKF) with a strategy of quantizing a paths-guided map and a pixel classification strategy based on phase quality information. The advantages of the proposed method depend on the following contributions: (1) the strategy of quantizing the paths-guided map can accelerate the process of searching unwrapping paths and greatly reducing time consumption on the unwrapping procedure; (2) the pixel classification strategy proposed by this paper can reduce the error propagation effect by decreasing the amounts of pixels with equal quantized paths-guided value in the process of unwrapping; and (3) the unscented Kalman filter enables simultaneous filtering and unwrapping without the information loss caused by linearization of a nonlinear model. In addition, a new paths-guided map derived from a phase quality map is inserted into the strategy of quantizing the paths-guided map to provide a more robust path of unwrapping, and then ensures better unwrapping results. Results obtained from synthetic data and real data show that the proposed method can efficiently obtain better solutions with respect to some of the most used algorithms. PMID:26560585

  6. Paper 5: Surveillance of multiple congenital anomalies: implementation of a computer algorithm in European registers for classification of cases

    DEFF Research Database (Denmark)

    Garne, Ester; Dolk, Helen; Loane, Maria; Wellesley, Diana; Barisic, Ingeborg; Calzolari, Elisa; Densem, James

    Surveillance of multiple congenital anomalies is considered to be more sensitive for the detection of new teratogens than surveillance of all or isolated congenital anomalies. Current literature proposes the manual review of all cases for classification into isolated or multiple congenital anomal...

  7. Supervised Semi-definite Embedding algorithms for classification%分类监督半定嵌入算法

    Institute of Scientific and Technical Information of China (English)

    董文明; 孔德庸

    2014-01-01

    Based on manifold learning algorithm with spectral analysis :Semi-definite Embedding (SDE) , we put forward two supervised SSDE algorithms :weighted SSDE and optimal distance SSDE .The simulation results verify the effectiveness of the algorithms .%基于谱分析流形学习算法半定嵌入算法(Semi-definite Embedding ,SDE),提出了两种监督型的SSDE算法,即基于权重的SSDE算法和基于最佳距离度量的SSDE算法,数值实验验证了算法的有效性。

  8. A hidden Markov model based algorithm for data stream classification algorithm%基于隐马尔可夫模型的流数据分类算法

    Institute of Scientific and Technical Information of China (English)

    潘怡; 何可可; 李国徽

    2014-01-01

    为优化周期性概念漂移分类精度,提出了一种基于隐马尔可夫模型的周期性流式数据分类(HMM -SDC)算法,算法结合实际可观测序列的输出建立漂移概念状态序列的转移矩阵概率模型,由观测值概率分布密度来预测状态的转移序列。当预测误差超过用户定义阈值时,算法能够更新优化转移矩阵参数,无须重复学习历史概念即可实现对数据概念漂移的有效预测。此外,算法采用半监督 K-M ean学习方法训练样本集,降低了人工标记样例的代价,能够避免隐形马尔可夫模型因标记样例不足而产生的欠学习问题。实验结果表明:相对传统集成分类算法,新算法对周期性数据漂移具有更好的分类精确度及分类时效性。%To improve the classification accuracy on data stream ,HMM -SDC(hidden Markov model based stream data classification )algorithm was presented . The invisible data concept states was a-ligned with the observable sequences through a hidden Markov chain model ,and the drifted concept could be forecasted with the actual observation value .When the mean predictive error was larger than a user defined threshold ,the state transition probability matrix was updated automatically without re-learning the historical data concepts . In addition , part of the unlabeled samples were classified through the semi-supervised K-Means method ,which reduced the impact of the insufficient labeled data for training the hidden Markov model .The experimental results show that the new algorithm has better performance than the traditional ensemble classification algorithm in periodical data stream clas-sification .

  9. Combination of Genetic Algorithm and Dempster-Shafer Theory of Evidence for Land Cover Classification Using Integration of SAR and Optical Satellite Imagery

    Science.gov (United States)

    Chu, H. T.; Ge, L.

    2012-07-01

    The integration of different kinds of remotely sensed data, in particular Synthetic Aperture Radar (SAR) and optical satellite imagery, is considered a promising approach for land cover classification because of the complimentary properties of each data source. However, the challenges are: how to fully exploit the capabilities of these multiple data sources, which combined datasets should be used and which data processing and classification techniques are most appropriate in order to achieve the best results. In this paper an approach, in which synergistic use of a feature selection (FS) methods with Genetic Algorithm (GA) and multiple classifiers combination based on Dempster-Shafer Theory of Evidence, is proposed and evaluated for classifying land cover features in New South Wales, Australia. Multi-date SAR data, including ALOS/PALSAR, ENVISAT/ASAR and optical (Landsat 5 TM+) images, were used for this study. Textural information were also derived and integrated with the original images. Various combined datasets were generated for classification. Three classifiers, namely Artificial Neural Network (ANN), Support Vector Machines (SVMs) and Self-Organizing Map (SOM) were employed. Firstly, feature selection using GA was applied for each classifier and dataset to determine the optimal input features and parameters. Then the results of three classifiers on particular datasets were combined using the Dempster-Shafer theory of Evidence. Results of this study demonstrate the advantages of the proposed method for land cover mapping using complex datasets. It is revealed that the use of GA in conjunction with the Dempster-Shafer Theory of Evidence can significantly improve the classification accuracy. Furthermore, integration of SAR and optical data often outperform single-type datasets.

  10. Nanotechnology, Aims, and Values

    OpenAIRE

    Lorusso, Ludovica

    2013-01-01

    This paper seeks to understand the importance of adopting an ethical framework based on values in the socio-ethical discussion on nanotechnology and generally on emerging technologies. In particular, within such framework it is introduced a distinction between two ideal types of science, defined on the basis of their different aims. Such distinction is considered to be a useful guide in the ethical debate on the technological development of our society, because it may help to understand what ...

  11. Dimensionality Reduction and Classification feature using Mutual Information applied to Hyperspectral Images : A Filter strategy based algorithm

    OpenAIRE

    Sarhrouni, ELkebir; Hammouch, Ahmed; Aboutajdine, Driss

    2012-01-01

    Hyperspectral images (HIS) classification is a high technical remote sensing tool. The goal is to reproduce a thematic map that will be compared with a reference ground truth map (GT), constructed by expecting the region. The HIS contains more than a hundred bidirectional measures, called bands (or simply images), of the same region. They are taken at juxtaposed frequencies. Unfortunately, some bands contain redundant information, others are affected by the noise, and the high dimensionality ...

  12. Visual-Based Clothing Attribute Classification Algorithm%基于视觉的服装属性分类算法

    Institute of Scientific and Technical Information of China (English)

    刘聪; 丁贵广

    2016-01-01

    提出了一种服装图像属性分类算法 .针对服装图像噪声多的问题 ,采用人体部位检测技术定位服装关键部位并去除冗余信息 ,提高了属性分类的准确率 .并提出了一种基于人体骨架与皮肤的特征提取算法 ,以较少的维数表达衣型特点 ,显著加快相关属性的分类速度 .针对服装属性语义复杂、需求多样化的问题 ,为不同的属性构建了不同的SVM决策树模型 ,从而提高分类效率 ,并同时满足粗、细粒度的服装分类需求 .实验结果验证了该方法在多种服装属性分类任务上的有效性 .%We propose an algorithm for classifying clothing image attributes .To handle the noise in clothing images , key parts of clothing are located by a well-trained human part detector ,and redundant information is eliminated ,by which means the accuracy of clothing attribute classification is improved .Additionally ,a novel feature descriptor based on human skeleton and skin is also proposed . This descriptor describes clothing feature with fewer dimensions ,which significantly speeds up classifiers of related attributes .To deal with the complex semantic of clothing attributes ,different SVM Decision Tree models are built for different attributes ,which improves the efficiency of classification and achieves the objective of both coarse-grained and fine-grained classification . Experiments demonstrate the effectiveness of the proposed algorithm on multiple clothing attribute classification tasks .

  13. Dual-energy cone-beam CT with a flat-panel detector: Effect of reconstruction algorithm on material classification

    Energy Technology Data Exchange (ETDEWEB)

    Zbijewski, W., E-mail: wzbijewski@jhu.edu; Gang, G. J.; Xu, J.; Wang, A. S.; Stayman, J. W. [Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205 (United States); Taguchi, K.; Carrino, J. A. [Russell H. Morgan Department of Radiology, Johns Hopkins University, Baltimore, Maryland 21205 (United States); Siewerdsen, J. H. [Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205 and Russell H. Morgan Department of Radiology, Johns Hopkins University, Baltimore, Maryland 21205 (United States)

    2014-02-15

    Purpose: Cone-beam CT (CBCT) with a flat-panel detector (FPD) is finding application in areas such as breast and musculoskeletal imaging, where dual-energy (DE) capabilities offer potential benefit. The authors investigate the accuracy of material classification in DE CBCT using filtered backprojection (FBP) and penalized likelihood (PL) reconstruction and optimize contrast-enhanced DE CBCT of the joints as a function of dose, material concentration, and detail size. Methods: Phantoms consisting of a 15 cm diameter water cylinder with solid calcium inserts (50–200 mg/ml, 3–28.4 mm diameter) and solid iodine inserts (2–10 mg/ml, 3–28.4 mm diameter), as well as a cadaveric knee with intra-articular injection of iodine were imaged on a CBCT bench with a Varian 4343 FPD. The low energy (LE) beam was 70 kVp (+0.2 mm Cu), and the high energy (HE) beam was 120 kVp (+0.2 mm Cu, +0.5 mm Ag). Total dose (LE+HE) was varied from 3.1 to 15.6 mGy with equal dose allocation. Image-based DE classification involved a nearest distance classifier in the space of LE versus HE attenuation values. Recognizing the differences in noise between LE and HE beams, the LE and HE data were differentially filtered (in FBP) or regularized (in PL). Both a quadratic (PLQ) and a total-variation penalty (PLTV) were investigated for PL. The performance of DE CBCT material discrimination was quantified in terms of voxelwise specificity, sensitivity, and accuracy. Results: Noise in the HE image was primarily responsible for classification errors within the contrast inserts, whereas noise in the LE image mainly influenced classification in the surrounding water. For inserts of diameter 28.4 mm, DE CBCT reconstructions were optimized to maximize the total combined accuracy across the range of calcium and iodine concentrations, yielding values of ∼88% for FBP and PLQ, and ∼95% for PLTV at 3.1 mGy total dose, increasing to ∼95% for FBP and PLQ, and ∼98% for PLTV at 15.6 mGy total dose. For a

  14. Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification

    DEFF Research Database (Denmark)

    Marinakis, Yannis; Dounias, Georgios; Jantzen, Jan

    2009-01-01

    The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify t...... other previously applied intelligent approaches....

  15. Impact of Reducing Polarimetric SAR Input on the Uncertainty of Crop Classifications Based on the Random Forests Algorithm

    DEFF Research Database (Denmark)

    Loosvelt, Lien; Peters, Jan; Skriver, Henning;

    2012-01-01

    features in multidate SAR data sets: an accuracy-oriented reduction and an efficiency-oriented reduction. For both strategies, the effect of feature reduction on the quality of the land cover map is assessed. The analyzed data set consists of 20 polarimetric features derived from L-band (1.25 GHz) SAR...... general and specific features for crop classification. Based on the importance ranking, features are gradually removed from the single-date data sets in order to construct several multidate data sets with decreasing dimensionality. In the accuracy-oriented and efficiency-oriented reduction, the input is...

  16. Comparison of Classification Algorithm in Coal Data Analysis System%分类算法在煤矿勘探数据分析系统中的比较

    Institute of Scientific and Technical Information of China (English)

    莫洪武; 万荣泽

    2013-01-01

    煤炭开采过程中需要对收集的勘探数据进行分析和研究,从中挖掘出更加有价值的信息。文章针对多种数据分类算法,研究分析他们在煤炭勘探数据分析中的作用。通过研究和比较多种分类算法在数据分析工作中的性能,找到能够更加有效地处理勘探数据的分类算法。%Coal system usually analyze and research on the collected coal data, and mine more valuable information from them. In data mining area, there are several kinds of data classification mining algorithms. Coal system could apply them into real work according to different data type. In this paper, focusing data classification algorithms, we research and analyze the function of the algorithms in coal data analysis. Through the research and comparison the performance of multiple classification algorithms, we find the effective classification algorithms in processing coal data.

  17. Integrated binary-class classification algorithm based on Logistic and SVM%集成Logistic与SVM的二分类算法

    Institute of Scientific and Technical Information of China (English)

    谢玲; 刘琼荪

    2011-01-01

    By probability values,the output of Logistic regression can be divided into four continuous intervals,and the fre quency of classification in each internal can be calculated.Based on Logistic regression and Support Vector Machine (SVM), an integrated binary-class classification algorithm is proposed.The validity of BLR-SVM is illustrated by numerical results on several UCI datasets.%对Logistic回归的输出结果通过概率分析划分为四个连续的区间,计算各个区间内训练样本的正确分类频率,由此将Logistic回归与支持向量机对样本的输出结果进行比较,构造了一种集成判别规则的二分类算法.实证分析表明提出的集成算法具有较好的分类效果.

  18. A segmentation and classification algorithm for online detection of internal disorders in citrus using X-ray radiographs

    OpenAIRE

    Dael, van, P.; Lebotsa, S.; Herremans, E.; Verboven, P.; Sijbers, J.; Opara, U.L.; Cronje, P.J.; Nicolaï, B.M.

    2016-01-01

    Abstract: Oranges and lemons can be affected by the physiological disorders granulation and endoxerosis respectively, decreasing their commercial value. X-ray radiographs provide images of the internal structure of citrus on which the disorders can be discerned. An image processing algorithm is proposed to detect these disorders on X-ray projection images and classify samples as being affected or not. The method automatically segments healthy and affected tissue, calculates a set of image fea...

  19. [Aiming for zero blindness].

    Science.gov (United States)

    Nakazawa, Toru

    2015-03-01

    -independent factors, as well as our investigation of ways to improve the clinical evaluation of the disease. Our research was prompted by the multifactorial nature of glaucoma. There is a high degree of variability in the pattern and speed of the progression of visual field defects in individual patients, presenting a major obstacle for successful clinical trials. To overcome this, we classified the eyes of glaucoma patients into 4 types, corresponding to the 4 patterns of glaucomatous optic nerve head morphology described: by Nicolela et al. and then tested the validity of this method by assessing the uniformity of clinical features in each group. We found that in normal tension glaucoma (NTG) eyes, each disc morphology group had a characteristic location in which the loss of circumpapillary retinal nerve fiber layer thickness (cpRNFLT; measured with optical coherence tomography: OCT) was most likely to occur. Furthermore, the incidence of reductions in visual acuity differed between the groups, as did the speed of visual field loss, the distribution of defective visual field test points, and the location of test points that were most susceptible to progressive damage, measured by Humphrey static perimetry. These results indicate that Nicolela's method of classifying eyes with glaucoma was able to overcome the difficulties caused by the diverse nature of the disease, at least to a certain extent. Building on these findings, we then set out to identify sectors of the visual field that correspond to the distribution of retinal nerve fibers, with the aim of detecting glaucoma progression with improved sensitivity. We first mapped the statistical correlation between visual field test points and cpRNFLT in each temporal clock-hour sector (from 6 to 12 o'clock), using OCT data from NTG patients. The resulting series of maps allowed us to identify areas containing visual field test points that were prone to be affected together as a group. We also used a similar method to identify visual

  20. Development of visible/infrared/microwave agriculture classification and biomass estimation algorithms, volume 2. [Oklahoma and Texas

    Science.gov (United States)

    Rosenthal, W. D.; Mcfarland, M. J.; Theis, S. W.; Jones, C. L. (Principal Investigator)

    1982-01-01

    Agricultural crop classification models using two or more spectral regions (visible through microwave) were developed and tested and biomass was estimated by including microwave with visible and infrared data. The study was conducted at Guymon, Oklahoma and Dalhart, Texas utilizing aircraft multispectral data and ground truth soil moisture and biomass information. Results indicate that inclusion of C, L, and P band active microwave data from look angles greater than 35 deg from nadir with visible and infrared data improved crop discrimination and biomass estimates compared to results using only visible and infrared data. The active microwave frequencies were sensitive to different biomass levels. In addition, two indices, one using only active microwave data and the other using data from the middle and near infrared bands, were well correlated to total biomass.

  1. Development of visible/infrared/microwave agriculture classification and biomass estimation algorithms. [Guyton, Oklahoma and Dalhart, Texas

    Science.gov (United States)

    Rosenthal, W. D.; Mcfarland, M. J.; Theis, S. W.; Jones, C. L. (Principal Investigator)

    1982-01-01

    Agricultural crop classification models using two or more spectral regions (visible through microwave) are considered in an effort to estimate biomass at Guymon, Oklahoma Dalhart, Texas. Both grounds truth and aerial data were used. Results indicate that inclusion of C, L, and P band active microwave data, from look angles greater than 35 deg from nadir, with visible and infrared data improve crop discrimination and biomass estimates compared to results using only visible and infrared data. The microwave frequencies were sensitive to different biomass levels. The K and C band were sensitive to differences at low biomass levels, while P band was sensitive to differences at high biomass levels. Two indices, one using only active microwave data and the other using data from the middle and near infrared bands, were well correlated to total biomass. It is implied that inclusion of active microwave sensors with visible and infrared sensors on future satellites could aid in crop discrimination and biomass estimation.

  2. Algorithm for Chinese short-text classification using concept description%使用概念描述的中文短文本分类算法

    Institute of Scientific and Technical Information of China (English)

    杨天平; 朱征宇

    2012-01-01

    In order to solve the problem that traditional classification is not very satisfactory due to fewer text features in short text, an algorithm using concept description was presented. At first, a global semantic concept word list was built. Then the test set and training set were conceptualized by the global semantic concept word list to combine the test short texts by the same description of concept in the training set, and at the same time, training long texts were combined by the training short texts in the training set. At last, the long text was classified by traditional classification algorithm. The experiments show that the proposed method could mine implicit semantic information in short text efficiently while expanding short text on semantics adequately, and improving the accuracy of short text classification.%针对短文本特征较少而导致使用传统文本分类算法进行分类效果并不理想的问题,提出了一种使用了概念描述的短文本分类算法,该方法首先构建出全局的语义概念词表;然后,使用概念词表分别对预测短文本和训练短文本概念化描述,使得预测短文本在训练集中找出拥有相似概念描述的训练短文本组合成预测长文本,同时将训练集内部的短文本也进行自组合形成训练长文本;最后,再使用传统的长文本分类算法进行分类.实验证明,该方法能够有效挖掘短文本内部隐含的语义信息,充分对短文本进行语义扩展,提高了短文本分类的准确度.

  3. Automated classification of seismic sources in large database using random forest algorithm: First results at Piton de la Fournaise volcano (La Réunion).

    Science.gov (United States)

    Hibert, Clément; Provost, Floriane; Malet, Jean-Philippe; Stumpf, André; Maggi, Alessia; Ferrazzini, Valérie

    2016-04-01

    In the past decades the increasing quality of seismic sensors and capability to transfer remotely large quantity of data led to a fast densification of local, regional and global seismic networks for near real-time monitoring. This technological advance permits the use of seismology to document geological and natural/anthropogenic processes (volcanoes, ice-calving, landslides, snow and rock avalanches, geothermal fields), but also led to an ever-growing quantity of seismic data. This wealth of seismic data makes the construction of complete seismicity catalogs, that include earthquakes but also other sources of seismic waves, more challenging and very time-consuming as this critical pre-processing stage is classically done by human operators. To overcome this issue, the development of automatic methods for the processing of continuous seismic data appears to be a necessity. The classification algorithm should satisfy the need of a method that is robust, precise and versatile enough to be deployed to monitor the seismicity in very different contexts. We propose a multi-class detection method based on the random forests algorithm to automatically classify the source of seismic signals. Random forests is a supervised machine learning technique that is based on the computation of a large number of decision trees. The multiple decision trees are constructed from training sets including each of the target classes. In the case of seismic signals, these attributes may encompass spectral features but also waveform characteristics, multi-stations observations and other relevant information. The Random Forests classifier is used because it provides state-of-the-art performance when compared with other machine learning techniques (e.g. SVM, Neural Networks) and requires no fine tuning. Furthermore it is relatively fast, robust, easy to parallelize, and inherently suitable for multi-class problems. In this work, we present the first results of the classification method applied

  4. Automated classifications of topography from DEMs by an unsupervised nested-means algorithm and a three-part geometric signature

    Science.gov (United States)

    Iwahashi, J.; Pike, R.J.

    2007-01-01

    An iterative procedure that implements the classification of continuous topography as a problem in digital image-processing automatically divides an area into categories of surface form; three taxonomic criteria-slope gradient, local convexity, and surface texture-are calculated from a square-grid digital elevation model (DEM). The sequence of programmed operations combines twofold-partitioned maps of the three variables converted to greyscale images, using the mean of each variable as the dividing threshold. To subdivide increasingly subtle topography, grid cells sloping at less than mean gradient of the input DEM are classified by designating mean values of successively lower-sloping subsets of the study area (nested means) as taxonomic thresholds, thereby increasing the number of output categories from the minimum 8 to 12 or 16. Program output is exemplified by 16 topographic types for the world at 1-km spatial resolution (SRTM30 data), the Japanese Islands at 270??m, and part of Hokkaido at 55??m. Because the procedure is unsupervised and reflects frequency distributions of the input variables rather than pre-set criteria, the resulting classes are undefined and must be calibrated empirically by subsequent analysis. Maps of the example classifications reflect physiographic regions, geological structure, and landform as well as slope materials and processes; fine-textured terrain categories tend to correlate with erosional topography or older surfaces, coarse-textured classes with areas of little dissection. In Japan the resulting classes approximate landform types mapped from airphoto analysis, while in the Americas they create map patterns resembling Hammond's terrain types or surface-form classes; SRTM30 output for the United States compares favorably with Fenneman's physical divisions. Experiments are suggested for further developing the method; the Arc/Info AML and the map of terrain classes for the world are available as online downloads. ?? 2006 Elsevier

  5. Automatic Detection and Classification of Pole-Like Objects in Urban Point Cloud Data Using an Anomaly Detection Algorithm

    Directory of Open Access Journals (Sweden)

    Borja Rodríguez-Cuenca

    2015-09-01

    Full Text Available Detecting and modeling urban furniture are of particular interest for urban management and the development of autonomous driving systems. This paper presents a novel method for detecting and classifying vertical urban objects and trees from unstructured three-dimensional mobile laser scanner (MLS or terrestrial laser scanner (TLS point cloud data. The method includes an automatic initial segmentation to remove the parts of the original cloud that are not of interest for detecting vertical objects, by means of a geometric index based on features of the point cloud. Vertical object detection is carried out through the Reed and Xiaoli (RX anomaly detection algorithm applied to a pillar structure in which the point cloud was previously organized. A clustering algorithm is then used to classify the detected vertical elements as man-made poles or trees. The effectiveness of the proposed method was tested in two point clouds from heterogeneous street scenarios and measured by two different sensors. The results for the two test sites achieved detection rates higher than 96%; the classification accuracy was around 95%, and the completion quality of both procedures was 90%. Non-detected poles come from occlusions in the point cloud and low-height traffic signs; most misclassifications occurred in man-made poles adjacent to trees.

  6. An Efficient, Scalable Time-Frequency Method for Tracking Energy Usage of Domestic Appliances Using a Two-Step Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Paula Meehan

    2014-10-01

    Full Text Available Load monitoring is the practice of measuring electrical signals in a domestic environment in order to identify which electrical appliances are consuming power. One reason for developing a load monitoring system is to reduce power consumption by increasing consumers’ awareness of which appliances consume the most energy. Another example of an application of load monitoring is activity sensing in the home for the provision of healthcare services. This paper outlines the development of a load disaggregation method that measures the aggregate electrical signals of a domestic environment and extracts features to identify each power consuming appliance. A single sensor is deployed at the main incoming power point, to sample the aggregate current signal. The method senses when an appliance switches ON or OFF and uses a two-step classification algorithm to identify which appliance has caused the event. Parameters from the current in the temporal and frequency domains are used as features to define each appliance. These parameters are the steady-state current harmonics and the rate of change of the transient signal. Each appliance’s electrical characteristics are distinguishable using these parameters. There are three Types of loads that an appliance can fall into, linear nonreactive, linear reactive or nonlinear reactive. It has been found that by identifying the load type first and then using a second classifier to identify individual appliances within these Types, the overall accuracy of the identification algorithm is improved.

  7. Efficient segmentation by sparse pixel classification

    DEFF Research Database (Denmark)

    Dam, Erik B; Loog, Marco

    2008-01-01

    Segmentation methods based on pixel classification are powerful but often slow. We introduce two general algorithms, based on sparse classification, for optimizing the computation while still obtaining accurate segmentations. The computational costs of the algorithms are derived, and they are...

  8. 一种模糊-证据kNN分类方法%A Fuzzy-Evidential k Nearest Neighbor Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    吕锋; 杜妮; 文成林

    2012-01-01

    已有的以k-最近邻(kNearest Neighbor,kNN)规则为核心的分类算法,如模糊kNN(Fuzzy kNN,FkNN)和证据kNN(Evidential kNN,EkNN)等,存在着两个问题:无法区别出样本特征的差异以及忽略了邻居距训练样本类中心距离的不同所带来的影响.为此,本文提出一种模糊-证据kNN算法.首先,利用特征的模糊熵值确定每个特征的权重,基于加权欧氏距离选取k个邻居;然后,利用邻居的信息熵区别对待邻居并结合FkNN在表示信息和EkNN在融合决策方面的优势,采取先模糊化再融合的方法确定待分类样本的类别.本文的方法在UCI标准数据集上进行了测试,结果表明该方法优于已有算法.%The classification algorithms based on k Nearest Neighbor (ANN) rule, such as Fuzzy kNN (FkNN) and Evidential ANN (EkNN),has two problems: the differences of the sample features cannot be recognized and the effect of fuzziness that aroused by the different distances between neighbors and the center of classes is not taken into account. In order to overcome the limitations,the fuzzy-evidential kNN(FEkNN)algorithm is proposed. First, the features' weights are determined by the features' fuzzy entropy values and k neighbors are selected according to the weighted Euclidean distance. Then samples are classified by the method, which fuzzify memberships of its neighbors first and then fuse the information. And this method combines the advantage of FkNN in information expression with that of EkNN in decision-making. Meanwhile,neighbors are distinguished by their information entropy values. The presented method is tested on the UCI datasets, and the results show that the proposed method outperforms the other kNN-based classification algorithms.

  9. Application of the Honeybee Mating Optimization Algorithm to Patent Document Classification in Combination with the Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Chui-Yu Chiu

    2013-08-01

    Full Text Available Patent rights have the property of exclusiveness. Inventors can protect their rights in the legal range and have monopoly for their open inventions. People are not allowed to use an invention before the inventors permit them to use it. Companies try to avoid the research and development investment in inventions that have been protected by patent. Patent retrieval and categorization technologies are used to uncover patent information to reduce the cost of torts. In this research, we propose a novel method which integrates the Honey-Bee Mating Optimization algorithm with Support Vector Machines for patent categorization. First, the CKIP method is utilized to extract phrases of the patent summary and title. Then we calculate the probability that a specific key phrase contains a certain concept based on Term Frequency - Inverse Document Frequency (TF-IDF methods. By combining frequencies and the probabilities of key phases generated by using the Honey-Bee Mating Optimization algorithm, our proposed method is expected to obtain better representative input values for the SVM model. Finally, this research uses patents from Chemical Mechanical Polishing (CMP as case examples to illustrate and demonstrate the superior results produced by the proposed methodology.

  10. A Comparison of Machine Learning Algorithms for Chemical Toxicity Classification Using a Simulated Multi-Scale Data Model

    Science.gov (United States)

    Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in hig...

  11. Audio Classification from Time-Frequency Texture

    OpenAIRE

    Yu, Guoshen; Slotine, Jean-Jacques

    2008-01-01

    Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments.

  12. 改进的面向麻花钻刃形节能优化 Dijkstra 算法%Improvement of Dijkstra algorithm aiming at energy conservation optimization design of geometrical shape of twist drills

    Institute of Scientific and Technical Information of China (English)

    熊良山

    2015-01-01

    Dijkstra algorithm was introduced to the energy conservation optimization design of the structure of complicated cutting tools. The designing procedure of the geometrical shape and dimen‐sions of the cutting edge of twist drills based on Dijkstra algorithm was proposed. The problems taken into consideration that the calculation efficiency is low, the calculated results are not sufficiently pre‐cise, and the smoothness and the machinability of the cutting edge curve cannot be guaranteed when the Dijkstra algorithm is used to determine the main cutting edge curve with minimal drilling power, a combination of two methods, dividing the discretized grids on the rake face into several parts along the radius direction, and gradually decreasing the searching scope to refine mesh along the circumference, was proposed to improve Dijkstra algorithm to reduce its time complexity and improve its calculation efficiency, since the main cutting edge is a curve with no inflexion along the radius direction. Calcula‐tion shows that the improved Dijkstra algorithm leads to an increase of calculation efficiency by over 1000 times, and it successfully results in smoothness of both the cutting edge curve and the distribu‐tion curves of cutting angles so that the machinability of the main cutting edge curve is guaranteed.%将Dijkstra算法引入复杂刀具结构节能优化设计,提出了基于Dijkstra算法的麻花钻主刃曲线节能优化设计的方法。针对应用Dijkstra算法求解麻花钻最小钻削功率主刃曲线时存在计算精度不高、效率低、难以保证刃形曲线的光滑性和可刃磨性等问题,利用主刃曲线在半径方向不会发生回头式弯折的特点,提出将前刀面螺旋面离散网络进行径向分段与逐步缩小周向搜索范围来加密网格相结合的方式改进Dijkstra算法的求解过程,以降低其时间复杂度、提高计算效率和计算精度。计算结果表明:改进后的Dijkstra算法既

  13. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are no...... software packages such as SPM, FSL, and FreeSurfer....

  14. A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    OpenAIRE

    Li Zhen; Setzer R Woodrow; Elloumi Fathi; Judson Richard; Shah Imran

    2008-01-01

    Abstract Background Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in com...

  15. Towards automatic classification of all WISE sources

    CERN Document Server

    Kurcz, Agnieszka; Solarz, Aleksandra; Krupa, Magdalena; Pollo, Agnieszka; Małek, Katarzyna

    2016-01-01

    The WISE satellite has detected hundreds of millions sources over the entire sky. Classifying them reliably is however a challenging task due to degeneracies in WISE multicolour space and low levels of detection in its two longest-wavelength bandpasses. Here we aim at obtaining comprehensive and reliable star, galaxy and quasar catalogues based on automatic source classification in full-sky WISE data. This means that the final classification will employ only parameters available from WISE itself, in particular those reliably measured for a majority of sources. For the automatic classification we applied the support vector machines (SVM) algorithm, which requires a training sample with relevant classes already identified, and we chose to use the SDSS spectroscopic dataset for that purpose. By calibrating the classifier on the test data drawn from SDSS, we first established that a polynomial kernel is preferred over a radial one for this particular dataset. Next, using three classification parameters (W1 magnit...

  16. An automated sleep-state classification algorithm for quantifying sleep timing and sleep-dependent dynamics of electroencephalographic and cerebral metabolic parameters

    Directory of Open Access Journals (Sweden)

    Rempe MJ

    2015-09-01

    Full Text Available Michael J Rempe,1,2 William C Clegern,2 Jonathan P Wisor2 1Mathematics and Computer Science, Whitworth University, Spokane, WA, USA; 2College of Medical Sciences and Sleep and Performance Research Center, Washington State University, Spokane, WA, USAIntroduction: Rodent sleep research uses electroencephalography (EEG and electromyography (EMG to determine the sleep state of an animal at any given time. EEG and EMG signals, typically sampled at >100 Hz, are segmented arbitrarily into epochs of equal duration (usually 2–10 seconds, and each epoch is scored as wake, slow-wave sleep (SWS, or rapid-eye-movement sleep (REMS, on the basis of visual inspection. Automated state scoring can minimize the burden associated with state and thereby facilitate the use of shorter epoch durations.Methods: We developed a semiautomated state-scoring procedure that uses a combination of principal component analysis and naïve Bayes classification, with the EEG and EMG as inputs. We validated this algorithm against human-scored sleep-state scoring of data from C57BL/6J and BALB/CJ mice. We then applied a general homeostatic model to characterize the state-dependent dynamics of sleep slow-wave activity and cerebral glycolytic flux, measured as lactate concentration.Results: More than 89% of epochs scored as wake or SWS by the human were scored as the same state by the machine, whether scoring in 2-second or 10-second epochs. The majority of epochs scored as REMS by the human were also scored as REMS by the machine. However, of epochs scored as REMS by the human, more than 10% were scored as SWS by the machine and 18 (10-second epochs to 28% (2-second epochs were scored as wake. These biases were not strain-specific, as strain differences in sleep-state timing relative to the light/dark cycle, EEG power spectral profiles, and the homeostatic dynamics of both slow waves and lactate were detected equally effectively with the automated method or the manual scoring

  17. Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

    Energy Technology Data Exchange (ETDEWEB)

    Ahmed, M; Gu, F; Ball, A, E-mail: M.Ahmed@hud.ac.uk [Diagnostic Engineering Research Group, University of Huddersfield, HD1 3DH (United Kingdom)

    2011-07-19

    Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.

  18. Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

    Science.gov (United States)

    Ahmed, M.; Gu, F.; Ball, A.

    2011-07-01

    Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.

  19. Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

    International Nuclear Information System (INIS)

    Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.

  20. A One-Class Classification-Based Control Chart Using the K-Means Data Description Algorithm

    Directory of Open Access Journals (Sweden)

    Walid Gani

    2014-01-01

    referred to as OC-charts, and extend their applications. We propose a new OC-chart using the K-means data description (KMDD algorithm, referred to as KM-chart. The proposed KM-chart gives the minimum closed spherical boundary around the in-control process data. It measures the distance between the center of KMDD-based sphere and the new incoming sample to be monitored. Any sample having a distance greater than the radius of KMDD-based sphere is considered as an out-of-control sample. Phase I and II analysis of KM-chart was evaluated through a real industrial application. In a comparative study based on the average run length (ARL criterion, KM-chart was compared with the kernel-distance based control chart, referred to as K-chart, and the k-nearest neighbor data description-based control chart, referred to as KNN-chart. Results revealed that, in terms of ARL, KM-chart performed better than KNN-chart in detecting small shifts in mean vector. Furthermore, the paper provides the MATLAB code for KM-chart, developed by the authors.

  1. APPLYING MULTIDIMENSIONAL PACKET CLASSIFICATION ALGORITHM IN FIREWALL%多维包分类算法在防火墙中的应用

    Institute of Scientific and Technical Information of China (English)

    夏淑华

    2011-01-01

    Along with the globalisation of Internet application, the attendant problems of the security of network information and so on have however affected the users on their trust of the safety and reliability of Internet services and their use of it. At present the firewall technology is the important security means in dealing with the problem of network security, on the basis of the introduction of firewall technology classification, in this paper we have studied the main idea of the AFBV algorthm. To solve the problem of this algorithm that in multidimensional rule library with large number it might appear the problem of time performance excessive consumption, we make the optimisation and the improvement, the deficiency of the AFBV algorithm in complex network environment has been overcome effectively. The contrast expermental result has been given through simulation experiment as well.%随着互联网应用的全球化发展,随之而来的网络信息安全等问题却影响了用户对互联网络服务安全性和可靠性的信任与使用.防火墙技术是目前应对网络安全问题的重要安全技术,在介绍防火墙技术分类的基础上,研究了AFBV算法的主要思想.针对该算法在数目较大的多维规则库下可能出现时间性能消耗过大的问题,进行了优化和改进,有效地克服了AFBV算法在复杂网络环境中的缺陷,并通过仿真实验给出了对比的实验结果.

  2. Classifier in Age classification

    OpenAIRE

    B. Santhi; R.Seethalakshmi

    2012-01-01

    Face is the important feature of the human beings. We can derive various properties of a human by analyzing the face. The objective of the study is to design a classifier for age using facial images. Age classification is essential in many applications like crime detection, employment and face detection. The proposed algorithm contains four phases: preprocessing, feature extraction, feature selection and classification. The classification employs two class labels namely child and Old. This st...

  3. Development of a clinical decision support system using genetic algorithms and Bayesian classification for improving the personalised management of women attending a colposcopy room.

    Science.gov (United States)

    Bountris, Panagiotis; Topaka, Elena; Pouliakis, Abraham; Haritou, Maria; Karakitsos, Petros; Koutsouris, Dimitrios

    2016-06-01

    Cervical cancer (CxCa) is often the result of underestimated abnormalities in the test Papanicolaou (Pap test). The recent advances in the study of the human papillomavirus (HPV) infection (the necessary cause for CxCa development) have guided clinical practice to add HPV related tests alongside the Pap test. In this way, today, HPV DNA testing is well accepted as an ancillary test and it is used for the triage of women with abnormal findings in cytology. However, these tests are either highly sensitive or highly specific, and therefore none of them provides an optimal solution. In this Letter, a clinical decision support system based on a hybrid genetic algorithm - Bayesian classification framework is presented, which combines the results of the Pap test with those of the HPV DNA test in order to exploit the benefits of each method and produce more accurate outcomes. Compared with the medical tests and their combinations (co-testing), the proposed system produced the best receiver operating characteristic curve and the most balanced combination among sensitivity and specificity in detecting high-grade cervical intraepithelial neoplasia and CxCa (CIN2+). This system may support decision-making for the improved management of women who attend a colposcopy room following a positive test result. PMID:27382484

  4. ARTIFICIAL BEE COLONY ALGORITHM INTEGRATED WITH FUZZY C-MEAN OPERATOR FOR DATA CLUSTERING

    OpenAIRE

    M. Krishnamoorthi; A.M.Natarajan

    2013-01-01

    Clustering task aims at the unsupervised classification of patterns in different groups. To enhance the quality of results, the emerging swarm-based algorithms now-a-days become an alternative to the conventional clustering methods. In this study, an optimization method based on the swarm intelligence algorithm is proposed for the purpose of clustering. The significance of the proposed algorithm is that it uses a Fuzzy C- Means (FCM) operator in the Artificial Bee Colony (ABC) algorithm. The ...

  5. Efficent-cutting packet classification algorithm based on the statistical decision tree%基于统计的高效决策树分组分类算法

    Institute of Scientific and Technical Information of China (English)

    陈立南; 刘阳; 马严; 黄小红; 赵庆聪; 魏伟

    2014-01-01

    Packet classification algorithms based on decision tree are easy to implement and widely employed in high-speed packet classification. The primary objective of constructing a decision tree is minimal storage and searching time complexity. An improved decision-tree algorithm is proposed based on statistics and evaluation on filter sets. HyperEC algorithm is a multiple dimensional packet classification algorithm. The proposed algorithm allows the tradeoff between storage and throughput during constructing decision tree. For it is not sensitive to IP address length, it is suitable for IPv6 packet classifi-cation as well as IPv4. The algorithm applies a natural and performance-guided decision-making process. The storage budget is preseted and then the best throughput is achieved. The results show that the HyperEC algorithm outperforms the HiCuts and HyperCuts algorithm, improving the storage and throughput performance and scalable to large filter sets.%基于决策树的分组分类算法因易于实现和高效性,在快速分组分类中广泛使用。决策树算法的基本目标是构造一棵存储高效且查找时间复杂度低的决策树。设计了一种基于规则集统计特性和评价指标的决策树算法——HyperEC 算法。HyperEC算法避免了在构建决策树过程中决策树高度过高和存储空间膨胀的问题。HyperEC算法对IP地址长度不敏感,同样适用于IPv6的多维分组分类。实验证明,HyperEC算法当规则数量较少时,与HyperCuts基本相同,但随着规则数量的增加,该算法在决策树高度、存储空间占用和查找性能方面都明显优于经典的决策树算法。

  6. A comparison of selected classification algorithms for mapping bamboo patches in lower Gangetic plains using very high resolution WorldView 2 imagery

    Science.gov (United States)

    Ghosh, Aniruddha; Joshi, P. K.

    2014-02-01

    Bamboo is used by different communities in India to develop indigenous products, maintain livelihood and sustain life. Indian National Bamboo Mission focuses on evaluation, monitoring and development of bamboo as an important plant resource. Knowledge of spatial distribution of bamboo therefore becomes necessary in this context. The present study attempts to map bamboo patches using very high resolution (VHR) WorldView 2 (WV 2) imagery in parts of South 24 Parganas, West Bengal, India using both pixel and object-based approaches. A combined layer of pan-sharpened multi-spectral (MS) bands, first 3 principal components (PC) of these bands and seven second order texture measures based Gray Level Co-occurrence Matrices (GLCM) of first three PC were used as input variables. For pixel-based image analysis (PBIA), recursive feature elimination (RFE) based feature selection was carried out to identify the most important input variables. Results of the feature selection indicate that the 10 most important variables include PC 1, PC 2 and their GLCM mean along with 6 MS bands. Three different sets of predictor variables (5 and 10 most important variables and all 32 variables) were classified with Support Vector Machine (SVM) and Random Forest (RF) algorithms. Producer accuracy of bamboo was found to be highest when 10 most important variables selected from RFE were classified with SVM (82%). However object-based image analysis (OBIA) achieved higher classification accuracy than PBIA using the same 32 variables, but with less number of training samples. Using object-based SVM classifier, the producer accuracy of bamboo reached 94%. The significance of this study is that the present framework is capable of accurately identifying bamboo patches as well as detecting other tree species in a tropical region with heterogeneous land use land cover (LULC), which could further aid the mandate of National Bamboo Mission and related programs.

  7. Cloud classification algorithm for cloudSat satellite based on fuzzy logic method%基于模糊逻辑的CloudSat卫星资料云分类算法

    Institute of Scientific and Technical Information of China (English)

    任建奇; 严卫; 杨汉乐; 施健康

    2011-01-01

    为了提高星载毫米波雷达资料云分类的准确性,从基于云角色的分类思想出发,利用源于CloudSat/CPR(云廓线雷达)和CALIPSO/Lidar(激光雷达)的云几何廓线数据产品2B-GEOPROF-LIDAR以及相关资料,通过对云的特征参数进行提取,采用模糊逻辑技术对特征参数进行处理并完成对云的分类,将分类结果与CloudSat数据处理中心(DPC)发布的云分类产品、以及CALIPSO激光雷达的观测数据进行对比分析,结果表明分类具有较高的一致性.%According to role-based classification method, a cloud classification algorithm based on fuzzy logic method was established to improve the accuracy of cloud classification for spaceborne millimeter wave radar detection.By extracting characteristic parameters of CloudSat satellite 2B-GEOPROF-LIDAR product from CloudSat CPR and CALIPSO lidar and other related data, this algorithm was used in the cloud classification work.The analysis results are consistent with the observation data of the classification product produced by CloudSat Data Processing Center (DPC) and CALIPSO lidar detection data.

  8. Efficient Fingercode Classification

    Science.gov (United States)

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  9. Product Classification in Supply Chain

    OpenAIRE

    Xing, Lihong; Xu, Yaoxuan

    2010-01-01

    Oriflame is a famous international direct sale cosmetics company with complicated supply chain operation but it lacks of a product classification system. It is vital to design a product classification method in order to support Oriflame global supply planning and improve the supply chain performance. This article is aim to investigate and design the multi-criteria of product classification, propose the classification model, suggest application areas of product classification results and intro...

  10. Performance Comparison of Musical Instrument Family Classification Using Soft Set

    Directory of Open Access Journals (Sweden)

    Saima Anwar Lashari

    2012-08-01

    Full Text Available Nowadays, it appears essential to design automatic and efficacious classification algorithm for the musical instruments. Automatic classification of musical instruments is made by extracting relevant features from the audio samples, afterward classification algorithm is used (using these extracted features to identify into which of a set of classes, the sound sample is possible to fit. The aim of this paper is to demonstrate the viability of soft set for audio signal classification. A dataset of 104 (single monophonic notes pieces of Traditional Pakistani musical instruments were designed. Feature extraction is done using two feature sets namely perception based and mel-frequency cepstral coefficients (MFCCs. In a while, two different classification techniques are applied for classification task, which are soft set (comparison table and fuzzy soft set (similarity measurement. Experimental results show that both classifiers can perform well on numerical data. However, soft set achieved accuracy up to 94.26% with best generated dataset. Consequently, these promising results provide new possibilities for soft set in classifying musical instrument sounds. Based on the analysis of the results, this study offers a new view on automatic instrument classification

  11. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    Science.gov (United States)

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  12. COMBINATION OF GENETIC ALGORITHM AND DEMPSTER-SHAFER THEORY OF EVIDENCE FOR LAND COVER CLASSIFICATION USING INTEGRATION OF SAR AND OPTICAL SATELLITE IMAGERY

    OpenAIRE

    Chu, H T; Ge, L

    2012-01-01

    The integration of different kinds of remotely sensed data, in particular Synthetic Aperture Radar (SAR) and optical satellite imagery, is considered a promising approach for land cover classification because of the complimentary properties of each data source. However, the challenges are: how to fully exploit the capabilities of these multiple data sources, which combined datasets should be used and which data processing and classification techniques are most appropriate in order to achieve ...

  13. 基于粗集分类和遗传算法的知识库集成方法%The Methods of Knowledge Database Integration Based on the Rough Set Classification and Genetic Algorithm

    Institute of Scientific and Technical Information of China (English)

    郭平; 程代杰

    2003-01-01

    As the base of intelligent system, it is very important to guarantee the consistency and non-redundancy of knowledge in knowledge database. Since the variety of knowledge sources, it is necessary to dispose knowledge with redundancy, inclusion and even contradiction during the integration of knowledge database. This paper researches the integration method based on the multi-knowledge database. Firstly, it finds out the inconsistent knowledge sets between the knowledge databases by rough set classification and presents one method eliminating the inconsistency by test data. Then, it regards consistent knowledge sets as the initial population of genetic calculation and constructs a genetic adaptive function based on accuracy, practicability and spreadability of knowledge representation to carry on the genetic calculation. Lastly, classifying the results of genetic calculation reduces the knowledge redundancy of knowledge database. This paper also presents a frameworkfor knowledge database integration based on the rough set classification and genetic algorithm.

  14. Supernova Photometric Classification Challenge

    CERN Document Server

    Kessler, Richard; Jha, Saurabh; Kuhlmann, Stephen

    2010-01-01

    We have publicly released a blinded mix of simulated SNe, with types (Ia, Ib, Ic, II) selected in proportion to their expected rate. The simulation is realized in the griz filters of the Dark Energy Survey (DES) with realistic observing conditions (sky noise, point spread function and atmospheric transparency) based on years of recorded conditions at the DES site. Simulations of non-Ia type SNe are based on spectroscopically confirmed light curves that include unpublished non-Ia samples donated from the Carnegie Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan Digital Sky Survey-II (SDSS-II). We challenge scientists to run their classification algorithms and report a type for each SN. A spectroscopically confirmed subset is provided for training. The goals of this challenge are to (1) learn the relative strengths and weaknesses of the different classification algorithms, (2) use the results to improve classification algorithms, and (3) understand what spectroscopically confirmed sub-...

  15. Significance of Classification Techniques in Prediction of Learning Disabilities

    Directory of Open Access Journals (Sweden)

    Julie M. David

    2010-10-01

    Full Text Available The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified.

  16. Significance of Classification Techniques in Prediction of Learning Disabilities

    CERN Document Server

    Balakrishnan, Julie M David And Kannan

    2010-01-01

    The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified.

  17. The Effect of Adaptive Gain and Adaptive Momentum in Improving Training Time of Gradient Descent Back Propagation Algorithm on Classification Problems

    OpenAIRE

    Norhamreeza Abdul Hamid; Nazri Mohd Nawi; Rozaida Ghazali

    2011-01-01

    The back propagation algorithm has been successfully applied to wide range of practical problems. Since this algorithm uses a gradient descent method, it has some limitations which are slow learning convergence velocity and easy convergence to local minima. The convergence behaviour of the back propagation algorithm depends on the choice of initial weights and biases, network topology, learning rate, momentum, activation function and value for the gain in the activation function. Previous res...

  18. Improving Classification Performance with Single-category Concept Match

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Discarding more and more complicated algorithms, this paper presents a new classification algorithm with singlecategory concept match. It also introduces the method to find such concepts, which is important to the algorithm. Experiment results show that it can improve classification precision and accelerate classification speed to some extent.

  19. Nominal classification

    OpenAIRE

    Senft, G.

    2007-01-01

    This handbook chapter summarizes some of the problems of nominal classification in language, presents and illustrates the various systems or techniques of nominal classification, and points out why nominal classification is one of the most interesting topics in Cognitive Linguistics.

  20. Future aims of biophysical models

    International Nuclear Information System (INIS)

    The present workshop has demonstrated, that it is easy to produce models, but frequently difficult to define their purposes and aims. A reliable prediction of future aims of biophysical modelling may be nearly impossible. It is less difficult to outline those uses of modelling that are unavailable for the pragmatic uses in radiation therapy and in radiation protection. The applications will also determine the general direction of development of the less empirical models that may facilitate the understanding of the mechanisms of radiation action and that may ultimately lead back to applications in radiation therapy and radiation protection. This paper addresses likely aims for modelling in the three areas of radiation therapy, radiation protection and cellular radiation effects. (author)

  1. The Effect of Adaptive Gain and Adaptive Momentum in Improving Training Time of Gradient Descent Back Propagation Algorithm on Classification Problems

    Directory of Open Access Journals (Sweden)

    Norhamreeza Abdul Hamid

    2011-01-01

    Full Text Available The back propagation algorithm has been successfully applied to wide range of practical problems. Since this algorithm uses a gradient descent method, it has some limitations which are slow learning convergence velocity and easy convergence to local minima. The convergence behaviour of the back propagation algorithm depends on the choice of initial weights and biases, network topology, learning rate, momentum, activation function and value for the gain in the activation function. Previous researchers demonstrated that in ‘feed forward’ algorithm, the slope of the activation function is directly influenced by a parameter referred to as ‘gain’. This research proposed an algorithm for improving the performance of the current working back propagation algorithm which is Gradien Descent Method with Adaptive Gain by changing the momentum coefficient adaptively for each node. The influence of the adaptive momentum together with adaptive gain on the learning ability of a neural network is analysed. Multilayer feed forward neural networks have been assessed. Physical interpretation of the relationship between the momentum value, the learning rate and weight values is given. The efficiency of the proposed algorithm is compared with conventional Gradient Descent Method and current Gradient Descent Method with Adaptive Gain was verified by means of simulation on three benchmark problems. In learning the patterns, the simulations result demonstrate that the proposed algorithm converged faster on Wisconsin breast cancer with an improvement ratio of nearly 1.8, 6.6 on Mushroom problem and 36% better on  Soybean data sets. The results clearly show that the proposed algorithm significantly improves the learning speed of the current gradient descent back-propagatin algorithm.

  2. Automatic classification of Deep Web sources based on KNN algorithm%基于K-近邻算法的Deep Web数据源的自动分类

    Institute of Scientific and Technical Information of China (English)

    张智; 顾韵华

    2011-01-01

    To meet the need of Deep Web query, an algorithm for classification of Deep Web sources based on KNN is put forward. The algorithm extracts the form features from Web pages, and makes the form features vector normal. Then the algorithm classifies Deep Web pages by computing distance. The experimental results show that the algorithm has improved in precision and recall.%针对Deep Web的查询需求,提出了一种基于K-近邻算法的Deep Web数据源的自动分类方法.该算法在对Deep Web网页进行表单特征提取及规范化的基础上,基于距离对Deep Web网页所属的目标主题进行判定.实验结果表明:基于K-近邻分类算法可以较有效地进行DeepWeb数据源的自动分类,并得到较高的查全率和查准率.

  3. Classification of smooth Fano polytopes

    DEFF Research Database (Denmark)

    Øbro, Mikkel

    A simplicial lattice polytope containing the origin in the interior is called a smooth Fano polytope, if the vertices of every facet is a basis of the lattice. The study of smooth Fano polytopes is motivated by their connection to toric varieties. The thesis concerns the classification of smooth...... Fano polytopes up to isomorphism. A smooth Fano -polytope can have at most vertices. In case of vertices an explicit classification is known. The thesis contains the classification in case of vertices. Classifications of smooth Fano -polytopes for fixed exist only for . In the thesis an algorithm for...... the classification of smooth Fano -polytopes for any given is presented. The algorithm has been implemented and used to obtain the complete classification for ....

  4. Comparison of Different Classification Techniques Using WEKA for Hematological Data

    Directory of Open Access Journals (Sweden)

    Md. Nurul Amin

    2015-03-01

    Full Text Available Medical professionals need a reliable prediction methodology to diagnose hematological data comments. There are large quantities of information about patients and their medical conditions. Generally, data mining (sometimes called data or knowledge discovery is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Weka is a data mining tools. It contains many machine leaning algorithms. It provides the facility to classify our data through various algorithms. Classification is an important data mining technique with broad applications. It classifies data of various kinds. Classification is used in every field of our life. Classification is used to classify each item in a set of data into one of predefined set of classes or groups. In this paper we are studying the various Classification algorithms. The thesis main aims to show the comparison of different classification algorithms using Waikato Environment for Knowledge Analysis or in short, WEKA and find out which algorithm is most suitable for user working on hematological data. To use propose model, new Doctor or patients can predict hematological data Comment also developed a mobile App that can easily diagnosis hematological data comments. The best algorithm based on the hematological data is J48 classifier with an accuracy of 97.16% and the total time taken to build the model is at 0.03 seconds. Naïve Bayes classifier has the lowest average error at 29.71% compared to others.

  5. Opposite Degree Algorithm and Its Applications

    Directory of Open Access Journals (Sweden)

    Xiao-Guang Yue

    2015-12-01

    Full Text Available The opposite (Opposite Degree, referred to as OD algorithm is an intelligent algorithm proposed by Yue Xiaoguang et al. Opposite degree algorithm is mainly based on the concept of opposite degree, combined with the idea of design of neural network and genetic algorithm and clustering analysis algorithm. The OD algorithm is divided into two sub algorithms, namely: opposite degree - numerical computation (OD-NC algorithm and opposite degree - Classification computation (OD-CC algorithm.

  6. Opposite Degree Algorithm and Its Applications

    OpenAIRE

    Xiao-Guang Yue

    2015-01-01

    The opposite (Opposite Degree, referred to as OD) algorithm is an intelligent algorithm proposed by Yue Xiaoguang et al. Opposite degree algorithm is mainly based on the concept of opposite degree, combined with the idea of design of neural network and genetic algorithm and clustering analysis algorithm. The OD algorithm is divided into two sub algorithms, namely: opposite degree - numerical computation (OD-NC) algorithm and opposite degree - Classification computation (OD-CC) algorithm.

  7. 用改进蚁群算法确定无功补偿分级容量%Improvement of ant colony algorithm for determining reactive compensation classification capacity

    Institute of Scientific and Technical Information of China (English)

    董张卓; 李哲; 赵元鹏

    2013-01-01

    Reactive load history information cannot be effectively applied by traditional reactive compensation classification method, so it exists the over-compensation and lack-compensation phenomenon easily. It proposes a optimization model for effective use of reactive load history information to determine the classification capacity. The model is solved using the improved ant colony algorithm. The pheromone is corrected in time by setting a pheromone threshold. Searching in vertical and horizontal way, the efficiency of ants' search is improved. Solution in that algorithm is effectively protected from local optimum, and the efficiency is increased several times.%传统确定无功分级补偿容量的方法不能有效利用负荷历史信息,容易出现过补或欠补现象.建立了有效利用历史无功负荷来求解无功补偿分级容量的优化模型,采用蚁群算法求解,对蚁群算法进行了改进.通过设定信息素的修正阈值,适时对信息素进行修正;通过纵向和横行的搜索方式,提高蚂蚁搜索的效率;算法能更好地避免陷入局部最优,执行效率数倍提高.

  8. Random Forests for Poverty Classification

    OpenAIRE

    Ruben Thoplan

    2014-01-01

    This paper applies a relatively novel method in data mining to address the issue of poverty classification in Mauritius. The random forests algorithm is applied to the census data in view of improving classification accuracy for poverty status. The analysis shows that the numbers of hours worked, age, education and sex are the most important variables in the classification of the poverty status of an individual. In addition, a clear poverty-gender gap is identified as women have higher chance...

  9. Quantum computing for pattern classification

    OpenAIRE

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2014-01-01

    It is well known that for certain tasks, quantum computing outperforms classical computing. A growing number of contributions try to use this advantage in order to improve or extend classical machine learning algorithms by methods of quantum information theory. This paper gives a brief introduction into quantum machine learning using the example of pattern classification. We introduce a quantum pattern classification algorithm that draws on Trugenberger's proposal for measuring the Hamming di...

  10. China Aims to Promote Import

    Institute of Scientific and Technical Information of China (English)

    Wang Ting

    2010-01-01

    @@ With the theme of"An Opening Market and Global Trade",aim at promoting communications and exchanges among governments,industries and business to achieve mutual benefit and a win-win situation,nearly 300 representatives from the relevant departments of the Chinese government,foreign embassies in China,industrial associations and major enterprises,as well as well-known Chinese and foreign experts and scholars were invited to take part in the forum and share their iews on Chinese market and foreign trade policies.

  11. A Fuzzy Logic Based Sentiment Classification

    Directory of Open Access Journals (Sweden)

    J.I.Sheeba

    2014-07-01

    Full Text Available Sentiment classification aims to detect information such as opinions, explicit , implicit feelings expressed in text. The most existing approaches are able to detect either explicit expressions or implicit expressions of sentiments in the text separately. In this proposed framework it will detect both Implicit and Explicit expressions available in the meeting transcripts. It will classify the Positive, Negative, Neutral words and also identify the topic of the particular meeting transcripts by using fuzzy logic. This paper aims to add some additional features for improving the classification method. The quality of the sentiment classification is improved using proposed fuzzy logic framework .In this fuzzy logic it includes the features like Fuzzy rules and Fuzzy C-means algorithm.The quality of the output is evaluated using the parameters such as precision, recall, f-measure. Here Fuzzy C-means Clustering technique measured in terms of Purity and Entropy. The data set was validated using 10-fold cross validation method and observed 95% confidence interval between the accuracy values .Finally, the proposed fuzzy logic method produced more than 85 % accurate results and error rate is very less compared to existing sentiment classification techniques.

  12. China's educational aim and theory

    Science.gov (United States)

    Guang-Wei, Zou

    1985-12-01

    The aim and theory of Chinese socialist education is to provide scientific and technological knowledge so as to develop the productive forces and to meet the demands of the socialist cause. Since education is the main vehicle towards modernizing science and technology, any investment in education is viewed as being productive as it feeds directly into economics. Faced with the demands of industrial and agricultural production, training a technical as well as a labour force becomes crucial. This is made possible by the provision of two labour systems for workers both from rural as well as urban areas and by two kinds of educational systems for both urban and rural students. Chinese educational theory is seen as a fusion of principles from its own educational legacy with those of Marxist-Leninist principles.

  13. Maximum mutual information regularized classification

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-09-07

    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  14. Absolute calibration of the colour index and O4 absorption derived from Multi-AXis (MAX-) DOAS measurements and their application to a standardised cloud classification algorithm

    OpenAIRE

    Wagner, Thomas; Beirle, Steffen; Remmers, Julia; Shaiganfar, Reza; Wang, Yang

    2016-01-01

    A method is developed for the calibration of the colour index (CI) and the O4 absorption derived from Differential Optical Absorption Spectroscopy (DOAS) measurements of scattered sunlight. The method is based on the comparison of measurements and radiative transfer simulations for well-defined atmospheric conditions and viewing geometries. Calibrated measurements of the CI and the O4 absorption are important for the detection and classification of clouds from MAX-DOAS observations. Such info...

  15. Band Selection and Classification of Hyperspectral Images using Mutual Information: An algorithm based on minimizing the error probability using the inequality of Fano

    OpenAIRE

    Sarhrouni, ELkebir; Hammouch, Ahmed; Aboutajdine, Driss

    2012-01-01

    Hyperspectral image is a substitution of more than a hundred images, called bands, of the same region. They are taken at juxtaposed frequencies. The reference image of the region is called Ground Truth map (GT). the problematic is how to find the good bands to classify the pixels of regions; because the bands can be not only redundant, but a source of confusion, and decreasing so the accuracy of classification. Some methods use Mutual Information (MI) and threshold, to select relevant bands. ...

  16. Research on efficient classification mining algorithm for large data feature of cloud computing equipment%云计算设备中的大数据特征高效分类挖掘方法研究

    Institute of Scientific and Technical Information of China (English)

    王昌辉

    2015-01-01

    The big-data classification mining in cloud computing equipment is the basis of real pattern recognition and intel-ligent control. The topological-structure grid-partition mining algorithm is adopted for large data mining in traditional cloud com-puting equipment,which can not effectively extract the detail characteristics of big data due to its poor classifying veracity. A ef-ficient classification mining algorithm for big data feature of cloud computing equipment is proposed,which is based on the frac-tional Fourier transform feature matching and K-L classification. The big data storage mechanism system of cloud computing equipment is analyzed. The fractional Fourier transform is used to deal with feature extraction and feature matching of big data. On the basis of K-L transform,the optimal path is chosen to guide categorical space,and a K-L big-data feature classifier is es-tablished to realize classification mining in cloud computing equipment. The simulation results show that this algorithm has ad-vantages of high-accuracy feature classification mining,less energy consumption and higher efficiency,and can realize efficient classification mining for the big data feature of the cloud computing equipments.%云计算设备中的大数据分类挖掘是现实模式识别和智能控制的基础,传统方法中对云计算设备中的大数据挖掘采用拓扑结构网格分区挖掘算法,不能有效提取大数据的细节特征,分类的准确性不好.提出一种基于分数阶Fourier变换特征匹配和K-L分类的云计算设备中的大数据特征高效分类挖掘算法.进行云计算设备中大数据存储机制体系分析,采用分数阶Fourier变换进行云计算设备中大数据特征提取和大数据特征匹配,基于K-L变换,选择最优的路径进行分类空间导引,构建了K-L大数据特征分类器,进行云计算设备中的大数据特征分类挖掘.仿真结果表明,采用该算法进行云计算设备中的大数据特征分

  17. Action Potential Classification Based on PCA and Improved K-means Algorithm%基于PCA和改进K均值算法的动作电位分类

    Institute of Scientific and Technical Information of China (English)

    师黎; 杨振兴; 王治忠; 王岩

    2011-01-01

    微电极阵列记录的神经元信号往往是电极临近区域数个神经元的动作电位信号以及大量背景噪声的混叠,研究神经系统的信息处理机制以及神经编码、解码机理需了解相关每个神经元的动作电位,因此需从记录信号中分离出每个神经元的动作电位.基于此,提出基于主元分析(PCA)和改进K均值相结合的动作电位分类方法.该方法采用PCA提取动作电位特征,使用改进K均值算法实现动作电位分类.实验结果表明,该方法降低了动作电位的特征维数以及K均值算法对初始分类重心的依赖,提高动作电位分类结果的正确率及稳定性.尤其是在处理低信噪比信号时,分类正确率仍能达到理想水平.%Neural signal recorded by the microelectrode array is often the mixture which is composed of action potentials of several neurons near the electrodes and the background noises. Researches on the nervous system information processing mechanism and neural coding and decoding mechanism need know every related neuron's action potential. Therefore, every neuron's action potential is essential to be separated from the recorded signal. This paper proposes a method based on Principal Component Analysis(PCA) combined with improved K-means for action potential classification. The action potentials' features are extracted by PCA, the action potential classification is implemented by the improved K-means algorithm. Experimental results show that the method brings down action potential's feature dimensions and dependence of the initial classification center for the K-means algorithm, and increases the accuracy and stability of the classification results. Particularly, when processing the low Signal to Noise Ratio(SNR) signals, it can also achieve an expected purpose.

  18. CCM: A Text Classification Method by Clustering

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock;

    2011-01-01

    In this paper, a new Cluster based Classification Model (CCM) for suspicious email detection and other text classification tasks, is presented. Comparative experiments of the proposed model against traditional classification models and the boosting algorithm are also discussed. Experimental results...... show that the CCM outperforms traditional classification models as well as the boosting algorithm for the task of suspicious email detection on terrorism domain email dataset and topic categorization on the Reuters-21578 and 20 Newsgroups datasets. The overall finding is that applying a cluster based...... approach to text classification tasks simplifies the model and at the same time increases the accuracy....

  19. Text Classification using Artificial Intelligence

    CERN Document Server

    Kamruzzaman, S M

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms for classifying text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using artificial intelligence technique that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of na\\"ive Bayes classifier is then used on derived features and finally only a single concept of genetic algorithm has been added for final classification. A syste...

  20. Text Classification using Data Mining

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using data mining that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. A system based on the...

  1. Vietnamese Document Representation and Classification

    Science.gov (United States)

    Nguyen, Giang-Son; Gao, Xiaoying; Andreae, Peter

    Vietnamese is very different from English and little research has been done on Vietnamese document classification, or indeed, on any kind of Vietnamese language processing, and only a few small corpora are available for research. We created a large Vietnamese text corpus with about 18000 documents, and manually classified them based on different criteria such as topics and styles, giving several classification tasks of different difficulty levels. This paper introduces a new syllable-based document representation at the morphological level of the language for efficient classification. We tested the representation on our corpus with different classification tasks using six classification algorithms and two feature selection techniques. Our experiments show that the new representation is effective for Vietnamese categorization, and suggest that best performance can be achieved using syllable-pair document representation, an SVM with a polynomial kernel as the learning algorithm, and using Information gain and an external dictionary for feature selection.

  2. 基于随机主元分析算法的BBS情感分类研究%Study on BBS Sentiment Classification Based on Random Principal Component Analysis Algorithm

    Institute of Scientific and Technical Information of China (English)

    刘林; 刘三(女牙); 刘智; 铁璐

    2014-01-01

    针对论坛(BBS)中文本的情感分类问题,提出一种改进的随机子空间算法。挖掘特征空间中的分类信息,在生成子空间的过程中,利用权重函数对特征进行分类能力评估,以较大概率选择分类能力较好的特征维度,保证分类精度;扩大选择的子空间维度,选择具有分类能力的特征,通过主元分析对子空间进行降维,保证算法效率和子空间多样性。实验结果表明,该算法分类精度达到91.3%,比基准算法具有更好的性能稳定性。%For Bulletin Board System(BBS) sentiment classification issues, an improved Random Subspace Method(RSM) is proposed. This method tries to make full use of the discriminative information in the high dimensional feature space. In the process of generating subspaces, on the one hand, a weighting function is used to evaluate classification abilities of the features, and better ones are chosen to ensure accuracy of classification with a higher probability, on the other hand, the size of the subspace is enlarged, principal component analysis is used to reduce the dimension of the subspace, and they ensure the efficiency and diversity. Experimental results show that the proposed algorithm obtains the best accuracy of 91.3%, which is higher than the conventional Random Subspace Method(RSM).

  3. 基于随机主元分析算法的BBS情感分类研究%Study on BBS Sentiment Classification Based on Random Principal Component Analysis Algorithm

    Institute of Scientific and Technical Information of China (English)

    刘林; 刘三(女牙); 刘智; 铁璐

    2014-01-01

    For Bulletin Board System(BBS) sentiment classification issues, an improved Random Subspace Method(RSM) is proposed. This method tries to make full use of the discriminative information in the high dimensional feature space. In the process of generating subspaces, on the one hand, a weighting function is used to evaluate classification abilities of the features, and better ones are chosen to ensure accuracy of classification with a higher probability, on the other hand, the size of the subspace is enlarged, principal component analysis is used to reduce the dimension of the subspace, and they ensure the efficiency and diversity. Experimental results show that the proposed algorithm obtains the best accuracy of 91.3%, which is higher than the conventional Random Subspace Method(RSM).%针对论坛(BBS)中文本的情感分类问题,提出一种改进的随机子空间算法。挖掘特征空间中的分类信息,在生成子空间的过程中,利用权重函数对特征进行分类能力评估,以较大概率选择分类能力较好的特征维度,保证分类精度;扩大选择的子空间维度,选择具有分类能力的特征,通过主元分析对子空间进行降维,保证算法效率和子空间多样性。实验结果表明,该算法分类精度达到91.3%,比基准算法具有更好的性能稳定性。

  4. The Research on Fatigue Driving Detection Algorithm

    Directory of Open Access Journals (Sweden)

    Zhui Lin

    2013-09-01

    Full Text Available Researches on Driver Fatigue Detection System, which aims to ensure the safety of operations and to reduce traffic accidents caused by artificial factors, has been the major research subject in transportation safety. There is an enormous advantage in the method obtaining the driver's image by camera, We propose efficient tracking and detecting algorithm and with an appearance model based on haar-like features, finding out the accuracy and robustness of tracking of eyes movements and the conflict between real-time tracing and accuracy of fatigue detection algorithms systems. First, PERCLOS algorithm is adopted to analyze and determine whether a person is fatigue. Second, AdaBoost algorithm is applied to fast detect and the algorithm is implemented in FPGA. Third, We propose a compressed sample tracking algorithm, which compress samples of image using the sparse measurement matrix and train the classification online. The algorithms runs in real-time and is implemented based on ARM add FPGA platform. Experimental results show that the algorithm has high recognition accuracy and robust performance under real train driving environment, in the case of nonlinear tracking of the human eye, illumination change, multi-scale variations, the driver head movement and pose variation. 

  5. Automatic Arabic Text Classification

    OpenAIRE

    Al-harbi, S; Almuhareb, A.; Al-Thubaity , A; Khorsheed, M. S.; Al-Rajeh, A.

    2008-01-01

    Automated document classification is an important text mining task especially with the rapid growth of the number of online documents present in Arabic language. Text classification aims to automatically assign the text to a predefined category based on linguistic features. Such a process has different useful applications including, but not restricted to, e-mail spam detection, web page content filtering, and automatic message routing. This paper presents the results of experiments on documen...

  6. Decoding the Encoding of Functional Brain Networks: an fMRI Classification Comparison of Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA), and Sparse Coding Algorithms

    OpenAIRE

    Xie, Jianwen; Douglas, Pamela K.; Wu, Ying Nian; Brody, Arthur L.; Anderson, Ariana E.

    2016-01-01

    Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet mathematical constraints such as sparse coding and positivity both provide alternate biologically-plausible frameworks for generating brain networks. Non-negative Matrix Factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms ($L1$ Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking,...

  7. Feature selection using a genetic algorithm-based hybrid approach

    Directory of Open Access Journals (Sweden)

    Luis Felipe Giraldo

    2010-04-01

    Full Text Available The present work proposes a hybrid feature selection model aimed at reducing training time whilst maintaining classification accuracy. The model includes adlusting a decision tree for producing feature subsets. Such subsets’ statistical relevance was evaluated from their resulting classification error. Evaluation involved using the k-nearest neighbors’ rule. Dimension reduction techniques usually assume an element of error; however, the hybrid selection model was tuned by means of genetic algorithms in this work. They simultaneously minimise the number of fea- tures and training error. Contrasting with conventional methods, this model also led to quantifying the relevance of each training set’s features. The model was tested on speech signals (hypernasality classification and ECG identification (ischemic cardiopathy. In the case of speech signals, the database consisted of 90 children (45 recordings per sample; the ECG database had 100 electrocardiograph records (50 recordings per sample. Results showed average reduction rates of up to 88%, classification error being less than 6%.

  8. ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURISTIC ALGORITHMS

    Directory of Open Access Journals (Sweden)

    Roghayeh Saneifar

    2015-11-01

    Full Text Available According to the increase of using data mining techniques in improving educational systems operations, Educational Data Mining has been introduced as a new and fast growing research area. Educational Data Mining aims to analyze data in educational environments in order to solve educational research problems. In this paper a new associative classification technique has been proposed to predict students final performance. Despite of several machine learning approaches such as ANNs, SVMs, etc. associative classifiers maintain interpretability along with high accuracy. In this research work, we have employed Honeybee Colony Optimization and Particle Swarm Optimization to extract association rule for student performance prediction as a multi-objective classification problem. Results indicate that the proposed swarm based algorithm outperforms well-known classification techniques on student performance prediction classification problem.

  9. Detection, identification and classification of defects using ANN and a robotic manipulator of 2 G.L. (Kohonen and MLP algorithms)

    International Nuclear Information System (INIS)

    The ultrasonic inspection technique had a sustained growth since the 80's It has several advantages, compared with the contact technique. A flexible and low cost solution is presented based on virtual instrumentation for the servomechanism (manipulator) control of the ultrasound inspection transducer in the immersion technique. The developed system uses a personal computer (PC). a Windows Operating System. Virtual Instrumentation Software. DAQ cards and a GPIB card. As a solution to detection, classification and evaluation of defects an Artificial Neuronal Networks technique proposed. It consists of characterization and interpretation of acoustic signals (echoes) acquired by the immersion ultrasonic inspection technique. Two neuronal networks are proposed: Kohonen and Multilayer Perceptron (MLP). With this techniques non-linear complex processes can be modeled with great precision. The 2-degree of freedom manipulator control, the data acquisition and the net training have been carried out in a virtual instrument environment using LabVIEV and Data Engine. (Author) 14 refs

  10. Supernova Photometric Lightcurve Classification

    Science.gov (United States)

    Zaidi, Tayeb; Narayan, Gautham

    2016-01-01

    This is a preliminary report on photometric supernova classification. We first explore the properties of supernova light curves, and attempt to restructure the unevenly sampled and sparse data from assorted datasets to allow for processing and classification. The data was primarily drawn from the Dark Energy Survey (DES) simulated data, created for the Supernova Photometric Classification Challenge. This poster shows a method for producing a non-parametric representation of the light curve data, and applying a Random Forest classifier algorithm to distinguish between supernovae types. We examine the impact of Principal Component Analysis to reduce the dimensionality of the dataset, for future classification work. The classification code will be used in a stage of the ANTARES pipeline, created for use on the Large Synoptic Survey Telescope alert data and other wide-field surveys. The final figure-of-merit for the DES data in the r band was 60% for binary classification (Type I vs II).Zaidi was supported by the NOAO/KPNO Research Experiences for Undergraduates (REU) Program which is funded by the National Science Foundation Research Experiences for Undergraduates Program (AST-1262829).

  11. Towards automatic classification of all WISE sources

    Science.gov (United States)

    Kurcz, A.; Bilicki, M.; Solarz, A.; Krupa, M.; Pollo, A.; Małek, K.

    2016-07-01

    Context. The Wide-field Infrared Survey Explorer (WISE) has detected hundreds of millions of sources over the entire sky. Classifying them reliably is, however, a challenging task owing to degeneracies in WISE multicolour space and low levels of detection in its two longest-wavelength bandpasses. Simple colour cuts are often not sufficient; for satisfactory levels of completeness and purity, more sophisticated classification methods are needed. Aims: Here we aim to obtain comprehensive and reliable star, galaxy, and quasar catalogues based on automatic source classification in full-sky WISE data. This means that the final classification will employ only parameters available from WISE itself, in particular those which are reliably measured for the majority of sources. Methods: For the automatic classification we applied a supervised machine learning algorithm, support vector machines (SVM). It requires a training sample with relevant classes already identified, and we chose to use the SDSS spectroscopic dataset (DR10) for that purpose. We tested the performance of two kernels used by the classifier, and determined the minimum number of sources in the training set required to achieve stable classification, as well as the minimum dimension of the parameter space. We also tested SVM classification accuracy as a function of extinction and apparent magnitude. Thus, the calibrated classifier was finally applied to all-sky WISE data, flux-limited to 16 mag (Vega) in the 3.4 μm channel. Results: By calibrating on the test data drawn from SDSS, we first established that a polynomial kernel is preferred over a radial one for this particular dataset. Next, using three classification parameters (W1 magnitude, W1-W2 colour, and a differential aperture magnitude) we obtained very good classification efficiency in all the tests. At the bright end, the completeness for stars and galaxies reaches ~95%, deteriorating to ~80% at W1 = 16 mag, while for quasars it stays at a level of

  12. From Local Patterns to Classification Models

    Science.gov (United States)

    Bringmann, Björn; Nijssen, Siegfried; Zimmermann, Albrecht

    Using pattern mining techniques for building a predictive model is currently a popular topic of research. The aim of these techniques is to obtain classifiers of better predictive performance as compared to greedily constructed models, as well as to allow the construction of predictive models for data not represented in attribute-value vectors. In this chapter we provide an overview of recent techniques we developed for integrating pattern mining and classification tasks. The range of techniques spans the entire range from approaches that select relevant patterns from a previously mined set for propositionalization of the data, over inducing patternbased rule sets, to algorithms that integrate pattern mining and model construction. We provide an overview of the algorithms which are most closely related to our approaches in order to put our techniques in a context.

  13. Significance of Classification Techniques in Prediction of Learning Disabilities

    Directory of Open Access Journals (Sweden)

    Julie M. David

    2010-10-01

    Full Text Available The aim of this study is to show the importance of two classification techniques, viz. decision tree andclustering, in prediction of learning disabilities (LD of school-age children. LDs affect about 10 percent ofall children enrolled in schools. The problems of children with specific learning disabilities have been acause of concern to parents and teachers for some time. Decision trees and clustering are powerful andpopular tools used for classification and prediction in Data mining. Different rules extracted from thedecision tree are used for prediction of learning disabilities. Clustering is the assignment of a set ofobservations into subsets, called clusters, which are useful in finding the different signs and symptoms(attributes present in the LD affected child. In this paper, J48 algorithm is used for constructing thedecision tree and K-means algorithm is used for creating the clusters. By applying these classificationtechniques, LD in any child can be identified.

  14. An Improved K-means Clustering Algorithm

    OpenAIRE

    Xiuchang Huang; Wei Su

    2014-01-01

    An improved k-means clustering algorithm based on K-MEANS algorithm is proposed. This paper gives an improved traditional algorithm by analyzing the statistical data. After a comparison between the actual data and the simulation data, this paper safely shows that the improved algorithm significantly reduce classification error on the simulation data set and the quality of the improved algorithm is much better than K-MEANS algorithm. Such comparative results confirm that the improved algorithm...

  15. Snow event classification with a 2D video disdrometer - A decision tree approach

    Science.gov (United States)

    Bernauer, F.; Hürkamp, K.; Rühm, W.; Tschiersch, J.

    2016-05-01

    Snowfall classification according to crystal type or degree of riming of the snowflakes is import for many atmospheric processes, e.g. wet deposition of aerosol particles. 2D video disdrometers (2DVD) have recently proved their capability to measure microphysical parameters of snowfall. The present work has the aim of classifying snowfall according to microphysical properties of single hydrometeors (e.g. shape and fall velocity) measured by means of a 2DVD. The constraints for the shape and velocity parameters which are used in a decision tree for classification of the 2DVD measurements, are derived from detailed on-site observations, combining automatic 2DVD classification with visual inspection. The developed decision tree algorithm subdivides the detected events into three classes of dominating crystal type (single crystals, complex crystals and pellets) and three classes of dominating degree of riming (weak, moderate and strong). The classification results for the crystal type were validated with an independent data set proving the unambiguousness of the classification. In addition, for three long-term events, good agreement of the classification results with independently measured maximum dimension of snowflakes, snowflake bulk density and surrounding temperature was found. The developed classification algorithm is applicable for wind speeds below 5.0 m s -1 and has the advantage of being easily implemented by other users.

  16. A simple and robust classification tree for differentiation between benign and malignant lesions in MR-mammography

    Energy Technology Data Exchange (ETDEWEB)

    Baltzer, Pascal A.T. [Medical University Vienna, Department of Radiology, Vienna (Austria); Dietzel, Matthias [University hospital Erlangen, Department of Neuroradiology, Erlangen (Germany); Kaiser, Werner A. [University Hospital Jena, Institute of Diagnostic and Interventional Radiology 1, Jena (Germany)

    2013-08-15

    In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. (orig.)

  17. A real-valued genetic algorithm to optimize the parameters of support vector machine for classification of multiple faults in NPP

    International Nuclear Information System (INIS)

    Two parameters, regularization parameter c, which determines the trade off cost between minimizing the training error and minimizing the complexity of the model and parameter sigma of the kernel function which defines the non-linear mapping from the input space to some high-dimensional feature space, which constructs a non-linear decision hyper surface in an input space, must be carefully predetermined in establishing an efficient support vector machine (SVM) model. Therefore, the purpose of this study is to develop a genetic-based SVM (GASVM) model that can automatically determine the optimal parameters, c and sigma, of SVM with the highest predictive accuracy and generalization ability simultaneously. The GASVM scheme is applied on observed monitored data of a pressurized water reactor nuclear power plant (PWRNPP) to classify its associated faults. Compared to the standard SVM model, simulation of GASVM indicates its superiority when applied on the dataset with unbalanced classes. GASVM scheme can gain higher classification with accurate and faster learning speed. (authors)

  18. Mechanical equipment classification research of AP1000 nuclear units

    International Nuclear Information System (INIS)

    According to the design features of AP1000, the AP1000 classification definition and seismic classification is described and analyzed. The characteristics of AP1000 mechanical equipment classification list is concluded for safety, seismic and manufacture classification. Through comparing the AP1000 classification and M310 classification, the questions perhaps met are found during the mechanical equipment classification of AP1000 nuclear power plants design and construction in China at future. Finally solution plans are given aiming at the above questions. (authors)

  19. Biogeography based Satellite Image Classification

    CERN Document Server

    Panchal, V K; Kaur, Navdeep; Kundra, Harish

    2009-01-01

    Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. This paper is focused on classification of the satellite image of a particular land cover using the theory of Biogeography based Optimization. The original BBO algorithm does not have the inbuilt property of clustering which is required during image classification. Hence modifications have been proposed to the original algorithm and...

  20. Sentiment Analysis of Movie Reviews using Hybrid Method of Naive Bayes and Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    M.Govindarajan

    2013-12-01

    Full Text Available The area of sentiment mining (also called sentiment extraction, opinion mining, opinion extraction, sentiment analysis, etc. has seen a large increase in academic interest in the last few years. Researchers in the areas of natural language processing, data mining, machine learning, and others have tested a variety of methods of automating the sentiment analysis process. In this research work, new hybrid classification method is proposed based on coupling classification methods using arcing classifier and their performances are analyzed in terms of accuracy. A Classifier ensemble was designed using Naive Bayes (NB, Genetic Algorithm (GA. In the proposed work, a comparative study of the effectiveness of ensemble technique is made for sentiment classification. The ensemble framework is applied to sentiment classification tasks, with the aim of efficiently integrating different feature sets and classification algorithms to synthesize a more accurate classification procedure. The feasibility and the benefits of the proposed approaches are demonstrated by means of movie review that is widely used in the field of sentiment classification. A wide range of comparative experiments are conducted and finally, some in-depth discussion is presented and conclusions are drawn about the effectiveness of ensemble technique for sentiment classification.

  1. Classification of Medical Brain Images

    Institute of Scientific and Technical Information of China (English)

    Pan Haiwei(潘海为); Li Jianzhong; Zhang Wei

    2003-01-01

    Since brain tumors endanger people's living quality and even their lives, the accuracy of classification becomes more important. Conventional classifying techniques are used to deal with those datasets with characters and numbers. It is difficult, however, to apply them to datasets that include brain images and medical history (alphanumeric data), especially to guarantee the accuracy. For these datasets, this paper combines the knowledge of medical field and improves the traditional decision tree. The new classification algorithm with the direction of the medical knowledge not only adds the interaction with the doctors, but also enhances the quality of classification. The algorithm has been used on real brain CT images and a precious rule has been gained from the experiments. This paper shows that the algorithm works well for real CT data.

  2. 基于蚁群聚集信息素的半监督文本分类算法%Semi-supervised Text Classification Algorithm Based on Ant Colony Aggregation Pheromone

    Institute of Scientific and Technical Information of China (English)

    杜芳华; 冀俊忠; 吴晨生; 吴金源

    2014-01-01

    半监督文本分类中已标记数据与未标记数据分布不一致,可能导致分类器性能较低。为此,提出一种利用蚁群聚集信息素浓度的半监督文本分类算法。将聚集信息素与传统的文本相似度计算相融合,利用Top-k策略选取出未标记蚂蚁可能归属的种群,依据判断规则判定未标记蚂蚁的置信度,采用随机选择策略,把置信度高的未标记蚂蚁加入到对其最有吸引力的训练种群中。在标准数据集上与朴素贝叶斯算法和EM算法进行对比实验,结果表明,该算法在精确率、召回率以及F1度量方面都取得了更好的效果。%There are many algorithms based on data distribution to effectively solve semi-supervised text categorization. However,they may perform badly when the labeled data distribution is different from the unlabeled data. This paper presents a semi-supervised text classification algorithm based on aggregation pheromone, which is used for species aggregation in real ants and other insects. The proposed method,which has no assumption regarding the data distribution, can be applied to any kind of data distribution. In light of aggregation pheromone,colonies that unlabeled ants may belong to are selected with a Top-k strategy. Then the confidence of unlabeled ants is determined by a judgment rule. Unlabeled ants with higher confidence are added into the most attractive training colony by a random selection strategy. Compared with Naïve Bayes and EM algorithm,the experiments on benchmark dataset show that this algorithm performs better on precision,recall and Macro F1.

  3. 基于类别空间多示例学习的色情图像过滤算法%Pornography Filtering Algorithm Based on Classification Space Multi-instance Learning

    Institute of Scientific and Technical Information of China (English)

    李博; 曹鹏; 栗伟; 赵大哲

    2013-01-01

    针对传统的不良图像自动过滤算法难以适用于复杂互联网环境的问题,提出一种通过构建类别空间进行多示例学习实现图像过滤的新算法.首先在YCgCr空间中扩展Hessian矩阵检测特征点作为图像的示例,然后定义YCgCr-LBP算子作为图像示例描述符,最后基于包示例频率统计原理提出类别空间模型,并利用余弦相似度完成图像识别.利用不同成分的数据集进行了多组实验对比,结果表明,所提出的算法克服了传统依靠皮肤比例方法对皮肤或类皮肤比例较大图像识别准确度较低的问题,同时也较一般的多示例学习方法对图像具有更好的描述能力,取得了较好的实验结果,具有实际应用价值.%In order to solve the problem that the traditional pornography filtering algorithms are hardly to be used for the complex Internet environment,a novel filtering algorithm was presented based on multi-instance learning by building classification space.Firstly,Hessian matrix was used in YCgCr space to detect image feature points which are instances of the image,and then LBP operator was expanded to YCgCr space.Secondly,YCgCr-LBP operator was constructed to describe the image instances.Finally,classification space model based on frequency statistical theory was proposed,and cosine similarity was used to complete image recognition.Different data sets were used to make comparison.The results showed that using the proposed method,the accuracy is increased compared with the large skin contented images filtering by the conventional skin proportional method,and the description of the proposed method is improved compared with the general multi-instance learning method.What' s more,better experimental results were obtained,which indicated the practical value.

  4. 基于L1-Graph表示的标记传播多观测样本分类算法%Label Propagation Classification Algorithm of Multiple Observation Sets Based on L]-Graph Representation

    Institute of Scientific and Technical Information of China (English)

    胡正平; 王玲丽

    2011-01-01

    同类样本被认为是分布在同一个高维观测空间的低维流形上,针对多观测样本分类如何利用这一流形结构的问题,提出基于L1 -Graph表示的标记传播多观测样本分类算法.首先基于稀疏表示的思路构造L1 -Graph,进而得到样本之间的相似度矩阵,然后在半监督分类标记传播算法的基础上,限制所有的观测样本都属于同一个类别的条件下,得到一个具有特殊结构的类标矩阵,最后把寻找最优类标矩阵的计算转化为离散目标函数优化问题,进而计算出测试样本所属类别.在USPS手写体数据库、ETH- 80物体识别数据库以及Cropped Yale人脸识别数据库上进行了一系列实验,实验结果表明了本文提出方法的可行性和有效性.%The samples in each class set can be supposed to distribute on a same low-dimensional manifold of the high-dimensional observation space. With regard to how to take advantage of this manifold structure for the effective classification of the multiple observation sets, label propagation classification algorithm of multiple observation sets based on LI-Graph representation is proposed in this paper. Based on sparse representation to construct LI -Graph and obtains a similarity matrix between samples as the first step. All observation images belong to a same class is restricted that to obtain a label matrix of special structure on the basis of semi-supervised label propagation algorithm. Lastly, transform the computation of the optimization label matrix to an optimization problem of discrete object function and obtains the class of the test samples. Experiments on the USPS handwritten digit database, ETH-80 object recognition database and Cropped Yale face recognition database show that the proposed method is valid and efficient.

  5. Brain source localization: A new method based on MUltiple SIgnal Classification algorithm and spatial sparsity of the field signal for electroencephalogram measurements

    Science.gov (United States)

    Vergallo, P.; Lay-Ekuakille, A.

    2013-08-01

    Brain activity can be recorded by means of EEG (Electroencephalogram) electrodes placed on the scalp of the patient. The EEG reflects the activity of groups of neurons located in the head, and the fundamental problem in neurophysiology is the identification of the sources responsible of brain activity, especially if a seizure occurs and in this case it is important to identify it. The studies conducted in order to formalize the relationship between the electromagnetic activity in the head and the recording of the generated external field allow to know pattern of brain activity. The inverse problem, that is given the sampling field at different electrodes the underlying asset must be determined, is more difficult because the problem may not have a unique solution, or the search for the solution is made difficult by a low spatial resolution which may not allow to distinguish between activities involving sources close to each other. Thus, sources of interest may be obscured or not detected and known method in source localization problem as MUSIC (MUltiple SIgnal Classification) could fail. Many advanced source localization techniques achieve a best resolution by exploiting sparsity: if the number of sources is small as a result, the neural power vs. location is sparse. In this work a solution based on the spatial sparsity of the field signal is presented and analyzed to improve MUSIC method. For this purpose, it is necessary to set a priori information of the sparsity in the signal. The problem is formulated and solved using a regularization method as Tikhonov, which calculates a solution that is the better compromise between two cost functions to minimize, one related to the fitting of the data, and another concerning the maintenance of the sparsity of the signal. At the first, the method is tested on simulated EEG signals obtained by the solution of the forward problem. Relatively to the model considered for the head and brain sources, the result obtained allows to

  6. Brain source localization: a new method based on MUltiple SIgnal Classification algorithm and spatial sparsity of the field signal for electroencephalogram measurements.

    Science.gov (United States)

    Vergallo, P; Lay-Ekuakille, A

    2013-08-01

    Brain activity can be recorded by means of EEG (Electroencephalogram) electrodes placed on the scalp of the patient. The EEG reflects the activity of groups of neurons located in the head, and the fundamental problem in neurophysiology is the identification of the sources responsible of brain activity, especially if a seizure occurs and in this case it is important to identify it. The studies conducted in order to formalize the relationship between the electromagnetic activity in the head and the recording of the generated external field allow to know pattern of brain activity. The inverse problem, that is given the sampling field at different electrodes the underlying asset must be determined, is more difficult because the problem may not have a unique solution, or the search for the solution is made difficult by a low spatial resolution which may not allow to distinguish between activities involving sources close to each other. Thus, sources of interest may be obscured or not detected and known method in source localization problem as MUSIC (MUltiple SIgnal Classification) could fail. Many advanced source localization techniques achieve a best resolution by exploiting sparsity: if the number of sources is small as a result, the neural power vs. location is sparse. In this work a solution based on the spatial sparsity of the field signal is presented and analyzed to improve MUSIC method. For this purpose, it is necessary to set a priori information of the sparsity in the signal. The problem is formulated and solved using a regularization method as Tikhonov, which calculates a solution that is the better compromise between two cost functions to minimize, one related to the fitting of the data, and another concerning the maintenance of the sparsity of the signal. At the first, the method is tested on simulated EEG signals obtained by the solution of the forward problem. Relatively to the model considered for the head and brain sources, the result obtained allows to

  7. Agriculture classification using POLSAR data

    DEFF Research Database (Denmark)

    Skriver, Henning; Dall, Jørgen; Ferro-Famil, Laurent;

    2005-01-01

    data, and a very important class of algorithms is the knowledge-based approaches. Here, generic characteristics of different cover types are derived by combining physical reasoning with the available empirical evidence. These are then used to define classification rules. Because of their emphasis on...... the physical content of the SAR data they attempt to generate robust, widely applicable methods, which are nonetheless capable of taking local conditions into account. In this paper a classification approach is presented, that uses a knowledge-based approach, where the crops are first classified into...... crops. This part of the classification process is not as well established as the first part, and both a supervised approach and a knowledge-based approach have been evaluated. Both POLSAR and PolInSAR data may be included in the classification scheme. The classification approach has been evaluated using...

  8. Multicore Processing for Clustering Algorithms

    Directory of Open Access Journals (Sweden)

    RekhanshRao

    2012-03-01

    Full Text Available Data Mining algorithms such as classification and clustering are the future of computation, though multidimensional data-processing is required. People are using multicore processors with GPU’s. Most of the programming languages doesn’t provide multiprocessing facilities and hence wastage of processing resources. Clustering and classification algorithms are more resource consuming. In this paper we have shown strategies to overcome such deficiencies using multicore processing platform OpelCL.

  9. Multicore Processing for Clustering Algorithms

    OpenAIRE

    RekhanshRao; Kapil Kumar Nagwanshi; SipiDubey

    2012-01-01

    Data Mining algorithms such as classification and clustering are the future of computation, though multidimensional data-processing is required. People are using multicore processors with GPU’s. Most of the programming languages doesn’t provide multiprocessing facilities and hence wastage of processing resources. Clustering and classification algorithms are more resource consuming. In this paper we have shown strategies to overcome such deficiencies using multicore processing platform OpelCL....

  10. A comparison of CA125, HE4, risk ovarian malignancy algorithm (ROMA, and risk malignancy index (RMI for the classification of ovarian masses

    Directory of Open Access Journals (Sweden)

    Cristina Anton

    2012-01-01

    Full Text Available OBJECTIVE: Differentiation between benign and malignant ovarian neoplasms is essential for creating a system for patient referrals. Therefore, the contributions of the tumor markers CA125 and human epididymis protein 4 (HE4 as well as the risk ovarian malignancy algorithm (ROMA and risk malignancy index (RMI values were considered individually and in combination to evaluate their utility for establishing this type of patient referral system. METHODS: Patients who had been diagnosed with ovarian masses through imaging analyses (n = 128 were assessed for their expression of the tumor markers CA125 and HE4. The ROMA and RMI values were also determined. The sensitivity and specificity of each parameter were calculated using receiver operating characteristic curves according to the area under the curve (AUC for each method. RESULTS: The sensitivities associated with the ability of CA125, HE4, ROMA, or RMI to distinguish between malignant versus benign ovarian masses were 70.4%, 79.6%, 74.1%, and 63%, respectively. Among carcinomas, the sensitivities of CA125, HE4, ROMA (pre-and post-menopausal, and RMI were 93.5%, 87.1%, 80%, 95.2%, and 87.1%, respectively. The most accurate numerical values were obtained with RMI, although the four parameters were shown to be statistically equivalent. CONCLUSION: There were no differences in accuracy between CA125, HE4, ROMA, and RMI for differentiating between types of ovarian masses. RMI had the lowest sensitivity but was the most numerically accurate method. HE4 demonstrated the best overall sensitivity for the evaluation of malignant ovarian tumors and the differential diagnosis of endometriosis. All of the parameters demonstrated increased sensitivity when tumors with low malignancy potential were considered low-risk, which may be used as an acceptable assessment method for referring patients to reference centers.

  11. Improvement and Validation of the BOAT Algorithm

    Directory of Open Access Journals (Sweden)

    Yingchun Liu

    2014-04-01

    Full Text Available The main objective of this paper is improving the BOAT classification algorithm and applying it in credit card big data analysis. Decision tree algorithm is a data analysis method for the classification which can be used to describe the extract important data class models or predict future data trends. The BOAT algorithm can reduce the data during reading and writing the operations, the improved algorithms in large data sets under the operating efficiency, and in line with the popular big data analysis. Through this paper, BOAT algorithm can further improve the performance of the algorithm and the distributed data sources under the performance. In this paper, large banking sectors of credit card data as the being tested data sets. The improved algorithm, the original BOAT algorithms, and the performance of other classical classification algorithms will be compared and analyzed.

  12. Texture classification of lung computed tomography images

    Science.gov (United States)

    Pheng, Hang See; Shamsuddin, Siti M.

    2013-03-01

    Current development of algorithms in computer-aided diagnosis (CAD) scheme is growing rapidly to assist the radiologist in medical image interpretation. Texture analysis of computed tomography (CT) scans is one of important preliminary stage in the computerized detection system and classification for lung cancer. Among different types of images features analysis, Haralick texture with variety of statistical measures has been used widely in image texture description. The extraction of texture feature values is essential to be used by a CAD especially in classification of the normal and abnormal tissue on the cross sectional CT images. This paper aims to compare experimental results using texture extraction and different machine leaning methods in the classification normal and abnormal tissues through lung CT images. The machine learning methods involve in this assessment are Artificial Immune Recognition System (AIRS), Naive Bayes, Decision Tree (J48) and Backpropagation Neural Network. AIRS is found to provide high accuracy (99.2%) and sensitivity (98.0%) in the assessment. For experiments and testing purpose, publicly available datasets in the Reference Image Database to Evaluate Therapy Response (RIDER) are used as study cases.

  13. Emergency Logistics Service Facilities Center Location Algorithm Based on Relative Classification of Disaster Degree%基于灾度相对分类的应急物流中心选址

    Institute of Scientific and Technical Information of China (English)

    骆达荣

    2013-01-01

    Usually the primary objective of general logistics is cost efficiency, while the goal of emergency logistics is to pur-sue the maximum of time efficiency and minimum of disaster loss. So the method of common logistics service center location doesn't meet the requirements of emergency logistics service center location. Furthermore the degree of damage and the different requirement of emergency supplies haven't been considered in the existing location models. Therefore, an emergency logistics facilities service center location algorithm based on relative classification of disaster degree is put forward in this paper. In the algorithm points dam-aged more seriously will be selected as service facilities points using clustering method. And then emergency service center location will be set using center of gravity method selection. Finally a simulation shows the algorithm is effective and feasible.%  与普通物流服务以成本效益为目标不同,应急物流服务要求以时间效益最大化和灾害损失最小化为目标,因而普通物流服务的设施中心选址的方法往往不适用于应急物流服务的设施中心选址。而现有应急物流服务设施中心的选址模型存在一定的不足,而且没有考虑各受灾点的受灾程度及相应的应急物资需求的差异。为此,本文提出基于灾度相对分类的应急设施服务中心选址算法,以聚类分析方法选取灾损较严重的受灾点为服务设施节点,然后采用重心法进行应急服务设施中心选址。最后,以实例仿真说明算法的有效可行。

  14. 一种基于匹配学习的人脸图像超分辨率算法%A super-resolution algorithm of face image based on pre-classification and match

    Institute of Scientific and Technical Information of China (English)

    窦翔; 陶青川

    2015-01-01

    The exsiting example-based super-resolution algorithms of face image adopt global search, which causes the problems of non-local mismatch and poor visual effect of image restoration. A new matching and learning-based face image super-resolution restoration algorithm is proposed. A pre-classification process of input image is applied to get a sub-sample library from the image library, and the corresponding feature images are created. In the matching process, two new search strategies for different face images are used, which consider the similarity and consistency between image patches and make the recovered image look more coherent and natural. Experimental results show that the proposed algorithm synthesizes high-resolution faces with better visual effect and obtains higher values of the average of Peak Signal-to-Noise Ratios(PSNR) when compared with other methods.%针对现有基于样本学习的人脸超分辨率算法对人脸图像采用全局搜索,存在非局部误匹配且复原图像视觉效果不佳等问题,提出了一种新的基于匹配学习的人脸图像超分辨率算法。首先根据输入图像预分类得到一个样本子类库,并构建相应的特征图像。在匹配过程中,针对不同人脸图像,采用2种新的搜索策略,考虑了图像块之间的相似性和一致性,使复原图像看起来更加连贯自然。实验结果表明,与其他方法相比,本文算法生成的高分辨率人脸图像获得了更好的视觉效果和更高的平均峰值信噪比,具有很好的实用价值。

  15. Supervised Classification Performance of Multispectral Images

    CERN Document Server

    Perumal, K

    2010-01-01

    Nowadays government and private agencies use remote sensing imagery for a wide range of applications from military applications to farm development. The images may be a panchromatic, multispectral, hyperspectral or even ultraspectral of terra bytes. Remote sensing image classification is one amongst the most significant application worlds for remote sensing. A few number of image classification algorithms have proved good precision in classifying remote sensing data. But, of late, due to the increasing spatiotemporal dimensions of the remote sensing data, traditional classification algorithms have exposed weaknesses necessitating further research in the field of remote sensing image classification. So an efficient classifier is needed to classify the remote sensing images to extract information. We are experimenting with both supervised and unsupervised classification. Here we compare the different classification methods and their performances. It is found that Mahalanobis classifier performed the best in our...

  16. Unsupervised Classification of Images: A Review

    OpenAIRE

    Abass Olaode; Golshah Naghdy; Catherine Todd

    2014-01-01

    Unsupervised image classification is the process by which each image in a dataset is identified to be a member of one of the inherent categories present in the image collection without the use of labelled training samples. Unsupervised categorisation of images relies on unsupervised machine learning algorithms for its implementation. This paper identifies clustering algorithms and dimension reduction algorithms as the two main classes of unsupervised machine learning algorithms needed in unsu...

  17. Classification des rongeurs

    OpenAIRE

    Mignon, Jacques; Hardouin, Jacques

    2003-01-01

    Les lecteurs du Bulletin BEDIM semblent parfois avoir des difficultés avec la classification scientifique des animaux connus comme "rongeurs" dans le langage courant. Vu les querelles existant encore aujourd'hui dans la mise en place de cette classification, nous ne nous en étonnerons guère. La brève synthèse qui suit concerne les animaux faisant ou susceptibles de faire partie du mini-élevage. The note aims at providing the main characteristics of the principal families of rodents relevan...

  18. Graph Colouring Algorithms

    DEFF Research Database (Denmark)

    Husfeldt, Thore

    2015-01-01

    This chapter presents an introduction to graph colouring algorithms. The focus is on vertex-colouring algorithms that work for general classes of graphs with worst-case performance guarantees in a sequential model of computation. The presentation aims to demonstrate the breadth of available...... techniques and is organized by algorithmic paradigm....

  19. Strategic Classification

    OpenAIRE

    Hardt, Moritz; Megiddo, Nimrod; Papadimitriou, Christos; Wootters, Mary

    2015-01-01

    Machine learning relies on the assumption that unseen test instances of a classification problem follow the same distribution as observed training data. However, this principle can break down when machine learning is used to make important decisions about the welfare (employment, education, health) of strategic individuals. Knowing information about the classifier, such individuals may manipulate their attributes in order to obtain a better classification outcome. As a result of this behavior...

  20. Extreme Learning Machine for land cover classification

    OpenAIRE

    Pal, Mahesh

    2008-01-01

    This paper explores the potential of extreme learning machine based supervised classification algorithm for land cover classification. In comparison to a backpropagation neural network, which requires setting of several user-defined parameters and may produce local minima, extreme learning machine require setting of one parameter and produce a unique solution. ETM+ multispectral data set (England) was used to judge the suitability of extreme learning machine for remote sensing classifications...

  1. Texture Classification based on Gabor Wavelet

    OpenAIRE

    Amandeep Kaur; Savita Gupta

    2012-01-01

    This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vec...

  2. Deep neural networks for spam classification

    OpenAIRE

    Kasmani, Mohamed Khizer

    2013-01-01

    This project elucidates the development of a spam filtering method using deep neural networks. A classification model employing algorithms such as Error Back Propagation (EBP) and Restricted Boltzmann Machines (RBM) is used to identify spam and non-spam emails. Moreover, a spam classification system employing deep neural network algorithms is developed, which has been tested on Enron email dataset in order to help users manage large volumes of email and, furthermore, their email folders. The ...

  3. Incrementally Maintaining Classification using an RDBMS

    OpenAIRE

    Koc, Mehmet Levent; Ré, Christopher

    2011-01-01

    The proliferation of imprecise data has motivated both researchers and the database industry to push statistical techniques into relational database management systems (RDBMSs). We study algorithms to maintain model-based views for a popular statistical technique, classification, inside an RDBMS in the presence of updates to the training examples. We make three technical contributions: (1) An algorithm that incrementally maintains classification inside an RDBMS. (2) An analysis of the above a...

  4. Computerized Classification Testing with the Rasch Model

    Science.gov (United States)

    Eggen, Theo J. H. M.

    2011-01-01

    If classification in a limited number of categories is the purpose of testing, computerized adaptive tests (CATs) with algorithms based on sequential statistical testing perform better than estimation-based CATs (e.g., Eggen & Straetmans, 2000). In these computerized classification tests (CCTs), the Sequential Probability Ratio Test (SPRT) (Wald,…

  5. Network planning tool based on network classification and load prediction

    OpenAIRE

    Hammami, Seif eddine; Afifi, Hossam; Marot, Michel; Gauthier, Vincent

    2016-01-01

    Real Call Detail Records (CDR) are analyzed and classified based on Support Vector Machine (SVM) algorithm. The daily classification results in three traffic classes. We use two different algorithms, K-means and SVM to check the classification efficiency. A second support vector regression (SVR) based algorithm is built to make an online prediction of traffic load using the history of CDRs. Then, these algorithms will be integrated to a network planning tool which will help cellular operators...

  6. Application of Data Mining in Protein Sequence Classification

    Directory of Open Access Journals (Sweden)

    Suprativ Saha

    2012-11-01

    Full Text Available Protein sequence classification involves feature selection for accurate classification. Popular protein sequence classification techniques involve extraction of specific features from the sequences. Researchers apply some well-known classification techniques like neural networks, Genetic algorithm, Fuzzy ARTMAP,Rough Set Classifier etc for accurate classification. This paper presents a review is with three different classification models such as neural network model, fuzzy ARTMAP model and Rough set classifier model.This is followed by a new technique for classifying protein sequences. The proposed model is typicallyimplemented with an own designed tool and tries to reduce the computational overheads encountered by earlier approaches and increase the accuracy of classification.

  7. Diagnostic Accuracy Comparison of Artificial Immune Algorithms for Primary Headaches

    Science.gov (United States)

    Çelik, Ufuk; Yurtay, Nilüfer; Koç, Emine Rabia; Tepe, Nermin; Güllüoğlu, Halil; Ertaş, Mustafa

    2015-01-01

    The present study evaluated the diagnostic accuracy of immune system algorithms with the aim of classifying the primary types of headache that are not related to any organic etiology. They are divided into four types: migraine, tension, cluster, and other primary headaches. After we took this main objective into consideration, three different neurologists were required to fill in the medical records of 850 patients into our web-based expert system hosted on our project web site. In the evaluation process, Artificial Immune Systems (AIS) were used as the classification algorithms. The AIS are classification algorithms that are inspired by the biological immune system mechanism that involves significant and distinct capabilities. These algorithms simulate the specialties of the immune system such as discrimination, learning, and the memorizing process in order to be used for classification, optimization, or pattern recognition. According to the results, the accuracy level of the classifier used in this study reached a success continuum ranging from 95% to 99%, except for the inconvenient one that yielded 71% accuracy. PMID:26075014

  8. Diagnostic Accuracy Comparison of Artificial Immune Algorithms for Primary Headaches

    Directory of Open Access Journals (Sweden)

    Ufuk Çelik

    2015-01-01

    Full Text Available The present study evaluated the diagnostic accuracy of immune system algorithms with the aim of classifying the primary types of headache that are not related to any organic etiology. They are divided into four types: migraine, tension, cluster, and other primary headaches. After we took this main objective into consideration, three different neurologists were required to fill in the medical records of 850 patients into our web-based expert system hosted on our project web site. In the evaluation process, Artificial Immune Systems (AIS were used as the classification algorithms. The AIS are classification algorithms that are inspired by the biological immune system mechanism that involves significant and distinct capabilities. These algorithms simulate the specialties of the immune system such as discrimination, learning, and the memorizing process in order to be used for classification, optimization, or pattern recognition. According to the results, the accuracy level of the classifier used in this study reached a success continuum ranging from 95% to 99%, except for the inconvenient one that yielded 71% accuracy.

  9. Agriculture classification using POLSAR data

    DEFF Research Database (Denmark)

    Skriver, Henning; Dall, Jørgen; Ferro-Famil, Laurent; Le Toan, Thuy; Lumsdon, Parivash; Moshammer, Rolf; Pottier, Eric; Quegan, Shaun

    2005-01-01

    data, and a very important class of algorithms is the knowledge-based approaches. Here, generic characteristics of different cover types are derived by combining physical reasoning with the available empirical evidence. These are then used to define classification rules. Because of their emphasis on...

  10. Development of a reject classification method, applied to the diagnotic of a nuclear reactor core: processing of thermal signals providing from out-of-reactor simulation

    International Nuclear Information System (INIS)

    Development of an evolution detection algorithm which aim is to extend the application field of the form recognition analysis to the diagnosis and follow-up of a complex system: study of the data from the out-of-reactor test loop with forced convection in sodium, study and description of a reject classification algorithm developed in the general point of view of evolution detection. This method is tested with theoretical data and with experimental data provided by the second test loop ISIS

  11. Soil Classification Using GATree

    CERN Document Server

    Bhargavi, P

    2010-01-01

    This paper details the application of a genetic programming framework for classification of decision tree of Soil data to classify soil texture. The database contains measurements of soil profile data. We have applied GATree for generating classification decision tree. GATree is a decision tree builder that is based on Genetic Algorithms (GAs). The idea behind it is rather simple but powerful. Instead of using statistic metrics that are biased towards specific trees we use a more flexible, global metric of tree quality that try to optimize accuracy and size. GATree offers some unique features not to be found in any other tree inducers while at the same time it can produce better results for many difficult problems. Experimental results are presented which illustrate the performance of generating best decision tree for classifying soil texture for soil data set.

  12. ARTIFICIAL BEE COLONY ALGORITHM INTEGRATED WITH FUZZY C-MEAN OPERATOR FOR DATA CLUSTERING

    Directory of Open Access Journals (Sweden)

    M. Krishnamoorthi

    2013-01-01

    Full Text Available Clustering task aims at the unsupervised classification of patterns in different groups. To enhance the quality of results, the emerging swarm-based algorithms now-a-days become an alternative to the conventional clustering methods. In this study, an optimization method based on the swarm intelligence algorithm is proposed for the purpose of clustering. The significance of the proposed algorithm is that it uses a Fuzzy C- Means (FCM operator in the Artificial Bee Colony (ABC algorithm. The area of action of the FCM operator comes at the scout bee phase of the ABC algorithm as the scout bees are introduced by the FCM operator. The experimental results have shown that the proposed approach has provided significant results in terms of the quality of solution. The comparative study of the proposed approach with existing algorithms in the literature using the datasets from UCI Machine learning repository is satisfactory.

  13. A Novel Approach to ECG Classification Based upon Two-Layered HMMs in Body Sensor Networks

    OpenAIRE

    Wei Liang; Yinlong Zhang; Jindong Tan; Yang Li

    2014-01-01

    This paper presents a novel approach to ECG signal filtering and classification. Unlike the traditional techniques which aim at collecting and processing the ECG signals with the patient being still, lying in bed in hospitals, our proposed algorithm is intentionally designed for monitoring and classifying the patient’s ECG signals in the free-living environment. The patients are equipped with wearable ambulatory devices the whole day, which facilitates the real-time heart attack detection. ...

  14. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is constr

  15. Medical images data mining using classification algorithm based on association rule%基于关联分类算法的医学图像数据挖掘

    Institute of Scientific and Technical Information of China (English)

    邓薇薇; 卢延鑫

    2012-01-01

    Objective In order to assist clinicians in diagnosis and treatment of brain disease,a classifier for medical images which contains tumora inside,based on association rule data mining techniques was constructed.Methtoods After a pre-processing phase of the medical images,the related features from those images were extracted and discretized as the input of association rule,then the medical images classifier was constructed by improved Apriori algorithm.Results The medical images classifier was constructed.The known type of medical images was utilized to train the classifier so as to mine the association rules that satisfy the constraint conditions.Then the brain tumor in an unknown type of medical image was classified by the classifier constructed.Conclusion Classification algorithm based on association rule can be effectively used in mining image features,and constructing an image classifier to identify benign or malignant tumors.%目的 利用关联分类算法,构造医学图像分类器,对未知类型的脑肿瘤图像进行自动判别和分类,以帮助临床医生进行脑疾病的诊断和治疗.方法 对医学图像经过预处理后进行特征提取,再将提取的特征离散化后放到事务数据库中作为关联分类规则的输入,然后利用改进的Apriori算法构造医学图像分类器.结果 构造了医学图像分类器,用已知类型的图像训练分类器挖掘满足约束条件的关联规则,然后利用发现的关联规则对未知类型的医学图像进行分类以判断脑肿瘤的良恶性.结论 利用关联分类算法可以有效地挖掘医学图像特征,进而构造图像分类器,实现脑肿瘤良恶性的自动判别.

  16. 基于速度分类算法的交通事件视频检测系统设计%Video Detection System Design for Traffic Incidents Based on Speed Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    熊昕; 徐建闽

    2013-01-01

    Real-time video traffic incident detection method was proposed based on speed classification algorithm. In addition , traffic detection method, vehicles cross-road processing, speed detection, traffic flow detection and the identification of traffic events were also discussed. Based on vehicle detection and tracking, events such as traffic stop, lane transform times, slow traffic congestion and others can be identified and detected automatically to derive traffic flow, occupation ratio, queue length, average speed and other transportation parameters. In comparison with the traditional traffic incident detection system, the system is intuitive convenient and low-cost,and has good market demand and practical value.%提出基于速度分类算法的交通事件实时视频检测方法,并对交通量检测方法、车辆跨道处理、速度检测、交通状况检测及交通事件识别等进行了研究.在车辆检测与跟踪的基础上,可实现车辆停止、慢行、车道变换次数和车流拥挤等交通事件识别功能,通过自动检测车辆避障、车道变换、超速、慢速、停止和交通阻塞等事件,获得交通流量、占有率、排队长度、车型和平均车速等交通参数.与传统交通事件检测系统相比,具有直观方便、费用低等优点.

  17. Innovating Web Page Classification Through Reducing Noise

    Institute of Scientific and Technical Information of China (English)

    LI Xiaoli (李晓黎); SHI Zhongzhi(史忠植)

    2002-01-01

    This paper presents a new method that eliminates noise in Web page classification. It first describes the presentation of a Web page based on HTML tags. Then through a novel distance formula, it eliminates the noise in similarity measure. After carefully analyzing Web pages, we design an algorithm that can distinguish related hyperlinks from noisy ones.We can utilize non-noisy hyperlinks to improve the performance of Web page classification (the CAWN algorithm). For any page, wecan classify it through the text and category of neighbor pages related to the page. The experimental results show that our approach improved classification accuracy.

  18. Fault Tolerant Neural Network for ECG Signal Classification Systems

    Directory of Open Access Journals (Sweden)

    MERAH, M.

    2011-08-01

    Full Text Available The aim of this paper is to apply a new robust hardware Artificial Neural Network (ANN for ECG classification systems. This ANN includes a penalization criterion which makes the performances in terms of robustness. Specifically, in this method, the ANN weights are normalized using the auto-prune method. Simulations performed on the MIT ? BIH ECG signals, have shown that significant robustness improvements are obtained regarding potential hardware artificial neuron failures. Moreover, we show that the proposed design achieves better generalization performances, compared to the standard back-propagation algorithm.

  19. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  20. Iterative Formulation of Control Aims in Fully Probabilistic Design

    Czech Academy of Sciences Publication Activity Database

    Jirsa, Ladislav; Kárný, Miroslav; Tesař, Ludvík

    Praha : ÚTIA AV ČR, v.v.i, 2009 - (Janžura, M.; Ivánek, J.), s. 31-31 [5th International Workshop on Data-Algorithms-Decision Making. Plzeň (CZ), 29.11.2009-01.12.2009] R&D Projects: GA MŠk(CZ) 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : fully probabilistc control design * aim elicitation * windsurfer Subject RIV: BB - Applied Statistics, Operational Research http://library.utia.cas.cz/separaty/2009/AS/jirsa-iterative formulation of control aims in fully probabilistic design.pdf

  1. Pet fur color and texture classification

    Science.gov (United States)

    Yen, Jonathan; Mukherjee, Debarghar; Lim, SukHwan; Tretter, Daniel

    2007-01-01

    Object segmentation is important in image analysis for imaging tasks such as image rendering and image retrieval. Pet owners have been known to be quite vocal about how important it is to render their pets perfectly. We present here an algorithm for pet (mammal) fur color classification and an algorithm for pet (animal) fur texture classification. Per fur color classification can be applied as a necessary condition for identifying the regions in an image that may contain pets much like the skin tone classification for human flesh detection. As a result of the evolution, fur coloration of all mammals is caused by a natural organic pigment called Melanin and Melanin has only very limited color ranges. We have conducted a statistical analysis and concluded that mammal fur colors can be only in levels of gray or in two colors after the proper color quantization. This pet fur color classification algorithm has been applied for peteye detection. We also present here an algorithm for animal fur texture classification using the recently developed multi-resolution directional sub-band Contourlet transform. The experimental results are very promising as these transforms can identify regions of an image that may contain fur of mammals, scale of reptiles and feather of birds, etc. Combining the color and texture classification, one can have a set of strong classifiers for identifying possible animals in an image.

  2. DIAGNOSIS OF DIABETES USING CLASSIFICATION MINING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Aiswarya Iyer

    2015-01-01

    Full Text Available Diabetes has affected over 246 million people worldwide with a majority of them being women. According to the WHO report, by 2025 this number is expected to rise to over 380 million. The disease has been named the fifth deadliest disease in the United States with no imminent cure in sight. With the rise of information technology and its continued advent into the medical and healthcare sector, the cases of diabetes as well as their symptoms are well documented. This paper aims at finding solutions to diagnose the disease by analyzing the patterns found in the data through classification analysis by employing Decision Tree and Naïve Bayes algorithms. The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients

  3. Soil Data Analysis Using Classification Techniques and Soil Attribute Prediction

    Directory of Open Access Journals (Sweden)

    Jay Gholap

    2012-05-01

    Full Text Available Agricultural research has been profited by technical advances such as automation, data mining. Today ,data mining is used in a vast areas and many off-the-shelf data mining system products and domain specific data mining application soft wares are available, but data mining in agricultural soil datasets is a relatively a young research field. The large amounts of data that are nowadays virtually harvested along with the crops have to be analyzed and should be used to their full extent. This research aims at analysis of soil dataset using data mining techniques. It focuses on classification of soil using various algorithms available. Another important purpose is to predict untested attributes using regression technique, and implementation of automated soil sample classification.

  4. Evaluation for Uncertain Image Classification and Segmentation

    CERN Document Server

    Martin, Arnaud; Arnold-Bos, Andreas

    2008-01-01

    Each year, numerous segmentation and classification algorithms are invented or reused to solve problems where machine vision is needed. Generally, the efficiency of these algorithms is compared against the results given by one or many human experts. However, in many situations, the location of the real boundaries of the objects as well as their classes are not known with certainty by the human experts. Furthermore, only one aspect of the segmentation and classification problem is generally evaluated. In this paper we present a new evaluation method for classification and segmentation of image, where we take into account both the classification and segmentation results as well as the level of certainty given by the experts. As a concrete example of our method, we evaluate an automatic seabed characterization algorithm based on sonar images.

  5. Classifying Classification

    Science.gov (United States)

    Novakowski, Janice

    2009-01-01

    This article describes the experience of a group of first-grade teachers as they tackled the science process of classification, a targeted learning objective for the first grade. While the two-year process was not easy and required teachers to teach in a new, more investigation-oriented way, the benefits were great. The project helped teachers and…

  6. Cluster Classification of Partial Discharges in Oil-impregnated Paper Insulation

    Directory of Open Access Journals (Sweden)

    SURESH, S. D. R.

    2010-02-01

    Full Text Available Recognition of multiple partial discharge (PD sources in high voltage equipment has been a challenging task until now. The work reported here, aims to recognize multiple PD sources in oil-impregnated paper using Cluster Analysis (CA and Fuzzy Logic (FL. The typical sources of PD in transformer are identified and the corresponding single source PD defect laboratory models are fabricated. From the measured PD signals, the necessary statistical parameters are extracted by applying CA for classification. A Fuzzy based algorithm has been developed to recognize single source PDs. The developed algorithm has also been applied to recognize multiple PD sources.

  7. Landslide hazards mapping using uncertain Naïve Bayesian classification method

    Institute of Scientific and Technical Information of China (English)

    毛伊敏; 张茂省; 王根龙; 孙萍萍

    2015-01-01

    Landslide hazard mapping is a fundamental tool for disaster management activities in Loess terrains. Aiming at major issues with these landslide hazard assessment methods based on Naïve Bayesian classification technique, which is difficult in quantifying those uncertain triggering factors, the main purpose of this work is to evaluate the predictive power of landslide spatial models based on uncertain Naïve Bayesian classification method in Baota district of Yan’an city in Shaanxi province, China. Firstly, thematic maps representing various factors that are related to landslide activity were generated. Secondly, by using field data and GIS techniques, a landslide hazard map was performed. To improve the accuracy of the resulting landslide hazard map, the strategies were designed, which quantified the uncertain triggering factor to design landslide spatial models based on uncertain Naïve Bayesian classification method named NBU algorithm. The accuracies of the area under relative operating characteristics curves (AUC) in NBU and Naïve Bayesian algorithm are 87.29%and 82.47%respectively. Thus, NBU algorithm can be used efficiently for landslide hazard analysis and might be widely used for the prediction of various spatial events based on uncertain classification technique.

  8. Spectral band selection for classification of soil organic matter content

    Science.gov (United States)

    Henderson, Tracey L.; Szilagyi, Andrea; Baumgardner, Marion F.; Chen, Chih-Chien Thomas; Landgrebe, David A.

    1989-01-01

    This paper describes the spectral-band-selection (SBS) algorithm of Chen and Landgrebe (1987, 1988, and 1989) and uses the algorithm to classify the organic matter content in the earth's surface soil. The effectiveness of the algorithm was evaluated comparing the results of classification of the soil organic matter using SBS bands with those obtained using Landsat MSS bands and TM bands, showing that the algorithm was successful in finding important spectral bands for classification of organic matter content. Using the calculated bands, the probabilities of correct classification for climate-stratified data were found to range from 0.910 to 0.980.

  9. Seismic texture classification. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Vinther, R.

    1997-12-31

    The seismic texture classification method, is a seismic attribute that can both recognize the general reflectivity styles and locate variations from these. The seismic texture classification performs a statistic analysis for the seismic section (or volume) aiming at describing the reflectivity. Based on a set of reference reflectivities the seismic textures are classified. The result of the seismic texture classification is a display of seismic texture categories showing both the styles of reflectivity from the reference set and interpolations and extrapolations from these. The display is interpreted as statistical variations in the seismic data. The seismic texture classification is applied to seismic sections and volumes from the Danish North Sea representing both horizontal stratifications and salt diapers. The attribute succeeded in recognizing both general structure of successions and variations from these. Also, the seismic texture classification is not only able to display variations in prospective areas (1-7 sec. TWT) but can also be applied to deep seismic sections. The seismic texture classification is tested on a deep reflection seismic section (13-18 sec. TWT) from the Baltic Sea. Applied to this section the seismic texture classification succeeded in locating the Moho, which could not be located using conventional interpretation tools. The seismic texture classification is a seismic attribute which can display general reflectivity styles and deviations from these and enhance variations not found by conventional interpretation tools. (LN)

  10. P300 Detection Algorithm Based on Fisher Distance

    Directory of Open Access Journals (Sweden)

    Pan WANG

    2010-12-01

    Full Text Available With the aim to improve the divisibility of the features extracted by wavelet transformation in P300 detection, we researched the P300 frequency domain of event related potentials and the influence of mother wavelet selection towards the divisibility of extracted features, and then a novel P300 feature extraction method based on wavelet transform and Fisher distance. This can select features dynamically for a particular subject and thereby overcome the drawbacks of no systematic feature selection method during traditional P300 feature extraction based on wavelet transform. In this paper, both the BCI Competition 2003 and the BCI Competition 2005 data sets of P300 were used for validation, the experiment results showed that the proposed method can increase the divisibility by 121.8% of the features extracted by wavelet transformation, and the classification results showed that the proposed method can increase the classification accuracy by 1.2% while reduce 73.5% of the classification time. At the same time, integration of multi-domain algorithm is proposed based on the research of EEG feature extraction algorithm, and can be utilized in EEG preprocessing and feature extraction, even classification.

  11. Multisource remote sensing data fusion using fuzzy self-organization mapping network and modified Dempster-Shafer evidential reasoning method of classification

    Science.gov (United States)

    Liu, Chunping; Kong, Ling; Shen, Peihua; Xia, Deshen

    2001-09-01

    By integrating Fuzzy Kohonen Clustering Network (FKCN) with Fuzzy Dempster-Shafer Evidential Reasoning Theory (FDSERT), a new multi-source data fusion of Remote Sensing information algorithm is proposed in this paper. The new algorithm can be applied in classification of remote sensing image through FKCN learning and FDSERT fusing. Experimental results comparing with the FKCN algorithm indicates that the classification algorithm of multi-source data fusion of Remote Sensing is superior to that of FKCN algorithm. And the algorithm can obviously improve classification accuracy. At the same time, the algorithm can makes the best of expert knowledge. Therefore the algorithm is an effective classification algorithm of Remote Sensing image.

  12. Fuzzy One-Class Classification Model Using Contamination Neighborhoods

    Directory of Open Access Journals (Sweden)

    Lev V. Utkin

    2012-01-01

    Full Text Available A fuzzy classification model is studied in the paper. It is based on the contaminated (robust model which produces fuzzy expected risk measures characterizing classification errors. Optimal classification parameters of the models are derived by minimizing the fuzzy expected risk. It is shown that an algorithm for computing the classification parameters is reduced to a set of standard support vector machine tasks with weighted data points. Experimental results with synthetic data illustrate the proposed fuzzy model.

  13. A MapReduce based Parallel SVM for Email Classification

    OpenAIRE

    Ke Xu; Cui Wen; Qiong Yuan; Xiangzhu He; Jun Tie

    2014-01-01

    Support Vector Machine (SVM) is a powerful classification and regression tool. Varying approaches including SVM based techniques are proposed for email classification. Automated email classification according to messages or user-specific folders and information extraction from chronologically ordered email streams have become interesting areas in text machine learning research. This paper presents a parallel SVM based on MapReduce (PSMR) algorithm for email classification. We discuss the chal...

  14. Machine Learning for Biological Trajectory Classification Applications

    Science.gov (United States)

    Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros

    2002-01-01

    Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.

  15. Data classification by Fuzzy Ant-Miner

    OpenAIRE

    Mohamed Hamlich; Mohammed Ramdani

    2012-01-01

    In this paper we propose an extension of classification algorithm based on ant colony algorithms to handle continuous valued attributes using the concepts of fuzzy logic. The ant colony algorithms transform continuous attributes into nominal attributes by creating clenched discrete intervals. This may lead to false predictions of the target attribute, especially if the attribute value history is close to the borders of discretization. Continuous attributes are discretized on the fly into fuzz...

  16. On the Aims and Responsibilities of Science

    Directory of Open Access Journals (Sweden)

    Hugh Lacey

    2007-06-01

    Full Text Available I offer a view of the aims and responsibilities of science, and use it to analyze critically van Fraassen’s view that ‘objectifying inquiry’ is fundamental to the nature of science.

  17. Clustering and Classification in Text Collections Using Graph Modularity

    OpenAIRE

    Pivovarov, Grigory; Trunov, Sergei

    2011-01-01

    A new fast algorithm for clustering and classification of large collections of text documents is introduced. The new algorithm employs the bipartite graph that realizes the word-document matrix of the collection. Namely, the modularity of the bipartite graph is used as the optimization functional. Experiments performed with the new algorithm on a number of text collections had shown a competitive quality of the clustering (classification), and a record-breaking speed.

  18. Presentation of Database on Aims and Visions

    DEFF Research Database (Denmark)

    Lundgaard, Jacob

    2005-01-01

    This presentation presents a database on aims and visions regarding regional development and transport and infrastructure in the corridor from Oslo-Göteborg-Copenhagen-Berlin. The process, the developed methodology as well as the result of the database is presented.......This presentation presents a database on aims and visions regarding regional development and transport and infrastructure in the corridor from Oslo-Göteborg-Copenhagen-Berlin. The process, the developed methodology as well as the result of the database is presented....

  19. HIV classification using coalescent theory

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Ming [Los Alamos National Laboratory; Letiner, Thomas K [Los Alamos National Laboratory; Korber, Bette T [Los Alamos National Laboratory

    2008-01-01

    Algorithms for subtype classification and breakpoint detection of HIV-I sequences are based on a classification system of HIV-l. Hence, their quality highly depend on this system. Due to the history of creation of the current HIV-I nomenclature, the current one contains inconsistencies like: The phylogenetic distance between the subtype B and D is remarkably small compared with other pairs of subtypes. In fact, it is more like the distance of a pair of subsubtypes Robertson et al. (2000); Subtypes E and I do not exist any more since they were discovered to be composed of recombinants Robertson et al. (2000); It is currently discussed whether -- instead of CRF02 being a recombinant of subtype A and G -- subtype G should be designated as a circulating recombination form (CRF) nd CRF02 as a subtype Abecasis et al. (2007); There are 8 complete and over 400 partial HIV genomes in the LANL-database which belong neither to a subtype nor to a CRF (denoted by U). Moreover, the current classification system is somehow arbitrary like all complex classification systems that were created manually. To this end, it is desirable to deduce the classification system of HIV systematically by an algorithm. Of course, this problem is not restricted to HIV, but applies to all fast mutating and recombining viruses. Our work addresses the simpler subproblem to score classifications of given input sequences of some virus species (classification denotes a partition of the input sequences in several subtypes and CRFs). To this end, we reconstruct ancestral recombination graphs (ARG) of the input sequences under restrictions determined by the given classification. These restritions are imposed in order to ensure that the reconstructed ARGs do not contradict the classification under consideration. Then, we find the ARG with maximal probability by means of Markov Chain Monte Carlo methods. The probability of the most probable ARG is interpreted as a score for the classification. To our

  20. Malware Detection, Supportive Software Agents and Its Classification Schemes

    Directory of Open Access Journals (Sweden)

    Adebayo, Olawale Surajudeen

    2012-12-01

    Full Text Available Over time, the task of curbing the emergence of malware and its dastard activities has been identified interms of analysis, detection and containment of malware. Malware is a general term that is used todescribe the category of malicious software that is part of security threats to the computer and internetsystem. It is a malignant program designed to hamper the effectiveness of a computer and internetsystem. This paper aims at identifying the malware as one of the most dreaded threats to an emergingcomputer and communication technology. The paper identified the category of malware, malwareclassification algorithms, malwares activities and ways of preventing and removing malware if iteventually infects system.The research also describes tools that classify malware dataset using a rule-based classification schemeand machine learning algorithms to detect the malicious program from normal program through patternrecognition.

  1. Digitisation of films and texture analysis for digital classification of pulmonary opacities

    International Nuclear Information System (INIS)

    The study aimed at evaluating the effect of different methods of digitisation of radiographic films on the digital classification of pulmonary opacities. Test sets from the standard of the International Labour Office (ILO) Classification of Radiographs of Pneumoconiosis were prepared by film digitsation using a scanning microdensitometer or a video digitiser based on a personal computer equipped with a real time digitiser board and a vidicon or a Charge Coupled Device (CCD) camera. Seven different algorithms were used for texture analysis resulting in 16 texture parameters for each region. All methods used for texture analysis were independent of the mean grey value level and the size of the image analysed. Classification was performed by discriminant analysis using the classes from the ILO classification. A hit ratio of at least 85% was achieved for a digitisation by scanner digitisation or the vidicon, while the corresponding results of the CCD camera were significantly less good. Classification by texture analysis of opacities of chest X-rays of pneumoconiosis digitised by a personal computer based video digitiser and a vidicon are of equal quality compared to digitisation by a scanning microdensitometer. Correct classification of 90% was achieved via the described statistical approach. (orig.)

  2. Is Fitts' law continuous in discrete aiming?

    Directory of Open Access Journals (Sweden)

    Rita Sleimen-Malkoun

    Full Text Available The lawful continuous linear relation between movement time and task difficulty (i.e., index of difficulty; ID in a goal-directed rapid aiming task (Fitts' law has been recently challenged in reciprocal performance. Specifically, a discontinuity was observed at critical ID and was attributed to a transition between two distinct dynamic regimes that occurs with increasing difficulty. In the present paper, we show that such a discontinuity is also present in discrete aiming when ID is manipulated via target width (experiment 1 but not via target distance (experiment 2. Fitts' law's discontinuity appears, therefore, to be a suitable indicator of the underlying functional adaptations of the neuro-muscular-skeletal system to task properties/requirements, independently of reciprocal or discrete nature of the task. These findings open new perspectives to the study of dynamic regimes involved in discrete aiming and sensori-motor mechanisms underlying the speed-accuracy trade-off.

  3. Performance Evaluation of Machine Learning Algorithms for Urban Pattern Recognition from Multi-spectral Satellite Images

    OpenAIRE

    Marc Wieland; Massimiliano Pittore

    2014-01-01

    In this study, a classification and performance evaluation framework for the recognition of urban patterns in medium (Landsat ETM, TM and MSS) and very high resolution (WorldView-2, Quickbird, Ikonos) multi-spectral satellite images is presented. The study aims at exploring the potential of machine learning algorithms in the context of an object-based image analysis and to thoroughly test the algorithm’s performance under varying conditions to optimize their usage for urban pattern recognitio...

  4. AIM: Ames Imaging Module Spacecraft Camera

    Science.gov (United States)

    Thompson, Sarah

    2015-01-01

    The AIM camera is a small, lightweight, low power, low cost imaging system developed at NASA Ames. Though it has imaging capabilities similar to those of $1M plus spacecraft cameras, it does so on a fraction of the mass, power and cost budget.

  5. Secure Multi-Party Computation Based Privacy Preserving Extreme Learning Machine Algorithm Over Vertically Distributed Data

    OpenAIRE

    Çatak, Ferhat Özgür

    2016-01-01

    Especially in the Big Data era, the usage of different classification methods is increasing day by day. The success of these classification methods depends on the effectiveness of learning methods. Extreme learning machine (ELM) classification algorithm is a relatively new learning method built on feed-forward neural-network. ELM classification algorithm is a simple and fast method that can create a model from high-dimensional data sets. Traditional ELM learning algorithm implicitly assumes c...

  6. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    Science.gov (United States)

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  7. Neuronal Classification of Atria Fibrillation

    OpenAIRE

    Mohamed BEN MESSAOUD

    2008-01-01

    Motivation. In medical field, particularly the cardiology, the diagnosis systems constitute the essential domain of research. In some applications, the traditional methods of classification present some limitations. The neuronal technique is considered as one of the promising algorithms to resolve such problem.Method. In this paper, two approaches of the Artificial Neuronal Network (ANN) technique are investigated to classify the heart beats which are Multi Layer Perception (MLP) and Radial B...

  8. Classification of Agri-Tourism / Rural Tourism SMEs in Poland (on the Example of the Wielkopolska Region)

    OpenAIRE

    Przezborska, Lucyna

    2005-01-01

    The paper is based on data from a questionnaire survey (interviews) conducted in the western part of Poland on 183 rural tourism and agri-tourism small and medium enterprises. The classification of enterprises was based on the methodology proposed by Wysocki (1996) and included the k-means clustering algorithm. As the result of the research three types of SMEs were separated, including the top resilient enterprises aimed mainly at tourism activity and usually connected with horse recreation, ...

  9. Evolutionary algorithms

    OpenAIRE

    Szöllösi, Tomáš

    2012-01-01

    The first part of this work deals with the optimization and evolutionary algorithms which are used as a tool to solve complex optimization problems. The discussed algorithms are Differential Evolution, Genetic Algorithm, Simulated Annealing and deterministic non-evolutionary algorithm Taboo Search.. Consequently the discussion is held on the issue of testing the optimization algorithms through the use of the test function gallery and comparison solution all algorithms on Travelling salesman p...

  10. Criteria for comparison of synchronization algorithms spaced measures time and frequency

    OpenAIRE

    Koval, Yuriy; Kostyrya, Alexander; Pryimak, Viacheslav; Al-Tvezhri, Basim

    2012-01-01

    The role and gives a classification of synchronization algorithms spatially separated measures time and frequency. For comparison algorithms introduced criteria that consider the example of one of the algorithms.

  11. Classification Using Markov Blanket for Feature Selection

    DEFF Research Database (Denmark)

    Zeng, Yifeng; Luo, Jian

    Selecting relevant features is in demand when a large data set is of interest in a classification task. It produces a tractable number of features that are sufficient and possibly improve the classification performance. This paper studies a statistical method of Markov blanket induction algorithm...... for filtering features and then applies a classifier using the Markov blanket predictors. The Markov blanket contains a minimal subset of relevant features that yields optimal classification performance. We experimentally demonstrate the improved performance of several classifiers using a Markov...... blanket induction as a feature selection method. In addition, we point out an important assumption behind the Markov blanket induction algorithm and show its effect on the classification performance....

  12. Classification of ASKAP Vast Radio Light Curves

    Science.gov (United States)

    Rebbapragada, Umaa; Lo, Kitty; Wagstaff, Kiri L.; Reed, Colorado; Murphy, Tara; Thompson, David R.

    2012-01-01

    The VAST survey is a wide-field survey that observes with unprecedented instrument sensitivity (0.5 mJy or lower) and repeat cadence (a goal of 5 seconds) that will enable novel scientific discoveries related to known and unknown classes of radio transients and variables. Given the unprecedented observing characteristics of VAST, it is important to estimate source classification performance, and determine best practices prior to the launch of ASKAP's BETA in 2012. The goal of this study is to identify light curve characterization and classification algorithms that are best suited for archival VAST light curve classification. We perform our experiments on light curve simulations of eight source types and achieve best case performance of approximately 90% accuracy. We note that classification performance is most influenced by light curve characterization rather than classifier algorithm.

  13. Towards noise classification of road pavements

    OpenAIRE

    Freitas, Elisabete F.; Paulo, Joel; Coelho, J. L. Bento; Pereira, Paulo A. A.

    2008-01-01

    Noise classification of road surfaces has been addressed in many European countries. This paper presents the first approach towards noise classification of Portuguese road pavements. In this early stage, it aims at establishing guidelines for decision makers to support their noise reduction policies and the development of a classification system adapted to the European recommendations. A ranking to provide guidance on tire-road noise emission levels for immediate use by decisio...

  14. Advanced Industrial Materials (AIM) fellowship program

    Energy Technology Data Exchange (ETDEWEB)

    McCleary, D.D. [Oak Ridge Institute for Science and Education, TN (United States)

    1997-04-01

    The Advanced Industrial Materials (AIM) Program administers a Graduate Fellowship Program focused toward helping students who are currently under represented in the nation`s pool of scientists and engineers, enter and complete advanced degree programs. The objectives of the program are to: (1) establish and maintain cooperative linkages between DOE and professors at universities with graduate programs leading toward degrees or with degree options in Materials Science, Materials Engineering, Metallurgical Engineering, and Ceramic Engineering, the disciplines most closely related to the AIM Program at Oak Ridge National Laboratory (ORNL); (2) strengthen the capabilities and increase the level of participation of currently under represented groups in master`s degree programs, and (3) offer graduate students an opportunity for practical research experience related to their thesis topic through the three-month research assignment or practicum at ORNL. The program is administered by the Oak Ridge Institute for Science and Education (ORISE).

  15. CHILDREN AIMED INTERFACES FOR ANDROID RUNNING DEVICES

    OpenAIRE

    Tîrziu Georgiana Cristina

    2011-01-01

    The paper focuses on the development of mobile interfaces for children. The Android operating system is presented from appearance with its features, hardware support and its advantages over others operating systems. Mobile software development requirements on different platforms for mobile devices are identified and described. A graphical interface aiming children is designed and its features are presented. The interface includes an application for managing the school related tasks and time. ...

  16. Aims and methods of education: A recapitulation

    OpenAIRE

    Pantić Nataša

    2007-01-01

    This paper gives an overview of principal distinction between the aims of the so-called "traditional" and "progressive" education and respective pedagogies associated with each. The term "traditional" education is used to denote the kind of education that prepares people for their role in society as it is, while the term "progressive" is used for education that aspires to equip mankind with capacity to shape the change of society. The paper raises some critical questions about the role of ped...

  17. Aims and methods of nuclear materials management

    International Nuclear Information System (INIS)

    Whilst international safeguarding of fissile materials against abuse has been the subject of extensive debate, little public attention has so far been devoted to the internal security of these materials. All countries using nuclear energy for peaceful purposes have laid down appropriate regulations. In the Federal Republic of Germany safeguards are required, for instance, by the Atomic Energy Act, and are therefore a prerequisite for licensing. The aims and methods of national nuclear materials management are contrasted with viewpoints on international safeguards

  18. Classification in Australia.

    Science.gov (United States)

    McKinlay, John

    Despite some inroads by the Library of Congress Classification and short-lived experimentation with Universal Decimal Classification and Bliss Classification, Dewey Decimal Classification, with its ability in recent editions to be hospitable to local needs, remains the most widely used classification system in Australia. Although supplemented at…

  19. Classification in context

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary cla...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  20. Multi-borders classification

    OpenAIRE

    Mills, Peter

    2014-01-01

    The number of possible methods of generalizing binary classification to multi-class classification increases exponentially with the number of class labels. Often, the best method of doing so will be highly problem dependent. Here we present classification software in which the partitioning of multi-class classification problems into binary classification problems is specified using a recursive control language.

  1. Learning Interpretable SVMs for Biological Sequence Classification

    OpenAIRE

    Sonnenburg Sören; Rätsch Gunnar; Schäfer Christin

    2006-01-01

    Abstract Background Support Vector Machines (SVMs) – using a variety of string kernels – have been successfully applied to biological sequence classification problems. While SVMs achieve high classification accuracy they lack interpretability. In many applications, it does not suffice that an algorithm just detects a biological signal in the sequence, but it should also provide means to interpret its solution in order to gain biological insight. Results We propose novel and efficient algorith...

  2. Multi-engine packet classification hardware accelerator

    OpenAIRE

    Kennedy, Alan; Liu, Zhen; Wang, Xiaojun; Liu, Bin

    2009-01-01

    As line rates increase, the task of designing high performance architectures with reduced power consumption for the processing of router traffic remains important. In this paper, we present a multi-engine packet classification hardware accelerator, which gives increased performance and reduced power consumption. It follows the basic idea of decision-tree based packet classification algorithms, such as HiCuts and HyperCuts, in which the hyperspace represented by the ruleset is recursively divi...

  3. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  4. Photometric Supernova Classification with Machine Learning

    Science.gov (United States)

    Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  5. Bazhenov Fm Classification Based on Wireline Logs

    Science.gov (United States)

    Simonov, D. A.; Baranov, V.; Bukhanov, N.

    2016-03-01

    This paper considers the main aspects of Bazhenov Formation interpretation and application of machine learning algorithms for the Kolpashev type section of the Bazhenov Formation, application of automatic classification algorithms that would change the scale of research from small to large. Machine learning algorithms help interpret the Bazhenov Formation in a reference well and in other wells. During this study, unsupervised and supervised machine learning algorithms were applied to interpret lithology and reservoir properties. This greatly simplifies the routine problem of manual interpretation and has an economic effect on the cost of laboratory analysis.

  6. Data classification by Fuzzy Ant-Miner

    Directory of Open Access Journals (Sweden)

    Mohamed Hamlich

    2012-03-01

    Full Text Available In this paper we propose an extension of classification algorithm based on ant colony algorithms to handle continuous valued attributes using the concepts of fuzzy logic. The ant colony algorithms transform continuous attributes into nominal attributes by creating clenched discrete intervals. This may lead to false predictions of the target attribute, especially if the attribute value history is close to the borders of discretization. Continuous attributes are discretized on the fly into fuzzy partitions that will be used to develop an algorithm called Fuzzy Ant-Miner. Fuzzy rules are generated by using the concept of fuzzy entropy and fuzzy fitness of a rule.

  7. Classification of Regional Ionospheric Disturbances Based on Support Vector Machines

    Science.gov (United States)

    Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil

    2016-07-01

    Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification

  8. Classification-based reasoning

    Science.gov (United States)

    Gomez, Fernando; Segami, Carlos

    1991-01-01

    A representation formalism for N-ary relations, quantification, and definition of concepts is described. Three types of conditions are associated with the concepts: (1) necessary and sufficient properties, (2) contingent properties, and (3) necessary properties. Also explained is how complex chains of inferences can be accomplished by representing existentially quantified sentences, and concepts denoted by restrictive relative clauses as classification hierarchies. The representation structures that make possible the inferences are explained first, followed by the reasoning algorithms that draw the inferences from the knowledge structures. All the ideas explained have been implemented and are part of the information retrieval component of a program called Snowy. An appendix contains a brief session with the program.

  9. Predictive Classification Trees

    Science.gov (United States)

    Dlugosz, Stephan; Müller-Funk, Ulrich

    CART (Breiman et al., Classification and Regression Trees, Chapman and Hall, New York, 1984) and (exhaustive) CHAID (Kass, Appl Stat 29:119-127, 1980) figure prominently among the procedures actually used in data based management, etc. CART is a well-established procedure that produces binary trees. CHAID, in contrast, admits multiple splittings, a feature that allows to exploit the splitting variable more extensively. On the other hand, that procedure depends on premises that are questionable in practical applications. This can be put down to the fact that CHAID relies on simultaneous Chi-Square- resp. F-tests. The null-distribution of the second test statistic, for instance, relies on the normality assumption that is not plausible in a data mining context. Moreover, none of these procedures - as implemented in SPSS, for instance - take ordinal dependent variables into account. In the paper we suggest an alternative tree-algorithm that: Requires explanatory categorical variables

  10. Contextual classification on PASM. [multimicroprocessor system for image processing and pattern recognition

    Science.gov (United States)

    Siegel, H. J.; Swain, P. H.

    1981-01-01

    The use of N microprocessors in the SIMD mode of parallel processing to do classifications almost N times faster than a single microprocessor is discussed. Examples of contextual classifiers are given, uniprocessor algorithms for performing contextual classifications are presented, and their computational complexity is analyzed. The SIMD mode of parallel processing is defined and PASM is overviewed. The presented uniprocessor algorithms are used as a basis for developing parallel algorithms for performing computationally intensive contextual classifications.

  11. Sow-activity classification from acceleration patterns

    DEFF Research Database (Denmark)

    Escalante, Hugo Jair; Rodriguez, Sara V.; Cordero, Jorge; Kristensen, Anders Ringgaard; Cornou, Cecile

    2013-01-01

    sow-activity classification can be approached with standard machine learning methods for pattern classification. Individual predictions for elements of times series of arbitrary length are combined to classify it as a whole. An extensive comparison of representative learning algorithms, including......This paper describes a supervised learning approach to sow-activity classification from accelerometer measurements. In the proposed methodology, pairs of accelerometer measurements and activity types are considered as labeled instances of a usual supervised classification task. Under this scenario...... neural networks, support vector machines, and ensemble methods, is presented. Experimental results are reported using a data set for sow-activity classification collected in a real production herd. The data set, which has been widely used in related works, includes measurements from active (Feeding...

  12. Texture Classification Based on Texton Features

    Directory of Open Access Journals (Sweden)

    U Ravi Babu

    2012-08-01

    Full Text Available Texture Analysis plays an important role in the interpretation, understanding and recognition of terrain, biomedical or microscopic images. To achieve high accuracy in classification the present paper proposes a new method on textons. Each texture analysis method depends upon how the selected texture features characterizes image. Whenever a new texture feature is derived it is tested whether it precisely classifies the textures. Here not only the texture features are important but also the way in which they are applied is also important and significant for a crucial, precise and accurate texture classification and analysis. The present paper proposes a new method on textons, for an efficient rotationally invariant texture classification. The proposed Texton Features (TF evaluates the relationship between the values of neighboring pixels. The proposed classification algorithm evaluates the histogram based techniques on TF for a precise classification. The experimental results on various stone textures indicate the efficacy of the proposed method when compared to other methods.

  13. Extreme Learning Machine for land cover classification

    CERN Document Server

    Pal, Mahesh

    2008-01-01

    This paper explores the potential of extreme learning machine based supervised classification algorithm for land cover classification. In comparison to a backpropagation neural network, which requires setting of several user-defined parameters and may produce local minima, extreme learning machine require setting of one parameter and produce a unique solution. ETM+ multispectral data set (England) was used to judge the suitability of extreme learning machine for remote sensing classifications. A back propagation neural network was used to compare its performance in term of classification accuracy and computational cost. Results suggest that the extreme learning machine perform equally well to back propagation neural network in term of classification accuracy with this data set. The computational cost using extreme learning machine is very small in comparison to back propagation neural network.

  14. BRAIN TUMOR CLASSIFICATION BASED ON CLUSTERED DISCRETE COSINE TRANSFORM IN COMPRESSED DOMAIN

    Directory of Open Access Journals (Sweden)

    V. Anitha

    2014-01-01

    Full Text Available This study presents a novel method to classify the brain tumors by means of efficient and integrated methods so as to increase the classification accuracy. In conventional systems, the problem being the same to extract the feature sets from the database and classify tumors based on the features sets. The main idea in plethora of earlier researches related to any classification method is to increase the classification accuracy.The actual need is to achieve a better accuracy in classification, by extracting more relevant feature sets after dimensionality reduction. There exists a trade-off between accuracy and the number of feature sets. Hence the focus in this study is to implement Discrete Cosine Transform (DCT on the brain tumor images for various classes. Using DCT, by itself, it offers a fair dimension reduction in feature sets.Later on, sequentially K-means algorithm is applied on DCT coefficients to cluster the feature sets. These cluster information are considered as refined feature sets and classified using Support Vector Machine (SVM is proposed in this study. This method of using DCT helps to adjust and vary the performance of classification based on the count of the DCT coefficients taken into account. There exists a good demand for an automatic classification of brain tumors which grealtly helps in the process of diagnosis. In this novel work, an average of 97% and a maximum of 100% classification accuracy has been achieved. This research is basically aiming and opening a new way of classification under compressed domain. Hence this study may be highly suitable for diagnosing under mobile computing and internet based medical diagnosis.

  15. Aims and methods of education: A recapitulation

    Directory of Open Access Journals (Sweden)

    Pantić Nataša

    2007-01-01

    Full Text Available This paper gives an overview of principal distinction between the aims of the so-called "traditional" and "progressive" education and respective pedagogies associated with each. The term "traditional" education is used to denote the kind of education that prepares people for their role in society as it is, while the term "progressive" is used for education that aspires to equip mankind with capacity to shape the change of society. The paper raises some critical questions about the role of pedagogy in achieving the aims of the progressive model, arguing that the employment of "progressive" methods does not necessarily guarantee the achievement of the commonly professed purposes of progressive education. This is illustrated in the paper by the results of a study in English schools showing how despite the claim of progressive methods, teachers tend to retain traditional attitudes and on the other hand, how even traditional teaching methods can serve the progressive purpose. This is not to advocate for the traditional pedagogy, but to suggest that it might be something other than pedagogy that makes a critical difference in educating liberal-minded citizens of the future. In this sense the paper explores the role of other factors that make a difference towards progressive education, such as democratization of human relations in school ethos and respect for children's freedom.

  16. Classification and knowledge

    Science.gov (United States)

    Kurtz, Michael J.

    1989-01-01

    Automated procedures to classify objects are discussed. The classification problem is reviewed, and the relation of epistemology and classification is considered. The classification of stellar spectra and of resolved images of galaxies is addressed.

  17. Hazard classification methodology

    International Nuclear Information System (INIS)

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  18. Remote Sensing Information Classification

    Science.gov (United States)

    Rickman, Douglas L.

    2008-01-01

    This viewgraph presentation reviews the classification of Remote Sensing data in relation to epidemiology. Classification is a way to reduce the dimensionality and precision to something a human can understand. Classification changes SCALAR data into NOMINAL data.

  19. HEATR project: ATR algorithm parallelization

    Science.gov (United States)

    Deardorf, Catherine E.

    1998-09-01

    High Performance Computing (HPC) Embedded Application for Target Recognition (HEATR) is a project funded by the High Performance Computing Modernization Office through the Common HPC Software Support Initiative (CHSSI). The goal of CHSSI is to produce portable, parallel, multi-purpose, freely distributable, support software to exploit emerging parallel computing technologies and enable application of scalable HPC's for various critical DoD applications. Specifically, the CHSSI goal for HEATR is to provide portable, parallel versions of several existing ATR detection and classification algorithms to the ATR-user community to achieve near real-time capability. The HEATR project will create parallel versions of existing automatic target recognition (ATR) detection and classification algorithms and generate reusable code that will support porting and software development process for ATR HPC software. The HEATR Team has selected detection/classification algorithms from both the model- based and training-based (template-based) arena in order to consider the parallelization requirements for detection/classification algorithms across ATR technology. This would allow the Team to assess the impact that parallelization would have on detection/classification performance across ATR technology. A field demo is included in this project. Finally, any parallel tools produced to support the project will be refined and returned to the ATR user community along with the parallel ATR algorithms. This paper will review: (1) HPCMP structure as it relates to HEATR, (2) Overall structure of the HEATR project, (3) Preliminary results for the first algorithm Alpha Test, (4) CHSSI requirements for HEATR, and (5) Project management issues and lessons learned.

  20. Discriminative Structured Dictionary Learning for Image Classification

    Institute of Scientific and Technical Information of China (English)

    王萍; 兰俊花; 臧玉卫; 宋占杰

    2016-01-01

    In this paper, a discriminative structured dictionary learning algorithm is presented. To enhance the dictionary’s discriminative power, the reconstruction error, classification error and inhomogeneous representation error are integrated into the objective function. The proposed approach learns a single structured dictionary and a linear classifier jointly. The learned dictionary encourages the samples from the same class to have similar sparse codes, and the samples from different classes to have dissimilar sparse codes. The solution to the objective function is achieved by employing a feature-sign search algorithm and Lagrange dual method. Experimental results on three public databases demonstrate that the proposed approach outperforms several recently proposed dictionary learning techniques for classification.