WorldWideScience

Sample records for classification algorithm aimed

  1. An Ensemble Classification Algorithm for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    K.Kavitha

    2014-04-01

    Full Text Available Hyperspectral image analysis has been used for many purposes in environmental monitoring, remote sensing, vegetation research and also for land cover classification. A hyperspectral image consists of many layers in which each layer represents a specific wavelength. The layers stack on top of one another making a cube-like image for entire spectrum. This work aims to classify the hyperspectral images and to produce a thematic map accurately. Spatial information of hyperspectral images is collected by applying morphological profile and local binary pattern. Support vector machine is an efficient classification algorithm for classifying the hyperspectral images. Genetic algorithm is used to obtain the best feature subjected for classification. Selected features are classified for obtaining the classes and to produce a thematic map. Experiment is carried out with AVIRIS Indian Pines and ROSIS Pavia University. Proposed method produces accuracy as 93% for Indian Pines and 92% for Pavia University.

  2. Classification algorithms using adaptive partitioning

    KAUST Repository

    Binev, Peter

    2014-12-01

    © 2014 Institute of Mathematical Statistics. Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335.1353; Mach. Learn. 66 (2007) 209.242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter - of margin conditions and a rate s of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness β of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Holder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.

  3. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    Science.gov (United States)

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  4. ONLINE REGULARIZED GENERALIZED GRADIENT CLASSIFICATION ALGORITHMS

    Institute of Scientific and Technical Information of China (English)

    Leilei Zhang; Baohui Sheng; Jianli Wang

    2010-01-01

    This paper considers online classification learning algorithms for regularized classification schemes with generalized gradient.A novel capacity independent approach is presented.It verifies the strong convergence of sizes and yields satisfactory convergence rates for polynomially decaying step sizes.Compared with the gradient schemes,this al-gorithm needs only less additional assumptions on the loss function and derives a stronger result with respect to the choice of step sizes and the regularization parameters.

  5. Behavior Classification Algorithms at Intersections

    OpenAIRE

    Aoude, Georges; Desaraju, Vishnu Rajeswar; Stephens, Lauren H.; How, Jonathan P.

    2011-01-01

    The ability to classify driver behavior lays the foundation for more advanced driver assistance systems. Improving safety at intersections has also been identified as high priority due to the large number of intersection related fatalities. This paper focuses on developing algorithms for estimating driver behavior at road intersections. It introduces two classes of algorithms that can classify drivers as compliant or violating. They are based on 1) Support Vector Machines (SVM) and 2) Hidden ...

  6. Intelligent Contextual Algorithm For Harmonics Classification

    Directory of Open Access Journals (Sweden)

    M.K. ELANGO

    2010-06-01

    Full Text Available This paper presents methods for classification of harmonics present in the electrical signal using Fast Fourier Transform (FFT, Contextual Clustering (CC and Back Propagation Algorithm (BPA. Power quality meter has been used to collect the electrical signal data from a 40W Fluorescent Lamp (FL. In the captured data, variouselectrical disturbances are introduced through Matlab code. FFT has been used for extraction of features from the acquired electrical signal. The FFT, CC, BPA and BPACC algorithms have been implemented by Matlab. Comparison of performance classification of harmonics by CC, BPA and BPACC are presented.

  7. Classification Algorithms for Determining Handwritten Digit

    Directory of Open Access Journals (Sweden)

    Hayder Naser Khraibet AL-Behadili

    2016-06-01

    Full Text Available Data-intensive science is a critical science paradigm that interferes with all other sciences. Data mining (DM is a powerful and useful technology with wide potential users focusing on important meaningful patterns and discovers a new knowledge from a collected dataset. Any predictive task in DM uses some attribute to classify an unknown class. Classification algorithms are a class of prominent mathematical techniques in DM. Constructing a model is the core aspect of such algorithms. However, their performance highly depends on the algorithm behavior upon manipulating data. Focusing on binarazaition as an approach for preprocessing, this paper analysis and evaluates different classification algorithms when construct a model based on accuracy in the classification task. The Mixed National Institute of Standards and Technology (MNIST handwritten digits dataset provided by Yann LeCun has been used in evaluation. The paper focuses on machine learning approaches for handwritten digits detection. Machine learning establishes classification methods, such as K-Nearest Neighbor(KNN, Decision Tree (DT, and Neural Networks (NN. Results showed that the knowledge-based method, i.e. NN algorithm, is more accurate in determining the digits as it reduces the error rate. The implication of this evaluation is providing essential insights for computer scientists and practitioners for choosing the suitable DM technique that fit with their data.

  8. An Experimental Comparative Study on Three Classification Algorithms

    Institute of Scientific and Technical Information of China (English)

    蔡巍; 王永成; 李伟; 尹中航

    2003-01-01

    Classification algorithm is one of the key techniques to affect text automatic classification system's performance, play an important role in automatic classification research area. This paper comparatively analyzed k-NN. VSM and hybrid classification algorithm presented by our research group. Some 2000 pieces of Internet news provided by ChinaInfoBank are used in the experiment. The result shows that the hybrid algorithm's performance presented by the groups is superior to the other two algorithms.

  9. Automatic modulation classification principles, algorithms and applications

    CERN Document Server

    Zhu, Zhechen

    2014-01-01

    Automatic Modulation Classification (AMC) has been a key technology in many military, security, and civilian telecommunication applications for decades. In military and security applications, modulation often serves as another level of encryption; in modern civilian applications, multiple modulation types can be employed by a signal transmitter to control the data rate and link reliability. This book offers comprehensive documentation of AMC models, algorithms and implementations for successful modulation recognition. It provides an invaluable theoretical and numerical comparison of AMC algo

  10. Fast deterministic algorithm for EEE components classification

    Science.gov (United States)

    Kazakovtsev, L. A.; Antamoshkin, A. N.; Masich, I. S.

    2015-10-01

    Authors consider the problem of automatic classification of the electronic, electrical and electromechanical (EEE) components based on results of the test control. Electronic components of the same type used in a high- quality unit must be produced as a single production batch from a single batch of the raw materials. Data of the test control are used for splitting a shipped lot of the components into several classes representing the production batches. Methods such as k-means++ clustering or evolutionary algorithms combine local search and random search heuristics. The proposed fast algorithm returns a unique result for each data set. The result is comparatively precise. If the data processing is performed by the customer of the EEE components, this feature of the algorithm allows easy checking of the results by a producer or supplier.

  11. A new classification algorithm based on RGH-tree search

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In this paper, we put forward a new classification algorithm based on RGH-Tree search and perform the classification analysis and comparison study. This algorithm can save computing resource and increase the classification efficiency. The experiment shows that this algorithm can get better effect in dealing with three dimensional multi-kind data. We find that the algorithm has better generalization ability for small training set and big testing result.

  12. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  13. Machine Learning Algorithms in Web Page Classification

    Directory of Open Access Journals (Sweden)

    W.A.AWAD

    2012-11-01

    Full Text Available In this paper we use machine learning algorithms like SVM, KNN and GIS to perform a behaviorcomparison on the web pages classifications problem, from the experiment we see in the SVM with smallnumber of negative documents to build the centroids has the smallest storage requirement and the least online test computation cost. But almost all GIS with different number of nearest neighbors have an evenhigher storage requirement and on line test computation cost than KNN. This suggests that some futurework should be done to try to reduce the storage requirement and on list test cost of GIS.

  14. An SMP soft classification algorithm for remote sensing

    Science.gov (United States)

    Phillips, Rhonda D.; Watson, Layne T.; Easterling, David R.; Wynne, Randolph H.

    2014-07-01

    This work introduces a symmetric multiprocessing (SMP) version of the continuous iterative guided spectral class rejection (CIGSCR) algorithm, a semiautomated classification algorithm for remote sensing (multispectral) images. The algorithm uses soft data clusters to produce a soft classification containing inherently more information than a comparable hard classification at an increased computational cost. Previous work suggests that similar algorithms achieve good parallel scalability, motivating the parallel algorithm development work here. Experimental results of applying parallel CIGSCR to an image with approximately 108 pixels and six bands demonstrate superlinear speedup. A soft two class classification is generated in just over 4 min using 32 processors.

  15. Multiscale modeling for classification of SAR imagery using hybrid EM algorithm and genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    Xianbin Wen; Hua Zhang; Jianguang Zhang; Xu Jiao; Lei Wang

    2009-01-01

    A novel method that hybridizes genetic algorithm (GA) and expectation maximization (EM) algorithm for the classification of syn-thetic aperture radar (SAR) imagery is proposed by the finite Gaussian mixtures model (GMM) and multiscale autoregressive (MAR)model. This algorithm is capable of improving the global optimality and consistency of the classification performance. The experiments on the SAR images show that the proposed algorithm outperforms the standard EM method significantly in classification accuracy.

  16. Support vector classification algorithm based on variable parameter linear programming

    Institute of Scientific and Technical Information of China (English)

    Xiao Jianhua; Lin Jian

    2007-01-01

    To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed.In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model.The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given.An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.

  17. Improved RMR Rock Mass Classification Using Artificial Intelligence Algorithms

    Science.gov (United States)

    Gholami, Raoof; Rasouli, Vamegh; Alimoradi, Andisheh

    2013-09-01

    Rock mass classification systems such as rock mass rating (RMR) are very reliable means to provide information about the quality of rocks surrounding a structure as well as to propose suitable support systems for unstable regions. Many correlations have been proposed to relate measured quantities such as wave velocity to rock mass classification systems to limit the associated time and cost of conducting the sampling and mechanical tests conventionally used to calculate RMR values. However, these empirical correlations have been found to be unreliable, as they usually overestimate or underestimate the RMR value. The aim of this paper is to compare the results of RMR classification obtained from the use of empirical correlations versus machine-learning methodologies based on artificial intelligence algorithms. The proposed methods were verified based on two case studies located in northern Iran. Relevance vector regression (RVR) and support vector regression (SVR), as two robust machine-learning methodologies, were used to predict the RMR for tunnel host rocks. RMR values already obtained by sampling and site investigation at one tunnel were taken into account as the output of the artificial networks during training and testing phases. The results reveal that use of empirical correlations overestimates the predicted RMR values. RVR and SVR, however, showed more reliable results, and are therefore suggested for use in RMR classification for design purposes of rock structures.

  18. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Science.gov (United States)

    Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933

  19. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    OpenAIRE

    C. Fernandez-Lozano; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected.

  20. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  1. MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION ALGORITHM

    Directory of Open Access Journals (Sweden)

    Htet Thazin Tike Thein

    2014-12-01

    Full Text Available Constructing a classification model is important in machine learning for a particular task. A classification process involves assigning objects into predefined groups or classes based on a number of observed attributes related to those objects. Artificial neural network is one of the classification algorithms which, can be used in many application areas. This paper investigates the potential of applying the feed forward neural network architecture for the classification of medical datasets. Migration based differential evolution algorithm (MBDE is chosen and applied to feed forward neural network to enhance the learning process and the network learning is validated in terms of convergence rate and classification accuracy. In this paper, MBDE algorithm with various migration policies is proposed for classification problems using medical diagnosis.

  2. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    Directory of Open Access Journals (Sweden)

    Hongxia Li

    2013-08-01

    Full Text Available With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It realizes the independent component analysis of complex network text classification. Through the ICA clustering algorithm of independent component, it realizes character words clustering extraction of text classification. The visualization of text retrieval is improved. Finally, we make a comparative analysis of collocation algorithm and ICA clustering algorithm through text classification and keyword search experiment. The paper gives the clustering degree of algorithm and accuracy figure. Through simulation analysis, we find that ICA clustering algorithm increases by 1.2% comparing with text classification clustering degree. Accuracy can be improved by 11.1% at most. It improves the efficiency and accuracy of text classification retrieval. It also provides a theoretical reference for text retrieval classification of eBook

  3. Intelligent Hybrid Cluster Based Classification Algorithm for Social Network Analysis

    Directory of Open Access Journals (Sweden)

    S. Muthurajkumar

    2014-05-01

    Full Text Available In this paper, we propose an hybrid clustering based classification algorithm based on mean approach to effectively classify to mine the ordered sequences (paths from weblog data in order to perform social network analysis. In the system proposed in this work for social pattern analysis, the sequences of human activities are typically analyzed by switching behaviors, which are likely to produce overlapping clusters. In this proposed system, a robust Modified Boosting algorithm is proposed to hybrid clustering based classification for clustering the data. This work is useful to provide connection between the aggregated features from the network data and traditional indices used in social network analysis. Experimental results show that the proposed algorithm improves the decision results from data clustering when combined with the proposed classification algorithm and hence it is proved that of provides better classification accuracy when tested with Weblog dataset. In addition, this algorithm improves the predictive performance especially for multiclass datasets which can increases the accuracy.

  4. A comparative study on classification of sleep stage based on EEG signals using feature selection and classification algorithms.

    Science.gov (United States)

    Şen, Baha; Peker, Musa; Çavuşoğlu, Abdullah; Çelebi, Fatih V

    2014-03-01

    Sleep scoring is one of the most important diagnostic methods in psychiatry and neurology. Sleep staging is a time consuming and difficult task undertaken by sleep experts. This study aims to identify a method which would classify sleep stages automatically and with a high degree of accuracy and, in this manner, will assist sleep experts. This study consists of three stages: feature extraction, feature selection from EEG signals, and classification of these signals. In the feature extraction stage, it is used 20 attribute algorithms in four categories. 41 feature parameters were obtained from these algorithms. Feature selection is important in the elimination of irrelevant and redundant features and in this manner prediction accuracy is improved and computational overhead in classification is reduced. Effective feature selection algorithms such as minimum redundancy maximum relevance (mRMR); fast correlation based feature selection (FCBF); ReliefF; t-test; and Fisher score algorithms are preferred at the feature selection stage in selecting a set of features which best represent EEG signals. The features obtained are used as input parameters for the classification algorithms. At the classification stage, five different classification algorithms (random forest (RF); feed-forward neural network (FFNN); decision tree (DT); support vector machine (SVM); and radial basis function neural network (RBF)) classify the problem. The results, obtained from different classification algorithms, are provided so that a comparison can be made between computation times and accuracy rates. Finally, it is obtained 97.03 % classification accuracy using the proposed method. The results show that the proposed method indicate the ability to design a new intelligent assistance sleep scoring system.

  5. Comparative Analysis of Serial Decision Tree Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Matthew Nwokejizie Anyanwu

    2009-09-01

    Full Text Available Classification of data objects based on a predefined knowledge of the objects is a data mining and knowledge management technique used in grouping similar data objects together. It can be defined as supervised learning algorithms as it assigns class labels to data objects based on the relationship between the data items with a pre-defined class label. Classification algorithms have a wide range of applications like churn prediction, fraud detection, artificial intelligence, and credit card rating etc. Also there are many classification algorithms available in literature but decision trees is the most commonly used because of its ease of implementation and easier to understand compared to other classification algorithms. Decision Tree classification algorithm can be implemented in a serial or parallel fashion based on the volume of data, memory space available on the computer resource and scalability of the algorithm. In this paper we will review the serial implementations of the decision tree algorithms, identify those that are commonly used. We will also use experimental analysis based on sample data records (Statlog data sets to evaluate the performance of the commonly used serial decision tree algorithms

  6. Comparative Analysis of Serial Decision Tree Classification Algorithms

    OpenAIRE

    Matthew Nwokejizie Anyanwu; Sajjan Shiva

    2009-01-01

    Classification of data objects based on a predefined knowledge of the objects is a data mining and knowledge management technique used in grouping similar data objects together. It can be defined as supervised learning algorithms as it assigns class labels to data objects based on the relationship between the data items with a pre-defined class label. Classification algorithms have a wide range of applications like churn prediction, fraud detection, artificial intelligence, and credit card ra...

  7. Comparative Evaluation of Packet Classification Algorithms, with Implementation

    Directory of Open Access Journals (Sweden)

    Hediyeh AmirJahanshahi Sistani

    2014-05-01

    Full Text Available in a realm of ever-increasing Internet connectivity, together with swelling computer security threats, security-cognizant network applications technology is gaining widespread popularity. Packet classifiers are extensively employed for numerous network applications in different types of network devices such as Firewalls and Router, among others. Appreciating the tangible performance of recommended packet classifiers is a prerequisite for both algorithm creators and consumers. However, this is occasionally challenging to accomplish. Each innovative algorithm published is assessed from diverse perceptions and is founded on different suppositions. Devoid of a mutual foundation, it is virtually impossible to compare different algorithms directly. In the interim, it too aids the system implementers to effortlessly pick the most suitable algorithm for their actual applications. Electing an ineffectual algorithm for an application can invite major expenditures. This is particularly true for packet classification in network routers, as packet classification is fundamentally a tough problem and all current algorithms are constructed on specific heuristics and filter set characteristics. The performance of the packet classification subsystem is vital for the aggregate success of the network routers. In this study, we have piloted an advanced exploration of the existing algorithms to provide a comparative evaluation of a number of known classification algorithms that have been considered for both software and hardware implementation. We have explained our earlier suggested DimCut packet classification algorithm, and related it with the BV, HiCuts and HyperCuts decision tree-based packet classification algorithms with the comparative evaluation analysis. This comparison has been carried out on implementations based on the same principles and design choices from different sources. Performance measurements have been obtained by feeding the implemented

  8. A Study of Different Quality Evaluation Functions in the cAnt-MinerPB Classification Algorithm

    OpenAIRE

    Medland, Matthew; Otero, Fernando E. B.

    2012-01-01

    Ant colony optimization (ACO) algorithms for classification in general employ a sequential covering strategy to create a list of classification rules. A key component in this strategy is the selection of the rule quality function, since the algorithm aims at creating one rule at a time using an ACO-based procedure to search the best rule. Recently, an improved strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules inst...

  9. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification

    Directory of Open Access Journals (Sweden)

    R. Sathya

    2013-02-01

    Full Text Available This paper presents a comparative account of unsupervised and supervised learning models and their pattern classification evaluations as applied to the higher education scenario. Classification plays a vital role in machine based learning algorithms and in the present study, we found that, though the error back-propagation learning algorithm as provided by supervised learning model is very efficient for a number of non-linear real-time problems, KSOM of unsupervised learning model, offers efficient solution and classification in the present study.

  10. An Algorithm for Classification of 3-D Spherical Spatial Points

    Institute of Scientific and Technical Information of China (English)

    ZHU Qing-xin; Mudur SP; LIU Chang; PENG Bo; WU Jia

    2003-01-01

    This paper presents a highly efficient algorithm for classification of 3D points sampled from lots of spheres, using neighboring relations of spatial points to construct a neighbor graph from points cloud. This algorithm can be used in object recognition, computer vision, and CAD model building, etc.

  11. Simple-Random-Sampling-Based Multiclass Text Classification Algorithm

    OpenAIRE

    Wuying Liu; Lin Wang; Mianzhu Yi

    2014-01-01

    Multiclass text classification (MTC) is a challenging issue and the corresponding MTC algorithms can be used in many applications. The space-time overhead of the algorithms must be concerned about the era of big data. Through the investigation of the token frequency distribution in a Chinese web document collection, this paper reexamines the power law and proposes a simple-random-sampling-based MTC (SRSMTC) algorithm. Supported by a token level memory to store labeled documents, the SRSMTC al...

  12. Comparison of Classification Algorithms and Training Sample Sizes in Urban Land Classification with Landsat Thematic Mapper Imagery

    Directory of Open Access Journals (Sweden)

    Congcong Li

    2014-01-01

    Full Text Available Although a large number of new image classification algorithms have been developed, they are rarely tested with the same classification task. In this research, with the same Landsat Thematic Mapper (TM data set and the same classification scheme over Guangzhou City, China, we tested two unsupervised and 13 supervised classification algorithms, including a number of machine learning algorithms that became popular in remote sensing during the past 20 years. Our analysis focused primarily on the spectral information provided by the TM data. We assessed all algorithms in a per-pixel classification decision experiment and all supervised algorithms in a segment-based experiment. We found that when sufficiently representative training samples were used, most algorithms performed reasonably well. Lack of training samples led to greater classification accuracy discrepancies than classification algorithms themselves. Some algorithms were more tolerable to insufficient (less representative training samples than others. Many algorithms improved the overall accuracy marginally with per-segment decision making.

  13. A New Clustering Algorithm for Face Classification

    Directory of Open Access Journals (Sweden)

    Shaker K. Ali

    2016-06-01

    Full Text Available In This paper, we proposed new clustering algorithm depend on other clustering algorithm ideas. The proposed algorithm idea is based on getting distance matrix, then the exclusion of the matrix points which will be clustered by saving the location (row, column of these points and determine the minimum distance of these points which will be belongs the group (class and keep the other points which are not clustering yet. The propose algorithm is applied to image data base of the human face with different environment (direction, angles... etc.. These data are collected from different resource (ORL site and real images collected from random sample of Thi_Qar city population in lraq. Our algorithm has been implemented on three types of distance to calculate the minimum distance between points (Euclidean, Correlation and Minkowski distance .The efficiency ratio of proposed algorithm has varied according to the data base and threshold, the efficiency of our algorithm is exceeded (96%. Matlab (2014 has been used in this work.

  14. Discovering Fuzzy Censored Classification Rules (Fccrs: A Genetic Algorithm Approach

    Directory of Open Access Journals (Sweden)

    Renu Bala

    2012-08-01

    Full Text Available Classification Rules (CRs are often discovered in the form of ‘If-Then’ Production Rules (PRs. PRs, beinghigh level symbolic rules, are comprehensible and easy to implement. However, they are not capable ofdealing with cognitive uncertainties like vagueness and ambiguity imperative to real word decision makingsituations. Fuzzy Classification Rules (FCRs based on fuzzy logic provide a framework for a flexiblehuman like reasoning involving linguistic variables. Moreover, a classification system consisting of simple‘If-Then’ rules is not competent in handling exceptional circumstances. In this paper, we propose aGenetic Algorithm approach to discover Fuzzy Censored Classification Rules (FCCRs. A FCCR is aFuzzy Classification Rule (FCRs augmented with censors. Here, censors are exceptional conditions inwhich the behaviour of a rule gets modified. The proposed algorithm works in two phases. In the firstphase, the Genetic Algorithm discovers Fuzzy Classification Rules. Subsequently, these FuzzyClassification Rules are mutated to produce FCCRs in the second phase. The appropriate encodingscheme, fitness function and genetic operators are designed for the discovery of FCCRs. The proposedapproach for discovering FCCRs is then illustrated on a synthetic dataset.

  15. Simple-Random-Sampling-Based Multiclass Text Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Wuying Liu

    2014-01-01

    Full Text Available Multiclass text classification (MTC is a challenging issue and the corresponding MTC algorithms can be used in many applications. The space-time overhead of the algorithms must be concerned about the era of big data. Through the investigation of the token frequency distribution in a Chinese web document collection, this paper reexamines the power law and proposes a simple-random-sampling-based MTC (SRSMTC algorithm. Supported by a token level memory to store labeled documents, the SRSMTC algorithm uses a text retrieval approach to solve text classification problems. The experimental results on the TanCorp data set show that SRSMTC algorithm can achieve the state-of-the-art performance at greatly reduced space-time requirements.

  16. A Syntactic Classification based Web Page Ranking Algorithm

    CERN Document Server

    Mukhopadhyay, Debajyoti; Kim, Young-Chon

    2011-01-01

    The existing search engines sometimes give unsatisfactory search result for lack of any categorization of search result. If there is some means to know the preference of user about the search result and rank pages according to that preference, the result will be more useful and accurate to the user. In the present paper a web page ranking algorithm is being proposed based on syntactic classification of web pages. Syntactic Classification does not bother about the meaning of the content of a web page. The proposed approach mainly consists of three steps: select some properties of web pages based on user's demand, measure them, and give different weightage to each property during ranking for different types of pages. The existence of syntactic classification is supported by running fuzzy c-means algorithm and neural network classification on a set of web pages. The change in ranking for difference in type of pages but for same query string is also being demonstrated.

  17. AN ENHANCEMENT OF ASSOCIATION CLASSIFICATION ALGORITHM FOR IDENTIFYING PHISHING WEBSITES

    Directory of Open Access Journals (Sweden)

    G.Parthasarathy

    2016-08-01

    Full Text Available Phishing is a fraudulent activity that involves attacker creating a model of an existing web page in order to get more important information similar to credit card details, passwords etc., of the users. This paper is an enhancement of the existing association classification algorithm to detect the phishing websites. We can enhance the accuracy to a greater extent by applying the association rules into classification. In addition, we can also obtain some valuable information and rules which cannot be captured by using any other classification approaches. However the rule generation procedure is very time consuming while encountering large data set. The proposed algorithm makes use of Apriori algorithm for identifying frequent itemsets and hence derives a decision tree based on the features of URL.

  18. Intrusion Detection in Mobile Ad Hoc Networks Using Classification Algorithms

    CERN Document Server

    Mitrokotsa, Aikaterini; Douligeris, Christos

    2008-01-01

    In this paper we present the design and evaluation of intrusion detection models for MANETs using supervised classification algorithms. Specifically, we evaluate the performance of the MultiLayer Perceptron (MLP), the Linear classifier, the Gaussian Mixture Model (GMM), the Naive Bayes classifier and the Support Vector Machine (SVM). The performance of the classification algorithms is evaluated under different traffic conditions and mobility patterns for the Black Hole, Forging, Packet Dropping, and Flooding attacks. The results indicate that Support Vector Machines exhibit high accuracy for almost all simulated attacks and that Packet Dropping is the hardest attack to detect.

  19. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    OpenAIRE

    Saadia Zahid; Fawad Hussain; Muhammad Rashid; Muhammad Haroon Yousaf; Hafiz Adnan Habib

    2015-01-01

    Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount o...

  20. Incremental learning algorithm for spike pattern classification

    OpenAIRE

    Mohemmed, A; Kasabov, N

    2012-01-01

    In a previous work (Mohemmed et al.), the authors proposed a supervised learning algorithm to train a spiking neuron to associate input/output spike patterns. In this paper, the association learning rule is applied in training a single layer of spiking neurons to classify multiclass spike patterns whereby the neurons are trained to recognize an input spike pattern by emitting a predetermined spike train. The training is performed in incremental fashion, i.e. the synaptic weights are adjusted ...

  1. Backpropagation Learning Algorithms for Email Classification.

    Directory of Open Access Journals (Sweden)

    *David Ndumiyana and Tarirayi Mukabeta

    2016-07-01

    Full Text Available Today email has become one the fastest and most effective form of communication. The popularity of this mode of transmitting goods, information and services has motivated spammers to perfect their technical skills to fool spam filters. This development has worsened the problems faced by Internet users as they have to deal with email congestion, email overload and unprioritised email messages. The result was an exponential increase in the number of email classification management tools for the past few decades. In this paper we propose a new spam classifier using a learning process of multilayer neural network to implement back propagation technique. Our contribution to the body of knowledge is the use of an improved empirical analysis to choose an optimum, novel collection of attributes of a user’s email contents that allows a quick detection of most important words in emails. We also demonstrate the effectiveness of two equal sets of emails training and testing data.

  2. Online Network Traffic Classification Algorithm Based on RVM

    Directory of Open Access Journals (Sweden)

    Zhang Qunhui

    2013-06-01

    Full Text Available Since compared with the Support Vector Machine (SVM, the Relevance Vector Machine (RVM not only has the advantage of avoiding the over- learn which is the characteristic of the SVM, but also greatly reduces the amount of computation of the kernel function and avoids the defects of the SVM that the scarcity is not strong, the large amount of calculation as well as the kernel function must satisfy the Mercer's condition and that human empirically determined parameters, so we proposed a new online traffic classification algorithm base on the RVM for this purpose. Through the analysis of the basic principles of RVM and the steps of the modeling, we made use of the training traffic classification model of the RVM to identify the network traffic in the real time through this model and the “port number+ DPI”. When the RVM predicts that the probability is in the query interval, we jointly used the "port number" and "DPI". Finally, we made a detailed experimental validation which shows that: compared with the Support Vector Machine (SVM network traffic classification algorithm, this algorithm can achieve the online network traffic classification, and the classification predication probability is greatly improved.

  3. Algorithms for classification of astronomical object spectra

    Science.gov (United States)

    Wasiewicz, P.; Szuppe, J.; Hryniewicz, K.

    2015-09-01

    Obtaining interesting celestial objects from tens of thousands or even millions of recorded optical-ultraviolet spectra depends not only on the data quality but also on the accuracy of spectra decomposition. Additionally rapidly growing data volumes demands higher computing power and/or more efficient algorithms implementations. In this paper we speed up the process of substracting iron transitions and fitting Gaussian functions to emission peaks utilising C++ and OpenCL methods together with the NOSQL database. In this paper we implemented typical astronomical methods of detecting peaks in comparison to our previous hybrid methods implemented with CUDA.

  4. Benchmarking protein classification algorithms via supervised cross-validation

    NARCIS (Netherlands)

    Kertész-Farkas, A.; Dhir, S.; Sonego, P.; Pacurar, M.; Netoteia, S.; Nijveen, H.; Kuzniar, A.; Leunissen, J.A.M.; Kocsor, A.; Pongor, S.

    2008-01-01

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-o

  5. An ellipse detection algorithm based on edge classification

    Science.gov (United States)

    Yu, Liu; Chen, Feng; Huang, Jianming; Wei, Xiangquan

    2015-12-01

    In order to enhance the speed and accuracy of ellipse detection, an ellipse detection algorithm based on edge classification is proposed. Too many edge points are removed by making edge into point in serialized form and the distance constraint between the edge points. It achieves effective classification by the criteria of the angle between the edge points. And it makes the probability of randomly selecting the edge points falling on the same ellipse greatly increased. Ellipse fitting accuracy is significantly improved by the optimization of the RED algorithm. It uses Euclidean distance to measure the distance from the edge point to the elliptical boundary. Experimental results show that: it can detect ellipse well in case of edge with interference or edges blocking each other. It has higher detecting precision and less time consuming than the RED algorithm.

  6. Novel classification method for remote sensing images based on information entropy discretization algorithm and vector space model

    Science.gov (United States)

    Xie, Li; Li, Guangyao; Xiao, Mang; Peng, Lei

    2016-04-01

    Various kinds of remote sensing image classification algorithms have been developed to adapt to the rapid growth of remote sensing data. Conventional methods typically have restrictions in either classification accuracy or computational efficiency. Aiming to overcome the difficulties, a new solution for remote sensing image classification is presented in this study. A discretization algorithm based on information entropy is applied to extract features from the data set and a vector space model (VSM) method is employed as the feature representation algorithm. Because of the simple structure of the feature space, the training rate is accelerated. The performance of the proposed method is compared with two other algorithms: back propagation neural networks (BPNN) method and ant colony optimization (ACO) method. Experimental results confirm that the proposed method is superior to the other algorithms in terms of classification accuracy and computational efficiency.

  7. Pyroelectric sensors and classification algorithms for border / perimeter security

    Science.gov (United States)

    Jacobs, Eddie L.; Chari, Srikant; Halford, Carl; McClellan, Harry

    2009-09-01

    It has been shown that useful classifications can be made with a sensor that detects the shape of moving objects. This type of sensor has been referred to as a profiling sensor. In this research, two configurations of pyroelectric detectors are considered for use in a profiling sensor, a linear array and a circular array. The linear array produces crude images representing the shape of objects moving through the field of view. The circular array produces a temporal motion vector. A simulation of the output of each detector configuration is created and used to generate simulated profiles. The simulation is performed by convolving the pyroelectric detector response with images derived from calibrated thermal infrared video sequences. Profiles derived from these simulations are then used to train and test classification algorithms. Classification algorithms examined in this study include a naive Bayesian (NB) classifier and Linear discriminant analysis (LDA). Each classification algorithm assumes a three class problem where profiles are classified as either human, animal, or vehicle. Simulation results indicate that these systems can reliably classify outputs from these types of sensors. These types of sensors can be used in applications involving border or perimeter security.

  8. A multi-feature based morphological algorithm for ST shape classification.

    Science.gov (United States)

    Fan, Shuqiong; Miao, Fen; Ma, Ruiqing; Li, Ye; Huang, Xuhui

    2015-08-01

    Abnormal ST segment is an important parameter for the diagnosis of myocardial ischemia and other heart diseases. As most abnormal ST segments sustain for only a few seconds, it is impractical for the doctors to detect and classify abnormal ones manually on time. Even though many ST segment classification algorithms are proposed to meet the rising demand of automatic myocardial ischemia diagnosis, they are often with lower recognition rate. The aim of this study is to detect abnormal ST segments precisely and classify them into more categories, and thus provide more detailed category information to help the clinicians make decisions. This study sums up ten common abnormal ST segments according to the clinical ECG records and proposes a morphological classification algorithm of ST segment based on multi-features. This algorithm consists of two parts: Feature points extraction and ST segment classification. In the first part, R wave is detected by using the 2B-spline wavelet transform, and mode-filtering method and morphological characteristics are used for other feature points extraction. In the ST segment classification process, ST segment level, variance, slope value, number of convex/concave points and other feature parameters are employed to classify the ST segment. This algorithm can classify abnormal ST segments into ten categories above. We evaluated the performance of the proposed algorithm based on ECG data in the European ST-T database. The global recognition rate of 92.7% and the best accuracy of 97% demonstrated the effectiveness of the proposed solution. PMID:26737618

  9. Classification of ETM+ Remote Sensing Image Based on Hybrid Algorithm of Genetic Algorithm and Back Propagation Neural Network

    Directory of Open Access Journals (Sweden)

    Haisheng Song

    2013-01-01

    Full Text Available The back propagation neural network (BPNN algorithm can be used as a supervised classification in the processing of remote sensing image classification. But its defects are obvious: falling into the local minimum value easily, slow convergence speed, and being difficult to determine intermediate hidden layer nodes. Genetic algorithm (GA has the advantages of global optimization and being not easy to fall into local minimum value, but it has the disadvantage of poor local searching capability. This paper uses GA to generate the initial structure of BPNN. Then, the stable, efficient, and fast BP classification network is gotten through making fine adjustments on the improved BP algorithm. Finally, we use the hybrid algorithm to execute classification on remote sensing image and compare it with the improved BP algorithm and traditional maximum likelihood classification (MLC algorithm. Results of experiments show that the hybrid algorithm outperforms improved BP algorithm and MLC algorithm.

  10. Classification of Different Species Families using Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    D.Chandravathi

    2010-08-01

    Full Text Available The division of similar objects into groups is known as Clustering. The main objective of this implementation is classification of DNA sequences related to different species and their families using Clustering Algorithm- Leader-sub leader algorithm. Clustering is done with the help of threshold value of scoring matrix. It is another simple and efficient technique that may help to find family, uperfamily and sub-family by generating sub clusters. From thisanalysis there may be a chance that members in sub-cluster may be affected if one of the leader clusters gets affected.

  11. DTL: a language to assist cardiologists in improving classification algorithms.

    Science.gov (United States)

    Kors, J A; Kamp, D M; Henkemans, D P; van Bemmel, J H

    1991-06-01

    Heuristic classifiers, e.g., for diagnostic classification of the electrocardiogram, can be very complex. The development and refinement of such classifiers is cumbersome and time-consuming. Generally, it requires a computer expert to implement the cardiologist's diagnostic reasoning into computer language. The average cardiologist, however, is not able to verify whether his intentions have been properly realized and perform as he hoped for. But also for the initiated, it often remains obscure how a particular result was reached by a complex classification program. An environment is presented which solves these problems. The environment consists of a language, DTL (Decision Tree Language), that allows cardiologists to express their classification algorithms in a way that is familiar to them, and an interpreter and translator for that language. The considerations in the design of DTL are described and the structure and capabilities of the interpreter and translator are discussed.

  12. Optimized features selection for gender classification using optimization algorithms

    OpenAIRE

    KHAN, Sajid Ali; Nazir, Muhammad; RIAZ, Naveed

    2013-01-01

    Optimized feature selection is an important task in gender classification. The optimized features not only reduce the dimensions, but also reduce the error rate. In this paper, we have proposed a technique for the extraction of facial features using both appearance-based and geometric-based feature extraction methods. The extracted features are then optimized using particle swarm optimization (PSO) and the bee algorithm. The geometric-based features are optimized by PSO with ensem...

  13. Protein Sequence Classification with Improved Extreme Learning Machine Algorithms

    Directory of Open Access Journals (Sweden)

    Jiuwen Cao

    2014-01-01

    Full Text Available Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms.

  14. The Optimization of Trained and Untrained Image Classification Algorithms for Use on Large Spatial Datasets

    Science.gov (United States)

    Kocurek, Michael J.

    2005-01-01

    The HARVIST project seeks to automatically provide an accurate, interactive interface to predict crop yield over the entire United States. In order to accomplish this goal, large images must be quickly and automatically classified by crop type. Current trained and untrained classification algorithms, while accurate, are highly inefficient when operating on large datasets. This project sought to develop new variants of two standard trained and untrained classification algorithms that are optimized to take advantage of the spatial nature of image data. The first algorithm, harvist-cluster, utilizes divide-and-conquer techniques to precluster an image in the hopes of increasing overall clustering speed. The second algorithm, harvistSVM, utilizes support vector machines (SVMs), a type of trained classifier. It seeks to increase classification speed by applying a "meta-SVM" to a quick (but inaccurate) SVM to approximate a slower, yet more accurate, SVM. Speedups were achieved by tuning the algorithm to quickly identify when the quick SVM was incorrect, and then reclassifying low-confidence pixels as necessary. Comparing the classification speeds of both algorithms to known baselines showed a slight speedup for large values of k (the number of clusters) for harvist-cluster, and a significant speedup for harvistSVM. Future work aims to automate the parameter tuning process required for harvistSVM, and further improve classification accuracy and speed. Additionally, this research will move documents created in Canvas into ArcGIS. The launch of the Mars Reconnaissance Orbiter (MRO) will provide a wealth of image data such as global maps of Martian weather and high resolution global images of Mars. The ability to store this new data in a georeferenced format will support future Mars missions by providing data for landing site selection and the search for water on Mars.

  15. A Hybrid Algorithm for Classification of Compressed ECG

    Directory of Open Access Journals (Sweden)

    Shubhada S.Ardhapurkar

    2012-03-01

    Full Text Available Efficient compression reduces memory requirement in long term recording and reduces power and time requirement in transmission. A new compression algorithm combining Linear Predictive coding (LPC and Discrete Wavelet transform is proposed in this study. Our coding algorithm offers compression ratio above 85% for records of MIT-BIH compression database. The performance of algorithm is quantified by computing distortion measures like percentage root mean square difference (PRD, wavelet-based weighted PRD (WWPRD and Wavelet energy based diagnostic distortion (WEDD. The PRD is found to be below 6 %, values of WWPRD and WEDD are less than 0.03. Classification of decompressed signals, by employing fuzzy c means method, is achieved with accuracy of 97%.

  16. An Imbalanced Data Classification Algorithm of De-noising Auto-Encoder Neural Network Based on SMOTE

    Directory of Open Access Journals (Sweden)

    Zhang Chenggang

    2016-01-01

    Full Text Available Imbalanced data classification problem has always been one of the hot issues in the field of machine learning. Synthetic minority over-sampling technique (SMOTE is a classical approach to balance datasets, but it may give rise to such problem as noise. Stacked De-noising Auto-Encoder neural network (SDAE, can effectively reduce data redundancy and noise through unsupervised layer-wise greedy learning. Aiming at the shortcomings of SMOTE algorithm when synthesizing new minority class samples, the paper proposed a Stacked De-noising Auto-Encoder neural network algorithm based on SMOTE, SMOTE-SDAE, which is aimed to deal with imbalanced data classification. The proposed algorithm is not only able to synthesize new minority class samples, but it also can de-noise and classify the sampled data. Experimental results show that compared with traditional algorithms, SMOTE-SDAE significantly improves the minority class classification accuracy of the imbalanced datasets.

  17. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  18. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    Directory of Open Access Journals (Sweden)

    Saadia Zahid

    2015-01-01

    Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.

  19. Implementation of several mathematical algorithms to breast tissue density classification

    Science.gov (United States)

    Quintana, C.; Redondo, M.; Tirao, G.

    2014-02-01

    The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories.

  20. Automatic classification of schizophrenia using resting-state functional language network via an adaptive learning algorithm

    Science.gov (United States)

    Zhu, Maohu; Jie, Nanfeng; Jiang, Tianzi

    2014-03-01

    A reliable and precise classification of schizophrenia is significant for its diagnosis and treatment of schizophrenia. Functional magnetic resonance imaging (fMRI) is a novel tool increasingly used in schizophrenia research. Recent advances in statistical learning theory have led to applying pattern classification algorithms to access the diagnostic value of functional brain networks, discovered from resting state fMRI data. The aim of this study was to propose an adaptive learning algorithm to distinguish schizophrenia patients from normal controls using resting-state functional language network. Furthermore, here the classification of schizophrenia was regarded as a sample selection problem where a sparse subset of samples was chosen from the labeled training set. Using these selected samples, which we call informative vectors, a classifier for the clinic diagnosis of schizophrenia was established. We experimentally demonstrated that the proposed algorithm incorporating resting-state functional language network achieved 83.6% leaveone- out accuracy on resting-state fMRI data of 27 schizophrenia patients and 28 normal controls. In contrast with KNearest- Neighbor (KNN), Support Vector Machine (SVM) and l1-norm, our method yielded better classification performance. Moreover, our results suggested that a dysfunction of resting-state functional language network plays an important role in the clinic diagnosis of schizophrenia.

  1. Review of WiMAX Scheduling Algorithms and Their Classification

    Science.gov (United States)

    Yadav, A. L.; Vyavahare, P. D.; Bansod, P. P.

    2014-07-01

    Providing quality of service (QoS) in wireless communication networks has become an important consideration for supporting variety of applications. IEEE 802.16 based WiMAX is the most promising technology for broadband wireless access with best QoS features for tripe play (voice, video and data) service users. Unlike wired networks, QoS support is difficult in wireless networks due to variable and unpredictable nature of wireless channels. In transmission of voice and video main issue involves allocation of available resources among the users to meet QoS criteria such as delay, jitter and throughput requirements to maximize goodput, to minimize power consumption while keeping feasible algorithm flexibility and ensuring system scalability. WiMAX assures guaranteed QoS by including several mechanisms at the MAC layer such as admission control and scheduling. Packet scheduling is a process of resolving contention for bandwidth which determines allocation of bandwidth among users and their transmission order. Various approaches for classification of scheduling algorithms in WiMAX have appeared in literature as homogeneous, hybrid and opportunistic scheduling algorithms. The paper consolidates the parameters and performance metrics that need to be considered in developing a scheduler. The paper surveys recently proposed scheduling algorithms, their shortcomings, assumptions, suitability and improvement issues associated with these uplink scheduling algorithms.

  2. A Computational Algorithm for Metrical Classification of Verse

    Directory of Open Access Journals (Sweden)

    Rama N.

    2010-03-01

    Full Text Available The science of versification and analysis of verse in Sanskrit is governed by rules of metre or chandas. Such metre-wise classification of verses has numerous uses for scholars and researchers alike, such as in the study of poets and their style of Sanskrit poetical works. This paper presents a comprehensive computational scheme and set of algorithms to identify the metre of verses given as Sanskrit (Unicode or English E-text (Latin Unicode. The paper also demonstrates the use of euphonic conjunction rules to correct verses in which these conjunctions, which are compulsory in verse, have erroneously not been implemented.

  3. A novel hybrid classification model of genetic algorithms, modified k-Nearest Neighbor and developed backpropagation neural network.

    Directory of Open Access Journals (Sweden)

    Nader Salari

    Full Text Available Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that

  4. Improved Algorithms for the Classification of Rough Rice Using a Bionic Electronic Nose Based on PCA and the Wilks Distribution

    Directory of Open Access Journals (Sweden)

    Sai Xu

    2014-03-01

    Full Text Available Principal Component Analysis (PCA is one of the main methods used for electronic nose pattern recognition. However, poor classification performance is common in classification and recognition when using regular PCA. This paper aims to improve the classification performance of regular PCA based on the existing Wilks ?-statistic (i.e., combined PCA with the Wilks distribution. The improved algorithms, which combine regular PCA with the Wilks ?-statistic, were developed after analysing the functionality and defects of PCA. Verification tests were conducted using a PEN3 electronic nose. The collected samples consisted of the volatiles of six varieties of rough rice (Zhongxiang1, Xiangwan13, Yaopingxiang, WufengyouT025, Pin 36, and Youyou122, grown in same area and season. The first two principal components used as analysis vectors cannot perform the rough rice varieties classification task based on a regular PCA. Using the improved algorithms, which combine the regular PCA with the Wilks ?-statistic, many different principal components were selected as analysis vectors. The set of data points of the Mahalanobis distance between each of the varieties of rough rice was selected to estimate the performance of the classification. The result illustrates that the rough rice varieties classification task is achieved well using the improved algorithm. A Probabilistic Neural Networks (PNN was also established to test the effectiveness of the improved algorithms. The first two principal components (namely PC1 and PC2 and the first and fifth principal component (namely PC1 and PC5 were selected as the inputs of PNN for the classification of the six rough rice varieties. The results indicate that the classification accuracy based on the improved algorithm was improved by 6.67% compared to the results of the regular method. These results prove the effectiveness of using the Wilks ?-statistic to improve the classification accuracy of the regular PCA approach. The

  5. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    Science.gov (United States)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  6. CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH

    Directory of Open Access Journals (Sweden)

    V. A. Ayma

    2015-03-01

    Full Text Available Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP, which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA’s machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM. The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  7. Algorithm for Soybean Classification Using Medium Resolution Satellite Images

    Directory of Open Access Journals (Sweden)

    Anibal Gusso

    2012-10-01

    Full Text Available An accurate estimation of soybean crop areas while the plants are still in the field is highly necessary for reliable calculation of real crop parameters as to yield, production and other data important to decision-making policies related to government planning. An algorithm for soybean classification over the Rio Grande do Sul State, Brazil, was developed as an objective, automated tool. It is based on reflectance from medium spatial resolution images. The classification method was called the RCDA (Reflectance-based Crop Detection Algorithm, which operates through a mathematical combination of multi-temporal optical reflectance data obtained from Landsat-5 TM images. A set of 39 municipalities was analyzed for eight crop years between 1996/1997 and 2009/2010. RCDA estimates were compared to the official estimates of the Brazilian Institute of Geography and Statistics (IBGE for soybean area at a municipal level. Coefficients R2 were between 0.81 and 0.98, indicating good agreement of the estimates. The RCDA was also compared to a soybean crop map derived from Landsat images for the 2000/2001 crop year, the overall map accuracy was 91.91% and the Kappa Index of Agreement was 0.76. Due to the calculation chain and pre-defined parameters, RCDA is a timesaving procedure and is less subjected to analyst skills for image interpretation. Thus, the RCDA was considered advantageous to provide thematic soybean maps at local and regional scales.

  8. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  9. Land-cover classification with an expert classification algorithm using digital aerial photographs

    Directory of Open Access Journals (Sweden)

    José L. de la Cruz

    2010-05-01

    Full Text Available The purpose of this study was to evaluate the usefulness of the spectral information of digital aerial sensors in determining land-cover classification using new digital techniques. The land covers that have been evaluated are the following, (1 bare soil, (2 cereals, including maize (Zea mays L., oats (Avena sativa L., rye (Secale cereale L., wheat (Triticum aestivum L. and barley (Hordeun vulgare L., (3 high protein crops, such as peas (Pisum sativum L. and beans (Vicia faba L., (4 alfalfa (Medicago sativa L., (5 woodlands and scrublands, including holly oak (Quercus ilex L. and common retama (Retama sphaerocarpa L., (6 urban soil, (7 olive groves (Olea europaea L. and (8 burnt crop stubble. The best result was obtained using an expert classification algorithm, achieving a reliability rate of 95%. This result showed that the images of digital airborne sensors hold considerable promise for the future in the field of digital classifications because these images contain valuable information that takes advantage of the geometric viewpoint. Moreover, new classification techniques reduce problems encountered using high-resolution images; while reliabilities are achieved that are better than those achieved with traditional methods.

  10. Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

    Directory of Open Access Journals (Sweden)

    Ghazi Raho

    2015-02-01

    Full Text Available Feature selection is necessary for effective text classification. Dataset preprocessing is essential to make upright result and effective performance. This paper investigates the effectiveness of using feature selection. In this paper we have been compared the performance between different classifiers in different situations using feature selection with stemming, and without stemming.Evaluation used a BBC Arabic dataset, different classification algorithms such as decision tree (D.T, K-nearest neighbors (KNN, Naïve Bayesian (NB method and Naïve Bayes Multinomial(NBM classifier were used. The experimental results are presented in term of precision, recall, F-Measures, accuracy and time to build model.

  11. An Evolutionary Algorithm for Enhanced Magnetic Resonance Imaging Classification

    Directory of Open Access Journals (Sweden)

    T.S. Murunya

    2014-11-01

    Full Text Available This study presents an image classification method for retrieval of images from a multi-varied MRI database. With the development of sophisticated medical imaging technology which helps doctors in diagnosis, medical image databases contain a huge amount of digital images. Magnetic Resonance Imaging (MRI is a widely used imaging technique which picks signals from a body's magnetic particles spinning to magnetic tune and through a computer converts scanned data into pictures of internal organs. Image processing techniques are required to analyze medical images and retrieve it from database. The proposed framework extracts features using Moment Invariants (MI and Wavelet Packet Tree (WPT. Extracted features are reduced using Correlation based Feature Selection (CFS and a CFS with cuckoo search algorithm is proposed. Naïve Bayes and K-Nearest Neighbor (KNN classify the selected features. National Biomedical Imaging Archive (NBIA dataset including colon, brain and chest is used to evaluate the framework.

  12. FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification

    Directory of Open Access Journals (Sweden)

    Wei-Hao Lee

    2012-05-01

    Full Text Available This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA. It is embedded in a System-On-Programmable-Chip (SOPC platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance andlow area costs.

  13. Improving the cAnt-MinerPB Classification Algorithm

    OpenAIRE

    Medland, Matthew; Otero, Fernando E. B.; Freitas, Alex A

    2012-01-01

    Ant Colony Optimisation (ACO) has been successfully applied to the classification task of data mining in the form of Ant-Miner. A new extension of Ant-Miner, called cAnt-MinerPB, uses the ACO procedure in a different fashion. The main difference is that the search in cAnt-MinerPB is optimised to find the best list of rules, whereas in Ant-Miner the search is optimised to find the best individual rule at each step of the sequential covering, producing a list of best rules. We aim to improve cA...

  14. CLASSIFICATION OF DEFECTS IN SOFTWARE USING DECISION TREE ALGORITHM

    Directory of Open Access Journals (Sweden)

    M. SURENDRA NAIDU

    2013-06-01

    Full Text Available Software defects due to coding errors continue to plague the industry with disastrous impact, especially in the enterprise application software category. Identifying how much of these defects are specifically due to coding errors is a challenging problem. Defect prevention is the most vivid but usually neglected aspect of softwarequality assurance in any project. If functional at all stages of software development, it can condense the time, overheads and wherewithal entailed to engineer a high quality product. In order to reduce the time and cost, we will focus on finding the total number of defects if the test case shows that the software process not executing properly. That has occurred in the software development process. The proposed system classifying various defects using decision tree based defect classification technique, which is used to group the defects after identification. The classification can be done by employing algorithms such as ID3 or C4.5 etc. After theclassification the defect patterns will be measured by employing pattern mining technique. Finally the quality will be assured by using various quality metrics such as defect density, etc. The proposed system will be implemented in JAVA.

  15. Multi-classification algorithm and its realization based on least square support vector machine algorithm

    Institute of Scientific and Technical Information of China (English)

    Fan Youping; Chen Yunping; Sun Wansheng; Li Yu

    2005-01-01

    As a new type of learning machine developed on the basis of statistics learning theory, support vector machine (SVM) plays an important role in knowledge discovering and knowledge updating by constructing non-linear optimal classifier. However, realizing SVM requires resolving quadratic programming under constraints of inequality, which results in calculation difficulty while learning samples gets larger. Besides, standard SVM is incapable of tackling multi-classification. To overcome the bottleneck of populating SVM, with training algorithm presented, the problem of quadratic programming is converted into that of resolving a linear system of equations composed of a group of equation constraints by adopting the least square SVM(LS-SVM) and introducing a modifying variable which can change inequality constraints into equation constraints, which simplifies the calculation. With regard to multi-classification, an LS-SVM applicable in multi-classification is deduced. Finally, efficiency of the algorithm is checked by using universal Circle in square and two-spirals to measure the performance of the classifier.

  16. A Novel Training Algorithm of Genetic Neural Networks and Its Application to Classification

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    First of all, this paper discusses the drawbacks of multilayer perceptron (MLP), which is trained by the traditional back propagation (BP) algorithm and used in a special classification problem. A new training algorithm for neural networks based on genetic algorithm and BP algorithm is developed. The difference between the new training algorithm and BP algorithm in the ability of nonlinear approaching is expressed through an example, and the application foreground is illustrated by an example.

  17. Contributions to "k"-Means Clustering and Regression via Classification Algorithms

    Science.gov (United States)

    Salman, Raied

    2012-01-01

    The dissertation deals with clustering algorithms and transforming regression problems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learning environment for solving regression problems as classification tasks by using…

  18. Study and Implementation of Web Mining Classification Algorithm Based on Building Tree of Detection Class Threshold

    Institute of Scientific and Technical Information of China (English)

    CHEN Jun-jie; SONG Han-tao; LU Yu-chang

    2005-01-01

    A new classification algorithm for web mining is proposed on the basis of general classification algorithm for data mining in order to implement personalized information services. The building tree method of detecting class threshold is used for construction of decision tree according to the concept of user expectation so as to find classification rules in different layers. Compared with the traditional C4. 5 algorithm, the disadvantage of excessive adaptation in C4. 5 has been improved so that classification results not only have much higher accuracy but also statistic meaning.

  19. Integrating genetic algorithm method with neural network for land use classification using SZ-3 CMODIS data

    Institute of Scientific and Technical Information of China (English)

    WANG Changyao; LUO Chengfeng; LIU Zhengjun

    2005-01-01

    This paper presents a methodology on land use mapping using CMODIS (Chinese Moderate Resolution Imaging Spectroradiometer ) data on-board SZ-3 (Shenzhou 3) spacecraft. The integrated method is composed of genetic algorithm (GA) for feature extraction and neural network classifier for land use classification. In the data preprocessing, a moment matching method was adopted to reuse classification was obtained. To generate a land use map, the three layers back propagation neural network classifier is used for training the samples and classification. Compared with the Maximum Likelihood classification algorithm, the results show that the accuracy of land use classification is obviously improved by using our proposed method, the selected band number in the classification process is reduced,and the computational performance for training and classification is improved. The result also shows that the CMODIS data can be effectively used for land use/land cover classification and change monitoring at regional and global scale.

  20. Detection of malicious attacks by Meta classification algorithms

    Directory of Open Access Journals (Sweden)

    G.Michael

    2015-03-01

    Full Text Available We address the problem of malicious node detection in a network based on the characteristics in the behavior of the network. This issue brings out a challenging set of research papers in the recent contributing a critical component to secure the network. This type of work evolves with many changes in the solution strategies. In this work, we propose carefully the learning models with cautious selection of attributes, selection of parameter thresholds and number of iterations. In this research, appropriate approach to evaluate the performance of a set of meta classifier algorithms (Ad Boost, Attribute selected classifier, Bagging, Classification via Regression, Filtered classifier, logit Boost, multiclass classifier. The ratio between training and testing data is made such way that compatibility of data patterns in both the sets are same. Hence we consider a set of supervised machine learning schemes with meta classifiers were applied on the selected dataset to predict the attack risk of the network environment . The trained models were then used for predicting the risk of the attacks in a web server environment or by any network administrator or any Security Experts. The Prediction Accuracy of the Classifiers was evaluated using 10-fold Cross Validation and the results have been compared to obtain the accuracy.

  1. Text Classification using Association Rule with a Hybrid Concept of Naive Bayes Classifier and Genetic Algorithm

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the automated assignment of natural language texts to predefined categories based on their content. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Now a day the demand of text classification is increasing tremendously. Keeping this demand into consideration, new and updated techniques are being developed for the purpose of automated text classification. This paper presents a new algorithm for text classification. Instead of using words, word relation i.e. association rules is used to derive feature set from pre-classified text documents. The concept of Naive Bayes Classifier is then used on derived features and finally a concept of Genetic Algorithm has been added for final classification. A system based on the proposed algorithm has been implemented and tested. The experimental ...

  2. Unsupervised classification algorithm based on EM method for polarimetric SAR images

    Science.gov (United States)

    Fernández-Michelli, J. I.; Hurtado, M.; Areta, J. A.; Muravchik, C. H.

    2016-07-01

    In this work we develop an iterative classification algorithm using complex Gaussian mixture models for the polarimetric complex SAR data. It is a non supervised algorithm which does not require training data or an initial set of classes. Additionally, it determines the model order from data, which allows representing data structure with minimum complexity. The algorithm consists of four steps: initialization, model selection, refinement and smoothing. After a simple initialization stage, the EM algorithm is iteratively applied in the model selection step to compute the model order and an initial classification for the refinement step. The refinement step uses Classification EM (CEM) to reach the final classification and the smoothing stage improves the results by means of non-linear filtering. The algorithm is applied to both simulated and real Single Look Complex data of the EMISAR mission and compared with the Wishart classification method. We use confusion matrix and kappa statistic to make the comparison for simulated data whose ground-truth is known. We apply Davies-Bouldin index to compare both classifications for real data. The results obtained for both types of data validate our algorithm and show that its performance is comparable to Wishart's in terms of classification quality.

  3. Improved algorithms for the classification of rough rice using a bionic electronic nose based on PCA and the Wilks distribution.

    Science.gov (United States)

    Xu, Sai; Zhou, Zhiyan; Lu, Huazhong; Luo, Xiwen; Lan, Yubin

    2014-03-19

    Principal Component Analysis (PCA) is one of the main methods used for electronic nose pattern recognition. However, poor classification performance is common in classification and recognition when using regular PCA. This paper aims to improve the classification performance of regular PCA based on the existing Wilks Λ-statistic (i.e., combined PCA with the Wilks distribution). The improved algorithms, which combine regular PCA with the Wilks Λ-statistic, were developed after analysing the functionality and defects of PCA. Verification tests were conducted using a PEN3 electronic nose. The collected samples consisted of the volatiles of six varieties of rough rice (Zhongxiang1, Xiangwan13, Yaopingxiang, WufengyouT025, Pin 36, and Youyou122), grown in same area and season. The first two principal components used as analysis vectors cannot perform the rough rice varieties classification task based on a regular PCA. Using the improved algorithms, which combine the regular PCA with the Wilks Λ-statistic, many different principal components were selected as analysis vectors. The set of data points of the Mahalanobis distance between each of the varieties of rough rice was selected to estimate the performance of the classification. The result illustrates that the rough rice varieties classification task is achieved well using the improved algorithm. A Probabilistic Neural Networks (PNN) was also established to test the effectiveness of the improved algorithms. The first two principal components (namely PC1 and PC2) and the first and fifth principal component (namely PC1 and PC5) were selected as the inputs of PNN for the classification of the six rough rice varieties. The results indicate that the classification accuracy based on the improved algorithm was improved by 6.67% compared to the results of the regular method. These results prove the effectiveness of using the Wilks Λ-statistic to improve the classification accuracy of the regular PCA approach. The results

  4. Study on An Absolute Non-Collision Hash and Jumping Table IP Classification Algorithms

    Institute of Scientific and Technical Information of China (English)

    SHANG Feng-jun; PAN Ying-jun

    2004-01-01

    In order to classify packet, we propose a novel IP classification based the non-collision hash and jumping table Trie-tree (NHJTTT) algorithm, which is based on non-collision hash Trie-tree and Lakshman and Stiliadis proposing a 2-dimensional classification algorithm (LS algorithm).The core of algorithm consists of two parts: structure the non-collision hash function, which is constructed mainly based on destination /source port and protocol type field so that the hash function can avoid space explosion problem; introduce jumping table Trie-tree based LS algorithm in order to reduce time complexity.The test results show that the classification rate of NHJTTT algorithm is up to 1 million packets per second and the maximum memory consumed is 9 MB for 10 000 rules.

  5. A Novel Algorithm of Network Trade Customer Classification Based on Fourier Basis Functions

    Directory of Open Access Journals (Sweden)

    Li Xinwu

    2013-11-01

    Full Text Available Learning algorithm of neural network is always an important research contents in neural network theory research and application field, learning algorithm about the feed-forward neural network has no satisfactory solution in particular for its defects in calculation speed. The paper presents a new Fourier basis functions neural network algorithm and applied it to classify network trade customer. First, 21 customer classification indicators are designed, based on characteristics and behaviors analysis of network trade customer, including customer characteristics type variables and customer behaviors type variables,; Second, Fourier basis functions is used to improve the calculation flow and algorithm structure of original BP neural network algorithm to speed up its convergence and then a new Fourier basis neural network model is constructed. Finally the experimental results show that the problem of convergence speed can been solved, and the accuracy of the customer classification are ensured when the new algorithm is used in network trade customer classification practically.

  6. Image-classification-based global dimming algorithm for LED backlights in LCDs

    Science.gov (United States)

    Qibin, Feng; Huijie, He; Dong, Han; Lei, Zhang; Guoqiang, Lv

    2015-07-01

    Backlight dimming can help LCDs reduce power consumption and improve CR. With fixed parameters, dimming algorithm cannot achieve satisfied effects for all kinds of images. The paper introduces an image-classification-based global dimming algorithm. The proposed classification method especially for backlight dimming is based on luminance and CR of input images. The parameters for backlight dimming level and pixel compensation are adaptive with image classifications. The simulation results show that the classification based dimming algorithm presents 86.13% power reduction improvement compared with dimming without classification, with almost same display quality. The prototype is developed. There are no perceived distortions when playing videos. The practical average power reduction of the prototype TV is 18.72%, compared with common TV without dimming.

  7. A Non-Collision Hash Trie-Tree Based FastIP Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    徐恪; 吴建平; 喻中超; 徐明伟

    2002-01-01

    With the development of network applications, routers must support such functions as firewalls, provision of QoS, traffic billing, etc. All these functions need the classification of IP packets, according to how different the packets are processed subsequently, which is determined. In this article, a novel IP classification algorithm is proposed based on the Grid of Tries algorithm. The new algorithm not only eliminates original limitations in the case of multiple fields but also shows better performance in regard to both time and space. It has better overall performance than many other algorithms.

  8. Classification of hyperspectral remote sensing images based on simulated annealing genetic algorithm and multiple instance learning

    Institute of Scientific and Technical Information of China (English)

    高红民; 周惠; 徐立中; 石爱业

    2014-01-01

    A hybrid feature selection and classification strategy was proposed based on the simulated annealing genetic algorithm and multiple instance learning (MIL). The band selection method was proposed from subspace decomposition, which combines the simulated annealing algorithm with the genetic algorithm in choosing different cross-over and mutation probabilities, as well as mutation individuals. Then MIL was combined with image segmentation, clustering and support vector machine algorithms to classify hyperspectral image. The experimental results show that this proposed method can get high classification accuracy of 93.13%at small training samples and the weaknesses of the conventional methods are overcome.

  9. A TCAM-based Two-dimensional Prefix Packet Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    王志恒; 刘刚; 白英彩

    2004-01-01

    Packet classification (PC) has become the main method to support the quality of service and security of network application. And two-dimensional prefix packet classification (PPC) is the popular one. This paper analyzes the problem of ruler conflict, and then presents a TCAMbased two-dimensional PPC algorithm. This algorithm makes use of the parallelism of TCAM to lookup the longest prefix in one instruction cycle. Then it uses a memory image and associated data structures to eliminate the conflicts between rulers, and performs a fast two-dimensional PPC.Compared with other algorithms, this algorithm has the least time complexity and less space complexity.

  10. Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

    OpenAIRE

    Huibin Lu; Zhengping Hu; Hongxiao Gao

    2015-01-01

    In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the...

  11. Development of a Fingerprint Gender Classification Algorithm Using Fingerprint Global Features

    OpenAIRE

    S. F. Abdullah; A.F.N.A. Rahman; Z.A.Abas; W.H.M Saad

    2016-01-01

    In forensic world, the process of identifying and calculating the fingerprint features is complex and take time when it is done manually using fingerprint laboratories magnifying glass. This study is meant to enhance the forensic manual method by proposing a new algorithm for fingerprint global feature extraction for gender classification. The result shows that the new algorithm gives higher acceptable readings which is above 70% of classification rate when it is compared to the manual method...

  12. An Analytic Hierarchy Model for Classification Algorithms Selection in Credit Risk Analysis

    OpenAIRE

    Gang Kou; Wenshuai Wu

    2014-01-01

    This paper proposes an analytic hierarchy model (AHM) to evaluate classification algorithms for credit risk analysis. The proposed AHM consists of three stages: data mining stage, multicriteria decision making stage, and secondary mining stage. For verification, 2 public-domain credit datasets, 10 classification algorithms, and 10 performance criteria are used to test the proposed AHM in the experimental study. The results demonstrate that the proposed AHM is an efficient tool to select class...

  13. Random forest algorithm for classification of multiwavelength data

    Institute of Scientific and Technical Information of China (English)

    Dan Gao; Yan-Xia Zhang; Yong-Heng Zhao

    2009-01-01

    We introduced a decision tree method called Random Forests for multiwavelength data classification. The data were adopted from different databases, including the Sloan Digital Sky Survey (SDSS) Data Release five, USNO, FIRST and ROSAT.We then studied the discrimination of quasars from stars and the classification of quasars,stars and galaxies with the sample from optical and radio bands and with that from optical and X-ray bands. Moreover, feature selection and feature weighting based on Random Forests were investigated. The performances based on different input patterns were compared. The experimental results show that the random forest method is an effective method for astronomical object classification and can be applied to other classification problems faced in astronomy. In addition, Random Forests will show its superiorities due to its own merits, e.g. classification, feature selection, feature weighting as well as outlier detection.

  14. An Imbalanced Data Classification Algorithm of De-noising Auto-Encoder Neural Network Based on SMOTE

    OpenAIRE

    Zhang Chenggang; Song Jiazhi; Pei Zhili; Jiang Jingqing

    2016-01-01

    Imbalanced data classification problem has always been one of the hot issues in the field of machine learning. Synthetic minority over-sampling technique (SMOTE) is a classical approach to balance datasets, but it may give rise to such problem as noise. Stacked De-noising Auto-Encoder neural network (SDAE), can effectively reduce data redundancy and noise through unsupervised layer-wise greedy learning. Aiming at the shortcomings of SMOTE algorithm when synthesizing new minority class samples...

  15. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. PMID:25880524

  16. ASTErIsM - Application of topometric clustering algorithms in automatic galaxy detection and classification

    CERN Document Server

    Tramacere, A; Dubath, P; Kneib, J -P; Courbin, F

    2016-01-01

    We present a study on galaxy detection and shape classification using topometric clustering algorithms. We first use the DBSCAN algorithm to extract, from CCD frames, groups of adjacent pixels with significant fluxes and we then apply the DENCLUE algorithm to separate the contributions of overlapping sources. The DENCLUE separation is based on the localization of pattern of local maxima, through an iterative algorithm which associates each pixel to the closest local maximum. Our main classification goal is to take apart elliptical from spiral galaxies. We introduce new sets of features derived from the computation of geometrical invariant moments of the pixel group shape and from the statistics of the spatial distribution of the DENCLUE local maxima patterns. Ellipticals are characterized by a single group of local maxima, related to the galaxy core, while spiral galaxies have additional ones related to segments of spiral arms. We use two different supervised ensemble classification algorithms, Random Forest,...

  17. 基于分类矩阵的决策树算法%Decision tree algorithm based on classification matrix

    Institute of Scientific and Technical Information of China (English)

    陶道强; 马良荔; 彭超

    2012-01-01

    为了提高决策树分类的速度和精确率,提出了一种基于分类矩阵的决策树算法.介绍了ID3算法的理论基础,定义了一种分类矩阵,指出了ID3算法的取值偏向性并利用分类矩阵给出了证明.在此基础上,引入了一个权重因子,抑制了原有算法的取值偏向,并利用分类矩阵给出相应证明,同时根据基于分类矩阵增益的特点,提出了新的决策树分类方案,旨在运算速率上进行优化,与原有算法进行了实验比较.对实验结果分析表明,优化后的方案在性能上有明显改善.%To improve the classification speed and accuracy of the decision tree algorithm, a new program is proposed based on classification matrix. Firstly, the basic theory of the ID3 algorithm is introduced and a classification matrix is defined. Then the variety bias of this algorithm is pointed out, which is proved using the classification matrix. On the basis of the above, a weighting factor is cited to suppress the variety bias of the ID3 algorithm on the premise of a corresponding proof. According to the characteristics of the gain based on the classification matrix, a new decision tree scheme is proposed, aiming to optimize computing speed. Finally, the program is compared with the ID3 algorithm through experiment Experimental results show that the optimized scheme is obviously better than the original one in performance.

  18. Polarimetric synthetic aperture radar image classification using fuzzy logic in the H/α-Wishart algorithm

    Science.gov (United States)

    Zhu, Teng; Yu, Jie; Li, Xiaojuan; Yang, Jie

    2015-01-01

    To solve the problem that the H/α-Wishart unsupervised classification algorithm can generate only inflexible clusters due to arbitrarily fixed zone boundaries in the clustering processing, a refined fuzzy logic based classification scheme called the H/α-Wishart fuzzy clustering algorithm is proposed in this paper. A fuzzy membership function was developed for the degree of pixels belonging to each class instead of an arbitrary boundary. To devise a unified fuzzy function, a normalized Wishart distance is proposed during the clustering step in the new algorithm. Then the degree of membership is computed to implement fuzzy clustering. After an iterative procedure, the algorithm yields a classification result. The new classification scheme is applied to two L-band polarimetric synthetic aperture radar (PolSAR) images and an X-band high-resolution PolSAR image of a field in LingShui, Hainan Province, China. Experimental results show that the classification precision of the refined algorithm is greater than that of the H/α-Wishart algorithm and that the refined algorithm performs well in differentiating shadows and water areas.

  19. Differential characteristic set algorithm for the complete symmetry classification of partial differential equations

    Institute of Scientific and Technical Information of China (English)

    Chaolu Temuer; Yu-shan BAI

    2009-01-01

    In this paper,we present a differential polynomial characteristic set algorithm for the complete symmetry classification of partial differential equations (PDEs)with some parameters. It can make the solution to the complete symmetry classification problem for PDEs become direct and systematic. As an illustrative example,the complete potential symmetry classifications of nonlinear and linear wave equations with an arbitrary function parameter are presented. This is a new application of the differential form characteristic set algorithm,i.e.,Wu's method,in differential equations.

  20. A Weighted Block Dictionary Learning Algorithm for Classification

    OpenAIRE

    Zhongrong Shi

    2016-01-01

    Discriminative dictionary learning, playing a critical role in sparse representation based classification, has led to state-of-the-art classification results. Among the existing discriminative dictionary learning methods, two different approaches, shared dictionary and class-specific dictionary, which associate each dictionary atom to all classes or a single class, have been studied. The shared dictionary is a compact method but with lack of discriminative information; the class-specific dict...

  1. IMPROVEMENT OF TCAM-BASED PACKET CLASSIFICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Xu Zhen; Zhang Jun; Rui Liyang; Sun Jun

    2008-01-01

    The feature of Ternary Content Addressable Memories (TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However, the limitations of TCAMs impede their utilization. In this paper, the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.

  2. Analysis of Distributed and Adaptive Genetic Algorithm for Mining Interesting Classification Rules

    Institute of Scientific and Technical Information of China (English)

    YI Yunfei; LIN Fang; QIN Jun

    2008-01-01

    Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules. The paper gives the method to encode for the rules, the fitness function, the selecting, crossover, mutation and migration operator for the DAGA at the same time are designed.

  3. A NEW UNSUPERVISED CLASSIFICATION ALGORITHM FOR POLARIMETRIC SAR IMAGES BASED ON FUZZY SET THEORY

    Institute of Scientific and Technical Information of China (English)

    Fu Yusheng; Xie Yan; Pi Yiming; Hou Yinming

    2006-01-01

    In this letter, a new method is proposed for unsupervised classification of terrain types and man-made objects using POLarimetric Synthetic Aperture Radar (POLSAR) data. This technique is a combination of the usage of polarimetric information of SAR images and the unsupervised classification method based on fuzzy set theory. Image quantization and image enhancement are used to preprocess the POLSAR data. Then the polarimetric information and Fuzzy C-Means (FCM) clustering algorithm are used to classify the preprocessed images. The advantages of this algorithm are the automated classification, its high classification accuracy, fast convergence and high stability. The effectiveness of this algorithm is demonstrated by experiments using SIR-C/X-SAR (Spaceborne Imaging Radar-C/X-band Synthetic Aperture Radar) data.

  4. [MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique].

    Science.gov (United States)

    Chen, Zhiru; Hong, Wenxue

    2016-02-01

    Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier. PMID:27382743

  5. Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data

    Directory of Open Access Journals (Sweden)

    Viswanath Satish

    2012-02-01

    Full Text Available Abstract Background Dimensionality reduction (DR enables the construction of a lower dimensional space (embedding from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of noise in the data. In this paper, we present a novel DR technique known as consensus embedding that aims to overcome these problems by generating and combining multiple low-dimensional embeddings, hence exploiting the variance among them in a manner similar to ensemble classifier schemes such as Bagging. We demonstrate theoretical properties of consensus embedding which show that it will result in a single stable embedding solution that preserves information more accurately as compared to any individual embedding (generated via DR schemes such as Principal Component Analysis, Graph Embedding, or Locally Linear Embedding. Intelligent sub-sampling (via mean-shift and code parallelization are utilized to provide for an efficient implementation of the scheme. Results Applications of consensus embedding are shown in the context of classification and clustering as applied to: (1 image partitioning of white matter and gray matter on 10 different synthetic brain MRI images corrupted with 18 different combinations of noise and bias field inhomogeneity, (2 classification of 4 high-dimensional gene-expression datasets, (3 cancer detection (at a pixel-level on 16 image slices obtained from 2 different high-resolution prostate MRI datasets. In over 200 different experiments concerning classification and segmentation of biomedical data, consensus embedding was found to consistently outperform both linear and non-linear DR methods within all applications considered. Conclusions We have presented a novel framework termed consensus embedding which leverages ensemble classification theory within dimensionality reduction, allowing for application to a wide range

  6. Algorithms for the Automatic Classification and Sorting of Conifers in the Garden Nursery Industry

    DEFF Research Database (Denmark)

    Petri, Stig

    with the classification and sorting of plants using machine vision have been discussed as an introduction to the work reported here. The use of Nordmann firs as a basis for evaluating the developed algorithms naturally introduces a bias towards this species in the algorithms, but steps have been taken throughout...... was used as the basis for evaluating the constructed feature extraction algorithms. Through an analysis of the construction of a machine vision system suitable for classifying and sorting plants, the needs with regard to physical frame, lighting system, camera and software algorithms have been uncovered...... classification performance. A total of six feature extraction algorithms are reported in this work. These include algorithms that record the image data directly, describe the border of the plant object, describe the color characteristics of the plant, or attempts to extract and describe the trunk structure...

  7. A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis

    Directory of Open Access Journals (Sweden)

    Bendi Venkata Ramana

    2011-03-01

    Full Text Available Patients with Liver disease have been continuously increasing because of excessive consumption ofalcohol, inhale of harmful gases, intake of contaminated food, pickles and drugs. Automatic classificationtools may reduce burden on doctors. This paper evaluates the selected classification algorithms for theclassification of some liver patient datasets. The classification algorithms considered here are Naïve Bayesclassifier, C4.5, Back propagation Neural Network algorithm, and Support Vector Machines. Thesealgorithms are evaluated based on four criteria: Accuracy, Precision, Sensitivity and Specificity

  8. Packet Classification by Multilevel Cutting of the Classification Space: An Algorithmic-Architectural Solution for IP Packet Classification in Next Generation Networks

    Directory of Open Access Journals (Sweden)

    Motasem Aldiab

    2008-01-01

    Full Text Available Traditionally, the Internet provides only a “best-effort” service, treating all packets going to the same destination equally. However, providing differentiated services for different users based on their quality requirements is increasingly becoming a demanding issue. For this, routers need to have the capability to distinguish and isolate traffic belonging to different flows. This ability to determine the flow each packet belongs to is called packet classification. Technology vendors are reluctant to support algorithmic solutions for classification due to their nondeterministic performance. Although content addressable memories (CAMs are favoured by technology vendors due to their deterministic high-lookup rates, they suffer from the problems of high-power consumption and high-silicon cost. This paper provides a new algorithmic-architectural solution for packet classification that mixes CAMs with algorithms based on multilevel cutting of the classification space into smaller spaces. The provided solution utilizes the geometrical distribution of rules in the classification space. It provides the deterministic performance of CAMs, support for dynamic updates, and added flexibility for system designers.

  9. Performance Analysis of Gender Clustering and Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Dr.K.Meena

    2012-03-01

    Full Text Available In speech processing, gender clustering and classification plays a major role. In both gender clustering and classification, selecting the feature is an important process and the often utilized featurefor gender clustering and classification in speech processing is pitch. The pitch value of a male speech differs much from that of a female speech. Normally, there is a considerable frequency value difference between the male and female speech. But, in some cases the frequency of male is almost equal to female or frequency of female is equal to male. In such situation, it is difficult to identify the exact gender. By considering this drawback, here three features namely; energy entropy, zero crossing rate and short time energy are used for identifying the gender. Gender clustering and classification of speech signal are estimated using the aforementioned three features. Here, the gender clustering is computed using Euclidean distance, Mahalanobis distance, Manhattan distance & Bhattacharyya distance method and the gender classification method is computed using combined fuzzy logic and neural network, neuro fuzzy and support vector machine and its performance are analyzed.

  10. Application of ant colony algorithm in plant leaves classification based on infrared spectroscopy

    Science.gov (United States)

    Guo, Tiantai; Hong, Bo; Kong, Ming; Zhao, Jun

    2014-04-01

    This paper proposes to use ant colony algorithm in the analysis of spectral data of plant leaves to achieve the best classification of different plants within a short time. Intelligent classification is realized according to different components of featured information included in near infrared spectrum data of plants. The near infrared diffusive emission spectrum curves of the leaves of Cinnamomum camphora and Acer saccharum Marsh are acquired, which have 75 leaves respectively, and are divided into two groups. Then, the acquired data are processed using ant colony algorithm and the same kind of leaves can be classified as a class by ant colony clustering algorithm. Finally, the two groups of data are classified into two classes. Experiment results show that the algorithm can distinguish different species up to the percentage of 100%. The classification of plant leaves has important application value in agricultural development, research of species invasion, floriculture etc.

  11. Research of Plant-Leaves Classification Algorithm Based on Supervised LLE

    Directory of Open Access Journals (Sweden)

    Yan Qing

    2013-06-01

    Full Text Available A new supervised LLE method based on the fisher projection was proposed in this paper, and combined it with a new classification algorithm based on manifold learning to realize the recognition of the plant leaves. Firstly,the method utilizes the Fisher projection distance to replace the sample's geodesic distance, and a new supervised LLE algorithm is obtained .Then, a classification algorithm which uses the manifold reconstruction error to distinguish the sample classification directly is adopted. This algorithm can utilize the category information better,and improve recognition rate effectively. At the same time, it has the advantage of the easily parameter estimation. The experimental results based on the real-world plant leaf databases shows its average accuracy of recognition was up to 95.17%.

  12. Two-step Classification Algorithm Based on Decision-Theoretic Rough Set Theory

    Directory of Open Access Journals (Sweden)

    Jun Wang

    2013-07-01

    Full Text Available This paper introduces rough set theory and decision-theoretic rough set theory. Then based on the latter, a two-step classification algorithm is proposed. Compared with primitive DTRST algorithms, our method decreases the range of negative domain and employs a two-steps strategy in classification. New samples and unknown samples can be estimated whether it belongs to the negative domain when they are found. Then, fewer wrong samples will be classified in negative domain. Therefore, error rate and loss of classification is lowered. Compared with traditional information filtering methods, such as Naive Bayes algorithm and primitive DTRST algorithm, the proposed method can gain high accuracy and low loss.  

  13. Application of a Genetic Algorithm to Nearest Neighbour Classification

    NARCIS (Netherlands)

    Simkin, S.; Verwaart, D.; Vrolijk, H.C.J.

    2005-01-01

    This paper describes the application of a genetic algorithm to nearest-neighbour based imputation of sample data into a census data dataset. The genetic algorithm optimises the selection and weights of variables used for measuring distance. The results show that the measure of fit can be improved by

  14. Woven fabric defects detection based on texture classification algorithm

    International Nuclear Information System (INIS)

    In this paper we have compared two famous methods in texture classification to solve the problem of recognition and classification of defects occurring in a textile manufacture. We have compared local binary patterns method with co-occurrence matrix. The classifier used is the support vector machines (SVM). The system has been tested using TILDA database. The results obtained are interesting and show that LBP is a good method for the problems of recognition and classifcation defects, it gives a good running time especially for the real time applications.

  15. Study on Increasing the Accuracy of Classification Based on Ant Colony algorithm

    Science.gov (United States)

    Yu, M.; Chen, D.-W.; Dai, C.-Y.; Li, Z.-L.

    2013-05-01

    The application for GIS advances the ability of data analysis on remote sensing image. The classification and distill of remote sensing image is the primary information source for GIS in LUCC application. How to increase the accuracy of classification is an important content of remote sensing research. Adding features and researching new classification methods are the ways to improve accuracy of classification. Ant colony algorithm based on mode framework defined, agents of the algorithms in nature-inspired computation field can show a kind of uniform intelligent computation mode. It is applied in remote sensing image classification is a new method of preliminary swarm intelligence. Studying the applicability of ant colony algorithm based on more features and exploring the advantages and performance of ant colony algorithm are provided with very important significance. The study takes the outskirts of Fuzhou with complicated land use in Fujian Province as study area. The multi-source database which contains the integration of spectral information (TM1-5, TM7, NDVI, NDBI) and topography characters (DEM, Slope, Aspect) and textural information (Mean, Variance, Homogeneity, Contrast, Dissimilarity, Entropy, Second Moment, Correlation) were built. Classification rules based different characters are discovered from the samples through ant colony algorithm and the classification test is performed based on these rules. At the same time, we compare with traditional maximum likelihood method, C4.5 algorithm and rough sets classifications for checking over the accuracies. The study showed that the accuracy of classification based on the ant colony algorithm is higher than other methods. In addition, the land use and cover changes in Fuzhou for the near term is studied and display the figures by using remote sensing technology based on ant colony algorithm. In addition, the land use and cover changes in Fuzhou for the near term is studied and display the figures by using

  16. Analysis and Evaluation of IKONOS Image Fusion Algorithm Based on Land Cover Classification

    Institute of Scientific and Technical Information of China (English)

    Xia; JING; Yan; BAO

    2015-01-01

    Different fusion algorithm has its own advantages and limitations,so it is very difficult to simply evaluate the good points and bad points of the fusion algorithm. Whether an algorithm was selected to fuse object images was also depended upon the sensor types and special research purposes. Firstly,five fusion methods,i. e. IHS,Brovey,PCA,SFIM and Gram-Schmidt,were briefly described in the paper. And then visual judgment and quantitative statistical parameters were used to assess the five algorithms. Finally,in order to determine which one is the best suitable fusion method for land cover classification of IKONOS image,the maximum likelihood classification( MLC) was applied using the above five fusion images. The results showed that the fusion effect of SFIM transform and Gram-Schmidt transform were better than the other three image fusion methods in spatial details improvement and spectral information fidelity,and Gram-Schmidt technique was superior to SFIM transform in the aspect of expressing image details. The classification accuracy of the fused image using Gram-Schmidt and SFIM algorithms was higher than that of the other three image fusion methods,and the overall accuracy was greater than 98%. The IHS-fused image classification accuracy was the lowest,the overall accuracy and kappa coefficient were 83. 14% and 0. 76,respectively. Thus the IKONOS fusion images obtained by the Gram-Schmidt and SFIM were better for improving the land cover classification accuracy.

  17. A method for classification of network traffic based on C5.0 Machine Learning Algorithm

    DEFF Research Database (Denmark)

    Bujlow, Tomasz; Riaz, M. Tahir; Pedersen, Jens Myrup

    2012-01-01

    Monitoring of the network performance in high-speed Internet infrastructure is a challenging task, as the requirements for the given quality level are service-dependent. Backbone QoS monitoring and analysis in Multi-hop Networks requires therefore knowledge about types of applications forming...... current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown...... and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options...

  18. A hierarchical classification ant colony algorithm for predicting gene ontology terms

    OpenAIRE

    Otero, Fernando E. B.; Freitas, Alex. A.; Johnson, Colin G.

    2009-01-01

    This paper proposes a novel Ant Colony Optimisation algorithm for the hierarchical problem of predicting protein functions using the Gene Ontology (GO). The GO structure represents a challenging case of hierarchical classification, since its terms are organised in a direct acyclic graph fashion where a term can have more than one parent in contrast to only one parent in tree structures. The proposed method discovers an ordered list of classification rules which is able to predict all GO terms...

  19. A Supervised Classification Algorithm for Note Onset Detection

    Directory of Open Access Journals (Sweden)

    Douglas Eck

    2007-01-01

    Full Text Available This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

  20. A RBF classification method of remote sensing image based on genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    The remote sensing image classification has stimulated considerable interest as an effective method for better retrieving information from the rapidly increasing large volume, complex and distributed satellite remote imaging data of large scale and cross-time, due to the increase of remote image quantities and image resolutions. In the paper, the genetic algorithms were employed to solve the weighting of the radial basis faction networks in order to improve the precision of remote sensing image classification. The remote sensing image classification was also introduced for the GIS spatial analysis and the spatial online analytical processing (OLAP) ,and the resulted effectiveness was demonstrated in the analysis of land utilization variation of Daqing city.

  1. Adaptative initialisation of a EvKNN classification algorithm

    OpenAIRE

    Chan Wai Tim, Stefen; Rombaut, Michele; Pellerin, Denis

    2012-01-01

    International audience; The establishment of the learning data base is a long and tedious task that must be carried out before starting the classification process. An Evidential KNN (EvKNN) has been developed in order to help the user, which proposes the "best" samples to label according to a strategy. However, at the beginning of this task, the classes are not clearly defined and are represented by a number of labeled samples smaller than the k required samples for EvKNN. In this paper, we p...

  2. Real Time Motif Classification from Database Using Intelligent Algorithms

    Directory of Open Access Journals (Sweden)

    Paresh Kotak

    2012-12-01

    Full Text Available The amount of raw data being accumulated in the databases is increasing at an inconceivable rate.However, these data-rich databases are poor in providing substantial information. This is where datamining comes into picture. Specifically, data mining is "the process of extracting or mining informationfrom large amount of data". Motif classification has been an active area of research in data mining. Itconsists of assigning a data instance to one of the predefined classes/groups based upon the knowledgegained from previously seen (classified data.

  3. PCIU: Hardware Implementations of an Efficient Packet Classification Algorithm with an Incremental Update Capability

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2011-01-01

    Full Text Available Packet classification plays a crucial role for a number of network services such as policy-based routing, firewalls, and traffic billing, to name a few. However, classification can be a bottleneck in the above-mentioned applications if not implemented properly and efficiently. In this paper, we propose PCIU, a novel classification algorithm, which improves upon previously published work. PCIU provides lower preprocessing time, lower memory consumption, ease of incremental rule update, and reasonable classification time compared to state-of-the-art algorithms. The proposed algorithm was evaluated and compared to RFC and HiCut using several benchmarks. Results obtained indicate that PCIU outperforms these algorithms in terms of speed, memory usage, incremental update capability, and preprocessing time. The algorithm, furthermore, was improved and made more accessible for a variety of applications through implementation in hardware. Two such implementations are detailed and discussed in this paper. The results indicate that a hardware/software codesign approach results in a slower, but easier to optimize and improve within time constraints, PCIU solution. A hardware accelerator based on an ESL approach using Handel-C, on the other hand, resulted in a 31x speed-up over a pure software implementation running on a state of the art Xeon processor.

  4. EVALUATION OF SOUND CLASSIFICATION USING MODIFIED CLASSIFIER AND SPEECH ENHANCEMENT USING ICA ALGORITHM FOR HEARING AID APPLICATION

    OpenAIRE

    N. Shanmugapriya; E. Chandra

    2016-01-01

    Hearing aid users are exposed to diversified vocal scenarios. The necessity for sound classification algorithms becomes a vital factor to yield good listening experience. In this work, an approach is proposed to improve the speech quality in the hearing aids based on Independent Component Analysis (ICA) algorithm with modified speech signal classification methods. The proposed algorithm has better results on speech intelligibility than other existing algorithm and this result has been proved ...

  5. Data classification with radial basis function networks based on a novel kernel density estimation algorithm.

    Science.gov (United States)

    Oyang, Yen-Jen; Hwang, Shien-Ching; Ou, Yu-Yen; Chen, Chien-Yu; Chen, Zhi-Wei

    2005-01-01

    This paper presents a novel learning algorithm for efficient construction of the radial basis function (RBF) networks that can deliver the same level of accuracy as the support vector machines (SVMs) in data classification applications. The proposed learning algorithm works by constructing one RBF subnetwork to approximate the probability density function of each class of objects in the training data set. With respect to algorithm design, the main distinction of the proposed learning algorithm is the novel kernel density estimation algorithm that features an average time complexity of O(n log n), where n is the number of samples in the training data set. One important advantage of the proposed learning algorithm, in comparison with the SVM, is that the proposed learning algorithm generally takes far less time to construct a data classifier with an optimized parameter setting. This feature is of significance for many contemporary applications, in particular, for those applications in which new objects are continuously added into an already large database. Another desirable feature of the proposed learning algorithm is that the RBF networks constructed are capable of carrying out data classification with more than two classes of objects in one single run. In other words, unlike with the SVM, there is no need to resort to mechanisms such as one-against-one or one-against-all for handling datasets with more than two classes of objects. The comparison with SVM is of particular interest, because it has been shown in a number of recent studies that SVM generally are able to deliver higher classification accuracy than the other existing data classification algorithms. As the proposed learning algorithm is instance-based, the data reduction issue is also addressed in this paper. One interesting observation in this regard is that, for all three data sets used in data reduction experiments, the number of training samples remaining after a naive data reduction mechanism is

  6. A Comparison of Two Open Source LiDAR Surface Classification Algorithms

    OpenAIRE

    Danny G Marks; Nancy F. Glenn; Timothy E. Link; Hudak, Andrew T.; Rupesh Shrestha; Michael J. Falkowski; Alistair M. S. Smith; Hongyu Huang; Wade T. Tinkham

    2011-01-01

    With the progression of LiDAR (Light Detection and Ranging) towards a mainstream resource management tool, it has become necessary to understand how best to process and analyze the data. While most ground surface identification algorithms remain proprietary and have high purchase costs; a few are openly available, free to use, and are supported by published results. Two of the latter are the multiscale curvature classification and the Boise Center Aerospace Laboratory LiDAR (BCAL) algorithms....

  7. Research of information classification and strategy intelligence extract algorithm based on military strategy hall

    Science.gov (United States)

    Chen, Lei; Li, Dehua; Yang, Jie

    2007-12-01

    Constructing virtual international strategy environment needs many kinds of information, such as economy, politic, military, diploma, culture, science, etc. So it is very important to build an information auto-extract, classification, recombination and analysis management system with high efficiency as the foundation and component of military strategy hall. This paper firstly use improved Boost algorithm to classify obtained initial information, then use a strategy intelligence extract algorithm to extract strategy intelligence from initial information to help strategist to analysis information.

  8. Automated detection and classification of cryptographic algorithms in binary programs through machine learning

    OpenAIRE

    Hosfelt, Diane Duros

    2015-01-01

    Threats from the internet, particularly malicious software (i.e., malware) often use cryptographic algorithms to disguise their actions and even to take control of a victim's system (as in the case of ransomware). Malware and other threats proliferate too quickly for the time-consuming traditional methods of binary analysis to be effective. By automating detection and classification of cryptographic algorithms, we can speed program analysis and more efficiently combat malware. This thesis wil...

  9. Walking pattern classification and walking distance estimation algorithms using gait phase information.

    Science.gov (United States)

    Wang, Jeen-Shing; Lin, Che-Wei; Yang, Ya-Ting C; Ho, Yu-Jen

    2012-10-01

    This paper presents a walking pattern classification and a walking distance estimation algorithm using gait phase information. A gait phase information retrieval algorithm was developed to analyze the duration of the phases in a gait cycle (i.e., stance, push-off, swing, and heel-strike phases). Based on the gait phase information, a decision tree based on the relations between gait phases was constructed for classifying three different walking patterns (level walking, walking upstairs, and walking downstairs). Gait phase information was also used for developing a walking distance estimation algorithm. The walking distance estimation algorithm consists of the processes of step count and step length estimation. The proposed walking pattern classification and walking distance estimation algorithm have been validated by a series of experiments. The accuracy of the proposed walking pattern classification was 98.87%, 95.45%, and 95.00% for level walking, walking upstairs, and walking downstairs, respectively. The accuracy of the proposed walking distance estimation algorithm was 96.42% over a walking distance.

  10. Data classification using metaheuristic Cuckoo Search technique for Levenberg Marquardt back propagation (CSLM) algorithm

    Science.gov (United States)

    Nawi, Nazri Mohd.; Khan, Abdullah; Rehman, M. Z.

    2015-05-01

    A nature inspired behavior metaheuristic techniques which provide derivative-free solutions to solve complex problems. One of the latest additions to the group of nature inspired optimization procedure is Cuckoo Search (CS) algorithm. Artificial Neural Network (ANN) training is an optimization task since it is desired to find optimal weight set of a neural network in training process. Traditional training algorithms have some limitation such as getting trapped in local minima and slow convergence rate. This study proposed a new technique CSLM by combining the best features of two known algorithms back-propagation (BP) and Levenberg Marquardt algorithm (LM) for improving the convergence speed of ANN training and avoiding local minima problem by training this network. Some selected benchmark classification datasets are used for simulation. The experiment result show that the proposed cuckoo search with Levenberg Marquardt algorithm has better performance than other algorithm used in this study.

  11. A New Function-based Framework for Classification and Evaluation of Mutual Exclusion Algorithms

    Directory of Open Access Journals (Sweden)

    Leila Omrani

    2011-05-01

    Full Text Available This paper presents a new function-based framework for mutual exclusion algorithms indistributed systems. In the traditional classification mutual exclusion algorithms were dividedin to two groups: Token-based and Permission-based. Recently, some new algorithms areproposed in order to increase fault tolerance, minimize message complexity and decreasesynchronization delay. Although the studies in this field up to now can compare and evaluatethe algorithms, this paper takes a step further and proposes a new function-based frameworkas a brief introduction to the algorithms in the four groups as follows: Token-based,Permission-based, Hybrid and K-mutual exclusion. In addition, because of being dispersaland obscure performance criteria, introduces four parameters which can be used to comparevarious distributed mutual exclusion algorithms such as message complexity, synchronizationdelay, decision theory and nodes configuration. Hope the proposed framework provides asuitable context for technical and clear evaluation of existing and future methods.

  12. Model classification rate control algorithm for video coding

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    A model classification rate control method for video coding is proposed. The macro-blocks are classified according to their prediction errors, and different parameters are used in the rate-quantization and distortion-quantization model.The different model parameters are calculated from the previous frame of the same type in the process of coding. These models are used to estimate the relations among rate, distortion and quantization of the current frame. Further steps,such as R-D optimization based quantization adjustment and smoothing of quantization of adjacent macroblocks, are used to improve the quality. The results of the experiments prove that the technique is effective and can be realized easily. The method presented in the paper can be a good way for MPEG and H. 264 rate control.

  13. Improved Fault Classification in Series Compensated Transmission Line: Comparative Evaluation of Chebyshev Neural Network Training Algorithms.

    Science.gov (United States)

    Vyas, Bhargav Y; Das, Biswarup; Maheshwari, Rudra Prakash

    2016-08-01

    This paper presents the Chebyshev neural network (ChNN) as an improved artificial intelligence technique for power system protection studies and examines the performances of two ChNN learning algorithms for fault classification of series compensated transmission line. The training algorithms are least-square Levenberg-Marquardt (LSLM) and recursive least-square algorithm with forgetting factor (RLSFF). The performances of these algorithms are assessed based on their generalization capability in relating the fault current parameters with an event of fault in the transmission line. The proposed algorithm is fast in response as it utilizes postfault samples of three phase currents measured at the relaying end corresponding to half-cycle duration only. After being trained with only a small part of the generated fault data, the algorithms have been tested over a large number of fault cases with wide variation of system and fault parameters. Based on the studies carried out in this paper, it has been found that although the RLSFF algorithm is faster for training the ChNN in the fault classification application for series compensated transmission lines, the LSLM algorithm has the best accuracy in testing. The results prove that the proposed ChNN-based method is accurate, fast, easy to design, and immune to the level of compensations. Thus, it is suitable for digital relaying applications. PMID:25314714

  14. Classification of EEG Signals using adaptive weighted distance nearest neighbor algorithm

    Directory of Open Access Journals (Sweden)

    E. Parvinnia

    2014-01-01

    Full Text Available Electroencephalogram (EEG signals are often used to diagnose diseases such as seizure, alzheimer, and schizophrenia. One main problem with the recorded EEG samples is that they are not equally reliable due to the artifacts at the time of recording. EEG signal classification algorithms should have a mechanism to handle this issue. It seems that using adaptive classifiers can be useful for the biological signals such as EEG. In this paper, a general adaptive method named weighted distance nearest neighbor (WDNN is applied for EEG signal classification to tackle this problem. This classification algorithm assigns a weight to each training sample to control its influence in classifying test samples. The weights of training samples are used to find the nearest neighbor of an input query pattern. To assess the performance of this scheme, EEG signals of thirteen schizophrenic patients and eighteen normal subjects are analyzed for the classification of these two groups. Several features including, fractal dimension, band power and autoregressive (AR model are extracted from EEG signals. The classification results are evaluated using Leave one (subject out cross validation for reliable estimation. The results indicate that combination of WDNN and selected features can significantly outperform the basic nearest-neighbor and the other methods proposed in the past for the classification of these two groups. Therefore, this method can be a complementary tool for specialists to distinguish schizophrenia disorder.

  15. A Comparative Study of Classification and Regression Algorithms for Modelling Students' Academic Performance

    Science.gov (United States)

    Strecht, Pedro; Cruz, Luís; Soares, Carlos; Mendes-Moreira, João; Abreu, Rui

    2015-01-01

    Predicting the success or failure of a student in a course or program is a problem that has recently been addressed using data mining techniques. In this paper we evaluate some of the most popular classification and regression algorithms on this problem. We address two problems: prediction of approval/failure and prediction of grade. The former is…

  16. Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

    Directory of Open Access Journals (Sweden)

    Huibin Lu

    2015-01-01

    Full Text Available In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the source domain to the target domain; the second step is to construct L1-Graph on the basis of sparse representation, so as to search for the nearest neighbor data with self-adaptation and preserve the samples and the geometric structure; lastly, we integrate two complementary objective functions into the unified optimization issue and use the iterative algorithm to cope with it, and then the estimation of the testing sample classification is completed. Comparative experiments are conducted in USPS-Binary digital database, Three-Domain Object Benchmark database, and ALOI database; the experimental results verify the effectiveness of the proposed algorithm, which improves the recognition accuracy and ensures the robustness of algorithm.

  17. Classification of EEG signals using a greedy algorithm for constructing a committee of weak classifiers

    International Nuclear Information System (INIS)

    A greedy algorithm has been proposed for the construction of a committee of weak EEG classifiers, which work in the simplest one-dimensional feature spaces. It has been shown that the accuracy of classification by the committee is several times higher than the accuracy of the best weak classifier

  18. A Novel Algorithm for Imbalance Data Classification Based on Neighborhood Hypergraph

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2014-01-01

    Full Text Available The classification problem for imbalance data is paid more attention to. So far, many significant methods are proposed and applied to many fields. But more efficient methods are needed still. Hypergraph may not be powerful enough to deal with the data in boundary region, although it is an efficient tool to knowledge discovery. In this paper, the neighborhood hypergraph is presented, combining rough set theory and hypergraph. After that, a novel classification algorithm for imbalance data based on neighborhood hypergraph is developed, which is composed of three steps: initialization of hyperedge, classification of training data set, and substitution of hyperedge. After conducting an experiment of 10-fold cross validation on 18 data sets, the proposed algorithm has higher average accuracy than others.

  19. Synthesis of supervised classification algorithm using intelligent and statistical tools

    Directory of Open Access Journals (Sweden)

    Ali Douik

    2009-09-01

    Full Text Available A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby, the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national competition. This work is prominent for many next computer vision studies as it's detailed in this study.

  20. Synthesis of supervised classification algorithm using intelligent and statistical tools

    CERN Document Server

    Douik, Ali

    2009-01-01

    A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby..., the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national c...

  1. Land use mapping from CBERS-2 images with open source tools by applying different classification algorithms

    Science.gov (United States)

    Sanhouse-García, Antonio J.; Rangel-Peraza, Jesús Gabriel; Bustos-Terrones, Yaneth; García-Ferrer, Alfonso; Mesas-Carrascosa, Francisco J.

    2016-02-01

    Land cover classification is often based on different characteristics between their classes, but with great homogeneity within each one of them. This cover is obtained through field work or by mean of processing satellite images. Field work involves high costs; therefore, digital image processing techniques have become an important alternative to perform this task. However, in some developing countries and particularly in Casacoima municipality in Venezuela, there is a lack of geographic information systems due to the lack of updated information and high costs in software license acquisition. This research proposes a low cost methodology to develop thematic mapping of local land use and types of coverage in areas with scarce resources. Thematic mapping was developed from CBERS-2 images and spatial information available on the network using open source tools. The supervised classification method per pixel and per region was applied using different classification algorithms and comparing them among themselves. Classification method per pixel was based on Maxver algorithms (maximum likelihood) and Euclidean distance (minimum distance), while per region classification was based on the Bhattacharya algorithm. Satisfactory results were obtained from per region classification, where overall reliability of 83.93% and kappa index of 0.81% were observed. Maxver algorithm showed a reliability value of 73.36% and kappa index 0.69%, while Euclidean distance obtained values of 67.17% and 0.61% for reliability and kappa index, respectively. It was demonstrated that the proposed methodology was very useful in cartographic processing and updating, which in turn serve as a support to develop management plans and land management. Hence, open source tools showed to be an economically viable alternative not only for forestry organizations, but for the general public, allowing them to develop projects in economically depressed and/or environmentally threatened areas.

  2. An arrhythmia classification algorithm using a dedicated wavelet adapted to different subjects

    Directory of Open Access Journals (Sweden)

    Min Se Dong

    2011-06-01

    Full Text Available Abstract Background Numerous studies have been conducted regarding a heartbeat classification algorithm over the past several decades. However, many algorithms have also been studied to acquire robust performance, as biosignals have a large amount of variation among individuals. Various methods have been proposed to reduce the differences coming from personal characteristics, but these expand the differences caused by arrhythmia. Methods In this paper, an arrhythmia classification algorithm using a dedicated wavelet adapted to individual subjects is proposed. We reduced the performance variation using dedicated wavelets, as in the ECG morphologies of the subjects. The proposed algorithm utilizes morphological filtering and a continuous wavelet transform with a dedicated wavelet. A principal component analysis and linear discriminant analysis were utilized to compress the morphological data transformed by the dedicated wavelets. An extreme learning machine was used as a classifier in the proposed algorithm. Results A performance evaluation was conducted with the MIT-BIH arrhythmia database. The results showed a high sensitivity of 97.51%, specificity of 85.07%, accuracy of 97.94%, and a positive predictive value of 97.26%. Conclusions The proposed algorithm achieves better accuracy than other state-of-the-art algorithms with no intrasubject between the training and evaluation datasets. And it significantly reduces the amount of intervention needed by physicians.

  3. Image processing and classification algorithm for yeast cell morphology in a microfluidic chip

    Science.gov (United States)

    Yang Yu, Bo; Elbuken, Caglar; Ren, Carolyn L.; Huissoon, Jan P.

    2011-06-01

    The study of yeast cell morphology requires consistent identification of cell cycle phases based on cell bud size. A computer-based image processing algorithm is designed to automatically classify microscopic images of yeast cells in a microfluidic channel environment. The images were enhanced to reduce background noise, and a robust segmentation algorithm is developed to extract geometrical features including compactness, axis ratio, and bud size. The features are then used for classification, and the accuracy of various machine-learning classifiers is compared. The linear support vector machine, distance-based classification, and k-nearest-neighbor algorithm were the classifiers used in this experiment. The performance of the system under various illumination and focusing conditions were also tested. The results suggest it is possible to automatically classify yeast cells based on their morphological characteristics with noisy and low-contrast images.

  4. Robust algorithm for arrhythmia classification in ECG using extreme learning machine

    Directory of Open Access Journals (Sweden)

    Shin Kwangsoo

    2009-10-01

    Full Text Available Abstract Background Recently, extensive studies have been carried out on arrhythmia classification algorithms using artificial intelligence pattern recognition methods such as neural network. To improve practicality, many studies have focused on learning speed and the accuracy of neural networks. However, algorithms based on neural networks still have some problems concerning practical application, such as slow learning speeds and unstable performance caused by local minima. Methods In this paper we propose a novel arrhythmia classification algorithm which has a fast learning speed and high accuracy, and uses Morphology Filtering, Principal Component Analysis and Extreme Learning Machine (ELM. The proposed algorithm can classify six beat types: normal beat, left bundle branch block, right bundle branch block, premature ventricular contraction, atrial premature beat, and paced beat. Results The experimental results of the entire MIT-BIH arrhythmia database demonstrate that the performances of the proposed algorithm are 98.00% in terms of average sensitivity, 97.95% in terms of average specificity, and 98.72% in terms of average accuracy. These accuracy levels are higher than or comparable with those of existing methods. We make a comparative study of algorithm using an ELM, back propagation neural network (BPNN, radial basis function network (RBFN, or support vector machine (SVM. Concerning the aspect of learning time, the proposed algorithm using ELM is about 290, 70, and 3 times faster than an algorithm using a BPNN, RBFN, and SVM, respectively. Conclusion The proposed algorithm shows effective accuracy performance with a short learning time. In addition we ascertained the robustness of the proposed algorithm by evaluating the entire MIT-BIH arrhythmia database.

  5. Seasonal cultivated and fallow cropland mapping using MODIS-based automated cropland classification algorithm

    Science.gov (United States)

    Wu, Zhuoting; Thenkabail, Prasad S.; Mueller, Rick; Zakzeski, Audra; Melton, Forrest; Johnson, Lee; Rosevelt, Carolyn; Dwyer, John; Jones, Jeanine; Verdin, James P.

    2013-01-01

    Increasing drought occurrences and growing populations demand accurate, routine, and consistent cultivated and fallow cropland products to enable water and food security analysis. The overarching goal of this research was to develop and test automated cropland classification algorithm (ACCA) that provide accurate, consistent, and repeatable information on seasonal cultivated as well as seasonal fallow cropland extents and areas based on the Moderate Resolution Imaging Spectroradiometer remote sensing data. Seasonal ACCA development process involves writing series of iterative decision tree codes to separate cultivated and fallow croplands from noncroplands, aiming to accurately mirror reliable reference data sources. A pixel-by-pixel accuracy assessment when compared with the U.S. Department of Agriculture (USDA) cropland data showed, on average, a producer’s accuracy of 93% and a user’s accuracy of 85% across all months. Further, ACCA-derived cropland maps agreed well with the USDA Farm Service Agency crop acreage-reported data for both cultivated and fallow croplands with R-square values over 0.7 and field surveys with an accuracy of ≥95% for cultivated croplands and ≥76% for fallow croplands. Our results demonstrated the ability of ACCA to generate cropland products, such as cultivated and fallow cropland extents and areas, accurately, automatically, and repeatedly throughout the growing season.

  6. Classification of Aerosol Retrievals from Spaceborne Polarimetry Using a Multiparameter Algorithm

    Science.gov (United States)

    Russell, Philip B.; Kacenelenbogen, Meloe; Livingston, John M.; Hasekamp, Otto P.; Burton, Sharon P.; Schuster, Gregory L.; Johnson, Matthew S.; Knobelspiesse, Kirk D.; Redemann, Jens; Ramachandran, S.; Holben, Brent

    2013-01-01

    In this presentation, we demonstrate application of a new aerosol classification algorithm to retrievals from the POLDER-3 polarimter on the PARASOL spacecraft. Motivation and method: Since the development of global aerosol measurements by satellites and AERONET, classification of observed aerosols into several types (e.g., urban-industrial, biomass burning, mineral dust, maritime, and various subtypes or mixtures of these) has proven useful to: understanding aerosol sources, transformations, effects, and feedback mechanisms; improving accuracy of satellite retrievals and quantifying assessments of aerosol radiative impacts on climate.

  7. A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation

    Directory of Open Access Journals (Sweden)

    Lavner Yizhar

    2009-01-01

    Full Text Available We present an efficient algorithm for segmentation of audio signals into speech or music. The central motivation to our study is consumer audio applications, where various real-time enhancements are often applied. The algorithm consists of a learning phase and a classification phase. In the learning phase, predefined training data is used for computing various time-domain and frequency-domain features, for speech and music signals separately, and estimating the optimal speech/music thresholds, based on the probability density functions of the features. An automatic procedure is employed to select the best features for separation. In the test phase, initial classification is performed for each segment of the audio signal, using a three-stage sieve-like approach, applying both Bayesian and rule-based methods. To avoid erroneous rapid alternations in the classification, a smoothing technique is applied, averaging the decision on each segment with past segment decisions. Extensive evaluation of the algorithm, on a database of more than 12 hours of speech and more than 22 hours of music showed correct identification rates of 99.4% and 97.8%, respectively, and quick adjustment to alternating speech/music sections. In addition to its accuracy and robustness, the algorithm can be easily adapted to different audio types, and is suitable for real-time operation.

  8. Comparison of some classification algorithms based on deterministic and nondeterministic decision rules

    KAUST Repository

    Delimata, Paweł

    2010-01-01

    We discuss two, in a sense extreme, kinds of nondeterministic rules in decision tables. The first kind of rules, called as inhibitory rules, are blocking only one decision value (i.e., they have all but one decisions from all possible decisions on their right hand sides). Contrary to this, any rule of the second kind, called as a bounded nondeterministic rule, can have on the right hand side only a few decisions. We show that both kinds of rules can be used for improving the quality of classification. In the paper, two lazy classification algorithms of polynomial time complexity are considered. These algorithms are based on deterministic and inhibitory decision rules, but the direct generation of rules is not required. Instead of this, for any new object the considered algorithms extract from a given decision table efficiently some information about the set of rules. Next, this information is used by a decision-making procedure. The reported results of experiments show that the algorithms based on inhibitory decision rules are often better than those based on deterministic decision rules. We also present an application of bounded nondeterministic rules in construction of rule based classifiers. We include the results of experiments showing that by combining rule based classifiers based on minimal decision rules with bounded nondeterministic rules having confidence close to 1 and sufficiently large support, it is possible to improve the classification quality. © 2010 Springer-Verlag.

  9. Human Talent Prediction in HRM using C4.5 Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Hamidah Jantan,

    2010-11-01

    Full Text Available In HRM, among the challenges for HR professionals is to manage an organization’s talents, especially to ensure the right person for the right job at the right time. Human talent prediction is an alternative to handle this issue. Due to that reason, classification and prediction in data mining which is commonly used in many areas can also be implemented to human talent. There are many classification techniques in data mining techniques such as Decision Tree, Neural Network, Rough Set Theory, Bayesian theory and Fuzzy logic. Decision tree is among the popular classification techniques, which can produce the interpretable rules or logic statement. Thegenerated rules from the selected technique can be used for future prediction. In this article, we present the study on how the potential human talent can be predicted using a decision tree classifier. By using this technique, the pattern of talent performance can be identified through the classification process. In that case, the hidden and valuable knowledge discovered in the related databases will be summarized in the decision tree structure. In this study, we use decision tree C4.5 classification algorithm to generate the classification rules for human talent performance records. Finally, the generated rules are evaluated using the unseen data in order to estimate the accuracy of the prediction result.

  10. Application of the probability-based covering algorithm model in text classification

    Institute of Scientific and Technical Information of China (English)

    ZHOU; Ying

    2009-01-01

    The probability-based covering algorithm(PBCA)is a new algorithm based on probability distribution.It decides,by voting,the class of the tested samples on the border of the coverage area,based on the probability of training samples.When using the original covering algorithm(CA),many tested samples that are located on the border of the coverage cannot be classified by the spherical neighborhood gained.The network structure of PBCA is a mixed structure composed of both a feed-forward network and a feedback network.By using this method of adding some heterogeneous samples and enlarging the coverage radius,it is possible to decrease the number of rejected samples and improve the rate of recognition accuracy.Relevant computer experiments indicate that the algorithm improves the study precision and achieves reasonably good results in text classification.

  11. Acoustic diagnosis of pulmonary hypertension: automated speech- recognition-inspired classification algorithm outperforms physicians

    Science.gov (United States)

    Kaddoura, Tarek; Vadlamudi, Karunakar; Kumar, Shine; Bobhate, Prashant; Guo, Long; Jain, Shreepal; Elgendi, Mohamed; Coe, James Y.; Kim, Daniel; Taylor, Dylan; Tymchak, Wayne; Schuurmans, Dale; Zemp, Roger J.; Adatia, Ian

    2016-09-01

    We hypothesized that an automated speech- recognition-inspired classification algorithm could differentiate between the heart sounds in subjects with and without pulmonary hypertension (PH) and outperform physicians. Heart sounds, electrocardiograms, and mean pulmonary artery pressures (mPAp) were recorded simultaneously. Heart sound recordings were digitized to train and test speech-recognition-inspired classification algorithms. We used mel-frequency cepstral coefficients to extract features from the heart sounds. Gaussian-mixture models classified the features as PH (mPAp ≥ 25 mmHg) or normal (mPAp < 25 mmHg). Physicians blinded to patient data listened to the same heart sound recordings and attempted a diagnosis. We studied 164 subjects: 86 with mPAp ≥ 25 mmHg (mPAp 41 ± 12 mmHg) and 78 with mPAp < 25 mmHg (mPAp 17 ± 5 mmHg) (p  < 0.005). The correct diagnostic rate of the automated speech-recognition-inspired algorithm was 74% compared to 56% by physicians (p = 0.005). The false positive rate for the algorithm was 34% versus 50% (p = 0.04) for clinicians. The false negative rate for the algorithm was 23% and 68% (p = 0.0002) for physicians. We developed an automated speech-recognition-inspired classification algorithm for the acoustic diagnosis of PH that outperforms physicians that could be used to screen for PH and encourage earlier specialist referral.

  12. Recent processing string and fusion algorithm improvements for automated sea mine classification in shallow water

    Science.gov (United States)

    Aridgides, Tom; Fernandez, Manuel F.; Dobeck, Gerald J.

    2003-09-01

    A novel sea mine computer-aided-detection / computer-aided-classification (CAD/CAC) processing string has been developed. The overall CAD/CAC processing string consists of pre-processing, adaptive clutter filtering (ACF), normalization, detection, feature extraction, feature orthogonalization, optimal subset feature selection, classification and fusion processing blocks. The range-dimension ACF is matched both to average highlight and shadow information, while also adaptively suppressing background clutter. For each detected object, features are extracted and processed through an orthogonalization transformation, enabling an efficient application of the optimal log-likelihood-ratio-test (LLRT) classification rule, in the orthogonal feature space domain. The classified objects of 4 distinct processing strings are fused using the classification confidence values as features and logic-based, "M-out-of-N", or LLRT-based fusion rules. The utility of the overall processing strings and their fusion was demonstrated with new shallow water high-resolution sonar imagery data. The processing string detection and classification parameters were tuned and the string classification performance was optimized, by appropriately selecting a subset of the original feature set. A significant improvement was made to the CAD/CAC processing string by utilizing a repeated application of the subset feature selection / LLRT classification blocks. It was shown that LLRT-based fusion algorithms outperform the logic based and the "M-out-of-N" ones. The LLRT-based fusion of the CAD/CAC processing strings resulted in up to a nine-fold false alarm rate reduction, compared to the best single CAD/CAC processing string results, while maintaining a constant correct mine classification probability.

  13. A simulation of remote sensor systems and data processing algorithms for spectral feature classification

    Science.gov (United States)

    Arduini, R. F.; Aherron, R. M.; Samms, R. W.

    1984-01-01

    A computational model of the deterministic and stochastic processes involved in multispectral remote sensing was designed to evaluate the performance of sensor systems and data processing algorithms for spectral feature classification. Accuracy in distinguishing between categories of surfaces or between specific types is developed as a means to compare sensor systems and data processing algorithms. The model allows studies to be made of the effects of variability of the atmosphere and of surface reflectance, as well as the effects of channel selection and sensor noise. Examples of these effects are shown.

  14. Research on Text Classification Algorithm%文本分类算法研究

    Institute of Scientific and Technical Information of China (English)

    王斌; 赵智超; 邵华清

    2011-01-01

    With the rapid development of Interact technology,information processing has become an indispensable tool to obtain useful information.This general introduction to the concept of text classification and classification process, and several typical text classification algorithm the basic idea for the field,advantages and disadvantages,etc.were introduced.%随着Internet等技术的飞速发展,信息处理已经成为人们获取有用信息不可或缺的工具。本文概括性地介绍了文本岔誊箩概今争分誊过程,_昔对几种母掣文本分类算法的基本思想、适用领域、优缺点等进行了介绍。

  15. Survey on Parameters of Fingerprint Classification Methods Based On Algorithmic Flow

    Directory of Open Access Journals (Sweden)

    Dimple Parekh

    2011-09-01

    Full Text Available Classification refers to assigning a given fingerprint to one of the existing classes already recognized inthe literature. A search over all the records in the database takes a long time, so the goal is to reduce thesize of the search space by choosing an appropriate subset of database for search. Classifying afingerprint images is a very difficult pattern recognition problem, due to the minimal interclassvariability and maximal intraclass variability. This paper presents a sequence flow diagram which willhelp in developing the clarity on designing algorithm for classification based on various parametersextracted from the fingerprint image. It discusses in brief the ways in which the parameters are extractedfrom the image. Existing fingerprint classification approaches are based on these parameters as inputfor classifying the image. Parameters like orientation map, singular points, spurious singular points,ridge flow, transforms and hybrid feature are discussed in the paper.

  16. Study on the classification algorithm of degree of arteriosclerosis based on fuzzy pattern recognition

    Science.gov (United States)

    Ding, Li; Zhou, Runjing; Liu, Guiying

    2010-08-01

    Pulse wave of human body contains large amount of physiological and pathological information, so the degree of arteriosclerosis classification algorithm is study based on fuzzy pattern recognition in this paper. Taking the human's pulse wave as the research object, we can extract the characteristic of time and frequency domain of pulse signal, and select the parameters with a better clustering effect for arteriosclerosis identification. Moreover, the validity of characteristic parameters is verified by fuzzy ISODATA clustering method (FISOCM). Finally, fuzzy pattern recognition system can quantitatively distinguish the degree of arteriosclerosis with patients. By testing the 50 samples in the built pulse database, the experimental result shows that the algorithm is practical and achieves a good classification recognition result.

  17. A low-latency Glitch Classification Algorithm Based in Waveform Morphology

    Science.gov (United States)

    Gabbard, Hunter; Mukherjee, Soma; Stone, Robert

    2016-03-01

    We present a novel and efficient algorithm for classification of signals that arise in gravitational wave channels of the Laser Interferometer Gravitational Wave Observatory (LIGO). Using data from LIGO's sixth science run (S6), we developed a new glitch classification algorithm based mainly on the morphology of the waveform as well as several other parameters (signal-to-noise ratio (SNR), duration, bandwidth, etc.). This is done using two novel methods, Kohonen Self Organizing Feature Maps (SOMs), and discrete wavelet transform coefficients. This study shows the feasibility of utilizing unsupervised machine learning techniques (SOMs) in order to display a multidimensional trigger set in a low-latency two dimensional format. UTRGV NSF REU Program.

  18. An Automated Cropland Classification Algorithm (ACCA) for Tajikistan by Combining Landsat, MODIS, and Secondary Data

    OpenAIRE

    Prasad S. Thenkabail; Zhuoting Wu

    2012-01-01

    The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA) that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed) over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan u...

  19. Determination of Optimum Classification System for Hyperspectral Imagery and LIDAR Data Based on Bees Algorithm

    Science.gov (United States)

    Samadzadega, F.; Hasani, H.

    2015-12-01

    Hyperspectral imagery is a rich source of spectral information and plays very important role in discrimination of similar land-cover classes. In the past, several efforts have been investigated for improvement of hyperspectral imagery classification. Recently the interest in the joint use of LiDAR data and hyperspectral imagery has been remarkably increased. Because LiDAR can provide structural information of scene while hyperspectral imagery provide spectral and spatial information. The complementary information of LiDAR and hyperspectral data may greatly improve the classification performance especially in the complex urban area. In this paper feature level fusion of hyperspectral and LiDAR data is proposed where spectral and structural features are extract from both dataset, then hybrid feature space is generated by feature stacking. Support Vector Machine (SVM) classifier is applied on hybrid feature space to classify the urban area. In order to optimize the classification performance, two issues should be considered: SVM parameters values determination and feature subset selection. Bees Algorithm (BA) is powerful meta-heuristic optimization algorithm which is applied to determine the optimum SVM parameters and select the optimum feature subset simultaneously. The obtained results show the proposed method can improve the classification accuracy in addition to reducing significantly the dimension of feature space.

  20. Real-time algorithms for human versus animal classification using a pyroelectric sensor

    Science.gov (United States)

    Hossen, Jakir; Jacobs, Eddie; Chari, Srikant

    2013-06-01

    Classification of human and animal targets imaged by a linear pyroelectic array senor presents some unique challenges especially in target segmentation and feature extraction. In this paper, we apply two approaches to address this problem. Both techniques start with the variational energy functional level set segmentation technique to separate the object from background. After segmentation, in the first technique, we extract features such as texture, invariant moments, edge, shape information, and spectral contents of the segmented object. These features are fed to classifiers including Naïve Bayesian (NB), and Support Vector Machine (SVM) for human against animal classification. In the second technique, the speeded up robust feature (SURF) extraction algorithm is applied to the segmented objects. A code book technique is used to classify objects based on SURF features. Human and animal data acquired-using the pyroelectric sensor in different terrains, are used for performance evaluation of the algorithms. The evaluation indicates that the features extracted in the first technique in conjunction with the NB classifier provide the highest classification rates. While the SURF feature plus code book approach provides a slightly lower classification rate, it provides better computational efficiency lending itself to real time implementation.

  1. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis

    Science.gov (United States)

    Lo, Benjamin W. Y.; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R. Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A. H.

    2016-01-01

    Background: Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). Methods: The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Results: Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56–2.45, P decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH. PMID:27512607

  2. Evolving Neural Network Using Variable String Genetic Algorithm for Color Infrared Aerial Image Classification

    Institute of Scientific and Technical Information of China (English)

    FU Xiaoyang; P E R Dale; ZHANG Shuqing

    2008-01-01

    Coastal wetlands are characterized by complex patterns both in their geomorphic and ecological features.Besides field observations,it is necessary to analyze the land cover of wetlands through the color infrared (CIR) aerial photography or remote sensing image.In this paper,we designed an evolving neural network classifier using variable string genetic algorithm (VGA) for the land cover classification of CIR aerial image.With the VGA,the classifier that we designed is able to evolve automatically the appropriate number of hidden nodes for modeling the neural network topology optimally and to find a near-optimal set of connection weights globally.Then,with backpropagation algorithm (BP),it can find the best connection weights.The VGA-BP classifier,which is derived from hybrid algorithms mentioned above,is demonstrated on CIR images classification effectively.Compared with standard classifiers,such as Bayes maximum-likelihood classifier,VGA classifier and BP-MLP (multi-layer perception) classifier,it has shown that the VGA-BP classifier can have better performance on highly resolution land cover classification.

  3. A Hybrid Multiobjective Differential Evolution Algorithm and Its Application to the Optimization of Grinding and Classification

    Directory of Open Access Journals (Sweden)

    Yalin Wang

    2013-01-01

    Full Text Available The grinding-classification is the prerequisite process for full recovery of the nonrenewable minerals with both production quality and quantity objectives concerned. Its natural formulation is a constrained multiobjective optimization problem of complex expression since the process is composed of one grinding machine and two classification machines. In this paper, a hybrid differential evolution (DE algorithm with multi-population is proposed. Some infeasible solutions with better performance are allowed to be saved, and they participate randomly in the evolution. In order to exploit the meaningful infeasible solutions, a functionally partitioned multi-population mechanism is designed to find an optimal solution from all possible directions. Meanwhile, a simplex method for local search is inserted into the evolution process to enhance the searching strategy in the optimization process. Simulation results from the test of some benchmark problems indicate that the proposed algorithm tends to converge quickly and effectively to the Pareto frontier with better distribution. Finally, the proposed algorithm is applied to solve a multiobjective optimization model of a grinding and classification process. Based on the technique for order performance by similarity to ideal solution (TOPSIS, the satisfactory solution is obtained by using a decision-making method for multiple attributes.

  4. Classification of Medical Datasets Using SVMs with Hybrid Evolutionary Algorithms Based on Endocrine-Based Particle Swarm Optimization and Artificial Bee Colony Algorithms.

    Science.gov (United States)

    Lin, Kuan-Cheng; Hsieh, Yi-Hsiu

    2015-10-01

    The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features. PMID:26289628

  5. Classification of Medical Datasets Using SVMs with Hybrid Evolutionary Algorithms Based on Endocrine-Based Particle Swarm Optimization and Artificial Bee Colony Algorithms.

    Science.gov (United States)

    Lin, Kuan-Cheng; Hsieh, Yi-Hsiu

    2015-10-01

    The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features.

  6. EVALUATION OF SOUND CLASSIFICATION USING MODIFIED CLASSIFIER AND SPEECH ENHANCEMENT USING ICA ALGORITHM FOR HEARING AID APPLICATION

    Directory of Open Access Journals (Sweden)

    N. Shanmugapriya

    2016-03-01

    Full Text Available Hearing aid users are exposed to diversified vocal scenarios. The necessity for sound classification algorithms becomes a vital factor to yield good listening experience. In this work, an approach is proposed to improve the speech quality in the hearing aids based on Independent Component Analysis (ICA algorithm with modified speech signal classification methods. The proposed algorithm has better results on speech intelligibility than other existing algorithm and this result has been proved by the intelligibility experiments. The ICA algorithm and modified Bayesian with Adaptive Neural Fuzzy Interference System (ANFIS is to effectiveness of the strategies of speech quality, thus this classification increases noise resistance of the new speech processing algorithm that proposed in this present work. This proposed work indicates that the new Modified classifier can be feasible in hearing aid applications.

  7. The Application of Multiobjective Genetic Algorithm to the Parameter Optimization of Single-Well Potential Stochastic Resonance Algorithm Aimed at Simultaneous Determination of Multiple Weak Chromatographic Peaks

    Directory of Open Access Journals (Sweden)

    Haishan Deng

    2014-01-01

    Full Text Available Simultaneous determination of multiple weak chromatographic peaks via stochastic resonance algorithm attracts much attention in recent years. However, the optimization of the parameters is complicated and time consuming, although the single-well potential stochastic resonance algorithm (SSRA has already reduced the number of parameters to only one and simplified the process significantly. Even worse, it is often difficult to keep amplified peaks with beautiful peak shape. Therefore, multiobjective genetic algorithm was employed to optimize the parameter of SSRA for multiple optimization objectives (i.e., S/N and peak shape and multiple chromatographic peaks. The applicability of the proposed method was evaluated with an experimental data set of Sudan dyes, and the results showed an excellent quantitative relationship between different concentrations and responses.

  8. A BENCHMARK TO SELECT DATA MINING BASED CLASSIFICATION ALGORITHMS FOR BUSINESS INTELLIGENCE AND DECISION SUPPORT SYSTEMS

    Directory of Open Access Journals (Sweden)

    Pardeep Kumar

    2012-09-01

    Full Text Available In today’s business scenario, we percept major changes in how managers use computerized support inmaking decisions. As more number of decision-makers use computerized support in decision making,decision support systems (DSS is developing from its starting as a personal support tool and is becomingthe common resource in an organization. DSS serve the management, operations, and planning levels of anorganization and help to make decisions, which may be rapidly changing and not easily specified inadvance. Data mining has a vital role to extract important information to help in decision making of adecision support system. It has been the active field of research in the last two-three decades. Integration ofdata mining and decision support systems (DSS can lead to the improved performance and can enable thetackling of new types of problems. Artificial Intelligence methods are improving the quality of decisionsupport, and have become embedded in many applications ranges from ant locking automobile brakes tothese days interactive search engines. It provides various machine learning techniques to support datamining. The classification is one of the main and valuable tasks of data mining. Several types ofclassification algorithms have been suggested, tested and compared to determine the future trends based onunseen data. There has been no single algorithm found to be superior over all others for all data sets.Various issues such as predictive accuracy, training time to build the model, robustness and scalabilitymust be considered and can have tradeoffs, further complex the quest for an overall superior method. Theobjective of this paper is to compare various classification algorithms that have been frequently used indata mining for decision support systems. Three decision trees based algorithms, one artificial neuralnetwork, one statistical, one support vector machines with and without adaboost and one clusteringalgorithm are tested and compared on

  9. A Benchmark to Select Data Mining Based Classification Algorithms for Business Intelligence and Decision Support Systems

    Directory of Open Access Journals (Sweden)

    Pardeep Kumar

    2012-10-01

    Full Text Available In today’s business scenario, we percept major changes in how managers use computerized support inmaking decisions. As more number of decision-makers use computerized support in decision making,decision support systems (DSS is developing from its starting as a personal support tool and is becomingthe common resource in an organization. DSS serve the management, operations, and planning levels of anorganization and help to make decisions, which may be rapidly changing and not easily specified inadvance. Data mining has a vital role to extract important information to help in decision making of adecision support system. It has been the active field of research in the last two-three decades. Integration ofdata mining and decision support systems (DSS can lead to the improved performance and can enable thetackling of new types of problems. Artificial Intelligence methods are improving the quality of decisionsupport, and have become embedded in many applications ranges from ant locking automobile brakes tothese days interactive search engines. It provides various machine learning techniques to support datamining. The classification is one of the main and valuable tasks of data mining. Several types ofclassification algorithms have been suggested, tested and compared to determine the future trends based onunseen data. There has been no single algorithm found to be superior over all others for all data sets.Various issues such as predictive accuracy, training time to build the model, robustness and scalabilitymust be considered and can have tradeoffs, further complex the quest for an overall superior method. Theobjective of this paper is to compare various classification algorithms that have been frequently used indata mining for decision support systems. Three decision trees based algorithms, one artificial neuralnetwork, one statistical, one support vector machines with and without adaboost and one clusteringalgorithm are tested and compared on

  10. Classification decision tree algorithm assisting in diagnosing solitary pulmonary nodule by SPECT/CT fusion imaging

    Institute of Scientific and Technical Information of China (English)

    Qiang Yongqian; Guo Youmin; Jin Chenwang; Liu Min; Yang Aimin; Wang Qiuping; Niu Gang

    2008-01-01

    Objective To develop a classification tree algorithm to improve diagnostic performances of 99mTc-MIBI SPECT/CT fusion imaging in differentiating solitary pulmonary nodules (SPNs). Methods Forty-four SPNs, including 30 malignant cases and 14 benign ones that were eventually pathologically identified, were included in this prospective study. All patients received 99Tcm-MIBI SPECT/CT scanning at an early stage and a delayed stage before operation. Thirty predictor variables, including 11 clinical variables, 4 variables of emission and 15 variables of transmission information from SPECT/CT scanning, were analyzed independently by the classification tree algorithm and radiological residents. Diagnostic rules were demonstrated in tree-topology, and diagnostic performances were compared with Area under Curve (AUC) of Receiver Operating Characteristic Curve (ROC). Results A classification decision tree with lowest relative cost of 0.340 was developed for 99Tcm-MIBI SPECT/CT scanning in which the value of Target/Normal region of 99Tcm-MIBI uptake in the delayed stage and in the early stage, age, cough and specula sign were five most important contributors. The sensitivity and specificity were 93.33% and 78. 57e, respectively, a little higher than those of the expert. The sensitivity and specificity by residents of Grade one were 76.67% and 28.57%, respectively, and AUC of CART and expert was 0.886±0.055 and 0.829±0.062, respectively, and the corresponding AUC of residents was 0.566±0.092. Comparisons of AUCs suggest that performance of CART was similar to that of expert (P=0.204), but greater than that of residents (P<0.001). Conclusion Our data mining technique using classification decision tree has a much higher accuracy than residents. It suggests that the application of this algorithm will significantly improve the diagnostic performance of residents.

  11. Library Event Matching event classification algorithm for electron neutrino interactions in the NOνA detectors

    Science.gov (United States)

    Backhouse, C.; Patterson, R. B.

    2015-04-01

    We describe the Library Event Matching classification algorithm implemented for use in the NOνA νμ →νe oscillation measurement. Library Event Matching, developed in a different form by the earlier MINOS experiment, is a powerful approach in which input trial events are compared to a large library of simulated events to find those that best match the input event. A key feature of the algorithm is that the comparisons are based on all the information available in the event, as opposed to higher-level derived quantities. The final event classifier is formed by examining the details of the best-matched library events. We discuss the concept, definition, optimization, and broader applications of the algorithm as implemented here. Library Event Matching is well-suited to the monolithic, segmented detectors of NOνA and thus provides a powerful technique for event discrimination.

  12. The efficiency of the RULES-4 classification learning algorithm in predicting the density of agents

    Directory of Open Access Journals (Sweden)

    Ziad Salem

    2014-12-01

    Full Text Available Learning is the act of obtaining new or modifying existing knowledge, behaviours, skills or preferences. The ability to learn is found in humans, other organisms and some machines. Learning is always based on some sort of observations or data such as examples, direct experience or instruction. This paper presents a classification algorithm to learn the density of agents in an arena based on the measurements of six proximity sensors of a combined actuator sensor units (CASUs. Rules are presented that were induced by the learning algorithm that was trained with data-sets based on the CASU’s sensor data streams collected during a number of experiments with “Bristlebots (agents in the arena (environment”. It was found that a set of rules generated by the learning algorithm is able to predict the number of bristlebots in the arena based on the CASU’s sensor readings with satisfying accuracy.

  13. A Wavelet-Based Algorithm for Delineation and Classification of Wave Patterns in Continuous Holter ECG Recordings

    OpenAIRE

    Johannesen, L; Grove, USL; Sørensen, JS; Schmidt, ML; Couderc, J-P; Graff, C

    2010-01-01

    Quantitative analysis of the electrocardiogram (ECG) requires delineation and classification of the individual ECG wave patterns. We propose a wavelet-based waveform classifier that uses the fiducial points identified by a delineation algorithm. For validation of the algorithm, manually annotated ECG records from the QT database (Physionet) were used. ECG waveform classification accuracies were: 85.6% (P-wave), 89.7% (QRS complex), 92.8% (T-wave) and 76.9% (U-wave). The proposed classificatio...

  14. Heterogeneous Ensemble Combination Search Using Genetic Algorithm for Class Imbalanced Data Classification.

    Science.gov (United States)

    Haque, Mohammad Nazmul; Noman, Nasimul; Berretta, Regina; Moscato, Pablo

    2016-01-01

    Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble's output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) - k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer's disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases. PMID:26764911

  15. Integrity Classification Algorithm of Images obtained from Impact Damaged Composite Structures

    Directory of Open Access Journals (Sweden)

    Mahmoud Z. Iskandarani

    2010-01-01

    Full Text Available Problem statement: Many NDT systems used for damage detection in composites are difficult to apply to complex geometric structures, also, they are time-consuming. As a solution to the problems associated with NDT applications, an intelligent analysis system that supports a portable testing environment, which allowed various types of inputs and provided sufficient data regarding level of damage in a tested structure was designed and tested. The developed technique was a novel approach that allowed locating defects with good accuracy. Approach: This research presented a novel approach to fast NDT using intelligent image analysis through a specifically developed algorithm that checks the integrity of composite structures. Such a novel approach allowed not only to determine the level of damage, but also, to correlate damage detected by one imaging technique using available instruments and methods to results that would be obtained using other instruments and techniques. Results: Using the developed ICA algorithm, accurate classification was achieved using C-Scan and Low Temperature Thermal imaging (LTT. Both techniques agreed on damage classification and structural integrity. Conclusion: This very successful approach to damage detection and classification is further supported by its ability to correlate different NDT technologies and predict others.

  16. SOMOTE_EASY: AN ALGORITHM TO TREAT THE CLASSIFICATION ISSUE IN REAL DATABASES

    Directory of Open Access Journals (Sweden)

    Hugo Leonardo Pereira Rufino

    2016-04-01

    Full Text Available Most classification tools assume that data distribution be balanced or with similar costs, when not properly classified. Nevertheless, in practical terms, the existence of database where unbalanced classes occur is commonplace, such as in the diagnosis of diseases, in which the confirmed cases are usually rare when compared with a healthy population. Other examples are the detection of fraudulent calls and the detection of system intruders. In these cases, the improper classification of a minority class (for instance, to diagnose a person with cancer as healthy may result in more serious consequences that incorrectly classify a majority class. Therefore, it is important to treat the database where unbalanced classes occur. This paper presents the SMOTE_Easy algorithm, which can classify data, even if there is a high level of unbalancing between different classes. In order to prove its efficiency, a comparison with the main algorithms to treat classification issues was made, where unbalanced data exist. This process was successful in nearly all tested databases

  17. Defining and evaluating classification algorithm for high-dimensional data based on latent topics.

    Directory of Open Access Journals (Sweden)

    Le Luo

    Full Text Available Automatic text categorization is one of the key techniques in information retrieval and the data mining field. The classification is usually time-consuming when the training dataset is large and high-dimensional. Many methods have been proposed to solve this problem, but few can achieve satisfactory efficiency. In this paper, we present a method which combines the Latent Dirichlet Allocation (LDA algorithm and the Support Vector Machine (SVM. LDA is first used to generate reduced dimensional representation of topics as feature in VSM. It is able to reduce features dramatically but keeps the necessary semantic information. The Support Vector Machine (SVM is then employed to classify the data based on the generated features. We evaluate the algorithm on 20 Newsgroups and Reuters-21578 datasets, respectively. The experimental results show that the classification based on our proposed LDA+SVM model achieves high performance in terms of precision, recall and F1 measure. Further, it can achieve this within a much shorter time-frame. Our process improves greatly upon the previous work in this field and displays strong potential to achieve a streamlined classification process for a wide range of applications.

  18. Hybrid Ant Bee Algorithm for Fuzzy Expert System Based Sample Classification.

    Science.gov (United States)

    GaneshKumar, Pugalendhi; Rani, Chellasamy; Devaraj, Durairaj; Victoire, T Aruldoss Albert

    2014-01-01

    Accuracy maximization and complexity minimization are the two main goals of a fuzzy expert system based microarray data classification. Our previous Genetic Swarm Algorithm (GSA) approach has improved the classification accuracy of the fuzzy expert system at the cost of their interpretability. The if-then rules produced by the GSA are lengthy and complex which is difficult for the physician to understand. To address this interpretability-accuracy tradeoff, the rule set is represented using integer numbers and the task of rule generation is treated as a combinatorial optimization task. Ant colony optimization (ACO) with local and global pheromone updations are applied to find out the fuzzy partition based on the gene expression values for generating simpler rule set. In order to address the formless and continuous expression values of a gene, this paper employs artificial bee colony (ABC) algorithm to evolve the points of membership function. Mutual Information is used for idenfication of informative genes. The performance of the proposed hybrid Ant Bee Algorithm (ABA) is evaluated using six gene expression data sets. From the simulation study, it is found that the proposed approach generated an accurate fuzzy system with highly interpretable and compact rules for all the data sets when compared with other approaches. PMID:26355782

  19. Unraveling cognitive traits using the Morris water maze unbiased strategy classification (MUST-C) algorithm.

    Science.gov (United States)

    Illouz, Tomer; Madar, Ravit; Louzon, Yoram; Griffioen, Kathleen J; Okun, Eitan

    2016-02-01

    The assessment of spatial cognitive learning in rodents is a central approach in neuroscience, as it enables one to assess and quantify the effects of treatments and genetic manipulations from a broad perspective. Although the Morris water maze (MWM) is a well-validated paradigm for testing spatial learning abilities, manual categorization of performance in the MWM into behavioral strategies is subject to individual interpretation, and thus to biases. Here we offer a support vector machine (SVM) - based, automated, MWM unbiased strategy classification (MUST-C) algorithm, as well as a cognitive score scale. This model was examined and validated by analyzing data obtained from five MWM experiments with changing platform sizes, revealing a limitation in the spatial capacity of the hippocampus. We have further employed this algorithm to extract novel mechanistic insights on the impact of members of the Toll-like receptor pathway on cognitive spatial learning and memory. The MUST-C algorithm can greatly benefit MWM users as it provides a standardized method of strategy classification as well as a cognitive scoring scale, which cannot be derived from typical analysis of MWM data. PMID:26522398

  20. Hybrid Medical Image Classification Using Association Rule Mining with Decision Tree Algorithm

    CERN Document Server

    Rajendran, P

    2010-01-01

    The main focus of image mining in the proposed method is concerned with the classification of brain tumor in the CT scan brain images. The major steps involved in the system are: pre-processing, feature extraction, association rule mining and hybrid classifier. The pre-processing step has been done using the median filtering process and edge features have been extracted using canny edge detection technique. The two image mining approaches with a hybrid manner have been proposed in this paper. The frequent patterns from the CT scan images are generated by frequent pattern tree (FP-Tree) algorithm that mines the association rules. The decision tree method has been used to classify the medical images for diagnosis. This system enhances the classification process to be more accurate. The hybrid method improves the efficiency of the proposed method than the traditional image mining methods. The experimental result on prediagnosed database of brain images showed 97% sensitivity and 95% accuracy respectively. The ph...

  1. Classification and authentication of unknown water samples using machine learning algorithms.

    Science.gov (United States)

    Kundu, Palash K; Panchariya, P C; Kundu, Madhusree

    2011-07-01

    This paper proposes the development of water sample classification and authentication, in real life which is based on machine learning algorithms. The proposed techniques used experimental measurements from a pulse voltametry method which is based on an electronic tongue (E-tongue) instrumentation system with silver and platinum electrodes. E-tongue include arrays of solid state ion sensors, transducers even of different types, data collectors and data analysis tools, all oriented to the classification of liquid samples and authentication of unknown liquid samples. The time series signal and the corresponding raw data represent the measurement from a multi-sensor system. The E-tongue system, implemented in a laboratory environment for 6 numbers of different ISI (Bureau of Indian standard) certified water samples (Aquafina, Bisleri, Kingfisher, Oasis, Dolphin, and McDowell) was the data source for developing two types of machine learning algorithms like classification and regression. A water data set consisting of 6 numbers of sample classes containing 4402 numbers of features were considered. A PCA (principal component analysis) based classification and authentication tool was developed in this study as the machine learning component of the E-tongue system. A proposed partial least squares (PLS) based classifier, which was dedicated as well; to authenticate a specific category of water sample evolved out as an integral part of the E-tongue instrumentation system. The developed PCA and PLS based E-tongue system emancipated an overall encouraging authentication percentage accuracy with their excellent performances for the aforesaid categories of water samples. PMID:21507400

  2. PMSVM: An Optimized Support Vector Machine Classification Algorithm Based on PCA and Multilevel Grid Search Methods

    Directory of Open Access Journals (Sweden)

    Yukai Yao

    2015-01-01

    Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.

  3. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    Energy Technology Data Exchange (ETDEWEB)

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.; Pounds, Joel G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the set of predictors to around 8 factors that can be validated using reputable medical and public health resources.

  4. Code Syntax-Comparison Algorithm Based on Type-Redefinition-Preprocessing and Rehash Classification

    Directory of Open Access Journals (Sweden)

    Baojiang Cui

    2011-08-01

    Full Text Available The code comparison technology plays an important role in the fields of software security protection and plagiarism detection. Nowadays, there are mainly FIVE approaches of plagiarism detection, file-attribute-based, text-based, token-based, syntax-based and semantic-based. The prior three approaches have their own limitations, while the technique based on syntax has its shortage of detection ability and low efficiency that all of these approaches cannot meet the requirements on large-scale software plagiarism detection. Based on our prior research, we propose an algorithm on type redefinition plagiarism detection, which could detect the level of simple type redefinition, repeating pattern redefinition, and the redefinition of type with pointer. Besides, this paper also proposes a code syntax-comparison algorithm based on rehash classification, which enhances the node storage structure of the syntax tree, and greatly improves the efficiency.

  5. Generation of a Supervised Classification Algorithm for Time-Series Variable Stars with an Application to the LINEAR Dataset

    CERN Document Server

    Johnston, Kyle B

    2016-01-01

    With the advent of digital astronomy, new benefits and new problems have been presented to the modern day astronomer. While data can be captured in a more efficient and accurate manor using digital means, the efficiency of data retrieval has led to an overload of scientific data for processing and storage. This paper will focus on the construction and application of a supervised pattern classification algorithm for the identification of variable stars. Given the reduction of a survey of stars into a standard feature space, the problem of using prior patterns to identify new observed patterns can be reduced to time tested classification methodologies and algorithms. Such supervised methods, so called because the user trains the algorithms prior to application using patterns with known classes or labels, provide a means to probabilistically determine the estimated class type of new observations. This paper will demonstrate the construction and application of a supervised classification algorithm on variable sta...

  6. Optimal Combination of Classification Algorithms and Feature Ranking Methods for Object-Based Classification of Submeter Resolution Z/I-Imaging DMC Imagery

    Directory of Open Access Journals (Sweden)

    Fulgencio Cánovas-García

    2015-04-01

    Full Text Available Object-based image analysis allows several different features to be calculated for the resulting objects. However, a large number of features means longer computing times and might even result in a loss of classification accuracy. In this study, we use four feature ranking methods (maximum correlation, average correlation, Jeffries–Matusita distance and mean decrease in the Gini index and five classification algorithms (linear discriminant analysis, naive Bayes, weighted k-nearest neighbors, support vector machines and random forest. The objective is to discover the optimal algorithm and feature subset to maximize accuracy when classifying a set of 1,076,937 objects, produced by the prior segmentation of a 0.45-m resolution multispectral image, with 356 features calculated on each object. The study area is both large (9070 ha and diverse, which increases the possibility to generalize the results. The mean decrease in the Gini index was found to be the feature ranking method that provided highest accuracy for all of the classification algorithms. In addition, support vector machines and random forest obtained the highest accuracy in the classification, both using their default parameters. This is a useful result that could be taken into account in the processing of high-resolution images in large and diverse areas to obtain a land cover classification.

  7. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  8. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  9. New Classification Method Based on Support-Significant Association Rules Algorithm

    Science.gov (United States)

    Li, Guoxin; Shi, Wen

    One of the most well-studied problems in data mining is mining for association rules. There was also research that introduced association rule mining methods to conduct classification tasks. These classification methods, based on association rule mining, could be applied for customer segmentation. Currently, most of the association rule mining methods are based on a support-confidence structure, where rules satisfied both minimum support and minimum confidence were returned as strong association rules back to the analyzer. But, this types of association rule mining methods lack of rigorous statistic guarantee, sometimes even caused misleading. A new classification model for customer segmentation, based on association rule mining algorithm, was proposed in this paper. This new model was based on the support-significant association rule mining method, where the measurement of confidence for association rule was substituted by the significant of association rule that was a better evaluation standard for association rules. Data experiment for customer segmentation from UCI indicated the effective of this new model.

  10. A Template Matching Approach to Classification of QAM Modulation using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Negar ahmadi

    2009-11-01

    Full Text Available The automatic recognition of the modulation format of a detected signal, the intermediate step between signal detection and demodulation, is a major task of an intelligent receiver, with various civilian and military applications. Obviously, with no knowledge of the transmitted data and many unknown parameters at the receiver, such as the signal power, carrier frequency and phase offsets, timing information, etc., blind identification of the modulation is a difficult task. This becomes even more challenging in real-world. In this paper modulation classification for QAM is performed by Genetic Algorithm followed by Template matching, considering the constellation of the received signal. In addition this classification finds the decision boundary of the signal which is critical information for bit detection. I have proposed and implemented a technique that casts modulation recognition into shape recognition. Constellation diagram is a traditional and powerful tool for design and evaluation of digital modulations. The simulation results show the capability of this method for modulation classification with high accuracy and appropriate convergence in the presence of noise.

  11. A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS AND PERKS

    Directory of Open Access Journals (Sweden)

    Mitali Desai

    2016-03-01

    Full Text Available The social networking sites have brought a new horizon for expressing views and opinions of individuals. Moreover, they provide medium to students to share their sentiments including struggles and joy during the learning process. Such informal information has a great venue for decision making. The large and growing scale of information needs automatic classification techniques. Sentiment analysis is one of the automated techniques to classify large data. The existing predictive sentiment analysis techniques are highly used to classify reviews on E-commerce sites to provide business intelligence. However, they are not much useful to draw decisions in education system since they classify the sentiments into merely three pre-set categories: positive, negative and neutral. Moreover, classifying the students’ sentiments into positive or negative category does not provide deeper insight into their problems and perks. In this paper, we propose a novel Hybrid Classification Algorithm to classify engineering students’ sentiments. Unlike traditional predictive sentiment analysis techniques, the proposed algorithm makes sentiment analysis process descriptive. Moreover, it classifies engineering students’ perks in addition to problems into several categories to help future students and education system in decision making.

  12. Fast Algorithm for Vectorcardiogram and Interbeat Intervals Analysis: Application for Premature Ventricular Contractions Classification

    Directory of Open Access Journals (Sweden)

    Irena Jekova

    2005-12-01

    Full Text Available In this study we investigated the adequacy of two non-orthogonal ECG leads from Holter recordings to provide reliable vectorcardiogram (VCG parameters. The VCG loop was constructed using the QRS samples in a fixed-size window around the fiducial point. We developed an algorithm for fast approximation of the VCG loop, estimation of its area and calculation of relative VCG characteristics, which are expected to be minimally dependent on the patient individuality and the ECG recording conditions. Moreover, in order to obtain independent from the heart rate temporal QRS characteristics, we introduced a parameter for estimation of the differences of the interbeat RR intervals. The statistical assessment of the proposed VCG and RR interval parameters showed distinguishing distributions for N and PVC beats. The reliability for PVC detection of the extracted parameter set was estimated independently with two classification methods - a stepwise discriminant analysis and a decision-tree-like classification algorithm, using the publicly available MIT-BIH arrhythmia database. The accuracy achieved with the stepwise discriminant analysis presented sensitivity of 91% and specificity of 95.6%, while the decision-tree-like technique assured sensitivity of 93.3% and specificity of 94.6%. We suggested possibilities for accuracy improvement with adequate electrodes placement of the Holter leads, supplementary analysis of the type of the predominant beats in the reference VCG matrix and smaller step for VCG loop approximation.

  13. Development of an algorithm for heartbeats detection and classification in Holter records based on temporal and morphological features

    Science.gov (United States)

    García, A.; Romano, H.; Laciar, E.; Correa, R.

    2011-12-01

    In this work a detection and classification algorithm for heartbeats analysis in Holter records was developed. First, a QRS complexes detector was implemented and their temporal and morphological characteristics were extracted. A vector was built with these features; this vector is the input of the classification module, based on discriminant analysis. The beats were classified in three groups: Premature Ventricular Contraction beat (PVC), Atrial Premature Contraction beat (APC) and Normal Beat (NB). These beat categories represent the most important groups of commercial Holter systems. The developed algorithms were evaluated in 76 ECG records of two validated open-access databases "arrhythmias MIT BIH database" and "MIT BIH supraventricular arrhythmias database". A total of 166343 beats were detected and analyzed, where the QRS detection algorithm provides a sensitivity of 99.69 % and a positive predictive value of 99.84 %. The classification stage gives sensitivities of 97.17% for NB, 97.67% for PCV and 92.78% for APC.

  14. Development of an algorithm for heartbeats detection and classification in Holter records based on temporal and morphological features

    International Nuclear Information System (INIS)

    In this work a detection and classification algorithm for heartbeats analysis in Holter records was developed. First, a QRS complexes detector was implemented and their temporal and morphological characteristics were extracted. A vector was built with these features; this vector is the input of the classification module, based on discriminant analysis. The beats were classified in three groups: Premature Ventricular Contraction beat (PVC), Atrial Premature Contraction beat (APC) and Normal Beat (NB). These beat categories represent the most important groups of commercial Holter systems. The developed algorithms were evaluated in 76 ECG records of two validated open-access databases 'arrhythmias MIT BIH database' and MIT BIH supraventricular arrhythmias database. A total of 166343 beats were detected and analyzed, where the QRS detection algorithm provides a sensitivity of 99.69 % and a positive predictive value of 99.84 %. The classification stage gives sensitivities of 97.17% for NB, 97.67% for PCV and 92.78% for APC.

  15. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine.

    Directory of Open Access Journals (Sweden)

    Fei Gao

    Full Text Available For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a "soft-start" approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment.

  16. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine

    Science.gov (United States)

    Gao, Fei; Mei, Jingyuan; Sun, Jinping; Wang, Jun; Yang, Erfu; Hussain, Amir

    2015-01-01

    For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM) is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a “soft-start” approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment. PMID:26275294

  17. MODIS Collection 6 shortwave-derived cloud phase classification algorithm and comparisons with CALIOP

    Science.gov (United States)

    Marchant, Benjamin; Platnick, Steven; Meyer, Kerry; Arnold, G. Thomas; Riedi, Jérôme

    2016-04-01

    Cloud thermodynamic phase (ice, liquid, undetermined) classification is an important first step for cloud retrievals from passive sensors such as MODIS (Moderate Resolution Imaging Spectroradiometer). Because ice and liquid phase clouds have very different scattering and absorbing properties, an incorrect cloud phase decision can lead to substantial errors in the cloud optical and microphysical property products such as cloud optical thickness or effective particle radius. Furthermore, it is well established that ice and liquid clouds have different impacts on the Earth's energy budget and hydrological cycle, thus accurately monitoring the spatial and temporal distribution of these clouds is of continued importance. For MODIS Collection 6 (C6), the shortwave-derived cloud thermodynamic phase algorithm used by the optical and microphysical property retrievals has been completely rewritten to improve the phase discrimination skill for a variety of cloudy scenes (e.g., thin/thick clouds, over ocean/land/desert/snow/ice surface, etc). To evaluate the performance of the C6 cloud phase algorithm, extensive granule-level and global comparisons have been conducted against the heritage C5 algorithm and CALIOP. A wholesale improvement is seen for C6 compared to C5.

  18. An Automated Algorithm to Screen Massive Training Samples for a Global Impervious Surface Classification

    Science.gov (United States)

    Tan, Bin; Brown de Colstoun, Eric; Wolfe, Robert E.; Tilton, James C.; Huang, Chengquan; Smith, Sarah E.

    2012-01-01

    An algorithm is developed to automatically screen the outliers from massive training samples for Global Land Survey - Imperviousness Mapping Project (GLS-IMP). GLS-IMP is to produce a global 30 m spatial resolution impervious cover data set for years 2000 and 2010 based on the Landsat Global Land Survey (GLS) data set. This unprecedented high resolution impervious cover data set is not only significant to the urbanization studies but also desired by the global carbon, hydrology, and energy balance researches. A supervised classification method, regression tree, is applied in this project. A set of accurate training samples is the key to the supervised classifications. Here we developed the global scale training samples from 1 m or so resolution fine resolution satellite data (Quickbird and Worldview2), and then aggregate the fine resolution impervious cover map to 30 m resolution. In order to improve the classification accuracy, the training samples should be screened before used to train the regression tree. It is impossible to manually screen 30 m resolution training samples collected globally. For example, in Europe only, there are 174 training sites. The size of the sites ranges from 4.5 km by 4.5 km to 8.1 km by 3.6 km. The amount training samples are over six millions. Therefore, we develop this automated statistic based algorithm to screen the training samples in two levels: site and scene level. At the site level, all the training samples are divided to 10 groups according to the percentage of the impervious surface within a sample pixel. The samples following in each 10% forms one group. For each group, both univariate and multivariate outliers are detected and removed. Then the screen process escalates to the scene level. A similar screen process but with a looser threshold is applied on the scene level considering the possible variance due to the site difference. We do not perform the screen process across the scenes because the scenes might vary due to

  19. Aims & Scope

    Institute of Scientific and Technical Information of China (English)

    2014-01-01

    Asian Pacific Journal of Tropical Biomedicine(APJTB)aims to set up and provide an international academic communication platform for physicians,medical scientists,allied health scientists and public health workers,especially those in the Asian Pacific region and worldwide on tropical biomedicine,infectious diseases and public health,and to meet the growing challenges

  20. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  1. Classification of Atrial Septal Defect and Ventricular Septal Defect with Documented Hemodynamic Parameters via Cardiac Catheterization by Genetic Algorithms and Multi-Layered Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Mustafa Yıldız

    2012-08-01

    Full Text Available Introduction: We aimed to develop a classification method to discriminate ventricular septal defect and atrial septal defect by using severalhemodynamic parameters.Patients and Methods: Forty three patients (30 atrial septal defect, 13 ventricular septal defect; 26 female, 17 male with documentedhemodynamic parameters via cardiac catheterization are included to study. Such parameters as blood pressure values of different areas,gender, age and Qp/Qs ratios are used for classification. Parameters, we used in classification are determined by divergence analysismethod. Those parameters are; i pulmonary artery diastolic pressure, ii Qp/Qs ratio, iii right atrium pressure, iv age, v pulmonary arterysystolic pressure, vi left ventricular sistolic pressure, vii aorta mean pressure, viii left ventricular diastolic pressure, ix aorta diastolicpressure, x aorta systolic pressure. Those parameters detected from our study population, are uploaded to multi-layered artificial neuralnetwork and the network was trained by genetic algorithm.Results: Trained cluster consists of 14 factors (7 atrial septal defect and 7 ventricular septal defect. Overall success ratio is 79.2%, andwith a proper instruction of artificial neural network this ratio increases up to 89%.Conclusion: Parameters, belonging to artificial neural network, which are needed to be detected by the investigator in classical methods,can easily be detected with the help of genetic algorithms. During the instruction of artificial neural network by genetic algorithms, boththe topology of network and factors of network can be determined. During the test stage, elements, not included in instruction cluster, areassumed as in test cluster, and as a result of this study, we observed that multi-layered artificial neural network can be instructed properly,and neural network is a successful method for aimed classification.

  2. Aims & Scope

    Institute of Scientific and Technical Information of China (English)

    2015-01-01

    <正>Asian Pacific Journal of Tropical Biomedicine(APJTB)aims to set up and provide an international academic communication platform for physicians,medical scientists,allied health scientists and public health workers,especially those in the Asian Pacific region and worldwide on tropical biomedicine,infectious diseases and public health,and to meet the growing challenges of understanding,preventing and controlling the dramatic global emergence and reemergence of infectious diseases in the Asian Pacific region.

  3. Aims & Scope

    Institute of Scientific and Technical Information of China (English)

    2013-01-01

    <正>Asian Pacific Journal of Tropical Biomedicine(APJTB)aims to set up and provide an international academic communication platform for physicians,medical scientists,allied health scientists and public health workers,especially those in the Asian Pacific region and worldwide on tropical biomedicine,infectious diseases and public health,and to meet the growing challenges of understanding,preventing and controlling the dramatic global emergence and reemergence of infectious diseases in the Asian Pacific region.

  4. HOS network-based classification of power quality events via regression algorithms

    Science.gov (United States)

    Palomares Salas, José Carlos; González de la Rosa, Juan José; Sierra Fernández, José María; Pérez, Agustín Agüera

    2015-12-01

    This work compares seven regression algorithms implemented in artificial neural networks (ANNs) supported by 14 power-quality features, which are based in higher-order statistics. Combining time and frequency domain estimators to deal with non-stationary measurement sequences, the final goal of the system is the implementation in the future smart grid to guarantee compatibility between all equipment connected. The principal results are based in spectral kurtosis measurements, which easily adapt to the impulsive nature of the power quality events. These results verify that the proposed technique is capable of offering interesting results for power quality (PQ) disturbance classification. The best results are obtained using radial basis networks, generalized regression, and multilayer perceptron, mainly due to the non-linear nature of data.

  5. Support vector machines and evolutionary algorithms for classification single or together?

    CERN Document Server

    Stoean, Catalin

    2014-01-01

    When discussing classification, support vector machines are known to be a capable and efficient technique to learn and predict with high accuracy within a quick time frame. Yet, their black box means to do so make the practical users quite circumspect about relying on it, without much understanding of the how and why of its predictions. The question raised in this book is how can this ‘masked hero’ be made more comprehensible and friendly to the public: provide a surrogate model for its hidden optimization engine, replace the method completely or appoint a more friendly approach to tag along and offer the much desired explanations? Evolutionary algorithms can do all these and this book presents such possibilities of achieving high accuracy, comprehensibility, reasonable runtime as well as unconstrained performance.

  6. Segmentation and Classification of Brain MRI Images Using Improved Logismos-B Algorithm

    Directory of Open Access Journals (Sweden)

    S. Dilip kumar

    2014-12-01

    Full Text Available Automated reconstruction and diagnosis of brain MRI images is one of the most challenging problems in medical imaging. Accurate segmentation of MRI images is a key step in contouring during radiotherapy analysis. Computed tomography (CT and Magnetic resonance (MR imaging are the most widely used radiographic techniques in diagnosis and treatment planning. Segmentation techniques used for the brain Magnetic Resonance Imaging (MRI is one of the methods used by the radiographer to detect any abnormality specifically in brain. The method also identifies important regions in brain such as white matter (WM, gray matter (GM and cerebrospinal fluid spaces (CSF. These regions are significant for physician or radiographer to analyze and diagnose the disease. We propose a novel clustering algorithm, improved LOGISMOS-B to classify tissue regions based on probabilistic tissue classification, generalized gradient vector flows with cost and distance function. The LOGISMOS graph segmentation framework. Expand the framework to allow regionally-aware graph construction and segmentation

  7. Study of Fault Diagnosis Method for Wind Turbine with Decision Classification Algorithms and Expert System

    Directory of Open Access Journals (Sweden)

    Feng Yongxin

    2012-09-01

    Full Text Available Study on the fault diagnosis method through the combination of decision classification algorithms and expert system. The method of extracting diagnosis rules with the CTree software was given, and a fault diagnosis system based on CLIPS was developed. In order to verify the feasibility of the method, at first the sample data was got through the simulations under fault of direct-drive wind turbine and gearbox, then the diagnosis rules was extracted with the CTree software, at last the fault diagnosis system proposed and the rules was used to extracted to diagnose the fault simulated. Test results showed that the misdiagnosis rate both within 5%, thus the feasibility of the method was verified.

  8. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  9. Content-based and Algorithmic Classifications of Journals: Perspectives on the Dynamics of Scientific Communication and Indexer Effects

    CERN Document Server

    Rafols, Ismael

    2008-01-01

    The aggregated journal-journal citation matrix -based on the Journal Citation Reports (JCR) of the Science Citation Index- can be decomposed by indexers and/or algorithmically. In this study, we test the results of two recently available algorithms for the decomposition of large matrices against two content-based classifications of journals: the ISI Subject Categories and the field/subfield classification of Glaenzel & Schubert (2003). The content-based schemes allow for the attribution of more than a single category to a journal, whereas the algorithms maximize the ratio of within-category citations over between-category citations in the aggregated category-category citation matrix. By adding categories, indexers generate between-category citations, which may enrich the database, for example, in the case of inter-disciplinary developments. The consequent indexer effects are significant in sparse areas of the matrix more than in denser ones. Algorithmic decompositions, on the other hand, are more heavily ...

  10. A Population Classification Evolution Algorithm for the Parameter Extraction of Solar Cell Models

    Directory of Open Access Journals (Sweden)

    Yiqun Zhang

    2016-01-01

    Full Text Available To quickly and precisely extract the parameters for solar cell models, inspired by simplified bird mating optimizer (SBMO, a new optimization technology referred to as population classification evolution (PCE is proposed. PCE divides the population into two groups, elite and ordinary, to reach a better compromise between exploitation and exploration. For the evolution of elite individuals, we adopt the idea of parthenogenesis in nature to afford a fast exploitation. For the evolution of ordinary individuals, we adopt an effective differential evolution strategy and a random movement of small probability is added to strengthen the ability to jump out of a local optimum, which affords a fast exploration. The proposed PCE is first estimated on 13 classic benchmark functions. The experimental results demonstrate that PCE yields the best results on 11 functions by comparing it with six evolutional algorithms. Then, PCE is applied to extract the parameters for solar cell models, that is, the single diode and the double diode. The experimental analyses demonstrate that the proposed PCE is superior when comparing it with other optimization algorithms for parameter identification. Moreover, PCE is tested using three different sources of data with good accuracy.

  11. The CR‐Ω+ Classification Algorithm for Spatio‐Temporal Prediction of Criminal Activity

    Directory of Open Access Journals (Sweden)

    S. Godoy‐Calderón

    2010-04-01

    Full Text Available We present a spatio‐temporal prediction model that allows forecasting of the criminal activity behavior in a particular region byusing supervised classification. The degree of membership of each pattern is interpreted as the forecasted increase or decreasein the criminal activity for the specified time and location. The proposed forecasting model (CR‐Ω+ is based on the family ofKora‐Ω Logical‐Combinatorial algorithms operating on large data volumes from several heterogeneous sources using aninductive learning process. We propose several modifications to the original algorithms by Bongard and Baskakova andZhuravlëv which improve the prediction performance on the studied dataset of criminal activity. We perform two analyses:punctual prediction and tendency analysis, which show that it is possible to predict punctually one of four crimes to beperpetrated (crime family, in a specific space and time, and 66% of effectiveness in the prediction of the place of crime, despiteof the noise of the dataset. The tendency analysis yielded an STRMSE (Spatio‐Temporal RMSE of less than 1.0.

  12. Binary classification SVM-based algorithms with interval-valued training data using triangular and Epanechnikov kernels.

    Science.gov (United States)

    Utkin, Lev V; Chekh, Anatoly I; Zhuk, Yulia A

    2016-08-01

    Classification algorithms based on different forms of support vector machines (SVMs) for dealing with interval-valued training data are proposed in the paper. L2-norm and L∞-norm SVMs are used for constructing the algorithms. The main idea allowing us to represent the complex optimization problems as a set of simple linear or quadratic programming problems is to approximate the Gaussian kernel by the well-known triangular and Epanechnikov kernels. The minimax strategy is used to choose an optimal probability distribution from the set and to construct optimal separating functions. Numerical experiments illustrate the algorithms. PMID:27179616

  13. Classification algorithms with multi-modal data fusion could accurately distinguish neuromyelitis optica from multiple sclerosis

    Directory of Open Access Journals (Sweden)

    Arman Eshaghi

    2015-01-01

    Full Text Available Neuromyelitis optica (NMO exhibits substantial similarities to multiple sclerosis (MS in clinical manifestations and imaging results and has long been considered a variant of MS. With the advent of a specific biomarker in NMO, known as anti-aquaporin 4, this assumption has changed; however, the differential diagnosis remains challenging and it is still not clear whether a combination of neuroimaging and clinical data could be used to aid clinical decision-making. Computer-aided diagnosis is a rapidly evolving process that holds great promise to facilitate objective differential diagnoses of disorders that show similar presentations. In this study, we aimed to use a powerful method for multi-modal data fusion, known as a multi-kernel learning and performed automatic diagnosis of subjects. We included 30 patients with NMO, 25 patients with MS and 35 healthy volunteers and performed multi-modal imaging with T1-weighted high resolution scans, diffusion tensor imaging (DTI and resting-state functional MRI (fMRI. In addition, subjects underwent clinical examinations and cognitive assessments. We included 18 a priori predictors from neuroimaging, clinical and cognitive measures in the initial model. We used 10-fold cross-validation to learn the importance of each modality, train and finally test the model performance. The mean accuracy in differentiating between MS and NMO was 88%, where visible white matter lesion load, normal appearing white matter (DTI and functional connectivity had the most important contributions to the final classification. In a multi-class classification problem we distinguished between all of 3 groups (MS, NMO and healthy controls with an average accuracy of 84%. In this classification, visible white matter lesion load, functional connectivity, and cognitive scores were the 3 most important modalities. Our work provides preliminary evidence that computational tools can be used to help make an objective differential diagnosis

  14. 数据挖掘常用分类算法研究%Data Mining Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    王明星; 刘锋

    2013-01-01

    Databases, data warehouses and other storage repository with a lot of potential commercial, scientific research and other activities related to the decision-making data and knowledge. For data analysis, data mining, there are usually two common meth-ods, ie, classification and prediction, the first data in the database were classified summarized, then you can get more valuable data in accordance with the classification rules, then we can based on this data Some information contained predicted future trends. In common classification algorithms, decision tree algorithm is an algorithm has good scalability, can be applied to large databases, can handle a variety of data types, classification mode easily converted into classification rules, the results are very plain and easy understand easily understood. This paper first introduces several commonly used classification algorithms, and then introduced the process of decision tree algorithm and the advantages and disadvantages in practical application of classification algorithms.%数据库、数据仓库以及其他存储信息库中潜藏着很多与商业、科学研究等活动的决策有关的数据和知识。对于数据挖掘中的数据分析,通常有两种常见的方法,即分类和预测,首先对数据库中的数据进行分类归纳,然后根据分类规则可以得到比较有价值的数据,然后我们可以根据这个数据来预测得到一些包含未来趋势的信息。在常见的分类算法中,决策树算法是一个有着很好扩展性的算法,可以应用到大型数据库中,可以对多种数据类型进行处理,分类模式容易转化为分类规则,结果也十分的浅显易懂易于理解。该文主要先介绍了几种常用的分类算法,然后具体介绍决策树算法的过程以及在分类算法实际应用中的优缺点。

  15. A Novel User Classification Method for Femtocell Network by Using Affinity Propagation Algorithm and Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Afaz Uddin Ahmed

    2014-01-01

    Full Text Available An artificial neural network (ANN and affinity propagation (AP algorithm based user categorization technique is presented. The proposed algorithm is designed for closed access femtocell network. ANN is used for user classification process and AP algorithm is used to optimize the ANN training process. AP selects the best possible training samples for faster ANN training cycle. The users are distinguished by using the difference of received signal strength in a multielement femtocell device. A previously developed directive microstrip antenna is used to configure the femtocell device. Simulation results show that, for a particular house pattern, the categorization technique without AP algorithm takes 5 indoor users and 10 outdoor users to attain an error-free operation. While integrating AP algorithm with ANN, the system takes 60% less training samples reducing the training time up to 50%. This procedure makes the femtocell more effective for closed access operation.

  16. Automatic classification of endogenous seismic sources within a landslide body using random forest algorithm

    Science.gov (United States)

    Provost, Floriane; Hibert, Clément; Malet, Jean-Philippe; Stumpf, André; Doubre, Cécile

    2016-04-01

    Different studies have shown the presence of microseismic activity in soft-rock landslides. The seismic signals exhibit significantly different features in the time and frequency domains which allow their classification and interpretation. Most of the classes could be associated with different mechanisms of deformation occurring within and at the surface (e.g. rockfall, slide-quake, fissure opening, fluid circulation). However, some signals remain not fully understood and some classes contain few examples that prevent any interpretation. To move toward a more complete interpretation of the links between the dynamics of soft-rock landslides and the physical processes controlling their behaviour, a complete catalog of the endogeneous seismicity is needed. We propose a multi-class detection method based on the random forests algorithm to automatically classify the source of seismic signals. Random forests is a supervised machine learning technique that is based on the computation of a large number of decision trees. The multiple decision trees are constructed from training sets including each of the target classes. In the case of seismic signals, these attributes may encompass spectral features but also waveform characteristics, multi-stations observations and other relevant information. The Random Forest classifier is used because it provides state-of-the-art performance when compared with other machine learning techniques (e.g. SVM, Neural Networks) and requires no fine tuning. Furthermore it is relatively fast, robust, easy to parallelize, and inherently suitable for multi-class problems. In this work, we present the first results of the classification method applied to the seismicity recorded at the Super-Sauze landslide between 2013 and 2015. We selected a dozen of seismic signal features that characterize precisely its spectral content (e.g. central frequency, spectrum width, energy in several frequency bands, spectrogram shape, spectrum local and global maxima

  17. DTBF+ Algorithm Based on Network Behavior Preference Classification%基于网络行为偏好分类的DTBF+算法

    Institute of Scientific and Technical Information of China (English)

    杨忠明; 秦勇; 蔡昭权; 武玉刚

    2011-01-01

    Aiming at the problem of non-reasonable flow distribution in application layer of DTBF dynamic token distribution based on Best-effect, an improved DTBF+ algorithm of DTBF with link application behavior preference classification mechanism is proposed, which is based on user transfer content classification mechanism. The algorithm allocats excess token of idle user's links dynamically to heavy user's links of nonPeer-to-Peer(P2P) application. It can improve bandwidth utilization effectively and reduce the blindness of allocating resource through application test in the actual environment, the algorithm effect obviously on improving the instantaneous utilization bandwidth of non-P2P application links.%针对Best-effect机制下DTBF动态令牌分配算法应用层流量分配不合理的问题,提出一种基于网络链路应用行为偏好分类机制的DTBF改进算法DTBF+.利用用户传输内容分类策略,将空闲用户链路中多余的令牌动态地分配到非对等(P2P)应用的繁忙用户链路中.应用结果表明,该算法可有效提高带宽的使用率和非P2P应用链路的瞬时带宽,降低资源分配的盲目性.

  18. The Self-Directed Violence Classification System and the Columbia Classification Algorithm for Suicide Assessment: A Crosswalk

    Science.gov (United States)

    Matarazzo, Bridget B.; Clemans, Tracy A.; Silverman, Morton M.; Brenner, Lisa A.

    2013-01-01

    The lack of a standardized nomenclature for suicide-related thoughts and behaviors prompted the Centers for Disease Control and Prevention, with the Veterans Integrated Service Network 19 Mental Illness Research Education and Clinical Center, to create the Self-Directed Violence Classification System (SDVCS). SDVCS has been adopted by the…

  19. Verdict Accuracy of Quick Reduct Algorithm using Clustering and Classification Techniques for Gene Expression Data

    Directory of Open Access Journals (Sweden)

    T.Chandrasekhar

    2012-01-01

    Full Text Available In most gene expression data, the number of training samples is very small compared to the large number of genes involved in the experiments. However, among the large amount of genes, only a small fraction is effective for performing a certain task. Furthermore, a small subset of genes is desirable in developing gene expression based diagnostic tools for delivering reliable and understandable results. With the gene selection results, the cost of biological experiment and decision can be greatly reduced by analyzing only the marker genes. An important application of gene expression data in functional genomics is to classify samples according to their gene expression profiles. Feature selection (FS is a process which attempts to select more informative features. It is one of the important steps in knowledge discovery. Conventional supervised FS methods evaluate various feature subsets using an evaluation function or metric to select only those features which are related to the decision classes of the data under consideration. This paper studies a feature selection method based on rough set theory. Further K-Means, Fuzzy C-Means (FCM algorithm have implemented for the reduced feature set without considering class labels. Then the obtained results are compared with the original class labels. Back Propagation Network (BPN has also been used for classification. Then the performance of K-Means, FCM, and BPN are analyzed through the confusion matrix. It is found that the BPN is performing well comparatively.

  20. Performance analysis of image processing algorithms for classification of natural vegetation in the mountains of southern California

    Science.gov (United States)

    Yool, S. R.; Star, J. L.; Estes, J. E.; Botkin, D. B.; Eckhardt, D. W.

    1986-01-01

    The earth's forests fix carbon from the atmosphere during photosynthesis. Scientists are concerned that massive forest removals may promote an increase in atmospheric carbon dioxide, with possible global warming and related environmental effects. Space-based remote sensing may enable the production of accurate world forest maps needed to examine this concern objectively. To test the limits of remote sensing for large-area forest mapping, we use Landsat data acquired over a site in the forested mountains of southern California to examine the relative capacities of a variety of popular image processing algorithms to discriminate different forest types. Results indicate that certain algorithms are best suited to forest classification. Differences in performance between the algorithms tested appear related to variations in their sensitivities to spectral variations caused by background reflectance, differential illumination, and spatial pattern by species. Results emphasize the complexity between the land-cover regime, remotely sensed data and the algorithms used to process these data.

  1. Business Analysis and Decision Making Through Unsupervised Classification of Mixed Data Type of Attributes Through Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Rohit Rastogi

    2014-01-01

    Full Text Available Grouping or unsupervised classification has variety of demands in which the major one is the capability of the chosen clustering approach to deal with scalability and to handle the mixed variety of data set. There are variety of data sets like categorical/nominal, ordinal, binary (symmetric or asymmetric, ratio and interval scaled variables. In the present scenario, latest approaches of unsupervised classification are Swarm Optimization based, Customer Segmentation based, Soft Computing methods like Fuzzy Based and GA based, Entropy Based methods and hierarchical approaches. These approaches have two serious bottlenecks…Either they are hybrid mathematical techniques or large computation demanding which increases their complexity and hence compromises with accuracy. It is very easy to compare and analyze that unsupervised classification by Genetic Algorithm is feasible, suitable and efficient for high-dimensional data sets with mixed data values that are obtained from real life results, events and happenings.

  2. A Novel Algorithm for Fault Classification on Transmission Lines using a Combined Adaptive Network-based Fuzzy Inference System

    Energy Technology Data Exchange (ETDEWEB)

    Yeo, S.M.; Kim, C.H. [Sungkyunkwan University (Korea); Chai, Y.M. [Chungju National University (Korea); Choi, J.D. [Daelim College (Korea)

    2001-07-01

    Accurate detection and classification of faults on transmission lines is vitally important. High impedance faults (HIF) in particular pose difficulties for the commonly employed conventional overcurrent and distance relays, and if not detected, can cause damage to expensive equipment, threaten life and cause fire hazards. Although HIFs are far less common than LIFs, it is imperative that any protection device should be able to satisfactorily deal with both HIFs and LIFs. This paper proposes an algorithm for fault detection and classification for both LIFs and HIFs using Adaptive Network-based Fuzzy Inference System(ANFIS). The performance of the proposed algorithm is tested on a typical 154[kV] Korean transmission line system under various fault conditions. Test results show that the ANFIS can detect and classify faults including (LIFs and HIFs) accurately within half a cycle. (author). 11 refs., 7 figs., 3 tabs.

  3. Classification of Aerosol Retrievals from Spaceborne Polarimetry Using a Multi-Parameter Algorithm

    Science.gov (United States)

    Russell, P. B.; Kacenelenbogen, M. S.; Livingston, J. M.; Hasekamp, O.; Burton, S. P.; Schuster, G. L.; Redemann, J.; Ramachandran, S.; Holben, B. N.

    2013-12-01

    In this presentation we demonstrate application of a new aerosol classification algorithm to retrievals from the POLDER-3 polarimeter on the PARASOL spacecraft. Motivation and method: Since the development of global aerosol measurements by satellites and AERONET, classification of observed aerosols into several types (e,g., urban-industrial, biomass burning, mineral dust, maritime, and various subtypes or mixtures of these) has proven useful to: understanding aerosol sources, transformations, effects, and feedback mechanisms; improving accuracy of satellite retrievals; and quantifying assessments of aerosol radiative impacts on climate. With ongoing improvements in satellite measurement capability, the number of aerosol parameters retrieved from spaceborne sensors has been growing, from the initial aerosol optical depth at one or a few wavelengths to a list that now includes complex refractive index, single scattering albedo (SSA), and depolarization of backscatter, each at several wavelengths; wavelength dependences of extinction, scattering, absorption, SSA, and backscatter; and several particle size and shape parameters. Making optimal use of these varied data products requires objective, multi-dimensional analysis methods. We describe such a method, which uses a modified Mahalanobis distance to quantify how far a data point described by N aerosol parameters is from each of several prespecified classes. The method makes explicit use of uncertainties in input parameters, treating a point and its N-dimensional uncertainty as an extended data point or pseudo-cluster E. It then uses a modified Mahalanobis distance, DEC, to assign that observation to the class (cluster) C that has minimum DEC from the point (equivalently, the class to which the point has maximum probability of belonging). The method also uses Wilks' overall lambda to indicate how well the input data lend themselves to separation into classes and Wilks' partial lambda to indicate the relative

  4. Development and comparative assessment of Raman spectroscopic classification algorithms for lesion discrimination in stereotactic breast biopsies with microcalcifications

    Science.gov (United States)

    Dingari, Narahara Chari; Barman, Ishan; Saha, Anushree; McGee, Sasha; Galindo, Luis H.; Liu, Wendy; Plecha, Donna; Klein, Nina; Dasari, Ramachandra Rao; Fitzmaurice, Maryann

    2014-01-01

    Microcalcifications are an early mammographic sign of breast cancer and a target for stereotactic breast needle biopsy. Here, we develop and compare different approaches for developing Raman classification algorithms to diagnose invasive and in situ breast cancer, fibrocystic change and fibroadenoma that can be associated with microcalcifications. In this study, Raman spectra were acquired from tissue cores obtained from fresh breast biopsies and analyzed using a constituent-based breast model. Diagnostic algorithms based on the breast model fit coefficients were devised using logistic regression, C4.5 decision tree classification, k-nearest neighbor (k-NN) and support vector machine (SVM) analysis, and subjected to leave-one-out cross validation. The best performing algorithm was based on SVM analysis (with radial basis function), which yielded a positive predictive value of 100% and negative predictive value of 96% for cancer diagnosis. Importantly, these results demonstrate that Raman spectroscopy provides adequate diagnostic information for lesion discrimination even in the presence of microcalcifications, which to the best of our knowledge has not been previously reported. Raman spectroscopy and multivariate classification provide accurate discrimination among lesions in stereotactic breast biopsies, irrespective of microcalcification status. PMID:22815240

  5. A global aerosol classification algorithm incorporating multiple satellite data sets of aerosol and trace gas abundances

    Science.gov (United States)

    Penning de Vries, M. J. M.; Beirle, S.; Hörmann, C.; Kaiser, J. W.; Stammes, P.; Tilstra, L. G.; Tuinder, O. N. E.; Wagner, T.

    2015-09-01

    Detecting the optical properties of aerosols using passive satellite-borne measurements alone is a difficult task due to the broadband effect of aerosols on the measured spectra and the influences of surface and cloud reflection. We present another approach to determine aerosol type, namely by studying the relationship of aerosol optical depth (AOD) with trace gas abundance, aerosol absorption, and mean aerosol size. Our new Global Aerosol Classification Algorithm, GACA, examines relationships between aerosol properties (AOD and extinction Ångström exponent from the Moderate Resolution Imaging Spectroradiometer (MODIS), UV Aerosol Index from the second Global Ozone Monitoring Experiment, GOME-2) and trace gas column densities (NO2, HCHO, SO2 from GOME-2, and CO from MOPITT, the Measurements of Pollution in the Troposphere instrument) on a monthly mean basis. First, aerosol types are separated based on size (Ångström exponent) and absorption (UV Aerosol Index), then the dominating sources are identified based on mean trace gas columns and their correlation with AOD. In this way, global maps of dominant aerosol type and main source type are constructed for each season and compared with maps of aerosol composition from the global MACC (Monitoring Atmospheric Composition and Climate) model. Although GACA cannot correctly characterize transported or mixed aerosols, GACA and MACC show good agreement regarding the global seasonal cycle, particularly for urban/industrial aerosols. The seasonal cycles of both aerosol type and source are also studied in more detail for selected 5° × 5° regions. Again, good agreement between GACA and MACC is found for all regions, but some systematic differences become apparent: the variability of aerosol composition (yearly and/or seasonal) is often not well captured by MACC, the amount of mineral dust outside of the dust belt appears to be overestimated, and the abundance of secondary organic aerosols is underestimated in comparison

  6. Classification-based summation of cerebral digital subtraction angiography series for image post-processing algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Schuldhaus, D; Spiegel, M; Polyanskaya, M; Hornegger, J [Pattern Recognition Lab, University Erlangen-Nuremberg (Germany); Redel, T [Siemens AG Healthcare Sector, Forchheim (Germany); Struffert, T; Doerfler, A, E-mail: martin.spiegel@informatik.uni-erlangen.de [Department of Neuroradiology, University Erlangen-Nuremberg (Germany)

    2011-03-21

    X-ray-based 2D digital subtraction angiography (DSA) plays a major role in the diagnosis, treatment planning and assessment of cerebrovascular disease, i.e. aneurysms, arteriovenous malformations and intracranial stenosis. DSA information is increasingly used for secondary image post-processing such as vessel segmentation, registration and comparison to hemodynamic calculation using computational fluid dynamics. Depending on the amount of injected contrast agent and the duration of injection, these DSA series may not exhibit one single DSA image showing the entire vessel tree. The interesting information for these algorithms, however, is usually depicted within a few images. If these images would be combined into one image the complexity of segmentation or registration methods using DSA series would drastically decrease. In this paper, we propose a novel method automatically splitting a DSA series into three parts, i.e. mask, arterial and parenchymal phase, to provide one final image showing all important vessels with less noise and moving artifacts. This final image covers all arterial phase images, either by image summation or by taking the minimum intensities. The phase classification is done by a two-step approach. The mask/arterial phase border is determined by a Perceptron-based method trained from a set of DSA series. The arterial/parenchymal phase border is specified by a threshold-based method. The evaluation of the proposed method is two-sided: (1) comparison between automatic and medical expert-based phase selection and (2) the quality of the final image is measured by gradient magnitudes inside the vessels and signal-to-noise (SNR) outside. Experimental results show a match between expert and automatic phase separation of 93%/50% and an average SNR increase of up to 182% compared to summing up the entire series.

  7. Classification-based summation of cerebral digital subtraction angiography series for image post-processing algorithms

    Science.gov (United States)

    Schuldhaus, D.; Spiegel, M.; Redel, T.; Polyanskaya, M.; Struffert, T.; Hornegger, J.; Doerfler, A.

    2011-03-01

    X-ray-based 2D digital subtraction angiography (DSA) plays a major role in the diagnosis, treatment planning and assessment of cerebrovascular disease, i.e. aneurysms, arteriovenous malformations and intracranial stenosis. DSA information is increasingly used for secondary image post-processing such as vessel segmentation, registration and comparison to hemodynamic calculation using computational fluid dynamics. Depending on the amount of injected contrast agent and the duration of injection, these DSA series may not exhibit one single DSA image showing the entire vessel tree. The interesting information for these algorithms, however, is usually depicted within a few images. If these images would be combined into one image the complexity of segmentation or registration methods using DSA series would drastically decrease. In this paper, we propose a novel method automatically splitting a DSA series into three parts, i.e. mask, arterial and parenchymal phase, to provide one final image showing all important vessels with less noise and moving artifacts. This final image covers all arterial phase images, either by image summation or by taking the minimum intensities. The phase classification is done by a two-step approach. The mask/arterial phase border is determined by a Perceptron-based method trained from a set of DSA series. The arterial/parenchymal phase border is specified by a threshold-based method. The evaluation of the proposed method is two-sided: (1) comparison between automatic and medical expert-based phase selection and (2) the quality of the final image is measured by gradient magnitudes inside the vessels and signal-to-noise (SNR) outside. Experimental results show a match between expert and automatic phase separation of 93%/50% and an average SNR increase of up to 182% compared to summing up the entire series.

  8. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    Science.gov (United States)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care

  9. Prediction of S-Nitrosylation Modification Sites Based on Kernel Sparse Representation Classification and mRMR Algorithm

    Directory of Open Access Journals (Sweden)

    Guohua Huang

    2014-01-01

    Full Text Available Protein S-nitrosylation plays a very important role in a wide variety of cellular biological activities. Hitherto, accurate prediction of S-nitrosylation sites is still of great challenge. In this paper, we presented a framework to computationally predict S-nitrosylation sites based on kernel sparse representation classification and minimum Redundancy Maximum Relevance algorithm. As much as 666 features derived from five categories of amino acid properties and one protein structure feature are used for numerical representation of proteins. A total of 529 protein sequences collected from the open-access databases and published literatures are used to train and test our predictor. Computational results show that our predictor achieves Matthews’ correlation coefficients of 0.1634 and 0.2919 for the training set and the testing set, respectively, which are better than those of k-nearest neighbor algorithm, random forest algorithm, and sparse representation classification algorithm. The experimental results also indicate that 134 optimal features can better represent the peptides of protein S-nitrosylation than the original 666 redundant features. Furthermore, we constructed an independent testing set of 113 protein sequences to evaluate the robustness of our predictor. Experimental result showed that our predictor also yielded good performance on the independent testing set with Matthews’ correlation coefficients of 0.2239.

  10. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  11. A Spectral Signature Shape-Based Algorithm for Landsat Image Classification

    Directory of Open Access Journals (Sweden)

    Yuanyuan Chen

    2016-08-01

    Full Text Available Land-cover datasets are crucial for earth system modeling and human-nature interaction research at local, regional and global scales. They can be obtained from remotely sensed data using image classification methods. However, in processes of image classification, spectral values have received considerable attention for most classification methods, while the spectral curve shape has seldom been used because it is difficult to be quantified. This study presents a classification method based on the observation that the spectral curve is composed of segments and certain extreme values. The presented classification method quantifies the spectral curve shape and takes full use of the spectral shape differences among land covers to classify remotely sensed images. Using this method, classification maps from TM (Thematic mapper data were obtained with an overall accuracy of 0.834 and 0.854 for two respective test areas. The approach presented in this paper, which differs from previous image classification methods that were mostly concerned with spectral “value” similarity characteristics, emphasizes the "shape" similarity characteristics of the spectral curve. Moreover, this study will be helpful for classification research on hyperspectral and multi-temporal images.

  12. DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification

    Directory of Open Access Journals (Sweden)

    Arran Schlosberg

    2014-05-01

    Full Text Available Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV.

  13. An Automated Cropland Classification Algorithm (ACCA for Tajikistan by Combining Landsat, MODIS, and Secondary Data

    Directory of Open Access Journals (Sweden)

    Prasad S. Thenkabail

    2012-09-01

    Full Text Available The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan using mega file data cubes (MFDCs involving data from Landsat Global Land Survey (GLS, Landsat Enhanced Thematic Mapper Plus (ETM+ 30 m, Moderate Resolution Imaging Spectroradiometer (MODIS 250 m time-series, a suite of secondary data (e.g., elevation, slope, precipitation, temperature, and in situ data. First, the process involved producing an accurate reference (or truth cropland layer (TCL, consisting of cropland extent, areas, and irrigated vs. rainfed cropland areas, for the entire country of Tajikistan based on MFDC of year 2005 (MFDC2005. The methods involved in producing TCL included using ISOCLASS clustering, Tasseled Cap bi-spectral plots, spectro-temporal characteristics from MODIS 250 m monthly normalized difference vegetation index (NDVI maximum value composites (MVC time-series, and textural characteristics of higher resolution imagery. The TCL statistics accurately matched with the national statistics of Tajikistan for irrigated and rainfed croplands, where about 70% of croplands were irrigated and the rest rainfed. Second, a rule-based ACCA was developed to replicate the TCL accurately (~80% producer’s and user’s accuracies or within 20% quantity disagreement involving about 10 million Landsat 30 m sized cropland pixels of Tajikistan. Development of ACCA was an iterative process involving series of rules that are coded, refined, tweaked, and re-coded till ACCA derived croplands (ACLs match accurately with TCLs. Third, the ACCA derived

  14. An Automated Cropland Classification Algorithm (ACCA) for Tajikistan by combining Landsat, MODIS, and secondary data

    Science.gov (United States)

    Thenkabail, Prasad S.; Wu, Zhuoting

    2012-01-01

    The overarching goal of this research was to develop and demonstrate an automated Cropland Classification Algorithm (ACCA) that will rapidly, routinely, and accurately classify agricultural cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed) over large areas such as a country or a region through combination of multi-sensor remote sensing and secondary data. In this research, a rule-based ACCA was conceptualized, developed, and demonstrated for the country of Tajikistan using mega file data cubes (MFDCs) involving data from Landsat Global Land Survey (GLS), Landsat Enhanced Thematic Mapper Plus (ETM+) 30 m, Moderate Resolution Imaging Spectroradiometer (MODIS) 250 m time-series, a suite of secondary data (e.g., elevation, slope, precipitation, temperature), and in situ data. First, the process involved producing an accurate reference (or truth) cropland layer (TCL), consisting of cropland extent, areas, and irrigated vs. rainfed cropland areas, for the entire country of Tajikistan based on MFDC of year 2005 (MFDC2005). The methods involved in producing TCL included using ISOCLASS clustering, Tasseled Cap bi-spectral plots, spectro-temporal characteristics from MODIS 250 m monthly normalized difference vegetation index (NDVI) maximum value composites (MVC) time-series, and textural characteristics of higher resolution imagery. The TCL statistics accurately matched with the national statistics of Tajikistan for irrigated and rainfed croplands, where about 70% of croplands were irrigated and the rest rainfed. Second, a rule-based ACCA was developed to replicate the TCL accurately (~80% producer’s and user’s accuracies or within 20% quantity disagreement involving about 10 million Landsat 30 m sized cropland pixels of Tajikistan). Development of ACCA was an iterative process involving series of rules that are coded, refined, tweaked, and re-coded till ACCA derived croplands (ACLs) match accurately with TCLs. Third, the ACCA derived cropland

  15. Testing the Generalization Efficiency of Oil Slick Classification Algorithm Using Multiple SAR Data for Deepwater Horizon Oil Spill

    Science.gov (United States)

    Ozkan, C.; Osmanoglu, B.; Sunar, F.; Staples, G.; Kalkan, K.; Balık Sanlı, F.

    2012-07-01

    Marine oil spills due to releases of crude oil from tankers, offshore platforms, drilling rigs and wells, etc. are seriously affecting the fragile marine and coastal ecosystem and cause political and environmental concern. A catastrophic explosion and subsequent fire in the Deepwater Horizon oil platform caused the platform to burn and sink, and oil leaked continuously between April 20th and July 15th of 2010, releasing about 780,000 m3 of crude oil into the Gulf of Mexico. Today, space-borne SAR sensors are extensively used for the detection of oil spills in the marine environment, as they are independent from sun light, not affected by cloudiness, and more cost-effective than air patrolling due to covering large areas. In this study, generalization extent of an object based classification algorithm was tested for oil spill detection using multiple SAR imagery data. Among many geometrical, physical and textural features, some more distinctive ones were selected to distinguish oil and look alike objects from each others. The tested classifier was constructed from a Multilayer Perception Artificial Neural Network trained by ABC, LM and BP optimization algorithms. The training data to train the classifier were constituted from SAR data consisting of oil spill originated from Lebanon in 2007. The classifier was then applied to the Deepwater Horizon oil spill data in the Gulf of Mexico on RADARSAT-2 and ALOS PALSAR images to demonstrate the generalization efficiency of oil slick classification algorithm.

  16. Development and comparative assessment of Raman spectroscopic classification algorithms for lesion discrimination in stereotactic breast biopsies with microcalcifications.

    Science.gov (United States)

    Dingari, Narahara Chari; Barman, Ishan; Saha, Anushree; McGee, Sasha; Galindo, Luis H; Liu, Wendy; Plecha, Donna; Klein, Nina; Dasari, Ramachandra Rao; Fitzmaurice, Maryann

    2013-04-01

    Microcalcifications are an early mammographic sign of breast cancer and a target for stereotactic breast needle biopsy. Here, we develop and compare different approaches for developing Raman classification algorithms to diagnose invasive and in situ breast cancer, fibrocystic change and fibroadenoma that can be associated with microcalcifications. In this study, Raman spectra were acquired from tissue cores obtained from fresh breast biopsies and analyzed using a constituent-based breast model. Diagnostic algorithms based on the breast model fit coefficients were devised using logistic regression, C4.5 decision tree classification, k-nearest neighbor (k -NN) and support vector machine (SVM) analysis, and subjected to leave-one-out cross validation. The best performing algorithm was based on SVM analysis (with radial basis function), which yielded a positive predictive value of 100% and negative predictive value of 96% for cancer diagnosis. Importantly, these results demonstrate that Raman spectroscopy provides adequate diagnostic information for lesion discrimination even in the presence of microcalcifications, which to the best of our knowledge has not been previously reported. PMID:22815240

  17. TEXTURE BASED LAND COVER CLASSIFICATION ALGORITHM USING GABOR WAVELET AND ANFIS CLASSIFIER

    Directory of Open Access Journals (Sweden)

    S. Jenicka

    2016-05-01

    Full Text Available Texture features play a predominant role in land cover classification of remotely sensed images. In this study, for extracting texture features from data intensive remotely sensed image, Gabor wavelet has been used. Gabor wavelet transform filters frequency components of an image through decomposition and produces useful features. For classification of fuzzy land cover patterns in the remotely sensed image, Adaptive Neuro Fuzzy Inference System (ANFIS has been used. The strength of ANFIS classifier is that it combines the merits of fuzzy logic and neural network. Hence in this article, land cover classification of remotely sensed image has been performed using Gabor wavelet and ANFIS classifier. The classification accuracy of the classified image obtained is found to be 92.8%.

  18. Feasibility of Genetic Algorithm for Textile Defect Classification Using Neural Network

    Directory of Open Access Journals (Sweden)

    Md. Tarek Habib

    2012-08-01

    Full Text Available The global market for textile industry is highly competitive nowadays. Quality control in productionprocess in textile industry has been a key factor for retaining existence in such competitive market.Automated textile inspection systems are very useful in this respect, because manual inspection is timeconsuming and not accurate enough. Hence, automated textile inspection systems have been drawing plentyof attention of the researchers of different countries in order to replace manual inspection. Defect detectionand defect classification are the two major problems that are posed by the research of automated textileinspection systems. In this paper, we perform an extensive investigation on the applicability of geneticalgorithm (GA in the context of textile defect classification using neural network (NN. We observe theeffect of tuning different network parameters and explain the reasons. We empirically find a suitable NNmodel in the context of textile defect classification. We compare the performance of this model with that ofthe classification models implemented by others.

  19. DOA Estimation of Low Altitude Target Based on Adaptive Step Glowworm Swarm Optimization-multiple Signal Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Zhou Hao

    2015-06-01

    Full Text Available The traditional MUltiple SIgnal Classification (MUSIC algorithm requires significant computational effort and can not be employed for the Direction Of Arrival (DOA estimation of targets in a low-altitude multipath environment. As such, a novel MUSIC approach is proposed on the basis of the algorithm of Adaptive Step Glowworm Swarm Optimization (ASGSO. The virtual spatial smoothing of the matrix formed by each snapshot is used to realize the decorrelation of the multipath signal and the establishment of a fullorder correlation matrix. ASGSO optimizes the function and estimates the elevation of the target. The simulation results suggest that the proposed method can overcome the low altitude multipath effect and estimate the DOA of target readily and precisely without radar effective aperture loss.

  20. Greedy heuristic algorithm for solving series of eee components classification problems*

    Science.gov (United States)

    Kazakovtsev, A. L.; Antamoshkin, A. N.; Fedosov, V. V.

    2016-04-01

    Algorithms based on using the agglomerative greedy heuristics demonstrate precise and stable results for clustering problems based on k- means and p-median models. Such algorithms are successfully implemented in the processes of production of specialized EEE components for using in space systems which include testing each EEE device and detection of homogeneous production batches of the EEE components based on results of the tests using p-median models. In this paper, authors propose a new version of the genetic algorithm with the greedy agglomerative heuristic which allows solving series of problems. Such algorithm is useful for solving the k-means and p-median clustering problems when the number of clusters is unknown. Computational experiments on real data show that the preciseness of the result decreases insignificantly in comparison with the initial genetic algorithm for solving a single problem.

  1. Comparative analysis of different implementations of a parallel algorithm for automatic target detection and classification of hyperspectral images

    Science.gov (United States)

    Paz, Abel; Plaza, Antonio; Plaza, Javier

    2009-08-01

    Automatic target detection in hyperspectral images is a task that has attracted a lot of attention recently. In the last few years, several algoritms have been developed for this purpose, including the well-known RX algorithm for anomaly detection, or the automatic target detection and classification algorithm (ATDCA), which uses an orthogonal subspace projection (OSP) approach to extract a set of spectrally distinct targets automatically from the input hyperspectral data. Depending on the complexity and dimensionality of the analyzed image scene, the target/anomaly detection process may be computationally very expensive, a fact that limits the possibility of utilizing this process in time-critical applications. In this paper, we develop computationally efficient parallel versions of both the RX and ATDCA algorithms for near real-time exploitation of these algorithms. In the case of ATGP, we use several distance metrics in addition to the OSP approach. The parallel versions are quantitatively compared in terms of target detection accuracy, using hyperspectral data collected by NASA's Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) over the World Trade Center in New York, five days after the terrorist attack of September 11th, 2001, and also in terms of parallel performance, using a massively Beowulf cluster available at NASA's Goddard Space Flight Center in Maryland.

  2. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data.

    Science.gov (United States)

    Yang, Sheng; Guo, Li; Shao, Fang; Zhao, Yang; Chen, Feng

    2015-01-01

    Sequencing is widely used to discover associations between microRNAs (miRNAs) and diseases. However, the negative binomial distribution (NB) and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility. Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare. The performance of seven feature selection (FS) algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF), was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio. Real data were used to evaluate the performance of RF, logistic regression, and support vector machine. Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms. The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96) from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas. Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes. PMID:26508990

  3. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data

    Directory of Open Access Journals (Sweden)

    Sheng Yang

    2015-01-01

    Full Text Available Sequencing is widely used to discover associations between microRNAs (miRNAs and diseases. However, the negative binomial distribution (NB and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility. Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare. The performance of seven feature selection (FS algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF, was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio. Real data were used to evaluate the performance of RF, logistic regression, and support vector machine. Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms. The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96 from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas. Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes.

  4. A Benchmark Data Set to Evaluate the Illumination Robustness of Image Processing Algorithms for Object Segmentation and Classification.

    Science.gov (United States)

    Khan, Arif Ul Maula; Mikut, Ralf; Reischl, Markus

    2015-01-01

    Developers of image processing routines rely on benchmark data sets to give qualitative comparisons of new image analysis algorithms and pipelines. Such data sets need to include artifacts in order to occlude and distort the required information to be extracted from an image. Robustness, the quality of an algorithm related to the amount of distortion is often important. However, using available benchmark data sets an evaluation of illumination robustness is difficult or even not possible due to missing ground truth data about object margins and classes and missing information about the distortion. We present a new framework for robustness evaluation. The key aspect is an image benchmark containing 9 object classes and the required ground truth for segmentation and classification. Varying levels of shading and background noise are integrated to distort the data set. To quantify the illumination robustness, we provide measures for image quality, segmentation and classification success and robustness. We set a high value on giving users easy access to the new benchmark, therefore, all routines are provided within a software package, but can as well easily be replaced to emphasize other aspects.

  5. An enhanced algorithm for knee joint sound classification using feature extraction based on time-frequency analysis.

    Science.gov (United States)

    Kim, Keo Sik; Seo, Jeong Hwan; Kang, Jin U; Song, Chul Gyu

    2009-05-01

    Vibroarthrographic (VAG) signals, generated by human knee movement, are non-stationary and multi-component in nature and their time-frequency distribution (TFD) provides a powerful means to analyze such signals. The objective of this paper is to improve the classification accuracy of the features, obtained from the TFD of normal and abnormal VAG signals, using segmentation by the dynamic time warping (DTW) and denoising algorithm by the singular value decomposition (SVD). VAG and knee angle signals, recorded simultaneously during one flexion and one extension of the knee, were segmented and normalized at 0.5 Hz by the DTW method. Also, the noise within the TFD of the segmented VAG signals was reduced by the SVD algorithm, and a back-propagation neural network (BPNN) was used to classify the normal and abnormal VAG signals. The characteristic parameters of VAG signals consist of the energy, energy spread, frequency and frequency spread parameter extracted by the TFD. A total of 1408 segments (normal 1031, abnormal 377) were used for training and evaluating the BPNN. As a result, the average classification accuracy was 91.4 (standard deviation +/-1.7) %. The proposed method showed good potential for the non-invasive diagnosis and monitoring of joint disorders such as osteoarthritis. PMID:19217685

  6. A novel algorithm for fault classification in transmission lines using a combined adaptive network and fuzzy inference system

    Energy Technology Data Exchange (ETDEWEB)

    Yeo, S.M.; Kim, C.H.; Hong, K.S. [Sungkyunkwan Univ., Suwon (Korea). School fo Information and Computer Engineering; Lim, Y.B. [LG Electronics CDMA Handsets Lab., Seoul (Korea); Aggarwal, R.K.; Johns, A.T. [University of Bath (United Kingdom). Dept. of Electronic and Electrical Engineering; Choi, M.S. [Myongji Univ., Yongin (Korea). Division of Electrical and Information Control Engineering

    2003-11-01

    Accurate detection and classification of faults on transmission lines is vitally important. In this respect, many different types of faults occur, inter alia low impedance faults (LIF) and high impedance faults (HIF). The latter in particular pose difficulties for the commonly employed conventional overcurrent and distance relays, and if not detected, can cause damage to expensive equipment, threaten life and cause fire hazards. Although HIFs are far less common than LIFs, it is imperative that any protection device should be able to satisfactorily deal with both HIFs and LIFs. Because of the randomness and asymmetric characteristics of HIFs, the modelling of HIF is difficult and many papers relating to various HIF models have been published. In this paper, the model of HIFs in transmission lines is accomplished using the characteristics of a ZnO arrester, which is then implemented within the overall transmission system model based on the electromagnetic transients programme. This paper proposes an algorithm for fault detection and classification for both LIFs and HIFs using Adaptive Network-based Fuzzy Inference System (ANFIS). The inputs into ANFIS are current signals only based on Root-Mean-Square values of three-phase currents and zero sequence current. The performance of the proposed algorithm is tested on a typical 154 kV Korean transmission line system under various fault conditions. Test results show that the ANFIS can detect and classify faults including (LIFs and HIFs) accurately within half a cycle. (author)

  7. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    the subtype classification 3 data sets solely on the basis of molecular-level monitoring. Compared to unsupervised clustering, the supervised method performed better for discriminating between cancer types and cancer subtypes for the leukemia data set. The performance of the proposed method, using only...

  8. Deciphering behavioral changes in animal movement with a ‘multiple change point algorithm- classification tree’ framework

    Directory of Open Access Journals (Sweden)

    Benedicte eMadon

    2014-07-01

    Full Text Available The recent development of tracking tools has improved our nascent knowledge on animal movement. Because of model complexity, unrealistic a priori hypotheses and heavy computational resources, behavioral changes along an animal path are still often assessed visually. A new avenue has recently been opened with change point algorithms because tracking data can be organized as time series with potential periodic change points segregating the movement in segments of different statistical properties. So far this approach was restricted to single change point detection and we propose a straightforward analytical framework based on a recent multiple change point algorithm: the PELT algorithm, a dynamic programming pruning search method to find, within time series, the optimal combination of number and locations of change points. Data segments found by the algorithm are then sorted out with a supervised classification tree procedure to organize segments by movement classes. We apply this framework to investigate changes in variance in daily distances of a migratory bird, the Macqueen’s Bustard, Chlamydotis macqueenii, and describe its movements in three classes: staging, non-migratory and migratory movements. Using simulation experiments, we show that the algorithm is robust to identify exact behavioral shift (on average more than 80% of the time but that positive autocorrelation when present is likely to lead to the detection of false change points (in 36% of the iterations with an average of 1.97 (se = 0.06 additional change points. A case study is provided to illustrate the biases associated with visual analysis of movement patterns compared to the reliability of our analytical framework. Technological improvement will provide new opportunities for the study of animal behavior, bringing along huge and various data sets, a growing challenge for biologists, and this straightforward and standardized framework could be an asset in the attempt to

  9. DETERMINATION OF OPTIMUM CLASSIFICATION SYSTEM FOR HYPERSPECTRAL IMAGERY AND LIDAR DATA BASED ON BEES ALGORITHM

    OpenAIRE

    Samadzadega, F.; H. Hasani

    2015-01-01

    Hyperspectral imagery is a rich source of spectral information and plays very important role in discrimination of similar land-cover classes. In the past, several efforts have been investigated for improvement of hyperspectral imagery classification. Recently the interest in the joint use of LiDAR data and hyperspectral imagery has been remarkably increased. Because LiDAR can provide structural information of scene while hyperspectral imagery provide spectral and spatial information. The comp...

  10. Analytical Study of Some Selected Classification Algorithms in WEKA Using Real Crime Data

    Directory of Open Access Journals (Sweden)

    Obuandike Georgina N.

    2015-12-01

    Full Text Available Data mining in the field of computer science is an answered prayer to the demand of this digital age. It is used to unravel hidden information from large volumes of data usually kept in data repositories to help improve management decision making. Classification is an essential task in data mining which is used to predict unknown class labels. It has been applied in the classification of different types of data. There are different techniques that can be applied in building a classification model. In this study the performance of these techniques such as J48 which is a type of decision tree classifier, Naïve Bayesian is a classifier that applies probability functions and ZeroR is a rule induction classifier are used. These classifiers are tested using real crime data collected from Nigeria Prisons Service. The metrics used to measure the performance of each classifier include accuracy, time, True Positive Rate (TP Rate, False Positive (FP Rate, Kappa Statistic, Precision and Recall. The study showed that the J48 classifier has the highest accuracy compared to other two classifiers in consideration. Choosing the right classifier for data mining task will help increase the mining accuracy.

  11. REAL TIME CLASSIFICATION AND CLUSTERING OF IDS ALERTS USING MACHINE LEARNING ALGORITHMS

    Directory of Open Access Journals (Sweden)

    T. Subbulakshmi

    2010-01-01

    Full Text Available Intrusion Detection Systems (IDS monitor a secured network for the evidence of malicious activities originating either inside or outside. Upon identifying a suspicious traffic, IDS generates and logs an alert. Unfortunately, most of the alerts generated are either false positive, i.e. benign traffic that has been classified as intrusions, or irrelevant, i.e. attacks that are not successful. The abundance of false positive alerts makes it difficult for the security analyst to find successful attacks and take remedial action. This paper describes a two phase automatic alert classification system to assist the human analyst in identifying the false positives. In the first phase, the alerts collected from one or more sensors are normalized and similar alerts are grouped to form a meta-alert. These meta-alerts are passively verified with an asset database to find out irrelevant alerts. In addition, an optional alert generalization is also performed for root cause analysis and thereby reduces false positives with human interaction. In the second phase, the reduced alerts are labeled and passed to an alert classifier which uses machine learning techniques for building the classification rules. This helps the analyst in automatic classification of the alerts. The system is tested in real environments and found to be effective in reducing the number of alerts as well as false positives dramatically, and thereby reducing the workload of human analyst.

  12. Direction of Onset estimation using Multiple Signal Classification, Estimation of Signal Parameter by Revolving Invariance Techniques and Maximumlikelihood Algorithms for Antenna arrays

    Directory of Open Access Journals (Sweden)

    Yuvaraja.T

    2015-12-01

    Full Text Available In this paper a comparison of the performance of three famous Eigen structure based Direction of arrival (DOA algorithms known as the Multiple Signal Classification (MUSIC, the Estimation of Signal Parameter via Rotational Invariance Techniques (ESPRIT and a non-subspace method maximum-likelihood estimation (MLE has been extensively studied in this research work The performance of this DOA estimation algorithm based on Uniform Linear Array (ULA. We estimated various DOA using MATLAB, results shows that MUSIC algorithm is more accurate and stable compared to ESPRIT and MLE algorithms.

  13. Scalable Algorithms for Unsupervised Classification and Anomaly Detection in Large Geospatiotemporal Data Sets

    Science.gov (United States)

    Mills, R. T.; Hoffman, F. M.; Kumar, J.

    2015-12-01

    The increasing availability of high-resolution geospatiotemporal datasets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery and mining of ecological data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe some unsupervised knowledge discovery and anomaly detection approaches based on highly scalable parallel algorithms for k-means clustering and singular value decomposition, consider a few practical applications thereof to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.

  14. Algorithm for the classification of multi-modulating signals on the electrocardiogram.

    Science.gov (United States)

    Mita, Mitsuo

    2007-03-01

    This article discusses the algorithm to measure electrocardiogram (ECG) and respiration simultaneously and to have the diagnostic potentiality for sleep apnoea from ECG recordings. The algorithm is composed by the combination with the three particular scale transform of a(j)(t), u(j)(t), o(j)(a(j)) and the statistical Fourier transform (SFT). Time and magnitude scale transforms of a(j)(t), u(j)(t) change the source into the periodic signal and tau(j) = o(j)(a(j)) confines its harmonics into a few instantaneous components at tau(j) being a common instant on two scales between t and tau(j). As a result, the multi-modulating source is decomposed by the SFT and is reconstructed into ECG, respiration and the other signals by inverse transform. The algorithm is expected to get the partial ventilation and the heart rate variability from scale transforms among a(j)(t), a(j+1)(t) and u(j+1)(t) joining with each modulation. The algorithm has a high potentiality of the clinical checkup for the diagnosis of sleep apnoea from ECG recordings.

  15. An Index-Inspired Algorithm for Anytime Classification on Evolving Data Streams

    DEFF Research Database (Denmark)

    Kranen, Phillip; Assent, Ira; Seidl, Thomas

    2012-01-01

    Due to the ever growing presence of data streams there has been a considerable amount of research on stream data mining over the past years. Anytime algorithms are particularly well suited for stream mining, since they flexibly use all available time on streams of varying data rates, and are also...

  16. Remote Sensing Classification based on Improved Ant Colony Rules Mining Algorithm

    Directory of Open Access Journals (Sweden)

    Shuying Liu

    2014-09-01

    Full Text Available Data mining can uncover previously undetected relationships among data items using automated data analysis techniques. In data mining, association rule mining is a prevalent and well researched method for discovering useful relations between variables in large databases. This paper investigates the principle of traditional rule mining, which will produce more non-essential candidate sets when it reads data into candidate items. Particularly when it deals with massive data, if the minimum support and minimum confidence are relatively small, combinatorial explosion of frequent item sets will occur and computing power and storage space required are likely to exceed the limits of machine. A new ant colony algorithm based on conventional Ant-Miner algorithm is proposed and is used in rules mining. Measurement formula of effectiveness of the rules is improved and pheromone concentration update strategy is also carried out. The experiment results show that execution time of proposed algorithm is lower than traditional algorithm and has better execution time and accuracy

  17. Mucinous Adenocarcinoma Involving the Ovary: Comparative Evaluation of the Classification Algorithms using Tumor Size and Laterality

    Science.gov (United States)

    Jung, Eun Sun; Bae, Jeong Hoon; Choi, Yeong Jin; Park, Jong-Sup; Lee, Kyo-Young

    2010-01-01

    For intraoperative consultation of mucinous adenocarcinoma involving the ovary, it would be useful to have approaching methods in addition to the traditional limited microscopic findings in order to determine the nature of the tumors. Mucinous adenocarcinomas involving the ovaries were evaluated in 91 cases of metastatic mucinous adenocarcinomas and 19 cases of primary mucinous adenocarcinomas using both an original algorithm (unilateral ≥10 cm tumors were considered primary and unilateral <10 cm tumors or bilateral tumors were considered metastatic) and a modified cut-off size algorithm. With 10 cm, 13 cm, and 15 cm size cut-offs, the algorithm correctly classified primary and metastatic tumors in 82.7%, 87.3%, and 89.1% of cases and in 80.6%, 84.9%, and 87.1% of signet ring cell carcinoma (SRC) excluded cases. In total cases and SRC excluded cases, 98.0% and 97.2% of bilateral tumors were metastatic and 100% and 100% of unilateral tumors <10 cm were metastatic, respectively. In total cases and SRC excluded cases, 68.4% and 68.4% of unilateral tumors ≥15 cm were primary, respectively. The diagnostic algorithm using size and laterality, in addition to clinical history, preoperative image findings, and operative findings, is a useful adjunct tool for differentiation of metastatic mucinous adenocarcinomas from primary mucinous adenocarcinomas of the ovary. PMID:20119573

  18. Pap-smear Classification Using Efficient Second Order Neural Network Training Algorithms

    DEFF Research Database (Denmark)

    Ampazis, Nikolaos; Dounias, George; Jantzen, Jan

    2004-01-01

    In this paper we make use of two highly efficient second order neural network training algorithms, namely the LMAM (Levenberg-Marquardt with Adaptive Momentum) and OLMAM (Optimized Levenberg-Marquardt with Adaptive Momentum), for the construction of an efficient pap-smear test classifier...

  19. An application of the Self Organizing Map Algorithm to computer aided classification of ASTER multispectral data

    Directory of Open Access Journals (Sweden)

    Ferdinando Giacco

    2008-01-01

    Full Text Available In this paper we employ the Kohonen’s Self Organizing Map (SOM as a strategy for an unsupervised analysis of ASTER multispectral (MS images. In order to obtain an accurate clusterization we introduce as input for the network, in addition to spectral data, some texture measures extracted from IKONOS images, which gives a contribution to the classification of manmade structures. After clustering of SOM outcomes, we associated each cluster with a major land cover and compared them with prior knowledge of the scene analyzed.

  20. A Novel Dispersion Degree and EBFNN-based Fingerprint Classification Algorithm%基于离散度和EBFNN的指纹分类方法

    Institute of Scientific and Technical Information of China (English)

    罗菁; 林树忠; 倪建云; 宋丽梅

    2009-01-01

    Aiming at shift and rotation in fingerprint images, a novel dispersion degree and Ellipsoidal Basis Function Neural Network (EBFNN)-based fingerprint classification algorithm was proposed in this paper. Firstly, feature space was obtained through wavelet transform on fingerprint image. Then, the optimal feature combinations of different dimension were acquired by searching features in the feature space. And the feature vector was determined by studying the changes of divergence degree of those optimal feature combinations along with the dimensions. Finally, EBFNN was trained by the feature vector and fingerprint classification was accomplished. The experimental results on FVC2000 and FVC2002-DB1 show that the average classification accuracy is 91.45% if the number of the hidden neurons is 11. Moreover, the proposed algorithm is robust to shift and rotation in fingerprint images, thus it has some values in practice.%针对指纹图像中的较大平移和旋转,提出了一种基于离散度和EBFNN的指纹分类方法.首先,对指纹图像进行离散小波变换获得特征空间.然后,对特征空间进行搜索得到不同维数下的优化特征组合,通过研究这些优化特征组合的散度值随维数的变化趋势,最终确定特征向量的构成.最后,以此特征向量训练EBFNN,完成指纹纹型分类,并在FVC2000和FVC2002-DB1上作了测试.实验结果表明,当隐层节点为11时 ,总的纹型辨识正确率可达91.45%,而且对指纹图像中的平移和旋转具有良好的鲁棒性,具有一定的实用价值.

  1. Real Time Classification and Clustering Of IDS Alerts Using Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    T. Subbulakshmi

    2010-01-01

    Full Text Available Intrusion Detection Systems (IDS monitor a secured network for the evidence of maliciousactivities originating either inside or outside. Upon identifying a suspicious traffic, IDSgenerates and logs an alert. Unfortunately, most of the alerts generated are either false positive,i.e. benign traffic that has been classified as intrusions, or irrelevant, i.e. attacks that are notsuccessful. The abundance of false positive alerts makes it difficult for the security analyst tofind successful attacks and take remedial action. This paper describes a two phase automaticalert classification system to assist the human analyst in identifying the false positives. In thefirst phase, the alerts collected from one or more sensors are normalized and similar alerts aregrouped to form a meta-alert. These meta-alerts are passively verified with an asset database tofind out irrelevant alerts. In addition, an optional alert generalization is also performed for rootcause analysis and thereby reduces false positives with human interaction. In the second phase,the reduced alerts are labeled and passed to an alert classifier which uses machine learningtechniques for building the classification rules. This helps the analyst in automatic classificationof the alerts. The system is tested in real environments and found to be effective in reducing thenumber of alerts as well as false positives dramatically, and thereby reducing the workload ofhuman analyst.

  2. Shape classification of wear particles by image boundary analysis using machine learning algorithms

    Science.gov (United States)

    Yuan, Wei; Chin, K. S.; Hua, Meng; Dong, Guangneng; Wang, Chunhui

    2016-05-01

    The shape features of wear particles generated from wear track usually contain plenty of information about the wear states of a machinery operational condition. Techniques to quickly identify types of wear particles quickly to respond to the machine operation and prolong the machine's life appear to be lacking and are yet to be established. To bridge rapid off-line feature recognition with on-line wear mode identification, this paper presents a new radial concave deviation (RCD) method that mainly involves the use of the particle boundary signal to analyze wear particle features. Signal output from the RCDs subsequently facilitates the determination of several other feature parameters, typically relevant to the shape and size of the wear particle. Debris feature and type are identified through the use of various classification methods, such as linear discriminant analysis, quadratic discriminant analysis, naïve Bayesian method, and classification and regression tree method (CART). The average errors of the training and test via ten-fold cross validation suggest CART is a highly suitable approach for classifying and analyzing particle features. Furthermore, the results of the wear debris analysis enable the maintenance team to diagnose faults appropriately.

  3. Decision making in double-pedicled DIEP and SIEA abdominal free flap breast reconstructions: An algorithmic approach and comprehensive classification.

    Directory of Open Access Journals (Sweden)

    Charles M Malata

    2015-10-01

    Full Text Available Introduction: The deep inferior epigastric artery perforator (DIEP free flap is the gold standard for autologous breast reconstruction. However, using a single vascular pedicle may not yield sufficient tissue in patients with midline scars or insufficient lower abdominal pannus. Double-pedicled free flaps overcome this problem using different vascular arrangements to harvest the entire lower abdominal flap. The literature is, however, sparse regarding technique selection. We therefore reviewed our experience in order to formulate an algorithm and comprehensive classification for this purpose. Methods: All patients undergoing unilateral double-pedicled abdominal perforator free flap breast reconstruction (AFFBR by a single surgeon (CMM over 40 months were reviewed from a prospectively collected database. Results: Of the 112 consecutive breast free flaps performed, 25 (22% utilised two vascular pedicles. The mean patient age was 45 years (range=27-54. All flaps but one (which used the thoracodorsal system were anastomosed to the internal mammary vessels using the rib-preservation technique. The surgical duration was 656 minutes (range=468-690 mins. The median flap weight was 618g (range=432-1275g and the mastectomy weight was 445g (range=220-896g. All flaps were successful and only three patients requested minor liposuction to reduce and reshape their reconstructed breasts.Conclusion: Bipedicled free abdominal perforator flaps, employed in a fifth of all our AFFBRs, are a reliable and safe option for unilateral breast reconstruction. They, however, necessitate clear indications to justify the additional technical complexity and surgical duration. Our algorithm and comprehensive classification facilitate technique selection for the anastomotic permutations and successful execution of these operations.

  4. Mucinous Adenocarcinoma Involving the Ovary: Comparative Evaluation of the Classification Algorithms using Tumor Size and Laterality

    OpenAIRE

    Jung, Eun Sun; Bae, Jeong Hoon; Lee, Ahwon; Choi, Yeong Jin; Park, Jong-Sup; Lee, Kyo-Young

    2010-01-01

    For intraoperative consultation of mucinous adenocarcinoma involving the ovary, it would be useful to have approaching methods in addition to the traditional limited microscopic findings in order to determine the nature of the tumors. Mucinous adenocarcinomas involving the ovaries were evaluated in 91 cases of metastatic mucinous adenocarcinomas and 19 cases of primary mucinous adenocarcinomas using both an original algorithm (unilateral ≥10 cm tumors were considered primary and unilateral

  5. Analysis of Speed Sign Classification Algorithms Using Shape Based Segmentation of Binary Images

    OpenAIRE

    Muhammad, Azam Sheikh; Lavesson, Niklas; Davidsson, Paul; Nilsson, Mikael

    2009-01-01

    Traffic Sign Recognition is a widely studied problem and its dynamic nature calls for the application of a broad range of preprocessing, segmentation, and recognition techniques but few databases are available for evaluation. We have produced a database consisting of 1,300 images captured by a video camera. On this database we have conducted a systematic experimental study. We used four different preprocessing techniques and designed a generic speed sign segmentation algorithm. Then we select...

  6. Blog Classification: Adding Linguistic Knowledge to Improve the K-NN Algorithm

    Science.gov (United States)

    Bayoudh, Ines; Bechet, Nicolas; Roche, Mathieu

    Blogs are interactive and regularly updated websites which can be seen as diaries. These websites are composed by articles based on distinct topics. Thus, it is necessary to develop Information Retrieval approaches for this new web knowledge. The first important step of this process is the categorization of the articles. The paper above compares several methods using linguistic knowledge with k-NN algorithm for automatic categorization of weblogs articles.

  7. Algorithm construction methodology for diagnostic classification of near-infrared spectroscopy data

    OpenAIRE

    Guevara, Ramón; Stothers, Lynn; Macnab, Andrew

    2011-01-01

    Background: Near-infrared spectroscopy (NIRS) has recognized potential but limited application for non-invasive diagnostic evaluation. Data analysis methodology that reproducibly distinguishes between the presence or absence of physiologic abnormality could broaden clinical application of this optical technique. Methods: Sample data sets from simultaneous NIRS bladder monitoring and invasive urodynamic pressure-flow studies (UDS) are used to illustrate how a diagnostic algorithm is constructe...

  8. A review on speech enhancement algorithms and why to combine with environment classification

    Science.gov (United States)

    Nidhyananthan, S. Selva; Kumari, R. Shantha Selva; Prakash, A. Arun

    2014-04-01

    Speech enhancement has been an intensive research for several decades to enhance the noisy speech that is corrupted by additive noise, multiplicative noise or convolutional noise. Even after decades of research it is still the most challenging problem, because most papers rely on estimating the noise during the nonspeech activity assuming that the background noise is uncorrelated (statistically independent of speech signal), nonstationary and slowly varying, so that the noise characteristics estimated in the absence of speech can be used subsequently in the presence of speech, whereas in a real time environment such assumptions do not hold for all the time. In this paper, we discuss the historical development of approaches that starts from the year 1970 to, the recent, 2013 for enhancing the noisy speech corrupted by additive background noise. Seeing the history, there are algorithms that enhance the noisy speech very well as long as a specific application is concerned such as the In-car noisy environments. It has to be observed that a speech enhancement algorithm performs well with a good estimation of the noise Power Spectral Density (PSD) from the noisy speech. Our idea pops up based on this observation, for online speech enhancement (i.e. in a real time environment) such as mobile phone applications, instead of estimating the noise from the noisy speech alone, the system should be able to monitor an environment continuously and classify it. Based on the current environment of the user, the system should adapt the algorithm (i.e. enhancement or estimation algorithm) for the current environment to enhance the noisy speech.

  9. Energy Based Feature Extraction for Classification of Respiratory Signals Using Modified Threshold Based Algorithm

    Directory of Open Access Journals (Sweden)

    A.BHAVANI SANKAR,

    2010-10-01

    Full Text Available In this work, we carried out a detailed study of various features of respiratory signal. Respiratory signals contains potentially precise information that could assist clinicians in making appropriate and timely decisions during sleeping disorder and labour. The extraction and detection of the sleep apnea from composite abdominal signals with powerful and advance methodologies is becoming a very important requirement in apnea patient monitoring. The method we proposed in this work is based on the extraction of four main features of respiratory signal. The automatic signal classification starts by extracting signal features from 30 seconds respiratory data through autoregressive modeling (AR and other techniques. Four features are: signal energy, zero crossing frequency, dominant frequency estimated by AR and strength of dominant frequency based on AR. These features are then compared to threshold values and introduced to a series of conditions to determine the signal category for each specific epoch.

  10. Operational algorithm for ice/water classification on dual-polarized RADARSAT-2 images

    OpenAIRE

    Zakhvatkina, Natalia; Korosov, Anton; Muckenhuber, Stefan; Sandven, Stein; Babiker, Mohamed

    2016-01-01

    Synthetic aperture radar (SAR) data from RADARSAT-2 (RS2) taken in dual-polarization mode provide additional information for discriminating sea ice and open water compared to single-polarization data. We have developed a fully automatic algorithm to distinguish between open water (rough/calm) and sea ice based on dual-polarized RS2 SAR images. Several technical problems inherent in RS2 data were solved on the pre-processing stage including thermal noise reduction in HV-polarization channel an...

  11. An Improved Polarimetric Radar Rainfall Algorithm With Hydrometeor Classification Optimized For Rainfall Estimation

    Science.gov (United States)

    Cifelli, R.; Wang, Y.; Lim, S.; Kennedy, P.; Chandrasekar, V.; Rutledge, S. A.

    2009-05-01

    The efficacy of dual polarimetric radar for quantitative precipitation estimation (QPE) is firmly established. Specifically, rainfall retrievals using combinations of reflectivity (ZH), differential reflectivity (ZDR), and specific differential phase (KDP) have advantages over traditional Z-R methods because more information about the drop size distribution and hydrometeor type are available. In addition, dual-polarization radar measurements are generally less susceptible to error and biases due to the presence of ice in the sampling volume. A number of methods have been developed to estimate rainfall from dual-polarization radar measurements. However, the robustness of these techniques in different precipitation regimes is unknown. Because the National Weather Service (NWS) will soon upgrade the WSR 88-D radar network to dual-polarization capability, it is important to test retrieval algorithms in different meteorological environments in order to better understand the limitations of the different methodologies. An important issue in dual-polarimetric rainfall estimation is determining which method to employ for a given set of polarimetric observables. For example, under what circumstances does differential phase information provide superior rain estimates relative to methods using reflectivity and differential reflectivity? At Colorado State University (CSU), a "blended" algorithm has been developed and used for a number of years to estimate rainfall based on ZH, ZDR, and KDP (Cifelli et al. 2002). The rainfall estimators for each sampling volume are chosen on the basis of fixed thresholds, which maximize the measurement capability of each polarimetric variable and combinations of variables. Tests have shown, however, that the retrieval is sensitive to the calculation of ice fraction in the radar volume via the difference reflectivity (ZDP - Golestani et al. 1989) methodology such that an inappropriate estimator can be selected in situations where radar echo is

  12. Classification of EEG-P300 Signals Extracted from Brain Activities in BCI Systems Using ν-SVM and BLDA Algorithms

    Directory of Open Access Journals (Sweden)

    Ali MOMENNEZHAD

    2014-06-01

    Full Text Available In this paper, a linear predictive coding (LPC model is used to improve classification accuracy, convergent speed to maximum accuracy, and maximum bitrates in brain computer interface (BCI system based on extracting EEG-P300 signals. First, EEG signal is filtered in order to eliminate high frequency noise. Then, the parameters of filtered EEG signal are extracted using LPC model. Finally, the samples are reconstructed by LPC coefficients and two classifiers, a Bayesian Linear discriminant analysis (BLDA, and b the υ-support vector machine (υ-SVM are applied in order to classify. The proposed algorithm performance is compared with fisher linear discriminant analysis (FLDA. Results show that the efficiency of our algorithm in improving classification accuracy and convergent speed to maximum accuracy are much better. As example at the proposed algorithms, respectively BLDA with LPC model and υ-SVM with LPC model with8 electrode configuration for subject S1 the total classification accuracy is improved as 9.4% and 1.7%. And also, subject 7 at BLDA and υ-SVM with LPC model algorithms (LPC+BLDA and LPC+ υ-SVM after block 11th converged to maximum accuracy but Fisher Linear Discriminant Analysis (FLDA algorithm did not converge to maximum accuracy (with the same configuration. So, it can be used as a promising tool in designing BCI systems.

  13. 改进的HyperSplit报文分类算法%Improved HyperSplit Packet Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    马腾; 陈庶樵; 张校辉

    2014-01-01

    In order to solve the problem of too much memory usage in existing work for high speed large volume multi-field packet classification, an improved HyperSplit algorithm is proposed. By analyzing the cause of too much memory usage, the heuristic algorithms are modified and designed to choose the cutting points and dimensions and eliminate redundancy. Rule replication is greatly reduced, redundant rules and nodes are removed, and the decision tree’s structure is optimized. Simulation results demonstrate that compared with the existing work, independent of rule base’s type and characteristic, the algorithm can greatly reduce memory usage without increasing the number of memory accesses and ensure that packets can be processed at wire speed, and when the volume of classifier is 105, the algorithm consumes about 80%memory usage as that of HyperSplit.%针对现有高速、大容量、多域报文分类算法普遍存在内存使用量大的问题,提出一种改进的 HyperSplit 多域报文分类算法。通过分析现有算法内存使用量大的原因,修正和设计选择分割维度与分割点、去除冗余结构的启发式算法,最大限度减少决策树中的复制规则数量,消除决策树中存在的冗余规则和冗余节点,优化决策树结构。仿真结果表明,该算法与现有多域报文分类算法陒比,不依赖于规则集类型和特征,在保证内存访问次数不增加、报文得到陑速处理的情况下,可降低算法的内存使用量,当规则集容量为105时,内存使用量降低到HyperSplit算法的80%。

  14. 基于可调整邻域阈值的DBSCAN算法在应急预案分类管理中的应用%Application of DBSCAN algorithm based on adjustable threshold in the emergency plan classification management

    Institute of Scientific and Technical Information of China (English)

    金保华; 林青; 赵家明

    2012-01-01

    针对庞大的预案文本资源分类难的问题,将可调整的邻域阈值Eps取代原有的全局Eps,得到了改进的DBSCAN密度聚类算法.以预案文本间的相似度作为聚类基本的度量属性,将改进的DBSCAN算法应用于应急预案分类管理,以去除边界.仿真结果证明该方法不仅不影响预案本来的基础分类方式,而且更易于实现,在一定程度上能够缓解噪音点误识别问题,对提高预案文本的重用性和分类的准确率有一定的参考意义.%Aiming at large plan texts resource classification problems, adjustable threshold Eps replaced the original global threshold Eps. An improved DBSCAN clustering algorithm based on density was put forward. The similarity between plan texts was taken as measurement attribute. Improved DBSCAN was applied in the field of plan classification to remove the boundary identification error. The simulation results showed that this method not only does not affect the result in basis classification way, but also have certain reference significance to improve accuracy and reusability of classification.

  15. Effectiveness of Partition and Graph Theoretic Clustering Algorithms for Multiple Source Partial Discharge Pattern Classification Using Probabilistic Neural Network and Its Adaptive Version: A Critique Based on Experimental Studies

    Directory of Open Access Journals (Sweden)

    S. Venkatesh

    2012-01-01

    Full Text Available Partial discharge (PD is a major cause of failure of power apparatus and hence its measurement and analysis have emerged as a vital field in assessing the condition of the insulation system. Several efforts have been undertaken by researchers to classify PD pulses utilizing artificial intelligence techniques. Recently, the focus has shifted to the identification of multiple sources of PD since it is often encountered in real-time measurements. Studies have indicated that classification of multi-source PD becomes difficult with the degree of overlap and that several techniques such as mixed Weibull functions, neural networks, and wavelet transformation have been attempted with limited success. Since digital PD acquisition systems record data for a substantial period, the database becomes large, posing considerable difficulties during classification. This research work aims firstly at analyzing aspects concerning classification capability during the discrimination of multisource PD patterns. Secondly, it attempts at extending the previous work of the authors in utilizing the novel approach of probabilistic neural network versions for classifying moderate sets of PD sources to that of large sets. The third focus is on comparing the ability of partition-based algorithms, namely, the labelled (learning vector quantization and unlabelled (K-means versions, with that of a novel hypergraph-based clustering method in providing parsimonious sets of centers during classification.

  16. Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study

    Directory of Open Access Journals (Sweden)

    Kanthida Kusonmano

    2012-01-01

    Full Text Available A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs, random forest (RF, k-nearest neighbors (k-NN, penalized logistic regression (PLR, and prediction analysis for microarrays (PAMs. We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.

  17. A New Nearest Neighbor Classification Algorithm Based on Local Probability Centers

    Directory of Open Access Journals (Sweden)

    I-Jing Li

    2014-01-01

    Full Text Available The nearest neighbor is one of the most popular classifiers, and it has been successfully used in pattern recognition and machine learning. One drawback of kNN is that it performs poorly when class distributions are overlapping. Recently, local probability center (LPC algorithm is proposed to solve this problem; its main idea is giving weight to samples according to their posterior probability. However, LPC performs poorly when the value of k is very small and the higher-dimensional datasets are used. To deal with this problem, this paper suggests that the gradient of the posterior probability function can be estimated under sufficient assumption. The theoretic property is beneficial to faithfully calculate the inner product of two vectors. To increase the performance in high-dimensional datasets, the multidimensional Parzen window and Euler-Richardson method are utilized, and a new classifier based on local probability centers is developed in this paper. Experimental results show that the proposed method yields stable performance with a wide range of k for usage, robust performance to overlapping issue, and good performance to dimensionality. The proposed theorem can be applied to mathematical problems and other applications. Furthermore, the proposed method is an attractive classifier because of its simplicity.

  18. Method of network traffic classification using improved LM algorithm%采用改进LM算法的网络流量分类方法

    Institute of Scientific and Technical Information of China (English)

    胡婷; 王勇; 陶晓玲

    2011-01-01

    In order to solve the problems in current work that rely on the traditional method of traffic classification,such as low accuracy, limited application region, an effective approach for network traffic classification named GA-LM is proposed. This method employs the classification method based on neural network as the classification model of network traffic,and applies L-M algorithm which is an improved BP algorithms and Genetic Algorithm (GA) that optimizes neural network weights, which will speed up the convergence of the neural network and improve the classification performance.The experimental data sets that are collected from actual networks are conducted experiments.The results show that the convergence speed of the method is faster and GA-LM has better feasibility and high accuracy which can be used effectively to the application of network traffic classification.%针对传统的流量分类方法准确率低、开销大、应用范围受限等问题,提出一种有效的网络流量分类方法(GA-LM).该方法将基于神经网络的分类方法作为网络流量的分类模型,采用L-M算法构造分类器,并用遗传算法优化网络初始连接权值,加速了网络收敛过程,提高了分类性能.通过对收集到的实际网络流量数据进行分类,实验结果表明GA-LM比标准BP算法和L-M算法的收敛速度快,具有较好的可行性和高准确性,从而可有效地用于网络流量分类中.

  19. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Science.gov (United States)

    McDonnell, Mark D; Tissera, Migel D; Vladusich, Tony; van Schaik, André; Tapson, Jonathan

    2015-01-01

    Recent advances in training deep (multi-layer) architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM) approach, which also enables a very rapid training time (∼ 10 minutes). Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  20. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Directory of Open Access Journals (Sweden)

    Mark D McDonnell

    Full Text Available Recent advances in training deep (multi-layer architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM approach, which also enables a very rapid training time (∼ 10 minutes. Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  1. Love thy neighbour: automatic animal behavioural classification of acceleration data using the K-nearest neighbour algorithm.

    Directory of Open Access Journals (Sweden)

    Owen R Bidder

    Full Text Available Researchers hoping to elucidate the behaviour of species that aren't readily observed are able to do so using biotelemetry methods. Accelerometers in particular are proving particularly effective and have been used on terrestrial, aquatic and volant species with success. In the past, behavioural modes were detected in accelerometer data through manual inspection, but with developments in technology, modern accelerometers now record at frequencies that make this impractical. In light of this, some researchers have suggested the use of various machine learning approaches as a means to classify accelerometer data automatically. We feel uptake of this approach by the scientific community is inhibited for two reasons; 1 Most machine learning algorithms require selection of summary statistics which obscure the decision mechanisms by which classifications are arrived, and 2 they are difficult to implement without appreciable computational skill. We present a method which allows researchers to classify accelerometer data into behavioural classes automatically using a primitive machine learning algorithm, k-nearest neighbour (KNN. Raw acceleration data may be used in KNN without selection of summary statistics, and it is easily implemented using the freeware program R. The method is evaluated by detecting 5 behavioural modes in 8 species, with examples of quadrupedal, bipedal and volant species. Accuracy and Precision were found to be comparable with other, more complex methods. In order to assist in the application of this method, the script required to run KNN analysis in R is provided. We envisage that the KNN method may be coupled with methods for investigating animal position, such as GPS telemetry or dead-reckoning, in order to implement an integrated approach to movement ecology research.

  2. BIANCA (Brain Intensity AbNormality Classification Algorithm): A new tool for automated segmentation of white matter hyperintensities.

    Science.gov (United States)

    Griffanti, Ludovica; Zamboni, Giovanna; Khan, Aamira; Li, Linxin; Bonifacio, Guendalina; Sundaresan, Vaanathi; Schulz, Ursula G; Kuker, Wilhelm; Battaglini, Marco; Rothwell, Peter M; Jenkinson, Mark

    2016-11-01

    Reliable quantification of white matter hyperintensities of presumed vascular origin (WMHs) is increasingly needed, given the presence of these MRI findings in patients with several neurological and vascular disorders, as well as in elderly healthy subjects. We present BIANCA (Brain Intensity AbNormality Classification Algorithm), a fully automated, supervised method for WMH detection, based on the k-nearest neighbour (k-NN) algorithm. Relative to previous k-NN based segmentation methods, BIANCA offers different options for weighting the spatial information, local spatial intensity averaging, and different options for the choice of the number and location of the training points. BIANCA is multimodal and highly flexible so that the user can adapt the tool to their protocol and specific needs. We optimised and validated BIANCA on two datasets with different MRI protocols and patient populations (a "predominantly neurodegenerative" and a "predominantly vascular" cohort). BIANCA was first optimised on a subset of images for each dataset in terms of overlap and volumetric agreement with a manually segmented WMH mask. The correlation between the volumes extracted with BIANCA (using the optimised set of options), the volumes extracted from the manual masks and visual ratings showed that BIANCA is a valid alternative to manual segmentation. The optimised set of options was then applied to the whole cohorts and the resulting WMH volume estimates showed good correlations with visual ratings and with age. Finally, we performed a reproducibility test, to evaluate the robustness of BIANCA, and compared BIANCA performance against existing methods. Our findings suggest that BIANCA, which will be freely available as part of the FSL package, is a reliable method for automated WMH segmentation in large cross-sectional cohort studies. PMID:27402600

  3. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    Science.gov (United States)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  4. Emotion Recognition of Weblog Sentences Based on an Ensemble Algorithm of Multi-label Classification and Word Emotions

    Science.gov (United States)

    Li, Ji; Ren, Fuji

    Weblogs have greatly changed the communication ways of mankind. Affective analysis of blog posts is found valuable for many applications such as text-to-speech synthesis or computer-assisted recommendation. Traditional emotion recognition in text based on single-label classification can not satisfy higher requirements of affective computing. In this paper, the automatic identification of sentence emotion in weblogs is modeled as a multi-label text categorization task. Experiments are carried out on 12273 blog sentences from the Chinese emotion corpus Ren_CECps with 8-dimension emotion annotation. An ensemble algorithm RAKEL is used to recognize dominant emotions from the writer's perspective. Our emotion feature using detailed intensity representation for word emotions outperforms the other main features such as the word frequency feature and the traditional lexicon-based feature. In order to deal with relatively complex sentences, we integrate grammatical characteristics of punctuations, disjunctive connectives, modification relations and negation into features. It achieves 13.51% and 12.49% increases for Micro-averaged F1 and Macro-averaged F1 respectively compared to the traditional lexicon-based feature. Result shows that multiple-dimension emotion representation with grammatical features can efficiently classify sentence emotion in a multi-label problem.

  5. Multi-layer Attribute Selection and Classification Algorithm for the Diagnosis of Cardiac Autonomic Neuropathy Based on HRV Attributes

    Directory of Open Access Journals (Sweden)

    Herbert F. Jelinek

    2015-12-01

    Full Text Available Cardiac autonomic neuropathy (CAN poses an important clinical problem, which often remains undetected due difficulty of conducting the current tests and their lack of sensitivity. CAN has been associated with growth in the risk of unexpected death in cardiac patients with diabetes mellitus. Heart rate variability (HRV attributes have been actively investigated, since they are important for diagnostics in diabetes, Parkinson's disease, cardiac and renal disease. Due to the adverse effects of CAN it is important to obtain a robust and highly accurate diagnostic tool for identification of early CAN, when treatment has the best outcome. Use of HRV attributes to enhance the effectiveness of diagnosis of CAN progression may provide such a tool. In the present paper we propose a new machine learning algorithm, the Multi-Layer Attribute Selection and Classification (MLASC, for the diagnosis of CAN progression based on HRV attributes. It incorporates our new automated attribute selection procedure, Double Wrapper Subset Evaluator with Particle Swarm Optimization (DWSE-PSO. We present the results of experiments, which compare MLASC with other simpler versions and counterpart methods. The experiments used our large and well-known diabetes complications database. The results of experiments demonstrate that MLASC has significantly outperformed other simpler techniques.

  6. Taking Triple Aim at the Triple Aim.

    Science.gov (United States)

    Bryan, Stirling; Donaldson, Cam

    2016-01-01

    Since its introduction to the USA, the Triple Aim is now being adopted in the healthcare systems of other advanced economies. Verma and Bhatia (2016) (V&B) argue that provincial governments in Canada now need to step up to the plate and lead on the implementation of a Triple Aim reform program here. Their proposals are wide ranging and ambitious, looking for governments to act as the "integrators" within the healthcare system, and lead the reforms. Our view is that, as a vision and set of goals for the healthcare system, the Triple Aim is all well and good, but as a pathway for system reform, as articulated by V&B, it misses the mark in at least three important respects. First, the emphasis on improvement driven by performance measurement and pay-for-performance is troubling and flies in the face of emerging evidence. Second, we know that scarcity can be recognized and managed, even in politically complex systems, and so we urge the Triple Aim proponents to embrace more fully notions of resource stewardship. Third, if we want to take seriously "population health" goals, we need to think very differently and consider broader health determinants; Triple Aim innovation targeted at healthcare systems will not deliver the goals. PMID:27009583

  7. Bearings-Only Algorithm on Distinguishing the Stability of Speed and Course of Aim%一种判别目标稳速稳向的纯方位算法

    Institute of Scientific and Technical Information of China (English)

    徐功慧; 郝阳

    2016-01-01

    A kind of bearings-only mathematics model on the distinguishing to stability of speed and course of aim was offered,because it can be served on bearings-only mathematics of submarine for dormant attack.Using this algorithm,ubmarine will be moved with steady speed and course at the stated time which is compartmentalized for three equal periods at least,and then it can give four shipboard angles of aim from its sonar.The model consisted of logic relation about four shipboard angles was compared and analyzed on its calculation result.The stability of speed and course of aim was affirmed.By validating from emluator,the algorithm is logical.It can asatisfy the practical demands that the kind of bearings-only mathematics model on the distinguishing to stability of speed and course of aim.%针对潜艇隐身攻击的纯方位算法,提出一种判定目标稳速稳向态势的算法;该算法要求潜艇在一定时间内稳速稳向运动,通过设定的至少3个相同周期间隔,利用声纳测得目标的4个舷角数值,按照4个舷角的逻辑关系形成的模型进行计算结果比较分析,可确认目标是否处于稳速稳向态势;经仿真验证,该算法具有逻辑符合性;这种判定目标稳速稳向的纯方位解算数学模型可以满足实际需求。

  8. 一种背景自适应的运动目标检测算法%Study on the background self-adaptative algorithm aiming at moving object detection

    Institute of Scientific and Technical Information of China (English)

    李凌

    2014-01-01

    In light of the inadequate updating of the background of Gaussian mixture model,this essay puts forward the background self-adaptative algorithm aiming at moving object detection on the basis of the edge detection and inter-frame difference• This algorithm extracts edge information of images by using the So-bel operator with three-frame-difference method• At the same time the algorithm divides each image into back-ground area,background exposure area and target moving one,to all of which different background updating strategies are applied• Experimental results show that this algorithm has good adaptability to the slowly moving object,light mutation and background into conditions,and can effectively detect moving targets.%鉴于传统混合高斯模型背景更新的不足,融合边缘检测、帧间差分,提出一种背景自适应的运动目标检测算法。该算法利用Sobel算子提取图像的边缘信息,采用了三帧差分法把每帧图像分为背景区域、背景暴露区域以及目标运动区域,对背景暴露区域、背景区域以及运动区域采用不同的背景更新策略。实验表明,算法对缓慢运动物体、光线突变及背景融入等条件有较好的适应性,能够有效地检测运动目标。

  9. 蚁群算法在数据挖掘分类中的研究%Application Research on the classification of Data Mining Using Ant Colony Algorithm

    Institute of Scientific and Technical Information of China (English)

    熊斌; 熊娟

    2012-01-01

    Classification is an important task in data mining, using ant foraging theory in the database search to introduce the ant algorithm classification in rules discovery,to chose and optimize a group of rules which is produced random, until the database can be covered, thereby dig the implicit rules in the database, set up the optimal classification model.%对蚁群算法杂数据挖掘中的分类任务的应用进行了研究,算法实质上是利用蚁群觅食原理在数据库中进行搜索,对随机产生的一组规则进行选择优化,直到数据库能被该组规则覆盖,从而挖掘出隐含在数据库中的规则。

  10. Absolute calibration of the colour index and O4 absorption derived from Multi AXis (MAX-)DOAS measurements and their application to a standardised cloud classification algorithm

    Science.gov (United States)

    Wagner, Thomas; Beirle, Steffen; Remmers, Julia; Shaiganfar, Reza; Wang, Yang

    2016-09-01

    A method is developed for the calibration of the colour index (CI) and the O4 absorption derived from differential optical absorption spectroscopy (DOAS) measurements of scattered sunlight. The method is based on the comparison of measurements and radiative transfer simulations for well-defined atmospheric conditions and viewing geometries. Calibrated measurements of the CI and the O4 absorption are important for the detection and classification of clouds from MAX-DOAS observations. Such information is needed for the identification and correction of the cloud influence on Multi AXis (MAX-)DOAS profile inversion results, but might be also be of interest on their own, e.g. for meteorological applications. The calibration algorithm was successfully applied to measurements at two locations: Cabauw in the Netherlands and Wuxi in China. We used CI and O4 observations calibrated by the new method as input for our recently developed cloud classification scheme and also adapted the corresponding threshold values accordingly. For the observations at Cabauw, good agreement is found with the results of the original algorithm. Together with the calibration procedure of the CI and O4 absorption, the cloud classification scheme, which has been tuned to specific locations/conditions so far, can now be applied consistently to MAX-DOAS measurements at different locations. In addition to the new threshold values, further improvements were introduced to the cloud classification algorithm, namely a better description of the SZA (solar zenith angle) dependence of the threshold values and a new set of wavelengths for the determination of the CI. We also indicate specific areas for future research to further improve the cloud classification scheme.

  11. Mapping the distributions of C3 and C4 grasses in the mixed-grass prairies of southwest Oklahoma using the Random Forest classification algorithm

    Science.gov (United States)

    Yan, Dong; de Beurs, Kirsten M.

    2016-05-01

    The objective of this paper is to demonstrate a new method to map the distributions of C3 and C4 grasses at 30 m resolution and over a 25-year period of time (1988-2013) by combining the Random Forest (RF) classification algorithm and patch stable areas identified using the spatial pattern analysis software FRAGSTATS. Predictor variables for RF classifications consisted of ten spectral variables, four soil edaphic variables and three topographic variables. We provided a confidence score in terms of obtaining pure land cover at each pixel location by retrieving the classification tree votes. Classification accuracy assessments and predictor variable importance evaluations were conducted based on a repeated stratified sampling approach. Results show that patch stable areas obtained from larger patches are more appropriate to be used as sample data pools to train and validate RF classifiers for historical land cover mapping purposes and it is more reasonable to use patch stable areas as sample pools to map land cover in a year closer to the present rather than years further back in time. The percentage of obtained high confidence prediction pixels across the study area ranges from 71.18% in 1988 to 73.48% in 2013. The repeated stratified sampling approach is necessary in terms of reducing the positive bias in the estimated classification accuracy caused by the possible selections of training and validation pixels from the same patch stable areas. The RF classification algorithm was able to identify the important environmental factors affecting the distributions of C3 and C4 grasses in our study area such as elevation, soil pH, soil organic matter and soil texture.

  12. Webpage Classification Based on Deep Learning Algorithm%基于深度学习的网页分类算法研究

    Institute of Scientific and Technical Information of China (English)

    陈芊希; 范磊

    2016-01-01

    Webpage classification can be used to select accurate webpage for users, which improves the accuracy of information retrieval. Deep learning is a new field in machine learning world. It's a multi-layer neural network learning algorithm, which achieves a very high accuracy by initializing the layer by layer. It has been used in image recognition, speech recognition and text classification. This paper uses the deep learning algorithm in webpage classification. With the experiments, it finds out that the deep learning has obvious advantages for webpage classification.%网页分类可将信息准确筛选与呈现给用户,提高信息检索的准确率.深度学习是机器学习中一个全新的领域,其本质是一种多层的神经网络学习算法,通过逐层初始化的方法来达到极高的准确率,被多次使用在图像识别、语音识别、文本分类中.提出了基于深度学习的网页分类算法,实验数据证明该方法可有效提高网页分类的准确率.

  13. Knowledge discovery and sequence-based prediction of pandemic influenza using an integrated classification and association rule mining (CBA) algorithm.

    Science.gov (United States)

    Kargarfard, Fatemeh; Sami, Ashkan; Ebrahimie, Esmaeil

    2015-10-01

    Pandemic influenza is a major concern worldwide. Availability of advanced technologies and the nucleotide sequences of a large number of pandemic and non-pandemic influenza viruses in 2009 provide a great opportunity to investigate the underlying rules of pandemic induction through data mining tools. Here, for the first time, an integrated classification and association rule mining algorithm (CBA) was used to discover the rules underpinning alteration of non-pandemic sequences to pandemic ones. We hypothesized that the extracted rules can lead to the development of an efficient expert system for prediction of influenza pandemics. To this end, we used a large dataset containing 5373 HA (hemagglutinin) segments of the 2009 H1N1 pandemic and non-pandemic influenza sequences. The analysis was carried out for both nucleotide and protein sequences. We found a number of new rules which potentially present the undiscovered antigenic sites at influenza structure. At the nucleotide level, alteration of thymine (T) at position 260 was the key discriminating feature in distinguishing non-pandemic from pandemic sequences. At the protein level, rules including I233K, M334L were the differentiating features. CBA efficiently classifies pandemic and non-pandemic sequences with high accuracy at both the nucleotide and protein level. Finding hotspots in influenza sequences is a significant finding as they represent the regions with low antibody reactivity. We argue that the virus breaks host immunity response by mutation at these spots. Based on the discovered rules, we developed the software, "Prediction of Pandemic Influenza" for discrimination of pandemic from non-pandemic sequences. This study opens a new vista in discovery of association rules between mutation points during evolution of pandemic influenza.

  14. 一种基于蚁群优化的图像分类算法%AN IMAGE CLASSIFICATION ALGORITHM BASED ON ANT COLONY OPTIMISATION

    Institute of Scientific and Technical Information of China (English)

    屠莉; 杨立志

    2015-01-01

    现有图像降维方法中特征信息被过多压缩,从而影响图像分类效果。提出IC-ACO算法,利用蚁群算法来解决图像分类问题。算法充分提取并保留图像的各种形态特征。利用蚁群优化算法在特征集中自动挖掘有效特征和特征值,构建各类分类规则,从而实现图像的分类识别。在真实的车标图像数据集上的实验结果表明,IC-ACO算法比其他类似算法具有更高的分类识别率。%Feature information in current image dimension reduction methods has been excessively compressed,which impacts the efficiency of image classification.In this paper we present the IC-ACO algorithm,it employs ant colony optimisation to solve image classification problem.The algorithm fully extracts various morphological features of image and retains them.The ant colony optimisation is used to automatically mine effective features and feature values from feature sets,the algorithm then constructs the classification rules of every type,thus realises image’s classified recognition.Experimental results on actual vehicle-logo image data sets show that the IC-ACO algorithm outperforms other similar algorithms in terms of the classified recognition accuracy.

  15. Classification dynamique d'un flux documentaire : une \\'evaluation statique pr\\'ealable de l'algorithme GERMEN

    CERN Document Server

    Lelu, Alain; Johansson, Joel

    2008-01-01

    Data-stream clustering is an ever-expanding subdomain of knowledge extraction. Most of the past and present research effort aims at efficient scaling up for the huge data repositories. Our approach focuses on qualitative improvement, mainly for "weak signals" detection and precise tracking of topical evolutions in the framework of information watch - though scalability is intrinsically guaranteed in a possibly distributed implementation. Our GERMEN algorithm exhaustively picks up the whole set of density peaks of the data at time t, by identifying the local perturbations induced by the current document vector, such as changing cluster borders, or new/vanishing clusters. Optimality yields from the uniqueness 1) of the density landscape for any value of our zoom parameter, 2) of the cluster allocation operated by our border propagation rule. This results in a rigorous independence from the data presentation ranking or any initialization parameter. We present here as a first step the only assessment of a static ...

  16. 多分类问题的凸包收缩方法%Multi-classification algorithm based on contraction of closed convex hull

    Institute of Scientific and Technical Information of China (English)

    李雪辉; 魏立力

    2011-01-01

    在最大边缘线性分类器和闭凸包收缩思想的基础上,针对二分类问题,通过闭凸包收缩技术,将线性不可分问题转化为线性可分问题.将上述思想推广到解决多分类问题中,提出了一类基于闭凸包收缩的多分类算法.该方法几何意义明确,在一定程度上克服了以往多分类方法目标函数过于复杂的缺点,并利用核思想将其推广到非线性分类问题上.%According to the maximal margin linear classifier and the contraction of closed convex hull, 2-classification linearly non-separable problem can be transformed to linearly separable problem by using proposed contraction methods of closed convex hull.Multi-classification problem can be solved by contracting closed convex, and multi-classification algorithm based on the contraction of closed convex hull is presented.The geometric meaning of optimization problem is obvious.The shortcomings of complicated objective function in multi-classification are overcame, nonlinear separable multi-classification problem can be solved using kernel method.

  17. Precision disablement aiming system

    Energy Technology Data Exchange (ETDEWEB)

    Monda, Mark J.; Hobart, Clinton G.; Gladwell, Thomas Scott

    2016-02-16

    A disrupter to a target may be precisely aimed by positioning a radiation source to direct radiation towards the target, and a detector is positioned to detect radiation that passes through the target. An aiming device is positioned between the radiation source and the target, wherein a mechanical feature of the aiming device is superimposed on the target in a captured radiographic image. The location of the aiming device in the radiographic image is used to aim a disrupter towards the target.

  18. The "Life Potential": a new complex algorithm to assess "Heart Rate Variability" from Holter records for cognitive and diagnostic aims. Preliminary experimental results showing its dependence on age, gender and health conditions

    CERN Document Server

    Barra, Orazio A

    2013-01-01

    Although HRV (Heart Rate Variability) analyses have been carried out for several decades, several limiting factors still make these analyses useless from a clinical point of view. The present paper aims at overcoming some of these limits by introducing the "Life Potential" (BMP), a new mathematical algorithm which seems to exhibit surprising cognitive and predictive capabilities. BMP is defined as a linear combination of five HRV Non-Linear Variables, in turn derived from the thermodynamic formalism of chaotic dynamic systems. The paper presents experimental measurements of BMP (Average Values and Standard Deviations) derived from 1048 Holter tests, matched in age and gender, including a control group of 356 healthy subjects. The main results are: (a) BMP always decreases when the age increases, and its dependence on age and gender is well established; (b) the shape of the age dependence within "healthy people" is different from that found in the general group: this behavior provides evidence of possible illn...

  19. 基于KD-Tree的KNN文本分类算法%KNN Algorithm for Text Classification Based on KD-Tree

    Institute of Scientific and Technical Information of China (English)

    刘忠; 刘洋; 建晓

    2012-01-01

    This paper apply KD-Tree to KNN text classification algorithm,firstly put a training text set into a KD-Tree,then search KD-Tree for the all parents nodes of the tested text node,the set including these parents text nodes is the most nearest text set,the type of the tested text is the same as the type of the most nearest text which has the most similarity with the test text,this algorithm decreases the number of the compared texts,and the time complexity is o(log2N).Experiments show that the improved KNN text classification algorithm is better than the traditional KNN text classification in classification efficiency.%本文将KD-Tree应用到KNN文本分类算法中,先对训练文本集建立一个KD-Tree,然后在KD-Tree中搜索测试文本的所有祖先节点文本,这些祖先节点文本集合就是待测文本的最邻近文本集合,与测试文本有最大相似度的祖先的文本类型就是待测试文本的类型,这种算法大大减少了参与比较的向量文本数目,时间复杂度仅为O(log2N)。实验表明,改进后的KNN文本分类算法具有比传统KNN文本分类法更高的分类效率。

  20. A Complete Solution Classification and Unified Algorithmic Treatment for the One- and Two-Step Asymmetric S-Transverse Mass (MT2) Event Scale Statistic

    CERN Document Server

    Walker, Joel W

    2014-01-01

    The MT2 or "s-transverse mass", statistic was developed to cope with the difficulty of associating a parent mass scale with a missing transverse energy signature, given that models of new physics generally predict production of escaping particles in pairs, while collider experiments are sensitive to just a single vector sum over all sources of missing transverse momentum. This document focuses on the generalized extension of that statistic to asymmetric one- and two-step decay chains, with arbitrary child particle masses and upstream missing transverse momentum. It provides a unified theoretical formulation, complete solution classification, taxonomy of critical points, and technical algorithmic prescription for treatment of the MT2 event scale. An implementation of the described algorithm is available for download, and is also a deployable component of the author's fully-featured selection cut software package AEACuS (Algorithmic Event Arbiter and Cut Selector).

  1. A non-contact method based on multiple signal classification algorithm to reduce the measurement time for accurately heart rate detection

    Science.gov (United States)

    Bechet, P.; Mitran, R.; Munteanu, M.

    2013-08-01

    Non-contact methods for the assessment of vital signs are of great interest for specialists due to the benefits obtained in both medical and special applications, such as those for surveillance, monitoring, and search and rescue. This paper investigates the possibility of implementing a digital processing algorithm based on the MUSIC (Multiple Signal Classification) parametric spectral estimation in order to reduce the observation time needed to accurately measure the heart rate. It demonstrates that, by proper dimensioning the signal subspace, the MUSIC algorithm can be optimized in order to accurately assess the heart rate during an 8-28 s time interval. The validation of the processing algorithm performance was achieved by minimizing the mean error of the heart rate after performing simultaneous comparative measurements on several subjects. In order to calculate the error the reference value of heart rate was measured using a classic measurement system through direct contact.

  2. A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    Directory of Open Access Journals (Sweden)

    Li Zhen

    2008-05-01

    Full Text Available Abstract Background Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in complex in vitro/in vivo datasets. We present a novel model to simulate complex chemical-toxicology data sets and use this model to evaluate the relative performance of different machine learning (ML methods. Results The classification performance of Artificial Neural Networks (ANN, K-Nearest Neighbors (KNN, Linear Discriminant Analysis (LDA, Naïve Bayes (NB, Recursive Partitioning and Regression Trees (RPART, and Support Vector Machines (SVM in the presence and absence of filter-based feature selection was analyzed using K-way cross-validation testing and independent validation on simulated in vitro assay data sets with varying levels of model complexity, number of irrelevant features and measurement noise. While the prediction accuracy of all ML methods decreased as non-causal (irrelevant features were added, some ML methods performed better than others. In the limit of using a large number of features, ANN and SVM were always in the top performing set of methods while RPART and KNN (k = 5 were always in the poorest performing set. The addition of measurement noise and irrelevant features decreased the classification accuracy of all ML methods, with LDA suffering the greatest performance degradation. LDA performance is especially sensitive to the use of feature selection. Filter-based feature selection generally improved performance, most strikingly for LDA. Conclusion We have developed a novel simulation model to evaluate machine learning methods for the

  3. Research on SVM KNN Classification Algorithm Based on Hadoop Platform%基于Hadoop平台的SVM KNN分类算法的研究

    Institute of Scientific and Technical Information of China (English)

    李正杰; 黄刚

    2016-01-01

    数据的变革带来了前所未有的发展,对丰富且复杂的结构化、半结构化或者是非结构化数据的监测、分析、采集、存储以及应用,已经成为了数据信息时代发展的主流,分类和处理海量数据包含的信息,需要有更好的解决方法。传统的数据挖掘分类方式显然已经不能满足需求,面对这些问题,这里对数据挖掘的一些分类算法进行分析和改进,对算法进行结合,提出了改进的SVM KNN分类算法。在这个基础上,利用Hadoop云计算平台,将研究后的分类算法在MapReduce模型中进行并行化应用,使改进后的算法能够适用于大数据的处理。最后用数据集对算法进行实验验证,通过对比传统的SVM分类算法,结果表明改进后的算法达到了高效、快速、准确、低成本的要求,可以有效地进行大数据分类工作。%The reform of data has brought the unprecedented development,to monitor,analyze,collect,store and apply to the rich and complex structured,semi-structured or unstructured data has become the mainstream of the development of the information age. To classi-fy and deal with the information contained in mass data,it’ s needed to have a better solution. The traditional data mining classification method cannot meet the demand any longer. To face these problems,it analyzes and improves the classification algorithm in data mining in this paper. Combined with the algorithms,an improved SVM KNN classification algorithm is proposed. Then on this basis,by utilizing Hadoop cloud computing platform,the new classification algorithm is put into MapReduce model for parallelization application,so the im-proved algorithm can be applied to large data processing. Finally,data set is used to conduct experimental verification on the algorithm. By comparing with traditional SVM classification algorithm,the results show that the improved algorithm has become more efficient,fast, accurate

  4. 基于DBSCAN算法的电信客户分类的应用研究%DBSCAN Algorithm based on Telecom Customer Classification Application Research

    Institute of Scientific and Technical Information of China (English)

    左国才; 周荣华; 符开耀

    2012-01-01

    In order to win in the fierce market competition, telecommunications enterprises has realized the importance of customer classification. Different customers accept different marketing strategies. DBSCAN algorithm can achieve customer classification, but the initial parameters E and MinPts values are very sensitive. Different values will produce different clustering results; An improved DBSCAN algorithm is proposed in order to achieve a more accurate and comprehensive customer classification.%为了在激烈的市场竞争中取胜,电信企业意识到必须将客户分类,针对不同的客户研究相应的营销策略,DBSCAN算法能够实现客户分类,但对初始参数E和MinPts的取值非常敏感,不同的取值将产生不同的聚类结果,通过对DBSCAN算法进行改进,实现了更加准确和全面的客户分类。

  5. Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.

  6. Knowledge discovery from patients' behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services.

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.

  7. Knowledge discovery from patients' behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services.

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers. PMID:27610177

  8. Binary classification of chalcone derivatives with LDA or KNN based on their antileishmanial activity and molecular descriptors selected using the Successive Projections Algorithm feature-selection technique.

    Science.gov (United States)

    Goodarzi, Mohammad; Saeys, Wouter; de Araujo, Mario Cesar Ugulino; Galvão, Roberto Kawakami Harrop; Vander Heyden, Yvan

    2014-01-23

    Chalcones are naturally occurring aromatic ketones, which consist of an α-, β-unsaturated carbonyl system joining two aryl rings. These compounds are reported to exhibit several pharmacological activities, including antiparasitic, antibacterial, antifungal, anticancer, immunomodulatory, nitric oxide inhibition and anti-inflammatory effects. In the present work, a Quantitative Structure-Activity Relationship (QSAR) study is carried out to classify chalcone derivatives with respect to their antileishmanial activity (active/inactive) on the basis of molecular descriptors. For this purpose, two techniques to select descriptors are employed, the Successive Projections Algorithm (SPA) and the Genetic Algorithm (GA). The selected descriptors are initially employed to build Linear Discriminant Analysis (LDA) models. An additional investigation is then carried out to determine whether the results can be improved by using a non-parametric classification technique (One Nearest Neighbour, 1NN). In a case study involving 100 chalcone derivatives, the 1NN models were found to provide better rates of correct classification than LDA, both in the training and test sets. The best result was achieved by a SPA-1NN model with six molecular descriptors, which provided correct classification rates of 97% and 84% for the training and test sets, respectively.

  9. An Evaluation of Different Training Sample Allocation Schemes for Discrete and Continuous Land Cover Classification Using Decision Tree-Based Algorithms

    Directory of Open Access Journals (Sweden)

    René Roland Colditz

    2015-07-01

    Full Text Available Land cover mapping for large regions often employs satellite images of medium to coarse spatial resolution, which complicates mapping of discrete classes. Class memberships, which estimate the proportion of each class for every pixel, have been suggested as an alternative. This paper compares different strategies of training data allocation for discrete and continuous land cover mapping using classification and regression tree algorithms. In addition to measures of discrete and continuous map accuracy the correct estimation of the area is another important criteria. A subset of the 30 m national land cover dataset of 2006 (NLCD2006 of the United States was used as reference set to classify NADIR BRDF-adjusted surface reflectance time series of MODIS at 900 m spatial resolution. Results show that sampling of heterogeneous pixels and sample allocation according to the expected area of each class is best for classification trees. Regression trees for continuous land cover mapping should be trained with random allocation, and predictions should be normalized with a linear scaling function to correctly estimate the total area. From the tested algorithms random forest classification yields lower errors than boosted trees of C5.0, and Cubist shows higher accuracies than random forest regression.

  10. The documents classification algorithm based on LDA%基于LDA的文本分类算法

    Institute of Scientific and Technical Information of China (English)

    何锦群; 刘朋杰

    2014-01-01

    Latent Dirichlet Allocation is a classic topic model which can extract latent topic from large data corpus. Model assumes that if a document is relevant to a topic, then all tokens in the document are relevant to that topic. Through narrowing the generate scope that each document generated from, in this paper, we present an improved text classification algorithm for adding topic-category distribution parameter to Latent Dirichlet Allocation. Documents in this model are generated from the category they most relevant. Gibbs sampling is employed to conduct approximate inference. And preliminary experiment is presented at the end of this paper.%LDA可以实现大量数据集合中潜在主题的挖掘与文本信息的分类,模型假设,如果文档与某主题相关,那么文档中的所有单词都与该主题相关。然而,在面对实际环境中大规模的数据,这会导致主题范围的扩大,不能对主题单词的潜在语义进行准确定位,限制了模型的鲁棒性和有效性。本文针对LDA的这一弊端提出了新的文档主题分类算法gLDA,该模型通过增加主题类别分布参数确定主题的产生范围,提高分类的准确性。 Reuters-21578数据集与复旦大学文本语料库中的数据结果证明,相对于传统的主题分类模型,该模型的分类效果得到了一定程度的提高。

  11. Classification of textures in satellite image with Gabor filters and a multi layer perceptron with back propagation algorithm obtaining high accuracy

    Directory of Open Access Journals (Sweden)

    Adriano Beluco, Paulo M. Engel, Alexandre Beluco

    2015-01-01

    Full Text Available The classification of images, in many cases, is applied to identify an alphanumeric string, a facial expression or any other characteristic. In the case of satellite images is necessary to classify all the pixels of the image. This article describes a supervised classification method for remote sensing images that integrates the importance of attributes in selecting features with the efficiency of artificial neural networks in the classification process, resulting in high accuracy for real images. The method consists of a texture segmentation based on Gabor filtering followed by an image classification itself with an application of a multi layer artificial neural network with a back propagation algorithm. The method was first applied to a synthetic image, like training, and then applied to a satellite image. Some results of experiments are presented in detail and discussed. The application of the method to the synthetic image resulted in the identification of 89.05% of the pixels of the image, while applying to the satellite image resulted in the identification of 85.15% of the pixels. The result for the satellite image can be considered a result of high accuracy.

  12. Precision laser aiming system

    Energy Technology Data Exchange (ETDEWEB)

    Ahrens, Brandon R. (Albuquerque, NM); Todd, Steven N. (Rio Rancho, NM)

    2009-04-28

    A precision laser aiming system comprises a disrupter tool, a reflector, and a laser fixture. The disrupter tool, the reflector and the laser fixture are configurable for iterative alignment and aiming toward an explosive device threat. The invention enables a disrupter to be quickly and accurately set up, aligned, and aimed in order to render safe or to disrupt a target from a standoff position.

  13. Performance Analysis of Anti-Phishing Tools and Study of Classification Data Mining Algorithms for a Novel Anti-Phishing System

    Directory of Open Access Journals (Sweden)

    Rajendra Gupta

    2015-11-01

    Full Text Available The term Phishing is a kind of spoofing website which is used for stealing sensitive and important information of the web user such as online banking passwords, credit card information and user's password etc. In the phishing attack, the attacker generates the warning message to the user about the security issues, ask for confidential information through phishing emails, ask to update the user's account information etc. Several experimental design considerations have been proposed earlier to countermeasure the phishing attack. The earlier systems are not giving more than 90 percentage successful results. In some cases, the system tool gives only 50-60 percentage successful result. In this paper, a novel algorithm is developed to check the performance of the anti-phishing system and compared the received data set with the data set of existing anti-phishing tools. The performance evaluation of novel anti-phishing system is studied with four different classification data mining algorithms which are Class Imbalance Problem (CIP, Rule based Classifier (Sequential Covering Algorithm (SCA, Nearest Neighbour Classification (NNC, Bayesian Classifier (BC on the data set of phishing and legitimate websites. The proposed system shows less error rate and better performance as compared to other existing system tools.

  14. 基于直方图统计分类隐写检测算法设计%THE STEGANOGRAPHY DETECTION ALGORITHM DESIGN BASED ON HISTOGRAM STATISTICAL CLASSIFICATION

    Institute of Scientific and Technical Information of China (English)

    邱志宏

    2013-01-01

    In order to improve the detection performance of steganography detection algorithm,in this paper we put forward a steganography detection algorithm which is based on histogram statistical classification.Through extracting the histogram characteristic parameters of the image information and using the classification mode which is constructed based on artificial neural networks,this algorithm achieves accurate judgement on images embedded with steganography information.We analyse in detail the design,principle and process of this steganography algorithm,and at last construct the experimental test environment.Test results indicate that the detection success rate and the false alarm rate of the steganography detection algorithm designed in the paper are better than those of Ezstego detection tool.%为了提高隐写检测算法的检测性能,提出基于直方图统计分类的隐写检测算法.通过对图片信息的直方图特征参数的提取,使用构建基于人工神经网络的分类模式,实现对嵌入隐写信息的图片准确判定.详细分析基于直方图统计分类隐写算法的设计原理和过程,最后构建实验测试环境.测试结果表明,该隐写检测算法的检测成功率和误检率均优于Ezstego检测工具.

  15. Classification Rule Mining Based on Improved Ant-miner Algorithm%基于改进Ant-miner算法的分类规则挖掘

    Institute of Scientific and Technical Information of China (English)

    肖菁; 梁燕辉

    2012-01-01

    为提高基于传统Ant-miner算法分类规则的预测准确性,提出一种基于改进Ant-miner的分类规则挖掘算法.利用样例在总样本中的密度及比例构造启发式函数,以避免在多个具有相同概率的选择条件下造成算法偏见.对剪枝规则按变异系数进行单点变异,由此扩大规则的搜索空间,提高规则的预测准确度.在Ant-miner算法的信息素更新公式中加入挥发系数,使其更接近现实蚂蚁的觅食行为,防止算法过早收敛.基于UCI标准数据的实验结果表明,该算法相比传统Ant-miner算法具有更高的预测准确度.%In order to improve the classification rule accuracy of the classical Ant-miner algorithm, this paper proposes an improved Ant-miner algorithm for classification rule mining. Heuristic function with sample density and sample proportion is constructed to avoid the bias caused by the same probability in Ant-miner. A pruning strategy with mutation probability is emploied to expand the search space and improve the rule accuracy. An evaporation coefficient in Ant-miner's pheromone update formula is added to slow down the convergence rate of the algorithm. Experimental results on UCI datasets show that the proposed algorithm is promising and can obtain higher predication accuracy than the original Ant-miner algorithm.

  16. An evidence gathering and assessment technique designed for a forest cover classification algorithm based on the Dempster-Shafer theory of evidence

    Science.gov (United States)

    Szymanski, David Lawrence

    This thesis presents a new approach for classifying Landsat 5 Thematic Mapper (TM) imagery that utilizes digitally represented, non-spectral data in the classification step. A classification algorithm that is based on the Dempster-Shafer theory of evidence is developed and tested for its ability to provide an accurate representation of forest cover on the ground at the Anderson et al (1976) level II. The research focuses on defining an objective, systematic method of gathering and assessing the evidence from digital sources including TM data, the normalized difference vegetation index, soils, slope, aspect, and elevation. The algorithm is implemented using the ESRI ArcView Spatial Analyst software package and the Grid spatial data structure with software coded in both ArcView Avenue and also C. The methodology uses frequency of occurrence information to gather evidence and also introduces measures of evidence quality that quantify the ability of the evidence source to differentiate the Anderson forest cover classes. The measures are derived objectively and empirically and are based on common principles of legal argument. The evidence assessment measures augment the Dempster-Shafer theory and the research will determine if they provide an argument that is mentally sound, credible, and consistent. This research produces a method for identifying, assessing, and combining evidence sources using the Dempster-Shafer theory that results in a classified image containing the Anderson forest cover class. Test results indicate that the new classifier performs with accuracy that is similar to the traditional maximum likelihood approach. However, confusion among the deciduous and mixed classes remains. The utility of the evidence gathering method and also the evidence assessment method is demonstrated and confirmed. The algorithm presents an operational method of using the Dempster-Shafer theory of evidence for forest classification.

  17. 一种抗噪的动态数据流分类算法%Classification algorithm for noisy and dynamic data stream

    Institute of Scientific and Technical Information of China (English)

    黄树成; 刘悦

    2016-01-01

    Tracking concept drifts in data streams has become a hot topic in data stream mining.However,noise directly affects the result of detection of concept drift and the effect of classification.Therefore,an anti-noise method is of importance for research and application.On this basis,a new classification algorithm for mining data streams is proposed,called FDBCA,which deals with noise data by Fast-DBSCAN that improved density based spatial clustering of applications with noise algorithm.In the algorithm,decision trees based on UFFT are select-ed as the basic classifiers,meanwhile it uses the μ-hypothesis testing method to detect concept drifts,update classification model dynamically,and has better property and effect.The experimental results show that FDBCA is superior to several existing algorithms in classification accuracy when handling noisy data streams with concept drifts.%数据流中的概念漂移问题已成为数据挖掘领域研究的热点之一。现实环境中的噪声直接影响概念漂移的检测及分类效果,因此,具有良好抗噪性能的数据流分类算法有重要的研究和应用价值。据此,文中提出了一种动态数据流分类算法 FDB-CA(FDBSCAN based classfication algorithm)。该算法使用 DBSCAN 的改进算法 FDBSCAN 来过滤噪声,以 UFFT 为基分类器构建加权集成模型,同时引入假设检验中的μ检验方法检测概念漂移,动态更新分类模型,具有较好的性能和分类效果。实验结果表明,同已有的数据流分类算法相比,FDBCA 算法在处理带有噪声的概念漂移数据流时具有更好的分类精度。

  18. Graph classification algorithm based on divide and conquer strategy and Hash linked list%基于分而治之及Hash链表的图分类算法

    Institute of Scientific and Technical Information of China (English)

    孙伟; 朱正礼

    2013-01-01

    主流的图结构数据分类算法大都是基于频繁子结构挖掘策略.这一策略必然导致对全局数据空间的不断重复搜索,从而使得该领域相关算法的效率较低,无法满足特定要求.针对此类算法的不足,采用分而治之方法,设计出一种模块化数据空间和利用Hash链表存取地址及支持度的算法.将原始数据库按照规则划分为有限的子模块,利用gSpan算法对各个模块进行操作获取局部频繁子模式,再利用Hash函数将各模块挖掘结果映射出唯一存储地址,同时记录其相应支持度构成Hash链表,最后得到全局频繁子模式并构造图数据分类器.算法避免了对全局空间的重复搜索,从而大幅度提升了执行效率;也使得模块化后的数据可以一次性装入内存,从而节省了内存开销.实验表明,新算法在分类模型塑造环节的效率较之于主流图分类算法提升了1.2~3.2倍,同时分类准确率没有下降.%The mainstream graph data classification algorithms are based on frequent substructure mining strategy, which inevitably leads to searching the global data space repeatedly and hence the related algorithms have low efficiency and cannot meet specific requirements. Aiming at the disadvantages of such algorithms, firstly, the "divide and conquer" strategy is used to design a modular data space and an algorithm that use the hash linked list to store the address and the support degree. Secondly, the original database is partitioned into a limited number of sub-modules according to the rules, and the gSpan algorithm is used to handle each sub-module to get the locally frequent sub-model. Thirdly, Hash functions are used to calculate the unique memory address of the mining result of each module, and construct the Hash linked list by recording the support degree. Finally, the globally frequent sub-model is obtained and the graph data classifier is built up. The algorithm avoids searching the global space

  19. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  20. Content-based and algorithmic classifications of journals: perspectives on the dynamics of scientific communication and indexer effects

    NARCIS (Netherlands)

    I. Rafols; L. Leydesdorff

    2009-01-01

    The aggregated journal-journal citation matrix—based on the Journal Citation Reports (JCR) of the Science Citation Index—can be decomposed by indexers and/or algorithmically. In this study, we test the results of two recently available algorithms for the decomposition of large matrices against two c

  1. Prediction models discriminating between nonlocomotive and locomotive activities in children using a triaxial accelerometer with a gravity-removal physical activity classification algorithm.

    Directory of Open Access Journals (Sweden)

    Yuki Hikihara

    Full Text Available The aims of our study were to examine whether a gravity-removal physical activity classification algorithm (GRPACA is applicable for discrimination between nonlocomotive and locomotive activities for various physical activities (PAs of children and to prove that this approach improves the estimation accuracy of a prediction model for children using an accelerometer. Japanese children (42 boys and 26 girls attending primary school were invited to participate in this study. We used a triaxial accelerometer with a sampling interval of 32 Hz and within a measurement range of ±6 G. Participants were asked to perform 6 nonlocomotive and 5 locomotive activities. We measured raw synthetic acceleration with the triaxial accelerometer and monitored oxygen consumption and carbon dioxide production during each activity with the Douglas bag method. In addition, the resting metabolic rate (RMR was measured with the subject sitting on a chair to calculate metabolic equivalents (METs. When the ratio of unfiltered synthetic acceleration (USA and filtered synthetic acceleration (FSA was 1.12, the rate of correct discrimination between nonlocomotive and locomotive activities was excellent, at 99.1% on average. As a result, a strong linear relationship was found for both nonlocomotive (METs = 0.013×synthetic acceleration +1.220, R2 = 0.772 and locomotive (METs = 0.005×synthetic acceleration +0.944, R2 = 0.880 activities, except for climbing down and up. The mean differences between the values predicted by our model and measured METs were -0.50 to 0.23 for moderate to vigorous intensity (>3.5 METs PAs like running, ball throwing and washing the floor, which were regarded as unpredictable PAs. In addition, the difference was within 0.25 METs for sedentary to mild moderate PAs (<3.5 METs. Our specific calibration model that discriminates between nonlocomotive and locomotive activities for children can be useful to evaluate the sedentary to vigorous

  2. Efficient and robust phase unwrapping algorithm based on unscented Kalman filter, the strategy of quantizing paths-guided map, and pixel classification strategy.

    Science.gov (United States)

    Xie, Xian Ming; Zeng, Qing Ning

    2015-11-01

    This paper presents an efficient and robust phase unwrapping algorithm which combines an unscented Kalman filter (UKF) with a strategy of quantizing a paths-guided map and a pixel classification strategy based on phase quality information. The advantages of the proposed method depend on the following contributions: (1) the strategy of quantizing the paths-guided map can accelerate the process of searching unwrapping paths and greatly reducing time consumption on the unwrapping procedure; (2) the pixel classification strategy proposed by this paper can reduce the error propagation effect by decreasing the amounts of pixels with equal quantized paths-guided value in the process of unwrapping; and (3) the unscented Kalman filter enables simultaneous filtering and unwrapping without the information loss caused by linearization of a nonlinear model. In addition, a new paths-guided map derived from a phase quality map is inserted into the strategy of quantizing the paths-guided map to provide a more robust path of unwrapping, and then ensures better unwrapping results. Results obtained from synthetic data and real data show that the proposed method can efficiently obtain better solutions with respect to some of the most used algorithms. PMID:26560585

  3. Paper 5 : Surveillance of Multiple Congenital Anomalies: Implementation of a Computer Algorithm in European Registers for Classification of Cases

    NARCIS (Netherlands)

    Garne, Ester; Dolk, Helen; Loane, Maria; Wellesley, Diana; Barisic, Ingeborg; Calzolari, Elisa; Densem, James

    2011-01-01

    BACKGROUND: Surveillance of multiple congenital anomalies is considered to be more sensitive for the detection of new teratogens than surveillance of all or isolated congenital anomalies. Current literature proposes the manual review of all cases for classification into isolated or multiple congenit

  4. Paper 5: Surveillance of multiple congenital anomalies: implementation of a computer algorithm in European registers for classification of cases

    DEFF Research Database (Denmark)

    Garne, Ester; Dolk, Helen; Loane, Maria;

    2011-01-01

    Surveillance of multiple congenital anomalies is considered to be more sensitive for the detection of new teratogens than surveillance of all or isolated congenital anomalies. Current literature proposes the manual review of all cases for classification into isolated or multiple congenital anomal...

  5. Supervised Semi-definite Embedding algorithms for classification%分类监督半定嵌入算法

    Institute of Scientific and Technical Information of China (English)

    董文明; 孔德庸

    2014-01-01

    Based on manifold learning algorithm with spectral analysis :Semi-definite Embedding (SDE) , we put forward two supervised SSDE algorithms :weighted SSDE and optimal distance SSDE .The simulation results verify the effectiveness of the algorithms .%基于谱分析流形学习算法半定嵌入算法(Semi-definite Embedding ,SDE),提出了两种监督型的SSDE算法,即基于权重的SSDE算法和基于最佳距离度量的SSDE算法,数值实验验证了算法的有效性。

  6. An Improved KNN Algorithm Based on Multi-attribute Classification%基于多属性分类的KNN改进算法

    Institute of Scientific and Technical Information of China (English)

    张炯辉; 许尧舜

    2013-01-01

    To improve the classification accuracy of the conventional Euclidean KNN algorithm and the im-proved KNN algorithm based on information entropy,this paper proposes an improved KNN algorithm based on multi-attribute classification. The procedures of the new algorithm comprise:i) classify the attributes according to the percentage of their attribute values in an entire attribute of sample set into those discrete attributes suit-able for entropy-based KNN algorithm and those continuous attributes suitable for conventional Euclidean KNN similarity-based algorithm;ii) process the two types of attributes separately and then sum up the two series of results with weighing and put the sum as the distance between samples;iii) select k samples those are closest to the test sample to determine the decision attribute type of the test sample.%提出了一种基于多属性分类的KNN改进算法,可有效提高传统的欧几里德KNN算法和基于信息熵的KNN改进算法的分类准确度。首先,按照单个属性不同属性值的个数占整个属性包含样本的比例进行属性的分类,分为基于信息熵的KNN算法处理的离散属性和基于传统欧几里德KNN相似度处理的连续属性两类,然后分别对不同属性进行区别处理;其次,将两类不同处理后得到的结果按比例求和作为样本之间的距离;最后,选取与待测样本的距离最小的k个样本判断测试样本的决策属性类别。

  7. 基于贝叶斯算法的图像分类系统设计%The Image Classification System Based on the Bayesian Algorithm

    Institute of Scientific and Technical Information of China (English)

    席伟

    2012-01-01

      Image classification is an important research direction of information processing, which involves a include image fea⁃ture extraction, to establish the image data decision table, select the appropriate pattern recognition algorithm for image classifica⁃tion. This paper selects the commonly used pattern recognition based on the minimum probability of error of the Bias algorithm, based on two kinds of image classification problems. Using the MATLAB graphical user interface ( GUI ) design method, a good human-computer interaction system main interface, finally gives the practical examples of the program running results, to pro⁃mote the theory of pattern recognition in image classification application and popularization, has practical significance.%  图像分类是信息处理的重要研究方向,其中涉及了包括有图像特征提取、建立图像数据决策表,选取适当模式识别算法实现图像的分类。该文选取了模式识别常用的基于最小错误概率的贝叶斯算法,实现了对两类图像的分类问题。利用MATLAB图形用户界面(GUI)方法,设计了良好的人机交互系统的主界面,最后给出了实际例子的程序运行结果,对推动模式识别理论在图像分类问题实践中的应用和普及,具有实际意义。

  8. Aims of the Workshop

    CERN Document Server

    Dornan, P J

    2010-01-01

    There are challenges and opportunities for the European particle physics community to engage with innovative and exciting developments which could lead to precision measurements in the neutrino sector. These have the potential to yield significant advances in the understanding of CP violation, the flavour riddle and theories beyond the Standard Model. This workshop aims to start the process of a dialogue in Europe so that informed decisions on the appropriate directions to pursue can be made in a few years time.

  9. A hidden Markov model based algorithm for data stream classification algorithm%基于隐马尔可夫模型的流数据分类算法

    Institute of Scientific and Technical Information of China (English)

    潘怡; 何可可; 李国徽

    2014-01-01

    为优化周期性概念漂移分类精度,提出了一种基于隐马尔可夫模型的周期性流式数据分类(HMM -SDC)算法,算法结合实际可观测序列的输出建立漂移概念状态序列的转移矩阵概率模型,由观测值概率分布密度来预测状态的转移序列。当预测误差超过用户定义阈值时,算法能够更新优化转移矩阵参数,无须重复学习历史概念即可实现对数据概念漂移的有效预测。此外,算法采用半监督 K-M ean学习方法训练样本集,降低了人工标记样例的代价,能够避免隐形马尔可夫模型因标记样例不足而产生的欠学习问题。实验结果表明:相对传统集成分类算法,新算法对周期性数据漂移具有更好的分类精确度及分类时效性。%To improve the classification accuracy on data stream ,HMM -SDC(hidden Markov model based stream data classification )algorithm was presented . The invisible data concept states was a-ligned with the observable sequences through a hidden Markov chain model ,and the drifted concept could be forecasted with the actual observation value .When the mean predictive error was larger than a user defined threshold ,the state transition probability matrix was updated automatically without re-learning the historical data concepts . In addition , part of the unlabeled samples were classified through the semi-supervised K-Means method ,which reduced the impact of the insufficient labeled data for training the hidden Markov model .The experimental results show that the new algorithm has better performance than the traditional ensemble classification algorithm in periodical data stream clas-sification .

  10. A genetic algorithm-Bayesian network approach for the analysis of metabolomics and spectroscopic data: application to the rapid identification of Bacillus spores and classification of Bacillus species

    Directory of Open Access Journals (Sweden)

    Goodacre Royston

    2011-01-01

    Full Text Available Abstract Background The rapid identification of Bacillus spores and bacterial identification are paramount because of their implications in food poisoning, pathogenesis and their use as potential biowarfare agents. Many automated analytical techniques such as Curie-point pyrolysis mass spectrometry (Py-MS have been used to identify bacterial spores giving use to large amounts of analytical data. This high number of features makes interpretation of the data extremely difficult We analysed Py-MS data from 36 different strains of aerobic endospore-forming bacteria encompassing seven different species. These bacteria were grown axenically on nutrient agar and vegetative biomass and spores were analyzed by Curie-point Py-MS. Results We develop a novel genetic algorithm-Bayesian network algorithm that accurately identifies sand selects a small subset of key relevant mass spectra (biomarkers to be further analysed. Once identified, this subset of relevant biomarkers was then used to identify Bacillus spores successfully and to identify Bacillus species via a Bayesian network model specifically built for this reduced set of features. Conclusions This final compact Bayesian network classification model is parsimonious, computationally fast to run and its graphical visualization allows easy interpretation of the probabilistic relationships among selected biomarkers. In addition, we compare the features selected by the genetic algorithm-Bayesian network approach with the features selected by partial least squares-discriminant analysis (PLS-DA. The classification accuracy results show that the set of features selected by the GA-BN is far superior to PLS-DA.

  11. SPEECH CLASSIFICATION USING ZERNIKE MOMENTS

    Directory of Open Access Journals (Sweden)

    Manisha Pacharne

    2011-07-01

    Full Text Available Speech recognition is very popular field of research and speech classification improves the performance for speech recognition. Different patterns are identified using various characteristics or features of speech to do there classification. Typical speech features set consist of many parameters like standard deviation, magnitude, zero crossing representing speech signal. By considering all these parameters, system computation load and time will increase a lot, so there is need to minimize these parameters by selecting important features. Feature selection aims to get an optimal subset of features from given space, leading to high classification performance. Thus feature selection methods should derive features that should reduce the amount of data used for classification. High recognition accuracy is in demand for speech recognition system. In this paper Zernike moments of speech signal are extracted and used as features of speech signal. Zernike moments are the shape descriptor generally used to describe the shape of region. To extract Zernike moments, one dimensional audio signal is converted into two dimensional image file. Then various feature selection and ranking algorithms like t-Test, Chi Square, Fisher Score, ReliefF, Gini Index and Information Gain are used to select important feature of speech signal. Performances of the algorithms are evaluated using accuracy of classifier. Support Vector Machine (SVM is used as the learning algorithm of classifier and it is observed that accuracy is improved a lot after removing unwanted features.

  12. Visual-Based Clothing Attribute Classification Algorithm%基于视觉的服装属性分类算法

    Institute of Scientific and Technical Information of China (English)

    刘聪; 丁贵广

    2016-01-01

    提出了一种服装图像属性分类算法 .针对服装图像噪声多的问题 ,采用人体部位检测技术定位服装关键部位并去除冗余信息 ,提高了属性分类的准确率 .并提出了一种基于人体骨架与皮肤的特征提取算法 ,以较少的维数表达衣型特点 ,显著加快相关属性的分类速度 .针对服装属性语义复杂、需求多样化的问题 ,为不同的属性构建了不同的SVM决策树模型 ,从而提高分类效率 ,并同时满足粗、细粒度的服装分类需求 .实验结果验证了该方法在多种服装属性分类任务上的有效性 .%We propose an algorithm for classifying clothing image attributes .To handle the noise in clothing images , key parts of clothing are located by a well-trained human part detector ,and redundant information is eliminated ,by which means the accuracy of clothing attribute classification is improved .Additionally ,a novel feature descriptor based on human skeleton and skin is also proposed . This descriptor describes clothing feature with fewer dimensions ,which significantly speeds up classifiers of related attributes .To deal with the complex semantic of clothing attributes ,different SVM Decision Tree models are built for different attributes ,which improves the efficiency of classification and achieves the objective of both coarse-grained and fine-grained classification . Experiments demonstrate the effectiveness of the proposed algorithm on multiple clothing attribute classification tasks .

  13. Dual-energy cone-beam CT with a flat-panel detector: Effect of reconstruction algorithm on material classification

    Energy Technology Data Exchange (ETDEWEB)

    Zbijewski, W., E-mail: wzbijewski@jhu.edu; Gang, G. J.; Xu, J.; Wang, A. S.; Stayman, J. W. [Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205 (United States); Taguchi, K.; Carrino, J. A. [Russell H. Morgan Department of Radiology, Johns Hopkins University, Baltimore, Maryland 21205 (United States); Siewerdsen, J. H. [Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205 and Russell H. Morgan Department of Radiology, Johns Hopkins University, Baltimore, Maryland 21205 (United States)

    2014-02-15

    Purpose: Cone-beam CT (CBCT) with a flat-panel detector (FPD) is finding application in areas such as breast and musculoskeletal imaging, where dual-energy (DE) capabilities offer potential benefit. The authors investigate the accuracy of material classification in DE CBCT using filtered backprojection (FBP) and penalized likelihood (PL) reconstruction and optimize contrast-enhanced DE CBCT of the joints as a function of dose, material concentration, and detail size. Methods: Phantoms consisting of a 15 cm diameter water cylinder with solid calcium inserts (50–200 mg/ml, 3–28.4 mm diameter) and solid iodine inserts (2–10 mg/ml, 3–28.4 mm diameter), as well as a cadaveric knee with intra-articular injection of iodine were imaged on a CBCT bench with a Varian 4343 FPD. The low energy (LE) beam was 70 kVp (+0.2 mm Cu), and the high energy (HE) beam was 120 kVp (+0.2 mm Cu, +0.5 mm Ag). Total dose (LE+HE) was varied from 3.1 to 15.6 mGy with equal dose allocation. Image-based DE classification involved a nearest distance classifier in the space of LE versus HE attenuation values. Recognizing the differences in noise between LE and HE beams, the LE and HE data were differentially filtered (in FBP) or regularized (in PL). Both a quadratic (PLQ) and a total-variation penalty (PLTV) were investigated for PL. The performance of DE CBCT material discrimination was quantified in terms of voxelwise specificity, sensitivity, and accuracy. Results: Noise in the HE image was primarily responsible for classification errors within the contrast inserts, whereas noise in the LE image mainly influenced classification in the surrounding water. For inserts of diameter 28.4 mm, DE CBCT reconstructions were optimized to maximize the total combined accuracy across the range of calcium and iodine concentrations, yielding values of ∼88% for FBP and PLQ, and ∼95% for PLTV at 3.1 mGy total dose, increasing to ∼95% for FBP and PLQ, and ∼98% for PLTV at 15.6 mGy total dose. For a

  14. Pap Smear Diagnosis Using a Hybrid Intelligent Scheme Focusing on Genetic Algorithm Based Feature Selection and Nearest Neighbor Classification

    DEFF Research Database (Denmark)

    Marinakis, Yannis; Dounias, Georgios; Jantzen, Jan

    2009-01-01

    The term pap-smear refers to samples of human cells stained by the so-called Papanicolaou method. The purpose of the Papanicolaou method is to diagnose pre-cancerous cell changes before they progress to invasive carcinoma. In this paper a metaheuristic algorithm is proposed in order to classify t...... other previously applied intelligent approaches....

  15. Impact of Reducing Polarimetric SAR Input on the Uncertainty of Crop Classifications Based on the Random Forests Algorithm

    DEFF Research Database (Denmark)

    Loosvelt, Lien; Peters, Jan; Skriver, Henning;

    2012-01-01

    features in multidate SAR data sets: an accuracy-oriented reduction and an efficiency-oriented reduction. For both strategies, the effect of feature reduction on the quality of the land cover map is assessed. The analyzed data set consists of 20 polarimetric features derived from L-band (1.25 GHz) SAR...... general and specific features for crop classification. Based on the importance ranking, features are gradually removed from the single-date data sets in order to construct several multidate data sets with decreasing dimensionality. In the accuracy-oriented and efficiency-oriented reduction, the input is...

  16. Comparison of Classification Algorithm in Coal Data Analysis System%分类算法在煤矿勘探数据分析系统中的比较

    Institute of Scientific and Technical Information of China (English)

    莫洪武; 万荣泽

    2013-01-01

    煤炭开采过程中需要对收集的勘探数据进行分析和研究,从中挖掘出更加有价值的信息。文章针对多种数据分类算法,研究分析他们在煤炭勘探数据分析中的作用。通过研究和比较多种分类算法在数据分析工作中的性能,找到能够更加有效地处理勘探数据的分类算法。%Coal system usually analyze and research on the collected coal data, and mine more valuable information from them. In data mining area, there are several kinds of data classification mining algorithms. Coal system could apply them into real work according to different data type. In this paper, focusing data classification algorithms, we research and analyze the function of the algorithms in coal data analysis. Through the research and comparison the performance of multiple classification algorithms, we find the effective classification algorithms in processing coal data.

  17. Comprehensive gene expression profiling and immunohistochemical studies support application of immunophenotypic algorithm for molecular subtype classification in diffuse large B-cell lymphoma

    DEFF Research Database (Denmark)

    Visco, C; Xu-Monette, Z Y; Miranda, R N;

    2012-01-01

    Gene expression profiling (GEP) has stratified diffuse large B-cell lymphoma (DLBCL) into molecular subgroups that correspond to different stages of lymphocyte development-namely germinal center B-cell like and activated B-cell like. This classification has prognostic significance, but GEP...... is expensive and not readily applicable into daily practice, which has lead to immunohistochemical algorithms proposed as a surrogate for GEP analysis. We assembled tissue microarrays from 475 de novo DLBCL patients who were treated with rituximab-CHOP chemotherapy. All cases were successfully profiled by GEP...... on formalin-fixed, paraffin-embedded tissue samples. Sections were stained with antibodies reactive with CD10, GCET1, FOXP1, MUM1 and BCL6 and cases were classified following a rationale of sequential steps of differentiation of B cells. Cutoffs for each marker were obtained using receiver...

  18. [Aiming for zero blindness].

    Science.gov (United States)

    Nakazawa, Toru

    2015-03-01

    -independent factors, as well as our investigation of ways to improve the clinical evaluation of the disease. Our research was prompted by the multifactorial nature of glaucoma. There is a high degree of variability in the pattern and speed of the progression of visual field defects in individual patients, presenting a major obstacle for successful clinical trials. To overcome this, we classified the eyes of glaucoma patients into 4 types, corresponding to the 4 patterns of glaucomatous optic nerve head morphology described: by Nicolela et al. and then tested the validity of this method by assessing the uniformity of clinical features in each group. We found that in normal tension glaucoma (NTG) eyes, each disc morphology group had a characteristic location in which the loss of circumpapillary retinal nerve fiber layer thickness (cpRNFLT; measured with optical coherence tomography: OCT) was most likely to occur. Furthermore, the incidence of reductions in visual acuity differed between the groups, as did the speed of visual field loss, the distribution of defective visual field test points, and the location of test points that were most susceptible to progressive damage, measured by Humphrey static perimetry. These results indicate that Nicolela's method of classifying eyes with glaucoma was able to overcome the difficulties caused by the diverse nature of the disease, at least to a certain extent. Building on these findings, we then set out to identify sectors of the visual field that correspond to the distribution of retinal nerve fibers, with the aim of detecting glaucoma progression with improved sensitivity. We first mapped the statistical correlation between visual field test points and cpRNFLT in each temporal clock-hour sector (from 6 to 12 o'clock), using OCT data from NTG patients. The resulting series of maps allowed us to identify areas containing visual field test points that were prone to be affected together as a group. We also used a similar method to identify visual

  19. Integrated binary-class classification algorithm based on Logistic and SVM%集成Logistic与SVM的二分类算法

    Institute of Scientific and Technical Information of China (English)

    谢玲; 刘琼荪

    2011-01-01

    By probability values,the output of Logistic regression can be divided into four continuous intervals,and the fre quency of classification in each internal can be calculated.Based on Logistic regression and Support Vector Machine (SVM), an integrated binary-class classification algorithm is proposed.The validity of BLR-SVM is illustrated by numerical results on several UCI datasets.%对Logistic回归的输出结果通过概率分析划分为四个连续的区间,计算各个区间内训练样本的正确分类频率,由此将Logistic回归与支持向量机对样本的输出结果进行比较,构造了一种集成判别规则的二分类算法.实证分析表明提出的集成算法具有较好的分类效果.

  20. [Aiming for zero blindness].

    Science.gov (United States)

    Nakazawa, Toru

    2015-03-01

    -independent factors, as well as our investigation of ways to improve the clinical evaluation of the disease. Our research was prompted by the multifactorial nature of glaucoma. There is a high degree of variability in the pattern and speed of the progression of visual field defects in individual patients, presenting a major obstacle for successful clinical trials. To overcome this, we classified the eyes of glaucoma patients into 4 types, corresponding to the 4 patterns of glaucomatous optic nerve head morphology described: by Nicolela et al. and then tested the validity of this method by assessing the uniformity of clinical features in each group. We found that in normal tension glaucoma (NTG) eyes, each disc morphology group had a characteristic location in which the loss of circumpapillary retinal nerve fiber layer thickness (cpRNFLT; measured with optical coherence tomography: OCT) was most likely to occur. Furthermore, the incidence of reductions in visual acuity differed between the groups, as did the speed of visual field loss, the distribution of defective visual field test points, and the location of test points that were most susceptible to progressive damage, measured by Humphrey static perimetry. These results indicate that Nicolela's method of classifying eyes with glaucoma was able to overcome the difficulties caused by the diverse nature of the disease, at least to a certain extent. Building on these findings, we then set out to identify sectors of the visual field that correspond to the distribution of retinal nerve fibers, with the aim of detecting glaucoma progression with improved sensitivity. We first mapped the statistical correlation between visual field test points and cpRNFLT in each temporal clock-hour sector (from 6 to 12 o'clock), using OCT data from NTG patients. The resulting series of maps allowed us to identify areas containing visual field test points that were prone to be affected together as a group. We also used a similar method to identify visual

  1. Research on Task Scheduling Algorithm Based on QoS Classification%基于QoS分类的任务调度算法研究

    Institute of Scientific and Technical Information of China (English)

    赵科伟; 洪龙; 周宁宁

    2016-01-01

    In order to improve the quality of cloud service providers and the satisfaction of users about cloud services,a task scheduling al-gorithm based on QoS classification is put forward. This algorithm is suited with independent tasks. Firstly,fuzzy clustering algorithm is used to classify task set. Then the traditional segmented Min-Min algorithm is applied for task allocation. Segmented Min-Min algorithm is more granular for resource allocation compared with the Min-Min algorithm,so it can improve the matching degree between task and resource. Only in this way can further reduce the completion time,and achieve a certain load balancing. The experimental results show that the proposed method can not only meet the needs of the user's QoS,but also obtain shorter completion time.%为了提高云服务供应商提供服务的质量和用户对云服务的满意度,文中提出了一种基于QoS分类的任务调度算法。该算法针对的是独立任务的调度,也就是任务之间没有依赖关系,因此该方法可以利用模糊聚类算法对任务集进行分类,然后采用传统的分段Min-Min算法进行任务的分配。分段Min-Min算法相比Min-Min算法是以更小粒度来分配资源,因此能提高任务和资源之间的匹配程度,在此基础上针对某些节点负载过重的情况采取优化方法,这样能进一步降低完成时间,同时取得了一定程度的负载均衡。实验结果表明,提出的改进方法既能满足用户的QoS需求,又能取得较短的完成时间。

  2. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  3. Algorithm for Chinese short-text classification using concept description%使用概念描述的中文短文本分类算法

    Institute of Scientific and Technical Information of China (English)

    杨天平; 朱征宇

    2012-01-01

    In order to solve the problem that traditional classification is not very satisfactory due to fewer text features in short text, an algorithm using concept description was presented. At first, a global semantic concept word list was built. Then the test set and training set were conceptualized by the global semantic concept word list to combine the test short texts by the same description of concept in the training set, and at the same time, training long texts were combined by the training short texts in the training set. At last, the long text was classified by traditional classification algorithm. The experiments show that the proposed method could mine implicit semantic information in short text efficiently while expanding short text on semantics adequately, and improving the accuracy of short text classification.%针对短文本特征较少而导致使用传统文本分类算法进行分类效果并不理想的问题,提出了一种使用了概念描述的短文本分类算法,该方法首先构建出全局的语义概念词表;然后,使用概念词表分别对预测短文本和训练短文本概念化描述,使得预测短文本在训练集中找出拥有相似概念描述的训练短文本组合成预测长文本,同时将训练集内部的短文本也进行自组合形成训练长文本;最后,再使用传统的长文本分类算法进行分类.实验证明,该方法能够有效挖掘短文本内部隐含的语义信息,充分对短文本进行语义扩展,提高了短文本分类的准确度.

  4. Automated classification of seismic sources in large database using random forest algorithm: First results at Piton de la Fournaise volcano (La Réunion).

    Science.gov (United States)

    Hibert, Clément; Provost, Floriane; Malet, Jean-Philippe; Stumpf, André; Maggi, Alessia; Ferrazzini, Valérie

    2016-04-01

    In the past decades the increasing quality of seismic sensors and capability to transfer remotely large quantity of data led to a fast densification of local, regional and global seismic networks for near real-time monitoring. This technological advance permits the use of seismology to document geological and natural/anthropogenic processes (volcanoes, ice-calving, landslides, snow and rock avalanches, geothermal fields), but also led to an ever-growing quantity of seismic data. This wealth of seismic data makes the construction of complete seismicity catalogs, that include earthquakes but also other sources of seismic waves, more challenging and very time-consuming as this critical pre-processing stage is classically done by human operators. To overcome this issue, the development of automatic methods for the processing of continuous seismic data appears to be a necessity. The classification algorithm should satisfy the need of a method that is robust, precise and versatile enough to be deployed to monitor the seismicity in very different contexts. We propose a multi-class detection method based on the random forests algorithm to automatically classify the source of seismic signals. Random forests is a supervised machine learning technique that is based on the computation of a large number of decision trees. The multiple decision trees are constructed from training sets including each of the target classes. In the case of seismic signals, these attributes may encompass spectral features but also waveform characteristics, multi-stations observations and other relevant information. The Random Forests classifier is used because it provides state-of-the-art performance when compared with other machine learning techniques (e.g. SVM, Neural Networks) and requires no fine tuning. Furthermore it is relatively fast, robust, easy to parallelize, and inherently suitable for multi-class problems. In this work, we present the first results of the classification method applied

  5. Automated classifications of topography from DEMs by an unsupervised nested-means algorithm and a three-part geometric signature

    Science.gov (United States)

    Iwahashi, J.; Pike, R.J.

    2007-01-01

    An iterative procedure that implements the classification of continuous topography as a problem in digital image-processing automatically divides an area into categories of surface form; three taxonomic criteria-slope gradient, local convexity, and surface texture-are calculated from a square-grid digital elevation model (DEM). The sequence of programmed operations combines twofold-partitioned maps of the three variables converted to greyscale images, using the mean of each variable as the dividing threshold. To subdivide increasingly subtle topography, grid cells sloping at less than mean gradient of the input DEM are classified by designating mean values of successively lower-sloping subsets of the study area (nested means) as taxonomic thresholds, thereby increasing the number of output categories from the minimum 8 to 12 or 16. Program output is exemplified by 16 topographic types for the world at 1-km spatial resolution (SRTM30 data), the Japanese Islands at 270??m, and part of Hokkaido at 55??m. Because the procedure is unsupervised and reflects frequency distributions of the input variables rather than pre-set criteria, the resulting classes are undefined and must be calibrated empirically by subsequent analysis. Maps of the example classifications reflect physiographic regions, geological structure, and landform as well as slope materials and processes; fine-textured terrain categories tend to correlate with erosional topography or older surfaces, coarse-textured classes with areas of little dissection. In Japan the resulting classes approximate landform types mapped from airphoto analysis, while in the Americas they create map patterns resembling Hammond's terrain types or surface-form classes; SRTM30 output for the United States compares favorably with Fenneman's physical divisions. Experiments are suggested for further developing the method; the Arc/Info AML and the map of terrain classes for the world are available as online downloads. ?? 2006 Elsevier

  6. Identification of Data Fragment Classification Algorithm Based on PCA-LDA and KNN-SMO%基于 PCA-LDA 和 KNN-SMO 的数据碎片分类识别算法

    Institute of Scientific and Technical Information of China (English)

    傅德胜; 经正俊

    2015-01-01

    在计算机取证领域,数据碎片的取证分析已成为获取数字证据的一种重要手段。本文针对取证中数据碎片的取证问题提出了一种新的基于内容特征的数据碎片类型识别算法,该方法首先对数据碎片进行分块主成分分析PCA 后,对 PCA 特征向量进行线性鉴别分析 LDA 获取组合特征向量,然后利用 K 最邻近 KNN 算法和序列最小优化SMO 算法组成融合分类器,运用获取的组合特征向量对数据碎片进行分类识别。实验表明,该算法与其他相关算法相比,具有较高的识别准确率和识别速率,取得了良好的识别效果。%In the computer forensics field, the forensic analysis of data fragment has become an important means to obtain digital evidence. Aiming at the problem of data fragment forensics, this paper proposes a novel algorithm of data classification identification based on the content feature. Firstly, it makes principal component analysis (PCA) of each blocks in the data fragment; secondly, it makes linear discriminant analysis (LDA) of each PCA feature vector so as to get the combinational feature vector; finally, the author identifies the type of data fragment with the combinational fea-ture vector by using the fusion classifier of k nearest neighbor (KNN) algorithm and sequential minimal optimization algorithm (SMO). Experimental results have shown that compared with the related algorithms the proposed algorithm has better identification accuracy and identification rate which achieves better identification results.

  7. An Efficient, Scalable Time-Frequency Method for Tracking Energy Usage of Domestic Appliances Using a Two-Step Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Paula Meehan

    2014-10-01

    Full Text Available Load monitoring is the practice of measuring electrical signals in a domestic environment in order to identify which electrical appliances are consuming power. One reason for developing a load monitoring system is to reduce power consumption by increasing consumers’ awareness of which appliances consume the most energy. Another example of an application of load monitoring is activity sensing in the home for the provision of healthcare services. This paper outlines the development of a load disaggregation method that measures the aggregate electrical signals of a domestic environment and extracts features to identify each power consuming appliance. A single sensor is deployed at the main incoming power point, to sample the aggregate current signal. The method senses when an appliance switches ON or OFF and uses a two-step classification algorithm to identify which appliance has caused the event. Parameters from the current in the temporal and frequency domains are used as features to define each appliance. These parameters are the steady-state current harmonics and the rate of change of the transient signal. Each appliance’s electrical characteristics are distinguishable using these parameters. There are three Types of loads that an appliance can fall into, linear nonreactive, linear reactive or nonlinear reactive. It has been found that by identifying the load type first and then using a second classifier to identify individual appliances within these Types, the overall accuracy of the identification algorithm is improved.

  8. Automatic Detection and Classification of Pole-Like Objects in Urban Point Cloud Data Using an Anomaly Detection Algorithm

    Directory of Open Access Journals (Sweden)

    Borja Rodríguez-Cuenca

    2015-09-01

    Full Text Available Detecting and modeling urban furniture are of particular interest for urban management and the development of autonomous driving systems. This paper presents a novel method for detecting and classifying vertical urban objects and trees from unstructured three-dimensional mobile laser scanner (MLS or terrestrial laser scanner (TLS point cloud data. The method includes an automatic initial segmentation to remove the parts of the original cloud that are not of interest for detecting vertical objects, by means of a geometric index based on features of the point cloud. Vertical object detection is carried out through the Reed and Xiaoli (RX anomaly detection algorithm applied to a pillar structure in which the point cloud was previously organized. A clustering algorithm is then used to classify the detected vertical elements as man-made poles or trees. The effectiveness of the proposed method was tested in two point clouds from heterogeneous street scenarios and measured by two different sensors. The results for the two test sites achieved detection rates higher than 96%; the classification accuracy was around 95%, and the completion quality of both procedures was 90%. Non-detected poles come from occlusions in the point cloud and low-height traffic signs; most misclassifications occurred in man-made poles adjacent to trees.

  9. DBN classification algorithm for numerical attribute%一种数值属性的深度置信网络分类方法

    Institute of Scientific and Technical Information of China (English)

    孙劲光; 蒋金叶; 孟祥福; 李秀娟

    2014-01-01

    Deep Belief Network(DBN)is a deep architecture that consists of several Restricted Boltzmann Machines (RBM). Generally the inputs of RBM are binary vector which leads to the information loss and in turn degrades the per-formance of classification. For the problem above, a DBN classification algorithm for numerical attribute is proposed through scaling the input into the interval between 0 and 1 with adding noise to sigmoid units, and achieving classification with one Gaussian hidden node on the top-level RBM. DBN can be used as feature extraction method as well as neural network with initially learned weights. DBN should have a better performance than the traditional neural network due to the initialization of the connecting weights rather than just using random weights in neural network. Experiments conducted on the dataset from UCI show that the proposed algorithm has a better accuracy than the traditional classification algo-rithm like SVM.%深度置信网络是个包含多个受限玻尔兹曼机的深层架构。针对深度置信网络分类时由于受限玻尔兹曼机的输入一般是二值向量而造成的信息的丢失从而使分类效果降低的问题,提出了通过在sigmoid单元中增加噪声来将输入缩放到[0,1]区间,使用带有一个高斯隐藏节点的顶层受限玻尔兹曼机实现分类功能的一种数值属性深度置信网络分类方法。深度置信网络和受限玻尔兹曼机可以作为特征提取方法也可以认为是带有训练的初始权值的神经网络。由于连接权值的初始化而不仅仅是神经网络的随机权值,深度置信网络分类应该比原有的传统的神经网络分类拥有更好的性能。在UCI的多个数据集上进行对比验证,实验结果表明深度置信网络分类方法比传统的SVM算法拥有更高的准确性。

  10. 一种模糊-证据kNN分类方法%A Fuzzy-Evidential k Nearest Neighbor Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    吕锋; 杜妮; 文成林

    2012-01-01

    已有的以k-最近邻(kNearest Neighbor,kNN)规则为核心的分类算法,如模糊kNN(Fuzzy kNN,FkNN)和证据kNN(Evidential kNN,EkNN)等,存在着两个问题:无法区别出样本特征的差异以及忽略了邻居距训练样本类中心距离的不同所带来的影响.为此,本文提出一种模糊-证据kNN算法.首先,利用特征的模糊熵值确定每个特征的权重,基于加权欧氏距离选取k个邻居;然后,利用邻居的信息熵区别对待邻居并结合FkNN在表示信息和EkNN在融合决策方面的优势,采取先模糊化再融合的方法确定待分类样本的类别.本文的方法在UCI标准数据集上进行了测试,结果表明该方法优于已有算法.%The classification algorithms based on k Nearest Neighbor (ANN) rule, such as Fuzzy kNN (FkNN) and Evidential ANN (EkNN),has two problems: the differences of the sample features cannot be recognized and the effect of fuzziness that aroused by the different distances between neighbors and the center of classes is not taken into account. In order to overcome the limitations,the fuzzy-evidential kNN(FEkNN)algorithm is proposed. First, the features' weights are determined by the features' fuzzy entropy values and k neighbors are selected according to the weighted Euclidean distance. Then samples are classified by the method, which fuzzify memberships of its neighbors first and then fuse the information. And this method combines the advantage of FkNN in information expression with that of EkNN in decision-making. Meanwhile,neighbors are distinguished by their information entropy values. The presented method is tested on the UCI datasets, and the results show that the proposed method outperforms the other kNN-based classification algorithms.

  11. Application of the Honeybee Mating Optimization Algorithm to Patent Document Classification in Combination with the Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Chui-Yu Chiu

    2013-08-01

    Full Text Available Patent rights have the property of exclusiveness. Inventors can protect their rights in the legal range and have monopoly for their open inventions. People are not allowed to use an invention before the inventors permit them to use it. Companies try to avoid the research and development investment in inventions that have been protected by patent. Patent retrieval and categorization technologies are used to uncover patent information to reduce the cost of torts. In this research, we propose a novel method which integrates the Honey-Bee Mating Optimization algorithm with Support Vector Machines for patent categorization. First, the CKIP method is utilized to extract phrases of the patent summary and title. Then we calculate the probability that a specific key phrase contains a certain concept based on Term Frequency - Inverse Document Frequency (TF-IDF methods. By combining frequencies and the probabilities of key phases generated by using the Honey-Bee Mating Optimization algorithm, our proposed method is expected to obtain better representative input values for the SVM model. Finally, this research uses patents from Chemical Mechanical Polishing (CMP as case examples to illustrate and demonstrate the superior results produced by the proposed methodology.

  12. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    2016-01-01

    Full Text Available Among non-small cell lung cancer (NSCLC, adenocarcinoma (AC, and squamous cell carcinoma (SCC are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR, can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.

  13. Audio Classification from Time-Frequency Texture

    CERN Document Server

    Yu, Guoshen

    2008-01-01

    Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments.

  14. A Comparison of Machine Learning Algorithms for Chemical Toxicity Classification Using a Simulated Multi-Scale Data Model

    Science.gov (United States)

    Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in hig...

  15. 改进的面向麻花钻刃形节能优化 Dijkstra 算法%Improvement of Dijkstra algorithm aiming at energy conservation optimization design of geometrical shape of twist drills

    Institute of Scientific and Technical Information of China (English)

    熊良山

    2015-01-01

    Dijkstra algorithm was introduced to the energy conservation optimization design of the structure of complicated cutting tools. The designing procedure of the geometrical shape and dimen‐sions of the cutting edge of twist drills based on Dijkstra algorithm was proposed. The problems taken into consideration that the calculation efficiency is low, the calculated results are not sufficiently pre‐cise, and the smoothness and the machinability of the cutting edge curve cannot be guaranteed when the Dijkstra algorithm is used to determine the main cutting edge curve with minimal drilling power, a combination of two methods, dividing the discretized grids on the rake face into several parts along the radius direction, and gradually decreasing the searching scope to refine mesh along the circumference, was proposed to improve Dijkstra algorithm to reduce its time complexity and improve its calculation efficiency, since the main cutting edge is a curve with no inflexion along the radius direction. Calcula‐tion shows that the improved Dijkstra algorithm leads to an increase of calculation efficiency by over 1000 times, and it successfully results in smoothness of both the cutting edge curve and the distribu‐tion curves of cutting angles so that the machinability of the main cutting edge curve is guaranteed.%将Dijkstra算法引入复杂刀具结构节能优化设计,提出了基于Dijkstra算法的麻花钻主刃曲线节能优化设计的方法。针对应用Dijkstra算法求解麻花钻最小钻削功率主刃曲线时存在计算精度不高、效率低、难以保证刃形曲线的光滑性和可刃磨性等问题,利用主刃曲线在半径方向不会发生回头式弯折的特点,提出将前刀面螺旋面离散网络进行径向分段与逐步缩小周向搜索范围来加密网格相结合的方式改进Dijkstra算法的求解过程,以降低其时间复杂度、提高计算效率和计算精度。计算结果表明:改进后的Dijkstra算法既

  16. Towards automatic classification of all WISE sources

    CERN Document Server

    Kurcz, Agnieszka; Solarz, Aleksandra; Krupa, Magdalena; Pollo, Agnieszka; Małek, Katarzyna

    2016-01-01

    The WISE satellite has detected hundreds of millions sources over the entire sky. Classifying them reliably is however a challenging task due to degeneracies in WISE multicolour space and low levels of detection in its two longest-wavelength bandpasses. Here we aim at obtaining comprehensive and reliable star, galaxy and quasar catalogues based on automatic source classification in full-sky WISE data. This means that the final classification will employ only parameters available from WISE itself, in particular those reliably measured for a majority of sources. For the automatic classification we applied the support vector machines (SVM) algorithm, which requires a training sample with relevant classes already identified, and we chose to use the SDSS spectroscopic dataset for that purpose. By calibrating the classifier on the test data drawn from SDSS, we first established that a polynomial kernel is preferred over a radial one for this particular dataset. Next, using three classification parameters (W1 magnit...

  17. A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    OpenAIRE

    Li Zhen; Setzer R Woodrow; Elloumi Fathi; Judson Richard; Shah Imran

    2008-01-01

    Abstract Background Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in com...

  18. Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

    Energy Technology Data Exchange (ETDEWEB)

    Ahmed, M; Gu, F; Ball, A, E-mail: M.Ahmed@hud.ac.uk [Diagnostic Engineering Research Group, University of Huddersfield, HD1 3DH (United Kingdom)

    2011-07-19

    Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.

  19. Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

    Science.gov (United States)

    Ahmed, M.; Gu, F.; Ball, A.

    2011-07-01

    Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.

  20. A One-Class Classification-Based Control Chart Using the K-Means Data Description Algorithm

    Directory of Open Access Journals (Sweden)

    Walid Gani

    2014-01-01

    referred to as OC-charts, and extend their applications. We propose a new OC-chart using the K-means data description (KMDD algorithm, referred to as KM-chart. The proposed KM-chart gives the minimum closed spherical boundary around the in-control process data. It measures the distance between the center of KMDD-based sphere and the new incoming sample to be monitored. Any sample having a distance greater than the radius of KMDD-based sphere is considered as an out-of-control sample. Phase I and II analysis of KM-chart was evaluated through a real industrial application. In a comparative study based on the average run length (ARL criterion, KM-chart was compared with the kernel-distance based control chart, referred to as K-chart, and the k-nearest neighbor data description-based control chart, referred to as KNN-chart. Results revealed that, in terms of ARL, KM-chart performed better than KNN-chart in detecting small shifts in mean vector. Furthermore, the paper provides the MATLAB code for KM-chart, developed by the authors.

  1. Text Classification Combined with Probabilistic Neural Network (PNN) and Learning Vector Quantization (LVQ) Algorithm%结合概率型神经网络(PNN)和学习矢量量化(LVQ)算法的文本分类方法

    Institute of Scientific and Technical Information of China (English)

    李敏; 余正涛

    2012-01-01

    Aiming at the problem of text classification, one text classification method based on the probabilistic neural network ( PNN ) and learning vector quantization ( LVQ ) is proposed. The text features and feature values are extracted by use of TFIDF method, and text categorization feature vector are formed. In addition, classification model based on probabilistic neural network can be constructed and the learning of competitive layer network is completed by using LVQ algorithms, so the corresponding pattern vector to move closer to each other, away from the other modes, thereby realizing text classification. The experimental results show that die method in the text classification performance with very good results, and not only has good classification accuracy, but also shows a good learning efficiency.%针对文本自动分类问题,提出一种基于概率型神经网络(PNN)和学习矢量量化(LVQ)相结合的文本分类算法,该方法借助TFIDF方法提取文本特征及特征值,形成文本分类特征向量,利用概率型神经网络构建分类模型,并利用LVQ学习算法对神经网络模型竞争层网络进行学习,使相应模式向量相互靠拢,远离其他模式,从而实现文本分类.实验结果表明,提出的该方法在文本分类中表现了很好的效果,不仅具有很好的分类准确率,还表现出很好的学习效率.

  2. APPLYING MULTIDIMENSIONAL PACKET CLASSIFICATION ALGORITHM IN FIREWALL%多维包分类算法在防火墙中的应用

    Institute of Scientific and Technical Information of China (English)

    夏淑华

    2011-01-01

    Along with the globalisation of Internet application, the attendant problems of the security of network information and so on have however affected the users on their trust of the safety and reliability of Internet services and their use of it. At present the firewall technology is the important security means in dealing with the problem of network security, on the basis of the introduction of firewall technology classification, in this paper we have studied the main idea of the AFBV algorthm. To solve the problem of this algorithm that in multidimensional rule library with large number it might appear the problem of time performance excessive consumption, we make the optimisation and the improvement, the deficiency of the AFBV algorithm in complex network environment has been overcome effectively. The contrast expermental result has been given through simulation experiment as well.%随着互联网应用的全球化发展,随之而来的网络信息安全等问题却影响了用户对互联网络服务安全性和可靠性的信任与使用.防火墙技术是目前应对网络安全问题的重要安全技术,在介绍防火墙技术分类的基础上,研究了AFBV算法的主要思想.针对该算法在数目较大的多维规则库下可能出现时间性能消耗过大的问题,进行了优化和改进,有效地克服了AFBV算法在复杂网络环境中的缺陷,并通过仿真实验给出了对比的实验结果.

  3. Classification and uptake of reservoir operation optimization methods

    Science.gov (United States)

    Dobson, Barnaby; Pianosi, Francesca; Wagener, Thorsten

    2016-04-01

    Reservoir operation optimization algorithms aim to improve the quality of reservoir release and transfer decisions. They achieve this by creating and optimizing the reservoir operating policy; a function that returns decisions based on the current system state. A range of mathematical optimization algorithms and techniques has been applied to the reservoir operation problem of policy optimization. In this work, we propose a classification of reservoir optimization approaches by focusing on the formulation of the water management problem rather than the optimization algorithm type. We believe that decision makers and operators will find it easier to navigate a classification system based on the problem characteristics, something they can clearly define, rather than the optimization algorithm. Part of this study includes an investigation regarding the extent of algorithm uptake and the possible reasons that limit real world application.

  4. Development of a clinical decision support system using genetic algorithms and Bayesian classification for improving the personalised management of women attending a colposcopy room.

    Science.gov (United States)

    Bountris, Panagiotis; Topaka, Elena; Pouliakis, Abraham; Haritou, Maria; Karakitsos, Petros; Koutsouris, Dimitrios

    2016-06-01

    Cervical cancer (CxCa) is often the result of underestimated abnormalities in the test Papanicolaou (Pap test). The recent advances in the study of the human papillomavirus (HPV) infection (the necessary cause for CxCa development) have guided clinical practice to add HPV related tests alongside the Pap test. In this way, today, HPV DNA testing is well accepted as an ancillary test and it is used for the triage of women with abnormal findings in cytology. However, these tests are either highly sensitive or highly specific, and therefore none of them provides an optimal solution. In this Letter, a clinical decision support system based on a hybrid genetic algorithm - Bayesian classification framework is presented, which combines the results of the Pap test with those of the HPV DNA test in order to exploit the benefits of each method and produce more accurate outcomes. Compared with the medical tests and their combinations (co-testing), the proposed system produced the best receiver operating characteristic curve and the most balanced combination among sensitivity and specificity in detecting high-grade cervical intraepithelial neoplasia and CxCa (CIN2+). This system may support decision-making for the improved management of women who attend a colposcopy room following a positive test result. PMID:27382484

  5. ARTIFICIAL BEE COLONY ALGORITHM INTEGRATED WITH FUZZY C-MEAN OPERATOR FOR DATA CLUSTERING

    OpenAIRE

    M. Krishnamoorthi; A.M.Natarajan

    2013-01-01

    Clustering task aims at the unsupervised classification of patterns in different groups. To enhance the quality of results, the emerging swarm-based algorithms now-a-days become an alternative to the conventional clustering methods. In this study, an optimization method based on the swarm intelligence algorithm is proposed for the purpose of clustering. The significance of the proposed algorithm is that it uses a Fuzzy C- Means (FCM) operator in the Artificial Bee Colony (ABC) algorithm. The ...

  6. Efficent-cutting packet classification algorithm based on the statistical decision tree%基于统计的高效决策树分组分类算法

    Institute of Scientific and Technical Information of China (English)

    陈立南; 刘阳; 马严; 黄小红; 赵庆聪; 魏伟

    2014-01-01

    Packet classification algorithms based on decision tree are easy to implement and widely employed in high-speed packet classification. The primary objective of constructing a decision tree is minimal storage and searching time complexity. An improved decision-tree algorithm is proposed based on statistics and evaluation on filter sets. HyperEC algorithm is a multiple dimensional packet classification algorithm. The proposed algorithm allows the tradeoff between storage and throughput during constructing decision tree. For it is not sensitive to IP address length, it is suitable for IPv6 packet classifi-cation as well as IPv4. The algorithm applies a natural and performance-guided decision-making process. The storage budget is preseted and then the best throughput is achieved. The results show that the HyperEC algorithm outperforms the HiCuts and HyperCuts algorithm, improving the storage and throughput performance and scalable to large filter sets.%基于决策树的分组分类算法因易于实现和高效性,在快速分组分类中广泛使用。决策树算法的基本目标是构造一棵存储高效且查找时间复杂度低的决策树。设计了一种基于规则集统计特性和评价指标的决策树算法——HyperEC 算法。HyperEC算法避免了在构建决策树过程中决策树高度过高和存储空间膨胀的问题。HyperEC算法对IP地址长度不敏感,同样适用于IPv6的多维分组分类。实验证明,HyperEC算法当规则数量较少时,与HyperCuts基本相同,但随着规则数量的增加,该算法在决策树高度、存储空间占用和查找性能方面都明显优于经典的决策树算法。

  7. Classification, disease, and diagnosis.

    Science.gov (United States)

    Jutel, Annemarie

    2011-01-01

    Classification shapes medicine and guides its practice. Understanding classification must be part of the quest to better understand the social context and implications of diagnosis. Classifications are part of the human work that provides a foundation for the recognition and study of illness: deciding how the vast expanse of nature can be partitioned into meaningful chunks, stabilizing and structuring what is otherwise disordered. This article explores the aims of classification, their embodiment in medical diagnosis, and the historical traditions of medical classification. It provides a brief overview of the aims and principles of classification and their relevance to contemporary medicine. It also demonstrates how classifications operate as social framing devices that enable and disable communication, assert and refute authority, and are important items for sociological study.

  8. Two Projection Pursuit Algorithms for Machine Learning under Non-Stationarity

    CERN Document Server

    Blythe, Duncan A J

    2011-01-01

    This thesis derives, tests and applies two linear projection algorithms for machine learning under non-stationarity. The first finds a direction in a linear space upon which a data set is maximally non-stationary. The second aims to robustify two-way classification against non-stationarity. The algorithm is tested on a key application scenario, namely Brain Computer Interfacing.

  9. Cloud classification algorithm for cloudSat satellite based on fuzzy logic method%基于模糊逻辑的CloudSat卫星资料云分类算法

    Institute of Scientific and Technical Information of China (English)

    任建奇; 严卫; 杨汉乐; 施健康

    2011-01-01

    为了提高星载毫米波雷达资料云分类的准确性,从基于云角色的分类思想出发,利用源于CloudSat/CPR(云廓线雷达)和CALIPSO/Lidar(激光雷达)的云几何廓线数据产品2B-GEOPROF-LIDAR以及相关资料,通过对云的特征参数进行提取,采用模糊逻辑技术对特征参数进行处理并完成对云的分类,将分类结果与CloudSat数据处理中心(DPC)发布的云分类产品、以及CALIPSO激光雷达的观测数据进行对比分析,结果表明分类具有较高的一致性.%According to role-based classification method, a cloud classification algorithm based on fuzzy logic method was established to improve the accuracy of cloud classification for spaceborne millimeter wave radar detection.By extracting characteristic parameters of CloudSat satellite 2B-GEOPROF-LIDAR product from CloudSat CPR and CALIPSO lidar and other related data, this algorithm was used in the cloud classification work.The analysis results are consistent with the observation data of the classification product produced by CloudSat Data Processing Center (DPC) and CALIPSO lidar detection data.

  10. Graduates employment classification using data mining approach

    Science.gov (United States)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  11. Product Classification in Supply Chain

    OpenAIRE

    Xing, Lihong; Xu, Yaoxuan

    2010-01-01

    Oriflame is a famous international direct sale cosmetics company with complicated supply chain operation but it lacks of a product classification system. It is vital to design a product classification method in order to support Oriflame global supply planning and improve the supply chain performance. This article is aim to investigate and design the multi-criteria of product classification, propose the classification model, suggest application areas of product classification results and intro...

  12. An Improved Unequal Clustering Algorithm Based on Cluster Head Classification%基于簇头分级的改进非均匀分簇算法

    Institute of Scientific and Technical Information of China (English)

    康琳; 董增寿

    2015-01-01

    针对无线传感器网络非均匀成簇路由中频繁的簇头轮换带来的簇内以及簇间广播开销对传感器网络生存周期的缩短,提出了一种基于簇头分级的改进的非均匀成簇算法(CHCI),利用簇内节点能量构建了节点的分级模型,将节点分为主要簇头(PCH),次要簇头(SCH)及簇内成员节点(CM),为PCH设置了重选因子。结合二次规划问题为SCH选择了最佳中继路径降低节点能耗,延长PCH的重选时间。仿真结果表明,CHCI算法比经典LEACH算法以及非均匀成簇的EEUC算法,延长了网络的生存时间。%Due to the unexpected intra-cluster and inter-cluster broadcasting overhead caused by periodical Cluster Head(CH)rotation in unequal clustering routing shortens the lifetime of the wireless sensor networks,an protocol based on Cluster Head Classified Improved algorithm is proposed. In CHCI,the classification model of nodes is con⁃structed according to the energy of the intra-cluster nodes,which contains Primary CH(PCH),Secondary CH (SCH),and Cluster Members. Besides,an Re-election Parameter(RP)is introduced for PCH. Further,an energy-efficient relay routing via SCH is selected based on Quadratic programming problem,which can lower the energy consumption for SCH and extend the Re-election time for PCH. Simulation results demonstrate that CHCI outper⁃forms traditional LEACH,EEUC protocols and prolong the lifetime of network.

  13. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    Science.gov (United States)

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  14. 基于粗集分类和遗传算法的知识库集成方法%The Methods of Knowledge Database Integration Based on the Rough Set Classification and Genetic Algorithm

    Institute of Scientific and Technical Information of China (English)

    郭平; 程代杰

    2003-01-01

    As the base of intelligent system, it is very important to guarantee the consistency and non-redundancy of knowledge in knowledge database. Since the variety of knowledge sources, it is necessary to dispose knowledge with redundancy, inclusion and even contradiction during the integration of knowledge database. This paper researches the integration method based on the multi-knowledge database. Firstly, it finds out the inconsistent knowledge sets between the knowledge databases by rough set classification and presents one method eliminating the inconsistency by test data. Then, it regards consistent knowledge sets as the initial population of genetic calculation and constructs a genetic adaptive function based on accuracy, practicability and spreadability of knowledge representation to carry on the genetic calculation. Lastly, classifying the results of genetic calculation reduces the knowledge redundancy of knowledge database. This paper also presents a frameworkfor knowledge database integration based on the rough set classification and genetic algorithm.

  15. Significance of Classification Techniques in Prediction of Learning Disabilities

    Directory of Open Access Journals (Sweden)

    Julie M. David

    2010-10-01

    Full Text Available The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified.

  16. Significance of Classification Techniques in Prediction of Learning Disabilities

    CERN Document Server

    Balakrishnan, Julie M David And Kannan

    2010-01-01

    The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified.

  17. Nominal classification

    OpenAIRE

    Senft, G.

    2007-01-01

    This handbook chapter summarizes some of the problems of nominal classification in language, presents and illustrates the various systems or techniques of nominal classification, and points out why nominal classification is one of the most interesting topics in Cognitive Linguistics.

  18. Improving Classification Performance with Single-category Concept Match

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Discarding more and more complicated algorithms, this paper presents a new classification algorithm with singlecategory concept match. It also introduces the method to find such concepts, which is important to the algorithm. Experiment results show that it can improve classification precision and accelerate classification speed to some extent.

  19. Brain CT image classification based on least squares support vector machine opti-mized by improved harmony search algorithm%改进和声搜索算法优化LSSVM的脑CT图像分类

    Institute of Scientific and Technical Information of China (English)

    郭正红; 赵丙辰

    2013-01-01

    In order to improve the brain CT image classification accuracy, this paper proposes brain CT mage classification mod-el(IHS-LSSVM)based on the least squares support vector machine and harmony search algorithm. Firstly, the LSSVM parame-ters are taken as different musical tone combination, and then the harmony search algorithm is used to find the optimal parame-ters, and the optimal position adjustment strategy is introduced to enhance the ability of jumping out of local minima, the brain CT image classification model is established according to the optimal parameters, and the performance of the model is tested. The simulation results show that, compared with the other models, IHS-LSSVM not only improves the image classification accu-racy, but also accelerates the classification speed, so it is an effective brain CT image classification model.%为了提高脑CT图像的分类正确率,针对分类器中的最小二乘支持向量机(LSSVM)参数优化问题,提出一种改进和声搜索算法优化LSSVM的脑CT图像分类模型(IHS-LSSVM)。将LSSVM参数看作不同乐器的声调组合,通过和声搜索算法的“调音”找到最优参数,并在寻优过程中引入粒子群算法的最优位置更新策略,增强了算法跳出局部极小值的能力,根据最优参数建立脑CT图像分类模型,并对模型的性能进行仿真测试。仿真结果表明,相对于对比模型,IHS-LSSVM不仅提高了脑CT图像分类正确率,而且加快分类速度,是一种有效的脑CT图像分类模型。

  20. 改进多分类器集成 AdaBoost算法的 Web主题分类%WEB TOPIC CLASSIFICATION BASED ON MODIFIED MULTI-CLASSIFIER INTEGRATION ADABOOST ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    伍杰华; 倪振声

    2013-01-01

    现有的Web主题分类算法一般基于单一模型构建或者仅仅把多个单一模型简单叠加进行决策。针对该问题,提出一种基于多分类器集成的改进AdaBoost算法的Web主题分类方法。算法先采用VIPS算法获取页面分块并获取其视觉特征和文本特征,根据每一类特征的维度分别训练弱分类器,然后计算其对应的错误率,修改错误判别的拒绝策略,从而针对不同特征产生相应的最优分类器,最后对两类最优分类器级联决策。实验结果表明,该方法能提高AdaBoost算法对复杂Web主题信息的分类准确率,同时也为Web主题分类领域的研究提供一种新的方案。%Current Web topic classification algorithms are generally constructed based on single model or merely superimpose the multiple single model for decision-making.In light of the problem, we propose a new Web topic classification method which is based on the modified multi-classifier integration AdaBoost algorithm .Firstly, the method uses VIPS algorithm to acquire page blocks as well as their visual and text features, and trains weak classifier on the basis of the dimension of each feature; then, the algorithm calculates its corresponding error rate and modifies the refusal strategies of error discrimination , so that generates the corresponding optimal classifier for different features ;finally it performs cascading decision-making on two kind of optimal classifiers .Experimental results demonstrate that the method can improve the classification precision of AdaBoost on complex Web topic information , and at the same time it also provides a kind of new scheme for research on Web topic classification field .

  1. 基于可分度和支持度的模糊密度赋值融合识别算法%Fusion Recognition Algorithm Based on Fuzzy Density Determination with Classification Capability and Supportability

    Institute of Scientific and Technical Information of China (English)

    詹永照; 张娟; 毛启容

    2012-01-01

    模糊积分理论可有效处理分类决策不确定性问题.当前模糊密度的确定方法未考虑各个分类器识别结果的可区分程度及各分类器对识别结果的支持程度,会丢失融合识别的相关信息.文中提出基于可分度和支持度的自适应模糊密度赋值融合识别算法.该算法根据各分类器对待识别样本的识别结果的可区分程度和支持程度对分类器的融合模糊密度进行自适应赋值,从而有效实现多分类器融合识别.将该算法应用于自然交互环境下的人脸表情识别和Cohn-Kanade表情识别.实验结果表明,该算法能有效提高总体表情识别率.%Fuzzy integral theory can be effectively used to deal with the uncertainties of the classification decisions. However, the classification capability of each classifier for recognition results and the supportability of each classifier for the object recognition are not taken into account in the current methods of fuzzy density determination, which results in the loss of the important information for fusion recognition. To overcome this disadvantage, a fusion recognition algorithm based on fuzzy density determination with classification capability and supportability for each classifier is presented. In this algorithm, the fuzzy densities for the classifier fusion are adaptively determined by classification capability of each classifier for recognition results and supportability of each classifier for the object recognition. Thus, the multi-classifiers fusion recognition can be effectively realized. The proposed algorithm is used to recognize facial expression in natural interaction situation and Cohn-Kanade facial expression database. The experimental results show that the proposed algorithm effectively raises the accuracy of expression recognition.

  2. An Incremental Learning Vector Quantization Algorithm Based on Pattern Density and Classification Error Ratio%基于样本密度和分类误差率的增量学习矢量量化算法研究

    Institute of Scientific and Technical Information of China (English)

    李娟; 王宇平

    2015-01-01

    As a simple and mature classification method, the K nearest neighbor algorithm (KNN) has been widely applied to many fields such as data mining, pattern recognition, etc. However, it faces serious challenges such as huge computation load, high memory consumption and intolerable runtime burden when the processed dataset is large. To deal with the above problems, based on the single-layer competitive learning of the incremental learning vector quantization (ILVQ) network, we propose a new incremental learning vector quantization method that merges together pattern density and classification error rate. By adopting a series of new competitive learning strategies, the proposed method can obtain an incremental prototype set from the original training set quickly by learning, inserting, merging, splitting and deleting these representative points adaptively. The proposed method can achieve a higher reduction efficiency while guaranteeing a higher classification accuracy synchronously for large-scale dataset. In addition, we improve the classical nearest neighbor classification algorithm by absorbing pattern density and classification error ratio of the final prototype neighborhood set into the classification decision criteria. The proposed method can generate an effective representative prototype set after learning the training dataset by a single pass scan, and hence has a strong generality. Experimental results show that the method not only can maintain and even improve the classification accuracy and reduction ratio, but also has the advantage of rapid prototype acquisition and classification over its counterpart algorithms.%作为一种简单而成熟的分类方法, K 最近邻(K nearest neighbor, KNN)算法在数据挖掘、模式识别等领域获得了广泛的应用,但仍存在计算量大、高空间消耗、运行时间长等问题。针对这些问题,本文在增量学习型矢量量化(Incremental learning vector quantization, ILVQ)的单层竞争学习基

  3. Classification Algorithm of l2-norm LS-SVM via Coordinate Descent%坐标下降l2范数LS-SVM分类算法*

    Institute of Scientific and Technical Information of China (English)

    刘建伟; 付捷; 汪韶雷; 罗雄麟

    2013-01-01

      The coordinate descent approach for l2 norm regulated least square support vector machine is studied. The datasets involved in the objective function for machine learning have larger data scale than the memory size has in image processing, human genome analysis, information retrieval, data management, and data mining. Recently, the coordinate descent method for large-scale linear SVM has good classification performance on large scale datasets. In this paper, the results of the work are extended to the least square support vector machine, and the coordinate descent approach for l2 norm regulated least square support vector machine is proposed. The vector optimization of the LS-SVM objective function is reduced to single variable optimization by the proposed algorithm. The experimental results on high-dimension small-sample datasets, middle-scale datasets and large-scale datasets demonstrate its effectiveness. Compared to the state-of-the-art LS-SVM classifiers, the proposed method can be a good candidate when data cannot fit in memory.%  研究l2范数正则化最小二乘支持向量机的坐标下降算法实现。在图像处理、人类基因组分析、信息检索、数据管理和数据挖掘中经常会遇到机器学习目标函数要处理的数据无法在内存中处理的场景。最近研究表明大规模线性支持向量机使用坐标下降方法具有较好的分类性能,在此工作基础上,文中扩展坐标下降方法到最小二乘支持向量机上,提出坐标下降l2范数LS-SVM分类算法。该算法把LS-SVM目标函数中模型向量的优化问题简化为特征分量的单目标逐次优化问题。在高维小样本数据集、中等规模数据集和大样本数据集上的实验验证了该算法的有效性,与LS-SVM分类算法相比,在数据内存中无法处理的情况下可作为备用方法。

  4. Predicting Students’ Performance using Modified ID3 Algorithm

    Directory of Open Access Journals (Sweden)

    Ramanathan L

    2013-06-01

    Full Text Available The ability to predict performance of students is very crucial in our present education system. We can use data mining concepts for this purpose. ID3 algorithm is one of the famous algorithms present today to generate decision trees. But this algorithm has a shortcoming that it is inclined to attributes with many values. So , this research aims to overcome this shortcoming of the algorithm by using gain ratio(instead of information gain as well as by giving weights to each attribute at every decision making point. Several other algorithms like J48 and Naive Bayes classification algorithm are alsoapplied on the dataset. The WEKA tool was used for the analysis of J48 and Naive Bayes algorithms. The results are compared and presented. The dataset used in our study is taken from the School of Computing Sciences and Engineering (SCSE, VIT University.

  5. Comparison of Different Classification Techniques Using WEKA for Hematological Data

    Directory of Open Access Journals (Sweden)

    Md. Nurul Amin

    2015-03-01

    Full Text Available Medical professionals need a reliable prediction methodology to diagnose hematological data comments. There are large quantities of information about patients and their medical conditions. Generally, data mining (sometimes called data or knowledge discovery is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Weka is a data mining tools. It contains many machine leaning algorithms. It provides the facility to classify our data through various algorithms. Classification is an important data mining technique with broad applications. It classifies data of various kinds. Classification is used in every field of our life. Classification is used to classify each item in a set of data into one of predefined set of classes or groups. In this paper we are studying the various Classification algorithms. The thesis main aims to show the comparison of different classification algorithms using Waikato Environment for Knowledge Analysis or in short, WEKA and find out which algorithm is most suitable for user working on hematological data. To use propose model, new Doctor or patients can predict hematological data Comment also developed a mobile App that can easily diagnosis hematological data comments. The best algorithm based on the hematological data is J48 classifier with an accuracy of 97.16% and the total time taken to build the model is at 0.03 seconds. Naïve Bayes classifier has the lowest average error at 29.71% compared to others.

  6. The Effect of Adaptive Gain and Adaptive Momentum in Improving Training Time of Gradient Descent Back Propagation Algorithm on Classification Problems

    Directory of Open Access Journals (Sweden)

    Norhamreeza Abdul Hamid

    2011-01-01

    Full Text Available The back propagation algorithm has been successfully applied to wide range of practical problems. Since this algorithm uses a gradient descent method, it has some limitations which are slow learning convergence velocity and easy convergence to local minima. The convergence behaviour of the back propagation algorithm depends on the choice of initial weights and biases, network topology, learning rate, momentum, activation function and value for the gain in the activation function. Previous researchers demonstrated that in ‘feed forward’ algorithm, the slope of the activation function is directly influenced by a parameter referred to as ‘gain’. This research proposed an algorithm for improving the performance of the current working back propagation algorithm which is Gradien Descent Method with Adaptive Gain by changing the momentum coefficient adaptively for each node. The influence of the adaptive momentum together with adaptive gain on the learning ability of a neural network is analysed. Multilayer feed forward neural networks have been assessed. Physical interpretation of the relationship between the momentum value, the learning rate and weight values is given. The efficiency of the proposed algorithm is compared with conventional Gradient Descent Method and current Gradient Descent Method with Adaptive Gain was verified by means of simulation on three benchmark problems. In learning the patterns, the simulations result demonstrate that the proposed algorithm converged faster on Wisconsin breast cancer with an improvement ratio of nearly 1.8, 6.6 on Mushroom problem and 36% better on  Soybean data sets. The results clearly show that the proposed algorithm significantly improves the learning speed of the current gradient descent back-propagatin algorithm.

  7. Automatic classification of Deep Web sources based on KNN algorithm%基于K-近邻算法的Deep Web数据源的自动分类

    Institute of Scientific and Technical Information of China (English)

    张智; 顾韵华

    2011-01-01

    To meet the need of Deep Web query, an algorithm for classification of Deep Web sources based on KNN is put forward. The algorithm extracts the form features from Web pages, and makes the form features vector normal. Then the algorithm classifies Deep Web pages by computing distance. The experimental results show that the algorithm has improved in precision and recall.%针对Deep Web的查询需求,提出了一种基于K-近邻算法的Deep Web数据源的自动分类方法.该算法在对Deep Web网页进行表单特征提取及规范化的基础上,基于距离对Deep Web网页所属的目标主题进行判定.实验结果表明:基于K-近邻分类算法可以较有效地进行DeepWeb数据源的自动分类,并得到较高的查全率和查准率.

  8. Random Forests for Poverty Classification

    OpenAIRE

    Ruben Thoplan

    2014-01-01

    This paper applies a relatively novel method in data mining to address the issue of poverty classification in Mauritius. The random forests algorithm is applied to the census data in view of improving classification accuracy for poverty status. The analysis shows that the numbers of hours worked, age, education and sex are the most important variables in the classification of the poverty status of an individual. In addition, a clear poverty-gender gap is identified as women have higher chance...

  9. Quantum computing for pattern classification

    OpenAIRE

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2014-01-01

    It is well known that for certain tasks, quantum computing outperforms classical computing. A growing number of contributions try to use this advantage in order to improve or extend classical machine learning algorithms by methods of quantum information theory. This paper gives a brief introduction into quantum machine learning using the example of pattern classification. We introduce a quantum pattern classification algorithm that draws on Trugenberger's proposal for measuring the Hamming di...

  10. China Aims to Promote Import

    Institute of Scientific and Technical Information of China (English)

    Wang Ting

    2010-01-01

    @@ With the theme of"An Opening Market and Global Trade",aim at promoting communications and exchanges among governments,industries and business to achieve mutual benefit and a win-win situation,nearly 300 representatives from the relevant departments of the Chinese government,foreign embassies in China,industrial associations and major enterprises,as well as well-known Chinese and foreign experts and scholars were invited to take part in the forum and share their iews on Chinese market and foreign trade policies.

  11. RESEARCH ON SPORTING IMAGES CLASSIFICATION AND APPLICATION BASED ON SIFT ALGORITHM%基于SIFT算法的体育类图像分类与应用研究

    Institute of Scientific and Technical Information of China (English)

    朱飞; 王兴起

    2011-01-01

    For the huge volume of data and computational complexity of content-based image classification, the authors propose a SIFT-based image classification method and apply it to sporting images classification. The method extracts feature points from images, then use DBScan and K-Means algorithms to analyse feature data to obtain data that best represent image features. Images are classified with these data. Experiments and analysis show that the proposed method possesses advantages of speed and precision.%针对基于内容的图像分类检索方法的数据量巨大、计算复杂度高等的不足,提出一种基于SIFT算法的图像分类方法,并将其应用到体育类图像分类上.该方法从图像中提取出特征点之后,分别使用DBScan算法和K-Mean算法对特征数据进行分析,从而得到最能反映图像特征的数据,再利用这些数据对图像进行分类.实验分析表明:该方法具有速度快、分类精度高的优点.

  12. 用改进蚁群算法确定无功补偿分级容量%Improvement of ant colony algorithm for determining reactive compensation classification capacity

    Institute of Scientific and Technical Information of China (English)

    董张卓; 李哲; 赵元鹏

    2013-01-01

    Reactive load history information cannot be effectively applied by traditional reactive compensation classification method, so it exists the over-compensation and lack-compensation phenomenon easily. It proposes a optimization model for effective use of reactive load history information to determine the classification capacity. The model is solved using the improved ant colony algorithm. The pheromone is corrected in time by setting a pheromone threshold. Searching in vertical and horizontal way, the efficiency of ants' search is improved. Solution in that algorithm is effectively protected from local optimum, and the efficiency is increased several times.%传统确定无功分级补偿容量的方法不能有效利用负荷历史信息,容易出现过补或欠补现象.建立了有效利用历史无功负荷来求解无功补偿分级容量的优化模型,采用蚁群算法求解,对蚁群算法进行了改进.通过设定信息素的修正阈值,适时对信息素进行修正;通过纵向和横行的搜索方式,提高蚂蚁搜索的效率;算法能更好地避免陷入局部最优,执行效率数倍提高.

  13. A Fuzzy Logic Based Sentiment Classification

    Directory of Open Access Journals (Sweden)

    J.I.Sheeba

    2014-07-01

    Full Text Available Sentiment classification aims to detect information such as opinions, explicit , implicit feelings expressed in text. The most existing approaches are able to detect either explicit expressions or implicit expressions of sentiments in the text separately. In this proposed framework it will detect both Implicit and Explicit expressions available in the meeting transcripts. It will classify the Positive, Negative, Neutral words and also identify the topic of the particular meeting transcripts by using fuzzy logic. This paper aims to add some additional features for improving the classification method. The quality of the sentiment classification is improved using proposed fuzzy logic framework .In this fuzzy logic it includes the features like Fuzzy rules and Fuzzy C-means algorithm.The quality of the output is evaluated using the parameters such as precision, recall, f-measure. Here Fuzzy C-means Clustering technique measured in terms of Purity and Entropy. The data set was validated using 10-fold cross validation method and observed 95% confidence interval between the accuracy values .Finally, the proposed fuzzy logic method produced more than 85 % accurate results and error rate is very less compared to existing sentiment classification techniques.

  14. 基于遗传算法的多光谱影像非监督训练分类系统%Unsupervised Training Approaches Using Genetic Algorithms for Multispectral Image Classification Systems

    Institute of Scientific and Technical Information of China (English)

    HUNG Chih-Cheng; XIANG Mei; Minh Pham; KUO Bor-Chen; Tommy L. Coleman

    2007-01-01

    This paper conveys the application of genetic algorithms(GA) which are used to improve unsupervised training and thereby increase the classification accuracy of remotely sensed data. The genetic competitive learning algorithm(GA-CL), an integrated approach of the GA and simple competitive learning(CL) algorithm, was developed for unsupervised training. Genetic algorithms are used to improve the training results for the algorithm. GA is used to prevent falling in the local minima during the process of cluster prototypes learning. The evaluation of the algorithm uses the Jeffries-Matusita(J-M) distance, a measure of statistical separability of pairs of trained clusters. Experiments on Landsat Thema-tic Mapper(TM) data show that the GA improves the simple competitive learning algorithm. Comparisons with other unsupervised training algorithms, the K-means, GA-K-means, and the simple competitive learning algorithm are provided.%本文将遗传算法(GA)应用于非监督训练,提高了遥感数据的分类精度.遗传竞争学习算法(GA-CL)综合了遗传算法和简单的竞争学习算法,可用于改进非监督训练的结果.遗传算法在典型样本聚类的过程中可以避免得到局部最优值.Jeffries-Matusita(J-M)距离法是通过统计测量两个训练类别之间的分离度,可用于评价这种算法.将此算法应用于TM数据的结果显示,遗传算法改进了简单的竞争学习算法,与其他非监督训练算法相比,其提供了K-均值,GA-K-均值和简单的竞争学习算法.

  15. China's educational aim and theory

    Science.gov (United States)

    Guang-Wei, Zou

    1985-12-01

    The aim and theory of Chinese socialist education is to provide scientific and technological knowledge so as to develop the productive forces and to meet the demands of the socialist cause. Since education is the main vehicle towards modernizing science and technology, any investment in education is viewed as being productive as it feeds directly into economics. Faced with the demands of industrial and agricultural production, training a technical as well as a labour force becomes crucial. This is made possible by the provision of two labour systems for workers both from rural as well as urban areas and by two kinds of educational systems for both urban and rural students. Chinese educational theory is seen as a fusion of principles from its own educational legacy with those of Marxist-Leninist principles.

  16. Maximum mutual information regularized classification

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-09-07

    In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reducing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradient descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization.

  17. Online Feature Selection of Class Imbalance via PA Algorithm

    Institute of Scientific and Technical Information of China (English)

    Chao Han; Yun-Kun Tan; Jin-Hui Zhu; Yong Guo; Jian Chen; Qing-Yao Wu

    2016-01-01

    Imbalance classification techniques have been frequently applied in many machine learning application domains where the number of the majority (or positive) class of a dataset is much larger than that of the minority (or negative) class. Meanwhile, feature selection (FS) is one of the key techniques for the high-dimensional classification task in a manner which greatly improves the classification performance and the computational efficiency. However, most studies of feature selection and imbalance classification are restricted to off-line batch learning, which is not well adapted to some practical scenarios. In this paper, we aim to solve high-dimensional imbalanced classification problem accurately and efficiently with only a small number of active features in an online fashion, and we propose two novel online learning algorithms for this purpose. In our approach, a classifier which involves only a small and fixed number of features is constructed to classify a sequence of imbalanced data received in an online manner. We formulate the construction of such online learner into an optimization problem and use an iterative approach to solve the problem based on the passive-aggressive (PA) algorithm as well as a truncated gradient (TG) method. We evaluate the performance of the proposed algorithms based on several real-world datasets, and our experimental results have demonstrated the effectiveness of the proposed algorithms in comparison with the baselines.

  18. Research on efficient classification mining algorithm for large data feature of cloud computing equipment%云计算设备中的大数据特征高效分类挖掘方法研究

    Institute of Scientific and Technical Information of China (English)

    王昌辉

    2015-01-01

    The big-data classification mining in cloud computing equipment is the basis of real pattern recognition and intel-ligent control. The topological-structure grid-partition mining algorithm is adopted for large data mining in traditional cloud com-puting equipment,which can not effectively extract the detail characteristics of big data due to its poor classifying veracity. A ef-ficient classification mining algorithm for big data feature of cloud computing equipment is proposed,which is based on the frac-tional Fourier transform feature matching and K-L classification. The big data storage mechanism system of cloud computing equipment is analyzed. The fractional Fourier transform is used to deal with feature extraction and feature matching of big data. On the basis of K-L transform,the optimal path is chosen to guide categorical space,and a K-L big-data feature classifier is es-tablished to realize classification mining in cloud computing equipment. The simulation results show that this algorithm has ad-vantages of high-accuracy feature classification mining,less energy consumption and higher efficiency,and can realize efficient classification mining for the big data feature of the cloud computing equipments.%云计算设备中的大数据分类挖掘是现实模式识别和智能控制的基础,传统方法中对云计算设备中的大数据挖掘采用拓扑结构网格分区挖掘算法,不能有效提取大数据的细节特征,分类的准确性不好.提出一种基于分数阶Fourier变换特征匹配和K-L分类的云计算设备中的大数据特征高效分类挖掘算法.进行云计算设备中大数据存储机制体系分析,采用分数阶Fourier变换进行云计算设备中大数据特征提取和大数据特征匹配,基于K-L变换,选择最优的路径进行分类空间导引,构建了K-L大数据特征分类器,进行云计算设备中的大数据特征分类挖掘.仿真结果表明,采用该算法进行云计算设备中的大数据特征分

  19. Absolute calibration of the colour index and O4 absorption derived from Multi-AXis (MAX-) DOAS measurements and their application to a standardised cloud classification algorithm

    OpenAIRE

    Wagner, Thomas; Beirle, Steffen; Remmers, Julia; Shaiganfar, Reza; Wang, Yang

    2016-01-01

    A method is developed for the calibration of the colour index (CI) and the O4 absorption derived from Differential Optical Absorption Spectroscopy (DOAS) measurements of scattered sunlight. The method is based on the comparison of measurements and radiative transfer simulations for well-defined atmospheric conditions and viewing geometries. Calibrated measurements of the CI and the O4 absorption are important for the detection and classification of clouds from MAX-DOAS observations. Such info...

  20. Noise digital image classification evaluation based on genetic algorithm%遗传算法的噪声干扰数字图像分类性能评价

    Institute of Scientific and Technical Information of China (English)

    吴限光; 李素梅; 吴兆阳

    2012-01-01

    This paper proposes two methods-BP neural network and SVM based on genetic algorithm for solving digital image recognition problems in real environment. Modeling first in the case of noise-free condition, and then add artificial noise for real-life noise simulations. The results show that by using genetic algorithm, SVM network solution has better noise immunity, and the genetic algorithm is more valuable in noise digital images classification field.%针对现实中各种噪声干扰的数字图像识别分类的问题,提出了基于遗传算法优化的BP神经网络和支持向量机神经网络两种方案,先在无噪声干扰情况下建模,然后加入人工噪声模拟现实中的噪声干扰.结果表明,遗传算法优化后的支持向量机网络方案具备更好的抗噪声干扰能力,在噪声干扰数字图像分类中具有更高应用价值.

  1. 一种改进的K_means算法在旅游客户细分中的应用%An Improved K_means Algorithm and its Application to Tourists Classification

    Institute of Scientific and Technical Information of China (English)

    汪永旗

    2012-01-01

    针对传统K_means算法存在的问题,提出一种基于密度的初始中心点选择方法,并利用几何三角形三边关系理论简化了迭代中的计算次数,以缩短大数据集聚类时间.针对旅游电子商务的特点,基于RFM模型设计了一种RFMVCI扩展模型.新算法的有效性和扩展模型的合理性在实验和旅游客户细分实践中获得了验证.%An improved density-based K_means algorithm is presented for the existing problems of traditional K_means clustering algorithm, in which selection of initial center pointer is optimized. Also, the triangular trilateral relation theorem is introduced to reduce calculation complexity. An expanded RMF model (RFMVCI) is presented in applications of tourism electronic business, and the validity of new algorithm and rationality of extended model are validated in practice of tourism customer classification.

  2. Action Potential Classification Based on PCA and Improved K-means Algorithm%基于PCA和改进K均值算法的动作电位分类

    Institute of Scientific and Technical Information of China (English)

    师黎; 杨振兴; 王治忠; 王岩

    2011-01-01

    微电极阵列记录的神经元信号往往是电极临近区域数个神经元的动作电位信号以及大量背景噪声的混叠,研究神经系统的信息处理机制以及神经编码、解码机理需了解相关每个神经元的动作电位,因此需从记录信号中分离出每个神经元的动作电位.基于此,提出基于主元分析(PCA)和改进K均值相结合的动作电位分类方法.该方法采用PCA提取动作电位特征,使用改进K均值算法实现动作电位分类.实验结果表明,该方法降低了动作电位的特征维数以及K均值算法对初始分类重心的依赖,提高动作电位分类结果的正确率及稳定性.尤其是在处理低信噪比信号时,分类正确率仍能达到理想水平.%Neural signal recorded by the microelectrode array is often the mixture which is composed of action potentials of several neurons near the electrodes and the background noises. Researches on the nervous system information processing mechanism and neural coding and decoding mechanism need know every related neuron's action potential. Therefore, every neuron's action potential is essential to be separated from the recorded signal. This paper proposes a method based on Principal Component Analysis(PCA) combined with improved K-means for action potential classification. The action potentials' features are extracted by PCA, the action potential classification is implemented by the improved K-means algorithm. Experimental results show that the method brings down action potential's feature dimensions and dependence of the initial classification center for the K-means algorithm, and increases the accuracy and stability of the classification results. Particularly, when processing the low Signal to Noise Ratio(SNR) signals, it can also achieve an expected purpose.

  3. Image classification based on extreme learning machine optimized by improved bat algorithm%改进蝙蝠算法优化极限学习机的图像分类

    Institute of Scientific and Technical Information of China (English)

    陈海挺

    2014-01-01

    针对分类器中的极限学习机参数优化问题,本文提出一种改进蝙蝠算法优化极限学习机的图像分类模型。首先将极限学习机参数看作蝙蝠位置,然后采用改进蝙蝠算法进行求解。采用病毒群体感染主群体,主群体在历代个体间纵向传递信息,病毒群体通过感染操作在同代个体间横向传递信息,增强了算法跳出局部极小值的能力。最后根据最优参数建立图像分类模型,并对模型的性能进行仿真测试。仿真结果表明,相对于对比模型,本文模型不仅提高了图像分类正确率,而且加快了分类速度,是一种有效的图像分类模型。%This paper proposed mage classification model based on the extreme learning machine and bat algo-rithm. Firstly, the ELM parameters are taken as bat, and then it is solved by the improved bat algorithm which the main groups which consists of bats transmit information cross the vertical generations and the virus groups’transfer evolutionary information cross the same generation through virus infection, and the performance of the model is test-ed. The simulation results show that, compared with the other models, the proposed model not only improves the im-age classification accuracy, but also accelerate the classification speed, so it is an effective image classification mod-el.

  4. Text Classification using Data Mining

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using data mining that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. A system based on the...

  5. Text Classification using Artificial Intelligence

    CERN Document Server

    Kamruzzaman, S M

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms for classifying text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using artificial intelligence technique that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of na\\"ive Bayes classifier is then used on derived features and finally only a single concept of genetic algorithm has been added for final classification. A syste...

  6. Classification of hand eczema

    DEFF Research Database (Denmark)

    Agner, T; Aalto-Korte, K; Andersen, K E;

    2015-01-01

    BACKGROUND: Classification of hand eczema (HE) is mandatory in epidemiological and clinical studies, and also important in clinical work. OBJECTIVES: The aim was to test a recently proposed classification system of HE in clinical practice in a prospective multicentre study. METHODS: Patients were......%) could not be classified. 38% had one additional diagnosis and 26% had two or more additional diagnoses. Eczema on feet was found in 30% of the patients, statistically significantly more frequently associated with hyperkeratotic and vesicular endogenous eczema. CONCLUSION: We find that the classification...

  7. Classification of Dams

    OpenAIRE

    Berg, Johan; Linder, Maria

    2013-01-01

    In a comparing survey this thesis investigates classification systems for dams in Sweden, Norway, Finland, Switzerland, Canada and USA. The investigation is aiming at an understanding of how potential consequences of a dam failure are taken into account when classifying dams. Furthermore, the significance of the classification, regarding the requirements on the dam owner and surveillance authorities concerning dam safety is considered and reviewed. The thesis is pointing out similarities and ...

  8. Minimum Error Entropy Classification

    CERN Document Server

    Marques de Sá, Joaquim P; Santos, Jorge M F; Alexandre, Luís A

    2013-01-01

    This book explains the minimum error entropy (MEE) concept applied to data classification machines. Theoretical results on the inner workings of the MEE concept, in its application to solving a variety of classification problems, are presented in the wider realm of risk functionals. Researchers and practitioners also find in the book a detailed presentation of practical data classifiers using MEE. These include multi‐layer perceptrons, recurrent neural networks, complexvalued neural networks, modular neural networks, and decision trees. A clustering algorithm using a MEE‐like concept is also presented. Examples, tests, evaluation experiments and comparison with similar machines using classic approaches, complement the descriptions.

  9. 基于人工免疫模式识别的结构损伤检测与分类算法%Structural Damage Detection and Classification Algorithm Based on Artificial Immune Pattern Recognition

    Institute of Scientific and Technical Information of China (English)

    周悦; 唐世; 贾雪松; 张东伟; 臧传治

    2013-01-01

    For the structure health monitoring,this paper studies the structural damage detection and classification problems using the artificial immune system which has the extremely powerful capabilities of autonomy , initiative, adaptive and the bionic principle between learning and memory. An artificial immune pattern recognition and structural detection classification algorithm based on diagonal distance is proposed through imitating the immune recognition and learning mechanism. With the structure of benchmark proposed by the IASC-ASCE SHM working group as the platform, the damage detection and classification are tested. The simulation results show the classification rate based on the diagonal distance is better than Euclidean and Ma-halanobis. The relationship between the classification rate and the parameters which are clone rate and memory cell replacement threshold value is tested based on the diagonal distance, which show that the cloning rate should try to choose suitable parameter values in order to get a better classification success rate. The algorithm based on the immune learning and evolution can produce the high quality memory cells which effectively identify all kinds of structural damage model.%目的 研究人工免疫系统的自治性、主动性、自适应及学习和记忆的仿生机理,来解决结构健康监测中的结构损伤识别和分类问题.方法 通过模仿免疫识别和学习机理,提出一种基于Diagonal距离的人工免疫模式识别的结构损伤分类算法,并在IASC-ASCE SHM工作小组所提出的benchmark模型上对结构模式分类进行了实验测试.结果 仿真实验表明基于Diagonal距离所得到的分类成功率要略高于Euclidean距离和Mahalanobis距离所得到的分类成功率;基于Diagonal距离研究了克隆率和记忆细胞替代阈值对分类成功率的影响,只要选取合适的参数值,就能获得较高的分类成功率.结论 基于Diagonal距离的人工免疫模式识别的

  10. Efficient segmentation by sparse pixel classification

    DEFF Research Database (Denmark)

    Dam, Erik B; Loog, Marco

    2008-01-01

    Segmentation methods based on pixel classification are powerful but often slow. We introduce two general algorithms, based on sparse classification, for optimizing the computation while still obtaining accurate segmentations. The computational costs of the algorithms are derived, and they are dem......, and they are demonstrated on real 3-D magnetic resonance imaging and 2-D radiograph data. We show that each algorithm is optimal for specific tasks, and that both algorithms allow a speedup of one or more orders of magnitude on typical segmentation tasks.......Segmentation methods based on pixel classification are powerful but often slow. We introduce two general algorithms, based on sparse classification, for optimizing the computation while still obtaining accurate segmentations. The computational costs of the algorithms are derived...

  11. 基于随机主元分析算法的BBS情感分类研究%Study on BBS Sentiment Classification Based on Random Principal Component Analysis Algorithm

    Institute of Scientific and Technical Information of China (English)

    刘林; 刘三(女牙); 刘智; 铁璐

    2014-01-01

    针对论坛(BBS)中文本的情感分类问题,提出一种改进的随机子空间算法。挖掘特征空间中的分类信息,在生成子空间的过程中,利用权重函数对特征进行分类能力评估,以较大概率选择分类能力较好的特征维度,保证分类精度;扩大选择的子空间维度,选择具有分类能力的特征,通过主元分析对子空间进行降维,保证算法效率和子空间多样性。实验结果表明,该算法分类精度达到91.3%,比基准算法具有更好的性能稳定性。%For Bulletin Board System(BBS) sentiment classification issues, an improved Random Subspace Method(RSM) is proposed. This method tries to make full use of the discriminative information in the high dimensional feature space. In the process of generating subspaces, on the one hand, a weighting function is used to evaluate classification abilities of the features, and better ones are chosen to ensure accuracy of classification with a higher probability, on the other hand, the size of the subspace is enlarged, principal component analysis is used to reduce the dimension of the subspace, and they ensure the efficiency and diversity. Experimental results show that the proposed algorithm obtains the best accuracy of 91.3%, which is higher than the conventional Random Subspace Method(RSM).

  12. 基于随机主元分析算法的BBS情感分类研究%Study on BBS Sentiment Classification Based on Random Principal Component Analysis Algorithm

    Institute of Scientific and Technical Information of China (English)

    刘林; 刘三(女牙); 刘智; 铁璐

    2014-01-01

    For Bulletin Board System(BBS) sentiment classification issues, an improved Random Subspace Method(RSM) is proposed. This method tries to make full use of the discriminative information in the high dimensional feature space. In the process of generating subspaces, on the one hand, a weighting function is used to evaluate classification abilities of the features, and better ones are chosen to ensure accuracy of classification with a higher probability, on the other hand, the size of the subspace is enlarged, principal component analysis is used to reduce the dimension of the subspace, and they ensure the efficiency and diversity. Experimental results show that the proposed algorithm obtains the best accuracy of 91.3%, which is higher than the conventional Random Subspace Method(RSM).%针对论坛(BBS)中文本的情感分类问题,提出一种改进的随机子空间算法。挖掘特征空间中的分类信息,在生成子空间的过程中,利用权重函数对特征进行分类能力评估,以较大概率选择分类能力较好的特征维度,保证分类精度;扩大选择的子空间维度,选择具有分类能力的特征,通过主元分析对子空间进行降维,保证算法效率和子空间多样性。实验结果表明,该算法分类精度达到91.3%,比基准算法具有更好的性能稳定性。

  13. Classification prediction of G protein-coupled receptor (GPCR)family based on genetic algorithm%基于遗传算法的G蛋白偶联受体(GPCR)家族的分类预测研究

    Institute of Scientific and Technical Information of China (English)

    王敏琦; 张力耘; 田雪; 蒲雪梅; 李梦龙

    2012-01-01

    G protein-coupled receptor( GPCR) has widely participated in the regulation of various physiological functions. GPCR has been drug target of most drug molecules on the market Due to lack of crystal structures of GPCR.it is important to use computational method to predict the coupling selectivity of GPCRs in the drug design field. Thereby,in the work,the pseudo-amino acids algorithm, the genetic algorithm and the Support Vector Machine method are used to carry out classification prediction of GPCR, by means of the protein sequence information. The prediction accuracy of the classification model reaches up to 82. 5%.%G蛋白偶联受体广泛参与各类生理活动的调控,目前市场上1/2的小分子药物均是以GPCR为药物靶标.由于G蛋白偶联受体晶体结构缺乏,采用理论方法对G蛋白受体耦合特异性进行分类预测在药物研发领域有着重要的学术和应用价值.因此,本文采用模式识别方法,基于GPCR序列,以伪氨基酸算法以及遗传算法为基础,用支持向量机方法建立了G蛋白偶联受体耦合特异性的预测模型,取得了可达82.5%的较高的预测准确度.

  14. Segmentation and classification of gait cycles.

    Science.gov (United States)

    Agostini, Valentina; Balestra, Gabriella; Knaflitz, Marco

    2014-09-01

    Gait abnormalities can be studied by means of instrumented gait analysis. Foot-switches are useful to study the foot-floor contact and for timing the gait phases in many gait disorders, provided that a reliable foot-switch signal may be collected. Considering long walks allows reducing the intra-subject variability, but requires automatic and user-independent methods to analyze a large number of gait cycles. The aim of this work is to describe and validate an algorithm for the segmentation of the foot-switch signal and the classification of the gait cycles. The performance of the algorithm was assessed comparing its results against the manual segmentation and classification performed by a gait analysis expert on the same signal. The performance was found to be equal to 100% for healthy subjects and over 98% for pathological subjects. The algorithm allows determining the atypical cycles (cycles that do not match the standard sequence of gait phases) for many different kinds of pathological gait, since it is not based on pathology-specific templates.

  15. Decoding the Encoding of Functional Brain Networks: an fMRI Classification Comparison of Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA), and Sparse Coding Algorithms

    OpenAIRE

    Xie, Jianwen; Douglas, Pamela K.; Wu, Ying Nian; Brody, Arthur L.; Anderson, Ariana E.

    2016-01-01

    Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet mathematical constraints such as sparse coding and positivity both provide alternate biologically-plausible frameworks for generating brain networks. Non-negative Matrix Factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms ($L1$ Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking,...

  16. Towards automatic classification of all WISE sources

    Science.gov (United States)

    Kurcz, A.; Bilicki, M.; Solarz, A.; Krupa, M.; Pollo, A.; Małek, K.

    2016-07-01

    Context. The Wide-field Infrared Survey Explorer (WISE) has detected hundreds of millions of sources over the entire sky. Classifying them reliably is, however, a challenging task owing to degeneracies in WISE multicolour space and low levels of detection in its two longest-wavelength bandpasses. Simple colour cuts are often not sufficient; for satisfactory levels of completeness and purity, more sophisticated classification methods are needed. Aims: Here we aim to obtain comprehensive and reliable star, galaxy, and quasar catalogues based on automatic source classification in full-sky WISE data. This means that the final classification will employ only parameters available from WISE itself, in particular those which are reliably measured for the majority of sources. Methods: For the automatic classification we applied a supervised machine learning algorithm, support vector machines (SVM). It requires a training sample with relevant classes already identified, and we chose to use the SDSS spectroscopic dataset (DR10) for that purpose. We tested the performance of two kernels used by the classifier, and determined the minimum number of sources in the training set required to achieve stable classification, as well as the minimum dimension of the parameter space. We also tested SVM classification accuracy as a function of extinction and apparent magnitude. Thus, the calibrated classifier was finally applied to all-sky WISE data, flux-limited to 16 mag (Vega) in the 3.4 μm channel. Results: By calibrating on the test data drawn from SDSS, we first established that a polynomial kernel is preferred over a radial one for this particular dataset. Next, using three classification parameters (W1 magnitude, W1-W2 colour, and a differential aperture magnitude) we obtained very good classification efficiency in all the tests. At the bright end, the completeness for stars and galaxies reaches ~95%, deteriorating to ~80% at W1 = 16 mag, while for quasars it stays at a level of

  17. The Research and Appliaction of the Multi-classification Algorithm of Error-Correcting Codes Based on Support Vector Machine%基于SVM的纠错编码多分类算法的研究与应用

    Institute of Scientific and Technical Information of China (English)

    祖文超; 苑津莎; 王峰; 刘磊

    2012-01-01

    In order to enhance the accuracy rate of transformer fault diagnosis,multiclass classification algorithm,which is based upon Error-correcting codes connects with SVM,has been proposedThe mathe-matical model of transformer fault diagnosis is set up according to the theory of Support Vector Machine. Firstly,the Error-correcting codes matrix constructs some irrelevant Support Vector Machine,so that the accuracy rate of classified model can be enhanced.Finally,taking the dissolved gases in the transformer oil as the practise and testing sample of Error-correcting codes and SVM to realize transformer fault diagno- sis.And checking the arithmetic by using UCI data.The multiclass classification algorithm has been verified through VS2008 combined with Libsvm has been verified.And the result shows the method has high ac- curacy of classification.%为了提高变压器故障诊断的准确率,提出了一种基于纠错编码和支持向量机相结合的多分类算法,根据SVM理论建立变压器故障诊断数学模型,首先基于纠错编码矩阵构造出若干个互不相关的子支持向量机,以提高分类模型的分类准确率。最后把变压器油中溶解气体(DGA)作为纠错编码支持向量机的训练以及测试样本,实现变压器的故障诊断,同时用UCI数据对该算法进行验证。通过VS2008和Libsvm相结合对其进行验证,结果表明该方法具有很高的分类精度。

  18. Detection, identification and classification of defects using ANN and a robotic manipulator of 2 G.L. (Kohonen and MLP algorithms)

    International Nuclear Information System (INIS)

    The ultrasonic inspection technique had a sustained growth since the 80's It has several advantages, compared with the contact technique. A flexible and low cost solution is presented based on virtual instrumentation for the servomechanism (manipulator) control of the ultrasound inspection transducer in the immersion technique. The developed system uses a personal computer (PC). a Windows Operating System. Virtual Instrumentation Software. DAQ cards and a GPIB card. As a solution to detection, classification and evaluation of defects an Artificial Neuronal Networks technique proposed. It consists of characterization and interpretation of acoustic signals (echoes) acquired by the immersion ultrasonic inspection technique. Two neuronal networks are proposed: Kohonen and Multilayer Perceptron (MLP). With this techniques non-linear complex processes can be modeled with great precision. The 2-degree of freedom manipulator control, the data acquisition and the net training have been carried out in a virtual instrument environment using LabVIEV and Data Engine. (Author) 14 refs

  19. An improved DS acoustic-seismic modality fusion algorithm based on a new cascaded fuzzy classifier for ground-moving targets classification in wireless sensor networks

    Science.gov (United States)

    Pan, Qiang; Wei, Jianming; Cao, Hongbing; Li, Na; Liu, Haitao

    2007-04-01

    A new cascaded fuzzy classifier (CFC) is proposed to implement ground-moving targets classification tasks locally at sensor nodes in wireless sensor networks (WSN). The CFC is composed of three and two binary fuzzy classifiers (BFC) respectively in seismic and acoustic signal channel in order to classify person, Light-wheeled (LW) Vehicle, and Heavywheeled (HW) Vehicle in presence of environmental background noise. Base on the CFC, a new basic belief assignment (bba) function is defined for each component BFC to give out a piece of evidence instead of a hard decision label. An evidence generator is used to synthesize available evidences from BFCs into channel evidences and channel evidences are further temporal-fused. Finally, acoustic-seismic modality fusion using Dempster-Shafer method is performed. Our implementation gives significantly better performance than the implementation with majority-voting fusion method through leave-one-out experiments.

  20. A New Cross-multidomain Classification Algorithm and Its Fast Version for Large Datasets%基于多源的跨领域数据分类快速新算法

    Institute of Scientific and Technical Information of China (English)

    顾鑫; 王士同; 许敏

    2014-01-01

    Cross-domain learning and classification involved in this paper attempts to effectively transfer the classification results obtained from supervised multisource domains to an unsupervised target domain. Generally speaking, although current cross-domain learning methods have obtained great successes for cross-single-domain learning problems, they will encounter overwhelming troubles in the sense of classification accuracy and running speed when carrying out them on large cross-multisource datasets. In this paper, based on the logistic regression model and the proposed consensus measure, a multi-source cross-domain classification (MSCC) algorithm is proposed to realize effective cross-domain classification for the target domain. In order to enable the MSCC to work well for large datasets, based on the algorithm CDdual (Dual coordinate descent method) as the recent advance about large-scale logistic regression, an MSCC0s fast version MSCC-CDdual for large datasets is derived and theoretically analysed. The experimental results on artificial data, text data and image data indicate that the proposed algorithm MSCC-CDdual has a fast speed, high classification accuracy and good domain adaption for large cross-multisource datasets. The contributions of the work here contain three aspects:1) A novel consensus measure is proposed, which is suitable for boosting multi-classifiers and convenient for us to develop MSCC0s fast version for large datasets; 2) The proposed algorithm MSCC-CDdual is demonstrated to be suitable for cross-multisource learning for both small and large datasets;3) MSCC-CDdual exhibits its additional advantage, i.e., the applicability for high dimensional datasets from another“large”perspective.%研究跨领域学习与分类是为了将对多源域的有监督学习结果有效地迁移至目标域,实现对目标域的无标记分类。当前的跨领域学习一般侧重于对单一源域到目标域的学习,且样本规模普遍较小,此类方

  1. Feature selection using a genetic algorithm-based hybrid approach

    Directory of Open Access Journals (Sweden)

    Luis Felipe Giraldo

    2010-04-01

    Full Text Available The present work proposes a hybrid feature selection model aimed at reducing training time whilst maintaining classification accuracy. The model includes adlusting a decision tree for producing feature subsets. Such subsets’ statistical relevance was evaluated from their resulting classification error. Evaluation involved using the k-nearest neighbors’ rule. Dimension reduction techniques usually assume an element of error; however, the hybrid selection model was tuned by means of genetic algorithms in this work. They simultaneously minimise the number of fea- tures and training error. Contrasting with conventional methods, this model also led to quantifying the relevance of each training set’s features. The model was tested on speech signals (hypernasality classification and ECG identification (ischemic cardiopathy. In the case of speech signals, the database consisted of 90 children (45 recordings per sample; the ECG database had 100 electrocardiograph records (50 recordings per sample. Results showed average reduction rates of up to 88%, classification error being less than 6%.

  2. ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURISTIC ALGORITHMS

    Directory of Open Access Journals (Sweden)

    Roghayeh Saneifar

    2015-11-01

    Full Text Available According to the increase of using data mining techniques in improving educational systems operations, Educational Data Mining has been introduced as a new and fast growing research area. Educational Data Mining aims to analyze data in educational environments in order to solve educational research problems. In this paper a new associative classification technique has been proposed to predict students final performance. Despite of several machine learning approaches such as ANNs, SVMs, etc. associative classifiers maintain interpretability along with high accuracy. In this research work, we have employed Honeybee Colony Optimization and Particle Swarm Optimization to extract association rule for student performance prediction as a multi-objective classification problem. Results indicate that the proposed swarm based algorithm outperforms well-known classification techniques on student performance prediction classification problem.

  3. Multiple Observation Sets Classification Algorithm Based on Graphical Presentation of Inconsistent Similarity Measure%非一致相似测度的图表示多观测样本分类算法

    Institute of Scientific and Technical Information of China (English)

    胡正平; 赵艳霜; 荆楠

    2012-01-01

    In classification problem of multiple observation sets, the samples are represented as points on Grassmannian manifolds, with regard to how to exploit the manifold structure to improve the classification performance,a multiple observation sets classification algorithm based on graphical presentation of inconsistent similarity measure graph is presented. First of all, considering the characters of global and local data structure comprehensively, an inconsistent similarity measure is constructed, which regards the distinction of within-class and between-class mainly and can effectively reflect the distribution character of actual data clustering. The second step is to obtain the similarity matrix via inconsistent similarity measure graph, after that, the computation of the optimal map is transformed into the search problem of the largest eigenvectors of the Rayleigh quotient by a combined Grassmannian kernel and then the projection matrix is obtained. Lastly, points on the manifold can be mapped into another space, the final classification is completed exploits the nearest neighbor classifier. Three comparative experiments are conducted on ETH-80 object recognition dataset, CMU-PIE and BANCA face recognition data-sets , the results prove that the algorithm performs better than traditional algorithm.%多观测样本分类问题中,样本表示成流形上的点,针对如何利用多观测样本的流形结构提高其分类性能的问题,提出非一致相似测度的Graph表示多观测样本分类算法.首先综合数据的全局与局部结构特性,构造一个非一致相似测度,非一致相似测度主要考虑类内和类间差别,能有效地体现数据实际聚类的分布特性;其次构造非一致相似测度Graph,进而得到样本之间的相似度矩阵,然后通过一个格拉斯曼联合核把最佳投影的计算转化成寻找瑞利熵的最大特征向量问题,进而得到投影矩阵.最后将本征流形上的点投影到另一个流形

  4. The Improved Job Scheduling Algorithm of Hadoop Platform

    OpenAIRE

    Guo, Yingjie; Wu, Linzhi; Yu, Wei; Wu, Bin; Wang, Xiaotian

    2015-01-01

    This paper discussed some job scheduling algorithms for Hadoop platform, and proposed a jobs scheduling optimization algorithm based on Bayes Classification viewing the shortcoming of those algorithms which are used. The proposed algorithm can be summarized as follows. In the scheduling algorithm based on Bayes Classification, the jobs in job queue will be classified into bad job and good job by Bayes Classification, when JobTracker gets task request, it will select a good job from job queue,...

  5. Novel semi-supervised classification algorithm based on TSVM%一种新的半监督直推式支持向量机分类算法

    Institute of Scientific and Technical Information of China (English)

    王安娜; 李云路; 赵锋云; 史成龙

    2011-01-01

    Traditional SYM is a kind of supervised learning method and it needs a large number of labeled samples. However, in practical applications, the number of labeled samples is limited and is very difficult to obtain. Therefore, how to effectively use unlabeled samples to deal with machine learning problem becomes an important task when the number of labeled samples is not large enough and the system has a large amount of unlabeled samples. In this paper, a novel semi-supervised classification algorithm is proposed, which combines the semi-supervised algorithm with SVM; in the iteration process, it combines unlabeled samples and labeled samples, and gradually obtains a more credible hyperplane. Both theoretical analysis and computer simulation results indicate that the proposed algorithm can utilize the unlabeled samples effectively, and the addition of the unlabeled samples improves the classification accuracy.%传统的支持向量机( SVM)是一种有监督学习方法,需要大量有标记样本,然而有标记样本的数量十分有限且获得困难.因此,当存在海量的无标记样本时,如何有效地利用这些数据成为了机器学习面临的重要任务.研究提出了一种新的半监督直推式支持向量机分类算法,将半监督算法与支持向量机结合,在迭代算法中将无标签样本与有标签样本结合,逐渐得到更可信的分类超平面.理论分析和计算机仿真结果都表明,研究提出的样本能够有效地利用大量的无标签样本,并且无标签样本的加入能够有效地提高分类准确率.

  6. Snow event classification with a 2D video disdrometer - A decision tree approach

    Science.gov (United States)

    Bernauer, F.; Hürkamp, K.; Rühm, W.; Tschiersch, J.

    2016-05-01

    Snowfall classification according to crystal type or degree of riming of the snowflakes is import for many atmospheric processes, e.g. wet deposition of aerosol particles. 2D video disdrometers (2DVD) have recently proved their capability to measure microphysical parameters of snowfall. The present work has the aim of classifying snowfall according to microphysical properties of single hydrometeors (e.g. shape and fall velocity) measured by means of a 2DVD. The constraints for the shape and velocity parameters which are used in a decision tree for classification of the 2DVD measurements, are derived from detailed on-site observations, combining automatic 2DVD classification with visual inspection. The developed decision tree algorithm subdivides the detected events into three classes of dominating crystal type (single crystals, complex crystals and pellets) and three classes of dominating degree of riming (weak, moderate and strong). The classification results for the crystal type were validated with an independent data set proving the unambiguousness of the classification. In addition, for three long-term events, good agreement of the classification results with independently measured maximum dimension of snowflakes, snowflake bulk density and surrounding temperature was found. The developed classification algorithm is applicable for wind speeds below 5.0 m s -1 and has the advantage of being easily implemented by other users.

  7. Identifying types of physical activity with a single accelerometer: Evaluating laboratory trained algorithms in daily life

    NARCIS (Netherlands)

    Cuba Gyllensten, I.; Bonomi, A.G.

    2011-01-01

    Accurate identification of physical activity types has been achieved in laboratory conditions using single-site accelerometers and classification algorithms. This methodology is then applied to free-living subjects to determine activity behaviour. This study aimed at analysing the reproducibility of

  8. A simple and robust classification tree for differentiation between benign and malignant lesions in MR-mammography

    Energy Technology Data Exchange (ETDEWEB)

    Baltzer, Pascal A.T. [Medical University Vienna, Department of Radiology, Vienna (Austria); Dietzel, Matthias [University hospital Erlangen, Department of Neuroradiology, Erlangen (Germany); Kaiser, Werner A. [University Hospital Jena, Institute of Diagnostic and Interventional Radiology 1, Jena (Germany)

    2013-08-15

    In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. (orig.)

  9. Quantum image classification using principal component analysis

    Directory of Open Access Journals (Sweden)

    Mateusz Ostaszewski

    2016-01-01

    Full Text Available We present a novel quantum algorithm for classification of images. The algorithm is constructed using principal component analysis and von Neuman quantum measurements. In order to apply the algorithm we present a new quantum representation of grayscale images.

  10. Biogeography based Satellite Image Classification

    CERN Document Server

    Panchal, V K; Kaur, Navdeep; Kundra, Harish

    2009-01-01

    Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. This paper is focused on classification of the satellite image of a particular land cover using the theory of Biogeography based Optimization. The original BBO algorithm does not have the inbuilt property of clustering which is required during image classification. Hence modifications have been proposed to the original algorithm and...

  11. Classification of Medical Brain Images

    Institute of Scientific and Technical Information of China (English)

    Pan Haiwei(潘海为); Li Jianzhong; Zhang Wei

    2003-01-01

    Since brain tumors endanger people's living quality and even their lives, the accuracy of classification becomes more important. Conventional classifying techniques are used to deal with those datasets with characters and numbers. It is difficult, however, to apply them to datasets that include brain images and medical history (alphanumeric data), especially to guarantee the accuracy. For these datasets, this paper combines the knowledge of medical field and improves the traditional decision tree. The new classification algorithm with the direction of the medical knowledge not only adds the interaction with the doctors, but also enhances the quality of classification. The algorithm has been used on real brain CT images and a precious rule has been gained from the experiments. This paper shows that the algorithm works well for real CT data.

  12. Hubble Classification

    Science.gov (United States)

    Murdin, P.

    2000-11-01

    A classification scheme for galaxies, devised in its original form in 1925 by Edwin P Hubble (1889-1953), and still widely used today. The Hubble classification recognizes four principal types of galaxy—elliptical, spiral, barred spiral and irregular—and arranges these in a sequence that is called the tuning-fork diagram....

  13. Sentiment Analysis of Movie Reviews using Hybrid Method of Naive Bayes and Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    M.Govindarajan

    2013-12-01

    Full Text Available The area of sentiment mining (also called sentiment extraction, opinion mining, opinion extraction, sentiment analysis, etc. has seen a large increase in academic interest in the last few years. Researchers in the areas of natural language processing, data mining, machine learning, and others have tested a variety of methods of automating the sentiment analysis process. In this research work, new hybrid classification method is proposed based on coupling classification methods using arcing classifier and their performances are analyzed in terms of accuracy. A Classifier ensemble was designed using Naive Bayes (NB, Genetic Algorithm (GA. In the proposed work, a comparative study of the effectiveness of ensemble technique is made for sentiment classification. The ensemble framework is applied to sentiment classification tasks, with the aim of efficiently integrating different feature sets and classification algorithms to synthesize a more accurate classification procedure. The feasibility and the benefits of the proposed approaches are demonstrated by means of movie review that is widely used in the field of sentiment classification. A wide range of comparative experiments are conducted and finally, some in-depth discussion is presented and conclusions are drawn about the effectiveness of ensemble technique for sentiment classification.

  14. 基于蚁群聚集信息素的半监督文本分类算法%Semi-supervised Text Classification Algorithm Based on Ant Colony Aggregation Pheromone

    Institute of Scientific and Technical Information of China (English)

    杜芳华; 冀俊忠; 吴晨生; 吴金源

    2014-01-01

    半监督文本分类中已标记数据与未标记数据分布不一致,可能导致分类器性能较低。为此,提出一种利用蚁群聚集信息素浓度的半监督文本分类算法。将聚集信息素与传统的文本相似度计算相融合,利用Top-k策略选取出未标记蚂蚁可能归属的种群,依据判断规则判定未标记蚂蚁的置信度,采用随机选择策略,把置信度高的未标记蚂蚁加入到对其最有吸引力的训练种群中。在标准数据集上与朴素贝叶斯算法和EM算法进行对比实验,结果表明,该算法在精确率、召回率以及F1度量方面都取得了更好的效果。%There are many algorithms based on data distribution to effectively solve semi-supervised text categorization. However,they may perform badly when the labeled data distribution is different from the unlabeled data. This paper presents a semi-supervised text classification algorithm based on aggregation pheromone, which is used for species aggregation in real ants and other insects. The proposed method,which has no assumption regarding the data distribution, can be applied to any kind of data distribution. In light of aggregation pheromone,colonies that unlabeled ants may belong to are selected with a Top-k strategy. Then the confidence of unlabeled ants is determined by a judgment rule. Unlabeled ants with higher confidence are added into the most attractive training colony by a random selection strategy. Compared with Naïve Bayes and EM algorithm,the experiments on benchmark dataset show that this algorithm performs better on precision,recall and Macro F1.

  15. Mimicking human texture classification

    Science.gov (United States)

    van Rikxoort, Eva M.; van den Broek, Egon L.; Schouten, Theo E.

    2005-03-01

    In an attempt to mimic human (colorful) texture classification by a clustering algorithm three lines of research have been encountered, in which as test set 180 texture images (both their color and gray-scale equivalent) were drawn from the OuTex and VisTex databases. First, a k-means algorithm was applied with three feature vectors, based on color/gray values, four texture features, and their combination. Second, 18 participants clustered the images using a newly developed card sorting program. The mutual agreement between the participants was 57% and 56% and between the algorithm and the participants it was 47% and 45%, for respectively color and gray-scale texture images. Third, in a benchmark, 30 participants judged the algorithms' clusters with gray-scale textures as more homogeneous then those with colored textures. However, a high interpersonal variability was present for both the color and the gray-scale clusters. So, despite the promising results, it is questionable whether average human texture classification can be mimicked (if it exists at all).

  16. 基于L1范数凸包数据描述的多观测样本分类算法%The Classification Algorithm of Multiple Observation Samples Based on L1 Norm Convex Hull Data Descriotion

    Institute of Scientific and Technical Information of China (English)

    胡正平; 王玲丽

    2012-01-01

    In order to construct a high-dimensional data approximate model in the purpose of the best coverage of the distribution of high-dimensional samples, the classification algorithm of multiple observation samples based on LI norm convex hull data description is proposed. The convex hull for each class in the train set and multiple observation samples in the test set is constructed as the first step. So the classification of multiple observation samples is transformed to the similarity of convex hulls. If the test convex hull and every train hull are not overlapping, LI norm distance measure is used to solve the similarity of convex hulls. Otherwise, LI norm distance measure is used to solve the similarity of reduced convex hulls. Then the nearest neighbor classifier is used to solve the classification of multiple observation samples. Experiments on three types of databases show that the proposed method is valid and efficient.%为建立高维空间样本分布的最佳覆盖为目标来实现覆盖分类,该文提出基于L1范数凸包数据描述的多观测样本分类算法.首先对训练集的每个类别以及测试集的多观测样本分别构造凸包模型,这样多观测样本的分类就转化为凸包模型的相似性度量问题.若测试集的凸包模型与训练集无重叠,采用L1范数距离测度进行凸包模型之间的相似性度量;若有重叠,采用L1范数距离测度进行收缩凸包(reduced convex hulls)之间的相似性度量.然后采用最近邻准则作为多观测样本的分类决策.在3个数据库上进行的实验结果,表明该文提出方法对于多观测样本分类具有可行性和有效性.

  17. AN INTELLIGENT CLASSIFICATION MODEL FOR PHISHING EMAIL DETECTION

    Directory of Open Access Journals (Sweden)

    Adwan Yasin

    2016-07-01

    Full Text Available Phishing attacks are one of the trending cyber-attacks that apply socially engineered messages that are communicated to people from professional hackers aiming at fooling users to reveal their sensitive information, the most popular communication channel to those messages is through users’ emails. This paper presents an intelligent classification model for detecting phishing emails using knowledge discovery, data mining and text processing techniques. This paper introduces the concept of phishing terms weighting which evaluates the weight of phishing terms in each email. The pre-processing phase is enhanced by applying text stemming and WordNet ontology to enrich the model with word synonyms. The model applied the knowledge discovery procedures using five popular classification algorithms and achieved a notable enhancement in classification accuracy; 99.1% accuracy was achieved using the Random Forest algorithm and 98.4% using J48, which is –to our knowledge- the highest accuracy rate for an accredited data set. This paper also presents a comparative study with similar proposed classification techniques.

  18. 基于类别空间多示例学习的色情图像过滤算法%Pornography Filtering Algorithm Based on Classification Space Multi-instance Learning

    Institute of Scientific and Technical Information of China (English)

    李博; 曹鹏; 栗伟; 赵大哲

    2013-01-01

    针对传统的不良图像自动过滤算法难以适用于复杂互联网环境的问题,提出一种通过构建类别空间进行多示例学习实现图像过滤的新算法.首先在YCgCr空间中扩展Hessian矩阵检测特征点作为图像的示例,然后定义YCgCr-LBP算子作为图像示例描述符,最后基于包示例频率统计原理提出类别空间模型,并利用余弦相似度完成图像识别.利用不同成分的数据集进行了多组实验对比,结果表明,所提出的算法克服了传统依靠皮肤比例方法对皮肤或类皮肤比例较大图像识别准确度较低的问题,同时也较一般的多示例学习方法对图像具有更好的描述能力,取得了较好的实验结果,具有实际应用价值.%In order to solve the problem that the traditional pornography filtering algorithms are hardly to be used for the complex Internet environment,a novel filtering algorithm was presented based on multi-instance learning by building classification space.Firstly,Hessian matrix was used in YCgCr space to detect image feature points which are instances of the image,and then LBP operator was expanded to YCgCr space.Secondly,YCgCr-LBP operator was constructed to describe the image instances.Finally,classification space model based on frequency statistical theory was proposed,and cosine similarity was used to complete image recognition.Different data sets were used to make comparison.The results showed that using the proposed method,the accuracy is increased compared with the large skin contented images filtering by the conventional skin proportional method,and the description of the proposed method is improved compared with the general multi-instance learning method.What' s more,better experimental results were obtained,which indicated the practical value.

  19. 基于L1-Graph表示的标记传播多观测样本分类算法%Label Propagation Classification Algorithm of Multiple Observation Sets Based on L]-Graph Representation

    Institute of Scientific and Technical Information of China (English)

    胡正平; 王玲丽

    2011-01-01

    同类样本被认为是分布在同一个高维观测空间的低维流形上,针对多观测样本分类如何利用这一流形结构的问题,提出基于L1 -Graph表示的标记传播多观测样本分类算法.首先基于稀疏表示的思路构造L1 -Graph,进而得到样本之间的相似度矩阵,然后在半监督分类标记传播算法的基础上,限制所有的观测样本都属于同一个类别的条件下,得到一个具有特殊结构的类标矩阵,最后把寻找最优类标矩阵的计算转化为离散目标函数优化问题,进而计算出测试样本所属类别.在USPS手写体数据库、ETH- 80物体识别数据库以及Cropped Yale人脸识别数据库上进行了一系列实验,实验结果表明了本文提出方法的可行性和有效性.%The samples in each class set can be supposed to distribute on a same low-dimensional manifold of the high-dimensional observation space. With regard to how to take advantage of this manifold structure for the effective classification of the multiple observation sets, label propagation classification algorithm of multiple observation sets based on LI-Graph representation is proposed in this paper. Based on sparse representation to construct LI -Graph and obtains a similarity matrix between samples as the first step. All observation images belong to a same class is restricted that to obtain a label matrix of special structure on the basis of semi-supervised label propagation algorithm. Lastly, transform the computation of the optimization label matrix to an optimization problem of discrete object function and obtains the class of the test samples. Experiments on the USPS handwritten digit database, ETH-80 object recognition database and Cropped Yale face recognition database show that the proposed method is valid and efficient.

  20. Memetic Algorithm and its Application Research in Classification%Memetic算法及其在分类中的应用研究

    Institute of Scientific and Technical Information of China (English)

    吉利鹏; 张洪伟

    2014-01-01

    群体智能优化算法Memetic算法(Memetic Algorithm,MA)采用进化算法的操作流程,引入局部搜索算子,使其在问题的求解中保证较高收敛性能的同时又能获得较高质量的解,克服了遗传算法等传统全局优化算法易“早熟”的问题,同时避免陷入局部解.在MA框架基础上,提出了全局动态适应MA算法,采用遗传算法为全局搜索算子,k-means算法为局部搜索算子.使用Java语言实现算法并对UCI中分类实验数据集进行测试,结果表明,将遗传算法和k-means结合的全局动态适应MA在分类问题中具有较高准确率.

  1. Brain source localization: A new method based on MUltiple SIgnal Classification algorithm and spatial sparsity of the field signal for electroencephalogram measurements

    Science.gov (United States)

    Vergallo, P.; Lay-Ekuakille, A.

    2013-08-01

    Brain activity can be recorded by means of EEG (Electroencephalogram) electrodes placed on the scalp of the patient. The EEG reflects the activity of groups of neurons located in the head, and the fundamental problem in neurophysiology is the identification of the sources responsible of brain activity, especially if a seizure occurs and in this case it is important to identify it. The studies conducted in order to formalize the relationship between the electromagnetic activity in the head and the recording of the generated external field allow to know pattern of brain activity. The inverse problem, that is given the sampling field at different electrodes the underlying asset must be determined, is more difficult because the problem may not have a unique solution, or the search for the solution is made difficult by a low spatial resolution which may not allow to distinguish between activities involving sources close to each other. Thus, sources of interest may be obscured or not detected and known method in source localization problem as MUSIC (MUltiple SIgnal Classification) could fail. Many advanced source localization techniques achieve a best resolution by exploiting sparsity: if the number of sources is small as a result, the neural power vs. location is sparse. In this work a solution based on the spatial sparsity of the field signal is presented and analyzed to improve MUSIC method. For this purpose, it is necessary to set a priori information of the sparsity in the signal. The problem is formulated and solved using a regularization method as Tikhonov, which calculates a solution that is the better compromise between two cost functions to minimize, one related to the fitting of the data, and another concerning the maintenance of the sparsity of the signal. At the first, the method is tested on simulated EEG signals obtained by the solution of the forward problem. Relatively to the model considered for the head and brain sources, the result obtained allows to

  2. A Novel Rule Induction Algorithm

    Institute of Scientific and Technical Information of China (English)

    ZHENG Jianguo; LIU Fang; WANG Lei; JIAO Licheng

    2001-01-01

    Knowledge discovery in databases is concerned with extracting useful information from databases, and the immune algorithm is a biological theory-based and globally searching algorithm. A specific immune algorithm is designed for discovering a few interesting, high-level prediction rules from databases, rather than discovering classification knowledge as usual in the literatures. Simulations show that this novel algorithm is able to improve the stability of the population, increase the holistic performance and make the rules extracted have higher precision.

  3. A comparison of CA125, HE4, risk ovarian malignancy algorithm (ROMA, and risk malignancy index (RMI for the classification of ovarian masses

    Directory of Open Access Journals (Sweden)

    Cristina Anton

    2012-01-01

    Full Text Available OBJECTIVE: Differentiation between benign and malignant ovarian neoplasms is essential for creating a system for patient referrals. Therefore, the contributions of the tumor markers CA125 and human epididymis protein 4 (HE4 as well as the risk ovarian malignancy algorithm (ROMA and risk malignancy index (RMI values were considered individually and in combination to evaluate their utility for establishing this type of patient referral system. METHODS: Patients who had been diagnosed with ovarian masses through imaging analyses (n = 128 were assessed for their expression of the tumor markers CA125 and HE4. The ROMA and RMI values were also determined. The sensitivity and specificity of each parameter were calculated using receiver operating characteristic curves according to the area under the curve (AUC for each method. RESULTS: The sensitivities associated with the ability of CA125, HE4, ROMA, or RMI to distinguish between malignant versus benign ovarian masses were 70.4%, 79.6%, 74.1%, and 63%, respectively. Among carcinomas, the sensitivities of CA125, HE4, ROMA (pre-and post-menopausal, and RMI were 93.5%, 87.1%, 80%, 95.2%, and 87.1%, respectively. The most accurate numerical values were obtained with RMI, although the four parameters were shown to be statistically equivalent. CONCLUSION: There were no differences in accuracy between CA125, HE4, ROMA, and RMI for differentiating between types of ovarian masses. RMI had the lowest sensitivity but was the most numerically accurate method. HE4 demonstrated the best overall sensitivity for the evaluation of malignant ovarian tumors and the differential diagnosis of endometriosis. All of the parameters demonstrated increased sensitivity when tumors with low malignancy potential were considered low-risk, which may be used as an acceptable assessment method for referring patients to reference centers.

  4. 序贯散列近邻法及其在光谱识别中的应用%Sequential Computation-based Hash Nearest Neighbor Algorithm and Its Application in Spectrum Classification

    Institute of Scientific and Technical Information of China (English)

    李乡儒

    2012-01-01

    The neatest neighbor (NN) method is one of the most typical methods in spectral retrieval, automatic processing and data mining. The main problem in NN is the low efficiency. Therefore,focus on the efficient implementation problem and introduce a novel and efficient algorithm SHNN (sequential computation-based hash nearest neighbor algorithm). In algorithm SHNN, firstly, decompose and recognize the spectrum flux components based on their hashing power; Secondly, the nearest neighbor is computed in PC A space based on sequential computation idea. In the second procedure,the putative nearest spectra can be reduced based on hash idea,and the un-nearest spectra can be rejected as early as possible. The contributions of this work are; 1) anovel algorithm SHNN is introduced,which improve the efficiency of the most popular spectramining method nearest neighbor significantly;2) Its application in star spectrum,normal galaxy spectrum and Qso spectrum classificationis investigated. Evaluated the efficiency of the proposed algorithms experimentally on the SDSS (Sloan Digital Sky Survey) released spectra. The experimental results show that the proposed SHNN algorithm improves the efficiency of nearest neighbor method more than 96%. The nearest neighbor is one of the most popular and typical methods in spectra mining. Therefore , this work is useful in a wide scenario of automatic spectra analysis, for example, spectra classification, spectra parameter estimation, redshift estimation based on spectra,etc.%基于近邻的方法是海量光谱数据获取、自动处理和挖掘中的一类重要方法,在应用中它们的主要问题是效率较低,为此文中提出了基于序贯计算的散列近邻法( SHNN).在SHNN中,首先使用PCA方法对光谱数据进行正交变换,使数据按照各成分的散列能力进行组织;然后在PCA空间中快速查找待识别光谱的近邻数据,在此过程中通过散列思想快速约减搜索空间,并用序贯计算法高效

  5. Unsupervised Classification of Images: A Review

    OpenAIRE

    Abass Olaode; Golshah Naghdy; Catherine Todd

    2014-01-01

    Unsupervised image classification is the process by which each image in a dataset is identified to be a member of one of the inherent categories present in the image collection without the use of labelled training samples. Unsupervised categorisation of images relies on unsupervised machine learning algorithms for its implementation. This paper identifies clustering algorithms and dimension reduction algorithms as the two main classes of unsupervised machine learning algorithms needed in unsu...

  6. Multicore Processing for Clustering Algorithms

    Directory of Open Access Journals (Sweden)

    RekhanshRao

    2012-03-01

    Full Text Available Data Mining algorithms such as classification and clustering are the future of computation, though multidimensional data-processing is required. People are using multicore processors with GPU’s. Most of the programming languages doesn’t provide multiprocessing facilities and hence wastage of processing resources. Clustering and classification algorithms are more resource consuming. In this paper we have shown strategies to overcome such deficiencies using multicore processing platform OpelCL.

  7. Texture Image Classification Based on Gabor Wavelet

    Institute of Scientific and Technical Information of China (English)

    DENG Wei-bing; LI Hai-fei; SHI Ya-li; YANG Xiao-hui

    2014-01-01

    For a texture image, by recognizining the class of every pixel of the image, it can be partitioned into disjoint regions of uniform texture. This paper proposed a texture image classification algorithm based on Gabor wavelet. In this algorithm, characteristic of every image is obtained through every pixel and its neighborhood of this image. And this algorithm can achieve the information transform between different sizes of neighborhood. Experiments on standard Brodatz texture image dataset show that our proposed algorithm can achieve good classification rates.

  8. Supervised Classification Performance of Multispectral Images

    CERN Document Server

    Perumal, K

    2010-01-01

    Nowadays government and private agencies use remote sensing imagery for a wide range of applications from military applications to farm development. The images may be a panchromatic, multispectral, hyperspectral or even ultraspectral of terra bytes. Remote sensing image classification is one amongst the most significant application worlds for remote sensing. A few number of image classification algorithms have proved good precision in classifying remote sensing data. But, of late, due to the increasing spatiotemporal dimensions of the remote sensing data, traditional classification algorithms have exposed weaknesses necessitating further research in the field of remote sensing image classification. So an efficient classifier is needed to classify the remote sensing images to extract information. We are experimenting with both supervised and unsupervised classification. Here we compare the different classification methods and their performances. It is found that Mahalanobis classifier performed the best in our...

  9. 决策树 ID3算法在客户信息分类中的应用%Application of decision tree ID3 algorithm in classification of customer information

    Institute of Scientific and Technical Information of China (English)

    吴建源

    2014-01-01

    In modern enterprises, how to retain ecustomers is important research direction of the enterprise customer management.This paper uses the decision tree ID3 algorithm to analyze characteristics of customer attributes, realize the classification of customer information, find out the characteristics of all kinds of customers, and specifically improve the relationship with the customers, so as to avoid the customer loss, and improve the market share.%在现代企业,如何保留客户是企业客户管理的重要研究方向。使用决策树 ID3算法,分析客户的属性特征,实现客户信息的分类,找出各类客户的特征,有针对性地改善客户关系,从而避免客户流失,提高市场的占有率。

  10. Entire Solution Path for Support Vector Machine for Positive and Unlabeled Classification

    Institute of Scientific and Technical Information of China (English)

    YAO Limin; TANG Jie; LI Juanzi

    2009-01-01

    Support vector machines (SVMs) aim to find an optimal separating hyper-plane that maximizes separation between two classes of training examples (more precisely, maximizes the margin between the two classes of examples). The choice of the cost parameter for training the SVM model is always a critical issue. This analysis studies how the cost parameter determines the hyper-plane; especially for classifica-tions using only positive data and unlabeled data. An algorithm is given for the entire solution path by choosing the 'best' cost parameter while training the SVM model. The performance of the algorithm is com-pared with conventional implementations that use default values as the cost parameter on two synthetic data sets and two real-world data sets. The results show that the algorithm achieves better results when dealing with positive data and unlabeled classification.

  11. Improvement and Validation of the BOAT Algorithm

    Directory of Open Access Journals (Sweden)

    Yingchun Liu

    2014-04-01

    Full Text Available The main objective of this paper is improving the BOAT classification algorithm and applying it in credit card big data analysis. Decision tree algorithm is a data analysis method for the classification which can be used to describe the extract important data class models or predict future data trends. The BOAT algorithm can reduce the data during reading and writing the operations, the improved algorithms in large data sets under the operating efficiency, and in line with the popular big data analysis. Through this paper, BOAT algorithm can further improve the performance of the algorithm and the distributed data sources under the performance. In this paper, large banking sectors of credit card data as the being tested data sets. The improved algorithm, the original BOAT algorithms, and the performance of other classical classification algorithms will be compared and analyzed.

  12. AN APPROACH FOR BREAST CANCER DIAGNOSIS CLASSIFICATION USING NEURAL NETWORK

    Directory of Open Access Journals (Sweden)

    Htet Thazin Tike Thein

    2015-01-01

    Full Text Available Artificial neural network has been widely used in various fields as an intelligent tool in recent years, such as artificial intelligence, pattern recognition, medical diagnosis, machine learning and so on. The classification of breast cancer is a medical application that poses a great challenge for researchers and scientists. Recently, the neural network has become a popular tool in the classification of cancer datasets. Classification is one of the most active research and application areas of neural networks. Major disadvantages of artificial neural network (ANN classifier are due to its sluggish convergence and always being trapped at the local minima. To overcome this problem, differential evolution algorithm (DE has been used to determine optimal value or near optimal value for ANN parameters. DE has been applied successfully to improve ANN learning from previous studies. However, there are still some issues on DE approach such as longer training time and lower classification accuracy. To overcome these problems, island based model has been proposed in this system. The aim of our study is to propose an approach for breast cancer distinguishing between different classes of breast cancer. This approach is based on the Wisconsin Diagnostic and Prognostic Breast Cancer and the classification of different types of breast cancer datasets. The proposed system implements the island-based training method to be better accuracy and less training time by using and analysing between two different migration topologies.

  13. Classification des rongeurs

    OpenAIRE

    Mignon, Jacques; Hardouin, Jacques

    2003-01-01

    Les lecteurs du Bulletin BEDIM semblent parfois avoir des difficultés avec la classification scientifique des animaux connus comme "rongeurs" dans le langage courant. Vu les querelles existant encore aujourd'hui dans la mise en place de cette classification, nous ne nous en étonnerons guère. La brève synthèse qui suit concerne les animaux faisant ou susceptibles de faire partie du mini-élevage. The note aims at providing the main characteristics of the principal families of rodents relevan...

  14. Application of a New Fuzzy Clustering Algorithm in Intrusion Detection

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    This paper presents a new Section Set Adaptive FCM algorithm. The algorithm solved the shortcomings of localoptimality, unsure classification and clustering numbers ascertained previously. And it improved on the architecture of FCM al-gorithm, enhanced the analysis for effective clustering. During the clustering processing, it may adjust clustering numbers dy-namically. Finally, it used the method of section set decreasing the time of classification. By experiments, the algorithm can im-prove dependability of clustering and correctness of classification.

  15. Emergency Logistics Service Facilities Center Location Algorithm Based on Relative Classification of Disaster Degree%基于灾度相对分类的应急物流中心选址

    Institute of Scientific and Technical Information of China (English)

    骆达荣

    2013-01-01

    Usually the primary objective of general logistics is cost efficiency, while the goal of emergency logistics is to pur-sue the maximum of time efficiency and minimum of disaster loss. So the method of common logistics service center location doesn't meet the requirements of emergency logistics service center location. Furthermore the degree of damage and the different requirement of emergency supplies haven't been considered in the existing location models. Therefore, an emergency logistics facilities service center location algorithm based on relative classification of disaster degree is put forward in this paper. In the algorithm points dam-aged more seriously will be selected as service facilities points using clustering method. And then emergency service center location will be set using center of gravity method selection. Finally a simulation shows the algorithm is effective and feasible.%  与普通物流服务以成本效益为目标不同,应急物流服务要求以时间效益最大化和灾害损失最小化为目标,因而普通物流服务的设施中心选址的方法往往不适用于应急物流服务的设施中心选址。而现有应急物流服务设施中心的选址模型存在一定的不足,而且没有考虑各受灾点的受灾程度及相应的应急物资需求的差异。为此,本文提出基于灾度相对分类的应急设施服务中心选址算法,以聚类分析方法选取灾损较严重的受灾点为服务设施节点,然后采用重心法进行应急服务设施中心选址。最后,以实例仿真说明算法的有效可行。

  16. 一种基于匹配学习的人脸图像超分辨率算法%A super-resolution algorithm of face image based on pre-classification and match

    Institute of Scientific and Technical Information of China (English)

    窦翔; 陶青川

    2015-01-01

    The exsiting example-based super-resolution algorithms of face image adopt global search, which causes the problems of non-local mismatch and poor visual effect of image restoration. A new matching and learning-based face image super-resolution restoration algorithm is proposed. A pre-classification process of input image is applied to get a sub-sample library from the image library, and the corresponding feature images are created. In the matching process, two new search strategies for different face images are used, which consider the similarity and consistency between image patches and make the recovered image look more coherent and natural. Experimental results show that the proposed algorithm synthesizes high-resolution faces with better visual effect and obtains higher values of the average of Peak Signal-to-Noise Ratios(PSNR) when compared with other methods.%针对现有基于样本学习的人脸超分辨率算法对人脸图像采用全局搜索,存在非局部误匹配且复原图像视觉效果不佳等问题,提出了一种新的基于匹配学习的人脸图像超分辨率算法。首先根据输入图像预分类得到一个样本子类库,并构建相应的特征图像。在匹配过程中,针对不同人脸图像,采用2种新的搜索策略,考虑了图像块之间的相似性和一致性,使复原图像看起来更加连贯自然。实验结果表明,与其他方法相比,本文算法生成的高分辨率人脸图像获得了更好的视觉效果和更高的平均峰值信噪比,具有很好的实用价值。

  17. 多连通李群覆盖学习算法在图像分类上的应用%Multiply Connected Lie Group Covering Learning Algorithm for Image Classifi-cation

    Institute of Scientific and Technical Information of China (English)

    严晨; 李凡长; 邹鹏

    2014-01-01

    As a novel learning method, Lie group machine learning has attracted much attention in academia. According to the connectivity of Lie group, this paper tries to map the research objects with different category characteristics into the space of multiply connected Lie group. Based on the homotopy equivalence of attachments on each simple con-nected Lie group, this paper explores the equivalent representation of the optimal path for each of the different cate-gories by covering ideas, so as to present the category information of images by employing its multiple-valued repre-sentation. Therefore, this paper proposes a new covering learning algorithm on multiply connected Lie group. The experimental results on the datasets of MPEG7_CE-Shape01_Part_B and MNIST show that the proposed algorithm has better classification performance with comparisons to the other two algorithms based on the Lie group means.%李群机器学习作为一种新的学习范式已被学术界广泛关注。根据李群的连通性质,将具有不同类别特征的研究对象映射到多连通李群空间,并从各个单连通李群空间上连线的同伦等价出发,运用覆盖的思想寻找对应不同类别的最优道路等价表示,从而用多连通李群的多值表示来呈现图像的类别信息,因此提出了多连通李群覆盖学习算法。在MPEG7_CE-Shape01_Part_B图像库的图像和MNIST手写体数字图像上进行了实验验证,结果表明与两种基于李群均值的学习算法相比,多连通李群覆盖学习算法具有较好的分类效果。

  18. Incrementally Maintaining Classification using an RDBMS

    OpenAIRE

    Koc, Mehmet Levent; Ré, Christopher

    2011-01-01

    The proliferation of imprecise data has motivated both researchers and the database industry to push statistical techniques into relational database management systems (RDBMSs). We study algorithms to maintain model-based views for a popular statistical technique, classification, inside an RDBMS in the presence of updates to the training examples. We make three technical contributions: (1) An algorithm that incrementally maintains classification inside an RDBMS. (2) An analysis of the above a...

  19. Application of Data Mining in Protein Sequence Classification

    Directory of Open Access Journals (Sweden)

    Suprativ Saha

    2012-11-01

    Full Text Available Protein sequence classification involves feature selection for accurate classification. Popular protein sequence classification techniques involve extraction of specific features from the sequences. Researchers apply some well-known classification techniques like neural networks, Genetic algorithm, Fuzzy ARTMAP,Rough Set Classifier etc for accurate classification. This paper presents a review is with three different classification models such as neural network model, fuzzy ARTMAP model and Rough set classifier model.This is followed by a new technique for classifying protein sequences. The proposed model is typicallyimplemented with an own designed tool and tries to reduce the computational overheads encountered by earlier approaches and increase the accuracy of classification.

  20. Classification of positive blood cultures

    DEFF Research Database (Denmark)

    Gradel, Kim Oren; Knudsen, Jenny Dahl; Arpi, Magnus;

    2012-01-01

    . For each classification, we tabulated episodes derived by the physicians assessment and the computer algorithm and compared 30-day mortality between concordant and discrepant groups with adjustment for age, gender, and comorbidity. RESULTS: Physicians derived 9,482 reference episodes from 21,705 positive...

  1. Soil Classification Using GATree

    CERN Document Server

    Bhargavi, P

    2010-01-01

    This paper details the application of a genetic programming framework for classification of decision tree of Soil data to classify soil texture. The database contains measurements of soil profile data. We have applied GATree for generating classification decision tree. GATree is a decision tree builder that is based on Genetic Algorithms (GAs). The idea behind it is rather simple but powerful. Instead of using statistic metrics that are biased towards specific trees we use a more flexible, global metric of tree quality that try to optimize accuracy and size. GATree offers some unique features not to be found in any other tree inducers while at the same time it can produce better results for many difficult problems. Experimental results are presented which illustrate the performance of generating best decision tree for classifying soil texture for soil data set.

  2. Graph Colouring Algorithms

    DEFF Research Database (Denmark)

    Husfeldt, Thore

    2015-01-01

    This chapter presents an introduction to graph colouring algorithms. The focus is on vertex-colouring algorithms that work for general classes of graphs with worst-case performance guarantees in a sequential model of computation. The presentation aims to demonstrate the breadth of available...... techniques and is organized by algorithmic paradigm....

  3. Development of a reject classification method, applied to the diagnotic of a nuclear reactor core: processing of thermal signals providing from out-of-reactor simulation

    International Nuclear Information System (INIS)

    Development of an evolution detection algorithm which aim is to extend the application field of the form recognition analysis to the diagnosis and follow-up of a complex system: study of the data from the out-of-reactor test loop with forced convection in sodium, study and description of a reject classification algorithm developed in the general point of view of evolution detection. This method is tested with theoretical data and with experimental data provided by the second test loop ISIS

  4. Discrimination and the aim of proportional representation

    DEFF Research Database (Denmark)

    Lippert-Rasmussen, Kasper

    2008-01-01

    Many organizations, companies, and so on are committed to certain representational aims as regards the composition of their workforce. One motivation for such aims is the assumption that numerical underrepresentation of groups manifests discrimination against them. In this article, I articulate...... representational aims in a way that best captures this rationale. My main claim is that the achievement of such representational aims is reducible to the elimination of the effects of wrongful discrimination on individuals and that this very important concern is, in principle, compatible with the representation......, the time-relative account of representational aims expounded shows that, and how, representational aims should accommodate the changing composition of populations over time....

  5. 基于动态交叉协同的属性量子进化约简与分类学习级联算法%A Cascade Algorithm of Quantum Attribute Evolution Reduction and Classification Learning Based on Dynamic Crossover Cooperation

    Institute of Scientific and Technical Information of China (English)

    丁卫平; 王建东; 管致锦; 施佺

    2011-01-01

    属性约简与规则分类学习是粗糙集理论研究和应用的重要内容.文中充分利用量子计算加速算法速度和混合蛙跳算法高效协同搜索等优势,提出一种基于动态交叉协同的量子蛙跳属性约简与分类学习的级联算法.该算法用量子态比特进行蛙群个体编码,以动态量子角旋转调整策略实现属性染色体快速约简,并在粗糙熵阈值分类标准内采用量子蛙群混合交叉协同进化机制提取和约简分类规则、组合决策规则链等,最后构造属性约简和分类学习双重功能级联模型.仿真实验验证该算法不仅具有较高的全局优化性能,且属性约简与规则分类学习的精度和效率均超过同类算法.%Attribute reduction and rule classification learning are important contents for research and application of rough set theory. Taking advantage of quantum computing to accelerate the algorithm speed and co-searching of shuffled frog leaping algorithm, a cascade algorithm of attribute reduction and classification learning based on the dynamic quantum frog-leaping crossover cooperation is proposed. Individuals in the frog swarm are represented by multi-state gene qubits, and the dynamic adjustment strategy of quantum rotation angle is applied to accelerate its convergence. By the crossover coevolution mechanism, classification rules are extracted and reduced, and decision rule chains are introduced in the classification criterion of rough entropy thresholding. The double cascade model of attribute reduction and classification learning is constructed. Experimental simulations indicate the proposed algorithm has good performance for global optimization. Compared with other algorithms, it is more efficient on attribute reduction and rule classification learning.

  6. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is constr

  7. Diagnostic Accuracy Comparison of Artificial Immune Algorithms for Primary Headaches

    Science.gov (United States)

    Çelik, Ufuk; Yurtay, Nilüfer; Koç, Emine Rabia; Tepe, Nermin; Güllüoğlu, Halil; Ertaş, Mustafa

    2015-01-01

    The present study evaluated the diagnostic accuracy of immune system algorithms with the aim of classifying the primary types of headache that are not related to any organic etiology. They are divided into four types: migraine, tension, cluster, and other primary headaches. After we took this main objective into consideration, three different neurologists were required to fill in the medical records of 850 patients into our web-based expert system hosted on our project web site. In the evaluation process, Artificial Immune Systems (AIS) were used as the classification algorithms. The AIS are classification algorithms that are inspired by the biological immune system mechanism that involves significant and distinct capabilities. These algorithms simulate the specialties of the immune system such as discrimination, learning, and the memorizing process in order to be used for classification, optimization, or pattern recognition. According to the results, the accuracy level of the classifier used in this study reached a success continuum ranging from 95% to 99%, except for the inconvenient one that yielded 71% accuracy. PMID:26075014

  8. Diagnostic Accuracy Comparison of Artificial Immune Algorithms for Primary Headaches

    Directory of Open Access Journals (Sweden)

    Ufuk Çelik

    2015-01-01

    Full Text Available The present study evaluated the diagnostic accuracy of immune system algorithms with the aim of classifying the primary types of headache that are not related to any organic etiology. They are divided into four types: migraine, tension, cluster, and other primary headaches. After we took this main objective into consideration, three different neurologists were required to fill in the medical records of 850 patients into our web-based expert system hosted on our project web site. In the evaluation process, Artificial Immune Systems (AIS were used as the classification algorithms. The AIS are classification algorithms that are inspired by the biological immune system mechanism that involves significant and distinct capabilities. These algorithms simulate the specialties of the immune system such as discrimination, learning, and the memorizing process in order to be used for classification, optimization, or pattern recognition. According to the results, the accuracy level of the classifier used in this study reached a success continuum ranging from 95% to 99%, except for the inconvenient one that yielded 71% accuracy.

  9. A Novel Approach to ECG Classification Based upon Two-Layered HMMs in Body Sensor Networks

    OpenAIRE

    Wei Liang; Yinlong Zhang; Jindong Tan; Yang Li

    2014-01-01

    This paper presents a novel approach to ECG signal filtering and classification. Unlike the traditional techniques which aim at collecting and processing the ECG signals with the patient being still, lying in bed in hospitals, our proposed algorithm is intentionally designed for monitoring and classifying the patient’s ECG signals in the free-living environment. The patients are equipped with wearable ambulatory devices the whole day, which facilitates the real-time heart attack detection. ...

  10. ARTIFICIAL BEE COLONY ALGORITHM INTEGRATED WITH FUZZY C-MEAN OPERATOR FOR DATA CLUSTERING

    Directory of Open Access Journals (Sweden)

    M. Krishnamoorthi

    2013-01-01

    Full Text Available Clustering task aims at the unsupervised classification of patterns in different groups. To enhance the quality of results, the emerging swarm-based algorithms now-a-days become an alternative to the conventional clustering methods. In this study, an optimization method based on the swarm intelligence algorithm is proposed for the purpose of clustering. The significance of the proposed algorithm is that it uses a Fuzzy C- Means (FCM operator in the Artificial Bee Colony (ABC algorithm. The area of action of the FCM operator comes at the scout bee phase of the ABC algorithm as the scout bees are introduced by the FCM operator. The experimental results have shown that the proposed approach has provided significant results in terms of the quality of solution. The comparative study of the proposed approach with existing algorithms in the literature using the datasets from UCI Machine learning repository is satisfactory.

  11. A novel block-classification and difference-expansion based reversible data hiding algorithm%一种基于块分类和差值扩展的可逆数据隐藏算法

    Institute of Scientific and Technical Information of China (English)

    宋伟; 侯建军; 李赵红

    2011-01-01

    The good statistical character of image blocks was used to present a novel digital reversible data hiding algorithm based on block-classification and difference-expansion. The host image was divided into data embedded area and auxiliary information area. In the former area, the types of image block were determined by the statistical relationship between the surrounding image blocks and the target image blocks, thus different amount of data information was embedded according to the different types. At the same time, the data were embedded in the less impact direction by the direction determined criterion, and the problem of image distortion due to single embed direction was solved. In auxiliary information embedded area, modified prediction-error was used to embed the location map and other auxiliary information for its zero overflow performance, and the new location map generated by auxiliary information embedded was avoided. Simulation of the algorithm was performed on different type images, and the results were compared with those of the exiting algorithms. The results show that the proposed algorithm has good performance, which can not only increase the amount of the embedded data, but also improve the embedded images' quality.%利用图像块的统计特性,描述一种新的基于差值扩展和块分类技术的可逆数据隐藏算法.首先,将图像分为数据嵌入区域和辅助信息嵌入区域.在数据嵌入区域,将分块后的图像利用周围图像块与目标图像块间的统计关系判断图像块的类型,从而根据类型差异嵌入不同数据量的信息,实现图像内容复杂度对数据嵌入量的控制.同时,利用方向判断准则选择对图像影响较小的方向嵌入,解决了单一嵌入方向对图像造成较大失真的问题.在辅助信息嵌入区域,采用MPE零溢出的特性嵌入位图辅助信息,避免嵌入辅助信息带来新的位图信息.对不同纹理图像进行实验测试以

  12. Automatic design of decision-tree algorithms with evolutionary algorithms.

    Science.gov (United States)

    Barros, Rodrigo C; Basgalupp, Márcio P; de Carvalho, André C P L F; Freitas, Alex A

    2013-01-01

    This study reports the empirical analysis of a hyper-heuristic evolutionary algorithm that is capable of automatically designing top-down decision-tree induction algorithms. Top-down decision-tree algorithms are of great importance, considering their ability to provide an intuitive and accurate knowledge representation for classification problems. The automatic design of these algorithms seems timely, given the large literature accumulated over more than 40 years of research in the manual design of decision-tree induction algorithms. The proposed hyper-heuristic evolutionary algorithm, HEAD-DT, is extensively tested using 20 public UCI datasets and 10 microarray gene expression datasets. The algorithms automatically designed by HEAD-DT are compared with traditional decision-tree induction algorithms, such as C4.5 and CART. Experimental results show that HEAD-DT is capable of generating algorithms which are significantly more accurate than C4.5 and CART.

  13. Innovating Web Page Classification Through Reducing Noise

    Institute of Scientific and Technical Information of China (English)

    LI Xiaoli (李晓黎); SHI Zhongzhi(史忠植)

    2002-01-01

    This paper presents a new method that eliminates noise in Web page classification. It first describes the presentation of a Web page based on HTML tags. Then through a novel distance formula, it eliminates the noise in similarity measure. After carefully analyzing Web pages, we design an algorithm that can distinguish related hyperlinks from noisy ones.We can utilize non-noisy hyperlinks to improve the performance of Web page classification (the CAWN algorithm). For any page, wecan classify it through the text and category of neighbor pages related to the page. The experimental results show that our approach improved classification accuracy.

  14. Online co-regularized algorithms

    NARCIS (Netherlands)

    Ruijter, T. de; Tsivtsivadze, E.; Heskes, T.

    2012-01-01

    We propose an online co-regularized learning algorithm for classification and regression tasks. We demonstrate that by sequentially co-regularizing prediction functions on unlabeled data points, our algorithm provides improved performance in comparison to supervised methods on several UCI benchmarks

  15. Fault Tolerant Neural Network for ECG Signal Classification Systems

    Directory of Open Access Journals (Sweden)

    MERAH, M.

    2011-08-01

    Full Text Available The aim of this paper is to apply a new robust hardware Artificial Neural Network (ANN for ECG classification systems. This ANN includes a penalization criterion which makes the performances in terms of robustness. Specifically, in this method, the ANN weights are normalized using the auto-prune method. Simulations performed on the MIT ? BIH ECG signals, have shown that significant robustness improvements are obtained regarding potential hardware artificial neuron failures. Moreover, we show that the proposed design achieves better generalization performances, compared to the standard back-propagation algorithm.

  16. Medical images data mining using classification algorithm based on association rule%基于关联分类算法的医学图像数据挖掘

    Institute of Scientific and Technical Information of China (English)

    邓薇薇; 卢延鑫

    2012-01-01

    Objective In order to assist clinicians in diagnosis and treatment of brain disease,a classifier for medical images which contains tumora inside,based on association rule data mining techniques was constructed.Methtoods After a pre-processing phase of the medical images,the related features from those images were extracted and discretized as the input of association rule,then the medical images classifier was constructed by improved Apriori algorithm.Results The medical images classifier was constructed.The known type of medical images was utilized to train the classifier so as to mine the association rules that satisfy the constraint conditions.Then the brain tumor in an unknown type of medical image was classified by the classifier constructed.Conclusion Classification algorithm based on association rule can be effectively used in mining image features,and constructing an image classifier to identify benign or malignant tumors.%目的 利用关联分类算法,构造医学图像分类器,对未知类型的脑肿瘤图像进行自动判别和分类,以帮助临床医生进行脑疾病的诊断和治疗.方法 对医学图像经过预处理后进行特征提取,再将提取的特征离散化后放到事务数据库中作为关联分类规则的输入,然后利用改进的Apriori算法构造医学图像分类器.结果 构造了医学图像分类器,用已知类型的图像训练分类器挖掘满足约束条件的关联规则,然后利用发现的关联规则对未知类型的医学图像进行分类以判断脑肿瘤的良恶性.结论 利用关联分类算法可以有效地挖掘医学图像特征,进而构造图像分类器,实现脑肿瘤良恶性的自动判别.

  17. 基于速度分类算法的交通事件视频检测系统设计%Video Detection System Design for Traffic Incidents Based on Speed Classification Algorithm

    Institute of Scientific and Technical Information of China (English)

    熊昕; 徐建闽

    2013-01-01

    Real-time video traffic incident detection method was proposed based on speed classification algorithm. In addition , traffic detection method, vehicles cross-road processing, speed detection, traffic flow detection and the identification of traffic events were also discussed. Based on vehicle detection and tracking, events such as traffic stop, lane transform times, slow traffic congestion and others can be identified and detected automatically to derive traffic flow, occupation ratio, queue length, average speed and other transportation parameters. In comparison with the traditional traffic incident detection system, the system is intuitive convenient and low-cost,and has good market demand and practical value.%提出基于速度分类算法的交通事件实时视频检测方法,并对交通量检测方法、车辆跨道处理、速度检测、交通状况检测及交通事件识别等进行了研究.在车辆检测与跟踪的基础上,可实现车辆停止、慢行、车道变换次数和车流拥挤等交通事件识别功能,通过自动检测车辆避障、车道变换、超速、慢速、停止和交通阻塞等事件,获得交通流量、占有率、排队长度、车型和平均车速等交通参数.与传统交通事件检测系统相比,具有直观方便、费用低等优点.

  18. Research on the Gastric Cancer Clinical Medical Data Mining Research Based on SPRINT Classification Algorithm%基于SPRINT算法的胃癌临床医疗数据挖掘研究

    Institute of Scientific and Technical Information of China (English)

    郑丹青

    2012-01-01

    To supply the data mining demand,a decision-tree based model is proposed for gastric cancer clinical medical information analysis and application.The model is developed from the existing operational database or data warehouse,from which the factors related to gastric cancer recurrence are extracted to form a decision tree training data set.Using the SPRINT classification algorithm,the model is capable of analyzing the risk factors for gastric cancer recurrence.Based on the analysis of all the potential factors affecting clinical diagnosis,treatment and prognosis,the model confirmed that the primary risk factor for gastric cancer recurrence was hereditary.%为了满足数据挖掘的需要,本文提出了一个基于决策树的胃癌临床医疗信息分析应用研究模型.该模型是从业务数据库或数据仓库中抽取与胃癌术后复发因素有关的数据,形成决策树的训练数据集.运用SPRINT算法,构建胃癌术后复发的危险因素分析模型.通过对模型分析,寻找疾病的临床诊断、治疗和预后的关系,证实胃癌术后复发首要危险因素是家族遗传.

  19. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  20. DIAGNOSIS OF DIABETES USING CLASSIFICATION MINING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Aiswarya Iyer

    2015-01-01

    Full Text Available Diabetes has affected over 246 million people worldwide with a majority of them being women. According to the WHO report, by 2025 this number is expected to rise to over 380 million. The disease has been named the fifth deadliest disease in the United States with no imminent cure in sight. With the rise of information technology and its continued advent into the medical and healthcare sector, the cases of diabetes as well as their symptoms are well documented. This paper aims at finding solutions to diagnose the disease by analyzing the patterns found in the data through classification analysis by employing Decision Tree and Naïve Bayes algorithms. The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients

  1. Evaluation for Uncertain Image Classification and Segmentation

    CERN Document Server

    Martin, Arnaud; Arnold-Bos, Andreas

    2008-01-01

    Each year, numerous segmentation and classification algorithms are invented or reused to solve problems where machine vision is needed. Generally, the efficiency of these algorithms is compared against the results given by one or many human experts. However, in many situations, the location of the real boundaries of the objects as well as their classes are not known with certainty by the human experts. Furthermore, only one aspect of the segmentation and classification problem is generally evaluated. In this paper we present a new evaluation method for classification and segmentation of image, where we take into account both the classification and segmentation results as well as the level of certainty given by the experts. As a concrete example of our method, we evaluate an automatic seabed characterization algorithm based on sonar images.

  2. Spectral band selection for classification of soil organic matter content

    Science.gov (United States)

    Henderson, Tracey L.; Szilagyi, Andrea; Baumgardner, Marion F.; Chen, Chih-Chien Thomas; Landgrebe, David A.

    1989-01-01

    This paper describes the spectral-band-selection (SBS) algorithm of Chen and Landgrebe (1987, 1988, and 1989) and uses the algorithm to classify the organic matter content in the earth's surface soil. The effectiveness of the algorithm was evaluated comparing the results of classification of the soil organic matter using SBS bands with those obtained using Landsat MSS bands and TM bands, showing that the algorithm was successful in finding important spectral bands for classification of organic matter content. Using the calculated bands, the probabilities of correct classification for climate-stratified data were found to range from 0.910 to 0.980.

  3. Landslide hazards mapping using uncertain Naïve Bayesian classification method

    Institute of Scientific and Technical Information of China (English)

    毛伊敏; 张茂省; 王根龙; 孙萍萍

    2015-01-01

    Landslide hazard mapping is a fundamental tool for disaster management activities in Loess terrains. Aiming at major issues with these landslide hazard assessment methods based on Naïve Bayesian classification technique, which is difficult in quantifying those uncertain triggering factors, the main purpose of this work is to evaluate the predictive power of landslide spatial models based on uncertain Naïve Bayesian classification method in Baota district of Yan’an city in Shaanxi province, China. Firstly, thematic maps representing various factors that are related to landslide activity were generated. Secondly, by using field data and GIS techniques, a landslide hazard map was performed. To improve the accuracy of the resulting landslide hazard map, the strategies were designed, which quantified the uncertain triggering factor to design landslide spatial models based on uncertain Naïve Bayesian classification method named NBU algorithm. The accuracies of the area under relative operating characteristics curves (AUC) in NBU and Naïve Bayesian algorithm are 87.29%and 82.47%respectively. Thus, NBU algorithm can be used efficiently for landslide hazard analysis and might be widely used for the prediction of various spatial events based on uncertain classification technique.

  4. Aims of education in South Africa

    Science.gov (United States)

    Morrow, Walter Eugene

    1990-06-01

    The first part of this paper gives a historical account of the aims of education under Apartheid, and discusses the ideological success of Apartheid education. The second part argues that a significant discussion — that is one which could have some purchase on schooling policy and educational practice — of aims of education in South Africa is not possible at present because the historical preconditions for such a discussion are not satisfied. It is argued that Apartheid has generated a political perspective which is unsympathetic to a discussion of aims of education; that the dominance of a social engineering model of schooling distorts a discussion of aims of education; and that a shared moral discourse, which is a necessary condition for a significant discussion of aims of education, does not yet exist in South Africa.

  5. Seismic texture classification. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Vinther, R.

    1997-12-31

    The seismic texture classification method, is a seismic attribute that can both recognize the general reflectivity styles and locate variations from these. The seismic texture classification performs a statistic analysis for the seismic section (or volume) aiming at describing the reflectivity. Based on a set of reference reflectivities the seismic textures are classified. The result of the seismic texture classification is a display of seismic texture categories showing both the styles of reflectivity from the reference set and interpolations and extrapolations from these. The display is interpreted as statistical variations in the seismic data. The seismic texture classification is applied to seismic sections and volumes from the Danish North Sea representing both horizontal stratifications and salt diapers. The attribute succeeded in recognizing both general structure of successions and variations from these. Also, the seismic texture classification is not only able to display variations in prospective areas (1-7 sec. TWT) but can also be applied to deep seismic sections. The seismic texture classification is tested on a deep reflection seismic section (13-18 sec. TWT) from the Baltic Sea. Applied to this section the seismic texture classification succeeded in locating the Moho, which could not be located using conventional interpretation tools. The seismic texture classification is a seismic attribute which can display general reflectivity styles and deviations from these and enhance variations not found by conventional interpretation tools. (LN)

  6. Efficient Cancer Classification using Fast Adaptive Neuro-Fuzzy Inference System (FANFIS based on Statistical Techniques

    Directory of Open Access Journals (Sweden)

    K.Ananda Kumar

    2011-09-01

    Full Text Available The increase in number of cancer is detected throughout the world. This leads to the requirement of developing a new technique which can detect the occurrence the cancer. This will help in better diagnosis in order to reduce the cancer patients. This paper aim at finding the smallest set of genes that can ensure highly accurate classification of cancer from micro array data by using supervised machine learning algorithms. The significance of finding the minimum subset is three fold: a The computational burden and noise arising from irrelevant genes are much reduced; b the cost for cancer testing is reduced significantly as it simplifies the gene expression tests to include only a very small number of genes rather than thousands of genes; c it calls for more investigation into the probable biological relationship between these small numbers of genes and cancer development and treatment. The proposed method involves two steps. In the first step, some important genes are chosen with the help of Analysis of Variance (ANOVA ranking scheme. In the second step, the classification capability is tested for all simple combinations of those important genes using a better classifier. The proposed method uses Fast Adaptive Neuro-Fuzzy Inference System (FANFIS as a classification model. This classification model uses Modified Levenberg-Marquardt algorithm for learning phase. The experimental results suggest that the proposed method results in better accuracy and also it takes lesser time for classification when compared to the conventional techniques.

  7. Machine Learning for Biological Trajectory Classification Applications

    Science.gov (United States)

    Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros

    2002-01-01

    Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.

  8. Fuzzy One-Class Classification Model Using Contamination Neighborhoods

    Directory of Open Access Journals (Sweden)

    Lev V. Utkin

    2012-01-01

    Full Text Available A fuzzy classification model is studied in the paper. It is based on the contaminated (robust model which produces fuzzy expected risk measures characterizing classification errors. Optimal classification parameters of the models are derived by minimizing the fuzzy expected risk. It is shown that an algorithm for computing the classification parameters is reduced to a set of standard support vector machine tasks with weighted data points. Experimental results with synthetic data illustrate the proposed fuzzy model.

  9. Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment

    Directory of Open Access Journals (Sweden)

    Rashmi Mukherjee

    2014-01-01

    Full Text Available The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough scheme for chronic wound (CW evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM, were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793.

  10. Automated tissue classification framework for reproducible chronic wound assessment.

    Science.gov (United States)

    Mukherjee, Rashmi; Manohar, Dhiraj Dhane; Das, Dev Kumar; Achar, Arun; Mitra, Analava; Chakraborty, Chandan

    2014-01-01

    The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough) scheme for chronic wound (CW) evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB) wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity) color space and subsequently the "S" component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM), were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793).

  11. On the Aims and Responsibilities of Science

    Directory of Open Access Journals (Sweden)

    Hugh Lacey

    2007-06-01

    Full Text Available I offer a view of the aims and responsibilities of science, and use it to analyze critically van Fraassen’s view that ‘objectifying inquiry’ is fundamental to the nature of science.

  12. Presentation of Database on Aims and Visions

    DEFF Research Database (Denmark)

    Lundgaard, Jacob

    2005-01-01

    This presentation presents a database on aims and visions regarding regional development and transport and infrastructure in the corridor from Oslo-Göteborg-Copenhagen-Berlin. The process, the developed methodology as well as the result of the database is presented.......This presentation presents a database on aims and visions regarding regional development and transport and infrastructure in the corridor from Oslo-Göteborg-Copenhagen-Berlin. The process, the developed methodology as well as the result of the database is presented....

  13. HIV classification using coalescent theory

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Ming [Los Alamos National Laboratory; Letiner, Thomas K [Los Alamos National Laboratory; Korber, Bette T [Los Alamos National Laboratory

    2008-01-01

    Algorithms for subtype classification and breakpoint detection of HIV-I sequences are based on a classification system of HIV-l. Hence, their quality highly depend on this system. Due to the history of creation of the current HIV-I nomenclature, the current one contains inconsistencies like: The phylogenetic distance between the subtype B and D is remarkably small compared with other pairs of subtypes. In fact, it is more like the distance of a pair of subsubtypes Robertson et al. (2000); Subtypes E and I do not exist any more since they were discovered to be composed of recombinants Robertson et al. (2000); It is currently discussed whether -- instead of CRF02 being a recombinant of subtype A and G -- subtype G should be designated as a circulating recombination form (CRF) nd CRF02 as a subtype Abecasis et al. (2007); There are 8 complete and over 400 partial HIV genomes in the LANL-database which belong neither to a subtype nor to a CRF (denoted by U). Moreover, the current classification system is somehow arbitrary like all complex classification systems that were created manually. To this end, it is desirable to deduce the classification system of HIV systematically by an algorithm. Of course, this problem is not restricted to HIV, but applies to all fast mutating and recombining viruses. Our work addresses the simpler subproblem to score classifications of given input sequences of some virus species (classification denotes a partition of the input sequences in several subtypes and CRFs). To this end, we reconstruct ancestral recombination graphs (ARG) of the input sequences under restrictions determined by the given classification. These restritions are imposed in order to ensure that the reconstructed ARGs do not contradict the classification under consideration. Then, we find the ARG with maximal probability by means of Markov Chain Monte Carlo methods. The probability of the most probable ARG is interpreted as a score for the classification. To our

  14. Malware Detection, Supportive Software Agents and Its Classification Schemes

    Directory of Open Access Journals (Sweden)

    Adebayo, Olawale Surajudeen

    2012-12-01

    Full Text Available Over time, the task of curbing the emergence of malware and its dastard activities has been identified interms of analysis, detection and containment of malware. Malware is a general term that is used todescribe the category of malicious software that is part of security threats to the computer and internetsystem. It is a malignant program designed to hamper the effectiveness of a computer and internetsystem. This paper aims at identifying the malware as one of the most dreaded threats to an emergingcomputer and communication technology. The paper identified the category of malware, malwareclassification algorithms, malwares activities and ways of preventing and removing malware if iteventually infects system.The research also describes tools that classify malware dataset using a rule-based classification schemeand machine learning algorithms to detect the malicious program from normal program through patternrecognition.

  15. Is Fitts' law continuous in discrete aiming?

    Directory of Open Access Journals (Sweden)

    Rita Sleimen-Malkoun

    Full Text Available The lawful continuous linear relation between movement time and task difficulty (i.e., index of difficulty; ID in a goal-directed rapid aiming task (Fitts' law has been recently challenged in reciprocal performance. Specifically, a discontinuity was observed at critical ID and was attributed to a transition between two distinct dynamic regimes that occurs with increasing difficulty. In the present paper, we show that such a discontinuity is also present in discrete aiming when ID is manipulated via target width (experiment 1 but not via target distance (experiment 2. Fitts' law's discontinuity appears, therefore, to be a suitable indicator of the underlying functional adaptations of the neuro-muscular-skeletal system to task properties/requirements, independently of reciprocal or discrete nature of the task. These findings open new perspectives to the study of dynamic regimes involved in discrete aiming and sensori-motor mechanisms underlying the speed-accuracy trade-off.

  16. PRAGUE SEMINAR ON LANGUAGE TEACHING AIMS, 1967.

    Science.gov (United States)

    FRIED, VILEM

    AN INTERNATIONAL SEMINAR, WHOSE PURPOSE IT WAS TO DISCUSS THE STRUCTURAL DIFFERENCES IN LANGUAGE TEACHING AIMS, IS REPORTED ON IN THIS ARTICLE. THE THREE PAPERS PRESENTED BY IVAN POLDAUF, VALENTINA ZETLINA, AND JOHN B. CARROLL ARE REVIEWED FOR THEIR DISCUSSIONS ON LINGUISTICS, DIDACTICS, AND PSYCHOLOGY. THE DISCUSSION FOLLOWING THE PRESENTATION OF…

  17. Aims and harvest of moral case deliberation.

    Science.gov (United States)

    Weidema, Froukje C; Molewijk, Bert A C; Kamsteeg, Frans; Widdershoven, Guy A M

    2013-09-01

    Deliberative ways of dealing with ethical issues in health care are expanding. Moral case deliberation is an example, providing group-wise, structured reflection on dilemmas from practice. Although moral case deliberation is well described in literature, aims and results of moral case deliberation sessions are unknown. This research shows (a) why managers introduce moral case deliberation and (b) what moral case deliberation participants experience as moral case deliberation results. A responsive evaluation was conducted, explicating moral case deliberation experiences by analysing aims (N = 78) and harvest (N = 255). A naturalistic data collection included interviews with managers and evaluation questionnaires of moral case deliberation participants (nurses). From the analysis, moral case deliberation appeals for cooperation, team bonding, critical attitude towards routines and nurses' empowerment. Differences are that managers aim to foster identity of the nursing profession, whereas nurses emphasize learning processes and understanding perspectives. We conclude that moral case deliberation influences team cooperation that cannot be controlled with traditional management tools, but requires time and dialogue. Exchanging aims and harvest between manager and team could result in co-creating (moral) practice in which improvements for daily cooperation result from bringing together perspectives of managers and team members.

  18. Neuronal Classification of Atria Fibrillation

    OpenAIRE

    Mohamed BEN MESSAOUD

    2008-01-01

    Motivation. In medical field, particularly the cardiology, the diagnosis systems constitute the essential domain of research. In some applications, the traditional methods of classification present some limitations. The neuronal technique is considered as one of the promising algorithms to resolve such problem.Method. In this paper, two approaches of the Artificial Neuronal Network (ANN) technique are investigated to classify the heart beats which are Multi Layer Perception (MLP) and Radial B...

  19. Classification of Agri-Tourism / Rural Tourism SMEs in Poland (on the Example of the Wielkopolska Region)

    OpenAIRE

    Przezborska, Lucyna

    2005-01-01

    The paper is based on data from a questionnaire survey (interviews) conducted in the western part of Poland on 183 rural tourism and agri-tourism small and medium enterprises. The classification of enterprises was based on the methodology proposed by Wysocki (1996) and included the k-means clustering algorithm. As the result of the research three types of SMEs were separated, including the top resilient enterprises aimed mainly at tourism activity and usually connected with horse recreation, ...

  20. Classification of ASKAP Vast Radio Light Curves

    Science.gov (United States)

    Rebbapragada, Umaa; Lo, Kitty; Wagstaff, Kiri L.; Reed, Colorado; Murphy, Tara; Thompson, David R.

    2012-01-01

    The VAST survey is a wide-field survey that observes with unprecedented instrument sensitivity (0.5 mJy or lower) and repeat cadence (a goal of 5 seconds) that will enable novel scientific discoveries related to known and unknown classes of radio transients and variables. Given the unprecedented observing characteristics of VAST, it is important to estimate source classification performance, and determine best practices prior to the launch of ASKAP's BETA in 2012. The goal of this study is to identify light curve characterization and classification algorithms that are best suited for archival VAST light curve classification. We perform our experiments on light curve simulations of eight source types and achieve best case performance of approximately 90% accuracy. We note that classification performance is most influenced by light curve characterization rather than classifier algorithm.