WorldWideScience

Sample records for machine svm-based classifier

  1. A support vector machine (SVM) based voltage stability classifier

    Energy Technology Data Exchange (ETDEWEB)

    Dosano, R.D.; Song, H. [Kunsan National Univ., Kunsan, Jeonbuk (Korea, Republic of); Lee, B. [Korea Univ., Seoul (Korea, Republic of)

    2007-07-01

    Power system stability has become even more complex and critical with the advent of deregulated energy markets and the growing desire to completely employ existing transmission and infrastructure. The economic pressure on electricity markets forces the operation of power systems and components to their limit of capacity and performance. System conditions can be more exposed to instability due to greater uncertainty in day to day system operations and increase in the number of potential components for system disturbances potentially resulting in voltage stability. This paper proposed a support vector machine (SVM) based power system voltage stability classifier using local measurements of voltage and active power of load. It described the procedure for fast classification of long-term voltage stability using the SVM algorithm. The application of the SVM based voltage stability classifier was presented with reference to the choice of input parameters; input data preconditioning; moving window for feature vector; determination of learning samples; and other considerations in SVM applications. The paper presented a case study with numerical examples of an 11-bus test system. The test results for the feasibility study demonstrated that the classifier could offer an excellent performance in classification with time-series measurements in terms of long-term voltage stability. 9 refs., 14 figs.

  2. An SVM-Based Classifier for Estimating the State of Various Rotating Components in Agro-Industrial Machinery with a Vibration Signal Acquired from a Single Point on the Machine Chassis

    Directory of Open Access Journals (Sweden)

    Ruben Ruiz-Gonzalez

    2014-11-01

    Full Text Available The goal of this article is to assess the feasibility of estimating the state of various rotating components in agro-industrial machinery by employing just one vibration signal acquired from a single point on the machine chassis. To do so, a Support Vector Machine (SVM-based system is employed. Experimental tests evaluated this system by acquiring vibration data from a single point of an agricultural harvester, while varying several of its working conditions. The whole process included two major steps. Initially, the vibration data were preprocessed through twelve feature extraction algorithms, after which the Exhaustive Search method selected the most suitable features. Secondly, the SVM-based system accuracy was evaluated by using Leave-One-Out cross-validation, with the selected features as the input data. The results of this study provide evidence that (i accurate estimation of the status of various rotating components in agro-industrial machinery is possible by processing the vibration signal acquired from a single point on the machine structure; (ii the vibration signal can be acquired with a uniaxial accelerometer, the orientation of which does not significantly affect the classification accuracy; and, (iii when using an SVM classifier, an 85% mean cross-validation accuracy can be reached, which only requires a maximum of seven features as its input, and no significant improvements are noted between the use of either nonlinear or linear kernels.

  3. Effective Sequential Classifier Training for SVM-Based Multitemporal Remote Sensing Image Classification

    Science.gov (United States)

    Guo, Yiqing; Jia, Xiuping; Paull, David

    2018-06-01

    The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.

  4. A Hierarchical Method for Transient Stability Prediction of Power Systems Using the Confidence of a SVM-Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Yanzhen Zhou

    2016-09-01

    Full Text Available Machine learning techniques have been widely used in transient stability prediction of power systems. When using the post-fault dynamic responses, it is difficult to draw a definite conclusion about how long the duration of response data used should be in order to balance the accuracy and speed. Besides, previous studies have the problem of lacking consideration for the confidence level. To solve these problems, a hierarchical method for transient stability prediction based on the confidence of ensemble classifier using multiple support vector machines (SVMs is proposed. Firstly, multiple datasets are generated by bootstrap sampling, then features are randomly picked up to compress the datasets. Secondly, the confidence indices are defined and multiple SVMs are built based on these generated datasets. By synthesizing the probabilistic outputs of multiple SVMs, the prediction results and confidence of the ensemble classifier will be obtained. Finally, different ensemble classifiers with different response times are built to construct different layers of the proposed hierarchical scheme. The simulation results show that the proposed hierarchical method can balance the accuracy and rapidity of the transient stability prediction. Moreover, the hierarchical method can reduce the misjudgments of unstable instances and cooperate with the time domain simulation to insure the security and stability of power systems.

  5. Settlement Prediction of Road Soft Foundation Using a Support Vector Machine (SVM Based on Measured Data

    Directory of Open Access Journals (Sweden)

    Yu Huiling

    2016-01-01

    Full Text Available The suppor1t vector machine (SVM is a relatively new artificial intelligence technique which is increasingly being applied to geotechnical problems and is yielding encouraging results. SVM is a new machine learning method based on the statistical learning theory. A case study based on road foundation engineering project shows that the forecast results are in good agreement with the measured data. The SVM model is also compared with BP artificial neural network model and traditional hyperbola method. The prediction results indicate that the SVM model has a better prediction ability than BP neural network model and hyperbola method. Therefore, settlement prediction based on SVM model can reflect actual settlement process more correctly. The results indicate that it is effective and feasible to use this method and the nonlinear mapping relation between foundation settlement and its influence factor can be expressed well. It will provide a new method to predict foundation settlement.

  6. Patients on weaning trials classified with support vector machines

    International Nuclear Information System (INIS)

    Garde, Ainara; Caminal, Pere; Giraldo, Beatriz F; Schroeder, Rico; Voss, Andreas; Benito, Salvador

    2010-01-01

    The process of discontinuing mechanical ventilation is called weaning and is one of the most challenging problems in intensive care. An unnecessary delay in the discontinuation process and an early weaning trial are undesirable. This study aims to characterize the respiratory pattern through features that permit the identification of patients' conditions in weaning trials. Three groups of patients have been considered: 94 patients with successful weaning trials, who could maintain spontaneous breathing after 48 h (GSucc); 39 patients who failed the weaning trial (GFail) and 21 patients who had successful weaning trials, but required reintubation in less than 48 h (GRein). Patients are characterized by their cardiorespiratory interactions, which are described by joint symbolic dynamics (JSD) applied to the cardiac interbeat and breath durations. The most discriminating features in the classification of the different groups of patients (GSucc, GFail and GRein) are identified by support vector machines (SVMs). The SVM-based feature selection algorithm has an accuracy of 81% in classifying GSucc versus the rest of the patients, 83% in classifying GRein versus GSucc patients and 81% in classifying GRein versus the rest of the patients. Moreover, a good balance between sensitivity and specificity is achieved in all classifications

  7. Generalized SMO algorithm for SVM-based multitask learning.

    Science.gov (United States)

    Cai, Feng; Cherkassky, Vladimir

    2012-06-01

    Exploiting additional information to improve traditional inductive learning is an active research area in machine learning. In many supervised-learning applications, training data can be naturally separated into several groups, and incorporating this group information into learning may improve generalization. Recently, Vapnik proposed a general approach to formalizing such problems, known as "learning with structured data" and its support vector machine (SVM) based optimization formulation called SVM+. Liang and Cherkassky showed the connection between SVM+ and multitask learning (MTL) approaches in machine learning, and proposed an SVM-based formulation for MTL called SVM+MTL for classification. Training the SVM+MTL classifier requires the solution of a large quadratic programming optimization problem which scales as O(n(3)) with sample size n. So there is a need to develop computationally efficient algorithms for implementing SVM+MTL. This brief generalizes Platt's sequential minimal optimization (SMO) algorithm to the SVM+MTL setting. Empirical results show that, for typical SVM+MTL problems, the proposed generalized SMO achieves over 100 times speed-up, in comparison with general-purpose optimization routines.

  8. Classifying smoking urges via machine learning.

    Science.gov (United States)

    Dumortier, Antoine; Beckjord, Ellen; Shiffman, Saul; Sejdić, Ervin

    2016-12-01

    Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms' performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. Copyright © 2016 Elsevier Ireland Ltd. All rights

  9. A SVM bases AI design for interactive gaming

    OpenAIRE

    Jiang, Yang; Jiang, Jianmin; Palmer, Ian

    2008-01-01

    Interactive gaming requires automatic processing on large volume of random data produced by players on spot, such as shooting, football kicking, boxing etc. In this paper, we describe an artificial intelligence approach in processing such random data for interactive gaming by using a one-class support vector machine (OC-SVM). In comparison with existing techniques, our OC-SVM based interactive gaming design has the features of: (i): high speed processing, providing instant response to the pla...

  10. Power quality events recognition using a SVM-based method

    Energy Technology Data Exchange (ETDEWEB)

    Cerqueira, Augusto Santiago; Ferreira, Danton Diego; Ribeiro, Moises Vidal; Duque, Carlos Augusto [Department of Electrical Circuits, Federal University of Juiz de Fora, Campus Universitario, 36036 900, Juiz de Fora MG (Brazil)

    2008-09-15

    In this paper, a novel SVM-based method for power quality event classification is proposed. A simple approach for feature extraction is introduced, based on the subtraction of the fundamental component from the acquired voltage signal. The resulting signal is presented to a support vector machine for event classification. Results from simulation are presented and compared with two other methods, the OTFR and the LCEC. The proposed method shown an improved performance followed by a reasonable computational cost. (author)

  11. SVM Based Descriptor Selection and Classification of Neurodegenerative Disease Drugs for Pharmacological Modeling.

    Science.gov (United States)

    Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin

    2013-03-01

    Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Ship localization in Santa Barbara Channel using machine learning classifiers.

    Science.gov (United States)

    Niu, Haiqiang; Ozanich, Emma; Gerstoft, Peter

    2017-11-01

    Machine learning classifiers are shown to outperform conventional matched field processing for a deep water (600 m depth) ocean acoustic-based ship range estimation problem in the Santa Barbara Channel Experiment when limited environmental information is known. Recordings of three different ships of opportunity on a vertical array were used as training and test data for the feed-forward neural network and support vector machine classifiers, demonstrating the feasibility of machine learning methods to locate unseen sources. The classifiers perform well up to 10 km range whereas the conventional matched field processing fails at about 4 km range without accurate environmental information.

  13. Reconfigurable support vector machine classifier with approximate computing

    NARCIS (Netherlands)

    van Leussen, M.J.; Huisken, J.; Wang, L.; Jiao, H.; De Gyvez, J.P.

    2017-01-01

    Support Vector Machine (SVM) is one of the most popular machine learning algorithms. An energy-efficient SVM classifier is proposed in this paper, where approximate computing is utilized to reduce energy consumption and silicon area. A hardware architecture with reconfigurable kernels and

  14. Machine Learning Based Classifier for Falsehood Detection

    Science.gov (United States)

    Mallikarjun, H. M.; Manimegalai, P., Dr.; Suresh, H. N., Dr.

    2017-08-01

    The investigation of physiological techniques for Falsehood identification tests utilizing the enthusiastic aggravations started as a part of mid 1900s. The need of Falsehood recognition has been a piece of our general public from hundreds of years back. Different requirements drifted over the general public raising the need to create trick evidence philosophies for Falsehood identification. The established similar addressing tests have been having a tendency to gather uncertain results against which new hearty strategies are being explored upon for acquiring more productive Falsehood discovery set up. Electroencephalography (EEG) is a non-obtrusive strategy to quantify the action of mind through the anodes appended to the scalp of a subject. Electroencephalogram is a record of the electric signs produced by the synchronous activity of mind cells over a timeframe. The fundamental goal is to accumulate and distinguish the important information through this action which can be acclimatized for giving surmising to Falsehood discovery in future analysis. This work proposes a strategy for Falsehood discovery utilizing EEG database recorded on irregular people of various age gatherings and social organizations. The factual investigation is directed utilizing MATLAB v-14. It is a superior dialect for specialized registering which spares a considerable measure of time with streamlined investigation systems. In this work center is made on Falsehood Classification by Support Vector Machine (SVM). 72 Samples are set up by making inquiries from standard poll with a Wright and wrong replies in a diverse era from the individual in wearable head unit. 52 samples are trained and 20 are tested. By utilizing Bluetooth based Neurosky’s Mindwave kit, brain waves are recorded and qualities are arranged appropriately. In this work confusion matrix is derived by matlab programs and accuracy of 56.25 % is achieved.

  15. Support vector machines classifiers of physical activities in preschoolers

    Science.gov (United States)

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  16. Representative Vector Machines: A Unified Framework for Classical Classifiers.

    Science.gov (United States)

    Gui, Jie; Liu, Tongliang; Tao, Dacheng; Sun, Zhenan; Tan, Tieniu

    2016-08-01

    Classifier design is a fundamental problem in pattern recognition. A variety of pattern classification methods such as the nearest neighbor (NN) classifier, support vector machine (SVM), and sparse representation-based classification (SRC) have been proposed in the literature. These typical and widely used classifiers were originally developed from different theory or application motivations and they are conventionally treated as independent and specific solutions for pattern classification. This paper proposes a novel pattern classification framework, namely, representative vector machines (or RVMs for short). The basic idea of RVMs is to assign the class label of a test example according to its nearest representative vector. The contributions of RVMs are twofold. On one hand, the proposed RVMs establish a unified framework of classical classifiers because NN, SVM, and SRC can be interpreted as the special cases of RVMs with different definitions of representative vectors. Thus, the underlying relationship among a number of classical classifiers is revealed for better understanding of pattern classification. On the other hand, novel and advanced classifiers are inspired in the framework of RVMs. For example, a robust pattern classification method called discriminant vector machine (DVM) is motivated from RVMs. Given a test example, DVM first finds its k -NNs and then performs classification based on the robust M-estimator and manifold regularization. Extensive experimental evaluations on a variety of visual recognition tasks such as face recognition (Yale and face recognition grand challenge databases), object categorization (Caltech-101 dataset), and action recognition (Action Similarity LAbeliNg) demonstrate the advantages of DVM over other classifiers.

  17. Defending Malicious Script Attacks Using Machine Learning Classifiers

    Directory of Open Access Journals (Sweden)

    Nayeem Khan

    2017-01-01

    Full Text Available The web application has become a primary target for cyber criminals by injecting malware especially JavaScript to perform malicious activities for impersonation. Thus, it becomes an imperative to detect such malicious code in real time before any malicious activity is performed. This study proposes an efficient method of detecting previously unknown malicious java scripts using an interceptor at the client side by classifying the key features of the malicious code. Feature subset was obtained by using wrapper method for dimensionality reduction. Supervised machine learning classifiers were used on the dataset for achieving high accuracy. Experimental results show that our method can efficiently classify malicious code from benign code with promising results.

  18. SVM-based glioma grading. Optimization by feature reduction analysis

    International Nuclear Information System (INIS)

    Zoellner, Frank G.; Schad, Lothar R.; Emblem, Kyrre E.; Harvard Medical School, Boston, MA; Oslo Univ. Hospital

    2012-01-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values (∝87%) while reducing the number of features by up to 98%. (orig.)

  19. Classifying images using restricted Boltzmann machines and convolutional neural networks

    Science.gov (United States)

    Zhao, Zhijun; Xu, Tongde; Dai, Chenyu

    2017-07-01

    To improve the feature recognition ability of deep model transfer learning, we propose a hybrid deep transfer learning method for image classification based on restricted Boltzmann machines (RBM) and convolutional neural networks (CNNs). It integrates learning abilities of two models, which conducts subject classification by exacting structural higher-order statistics features of images. While the method transfers the trained convolutional neural networks to the target datasets, fully-connected layers can be replaced by restricted Boltzmann machine layers; then the restricted Boltzmann machine layers and Softmax classifier are retrained, and BP neural network can be used to fine-tuned the hybrid model. The restricted Boltzmann machine layers has not only fully integrated the whole feature maps, but also learns the statistical features of target datasets in the view of the biggest logarithmic likelihood, thus removing the effects caused by the content differences between datasets. The experimental results show that the proposed method has improved the accuracy of image classification, outperforming other methods on Pascal VOC2007 and Caltech101 datasets.

  20. Classifying BCI signals from novice users with extreme learning machine

    Directory of Open Access Journals (Sweden)

    Rodríguez-Bermúdez Germán

    2017-07-01

    Full Text Available Brain computer interface (BCI allows to control external devices only with the electrical activity of the brain. In order to improve the system, several approaches have been proposed. However it is usual to test algorithms with standard BCI signals from experts users or from repositories available on Internet. In this work, extreme learning machine (ELM has been tested with signals from 5 novel users to compare with standard classification algorithms. Experimental results show that ELM is a suitable method to classify electroencephalogram signals from novice users.

  1. Machine learning classifiers and fMRI: a tutorial overview.

    Science.gov (United States)

    Pereira, Francisco; Mitchell, Tom; Botvinick, Matthew

    2009-03-01

    Interpreting brain image experiments requires analysis of complex, multivariate data. In recent years, one analysis approach that has grown in popularity is the use of machine learning algorithms to train classifiers to decode stimuli, mental states, behaviours and other variables of interest from fMRI data and thereby show the data contain information about them. In this tutorial overview we review some of the key choices faced in using this approach as well as how to derive statistically significant results, illustrating each point from a case study. Furthermore, we show how, in addition to answering the question of 'is there information about a variable of interest' (pattern discrimination), classifiers can be used to tackle other classes of question, namely 'where is the information' (pattern localization) and 'how is that information encoded' (pattern characterization).

  2. Implementation of a classifier didactical machine for learning mechatronic processes

    Directory of Open Access Journals (Sweden)

    Alex De La Cruz

    2017-06-01

    Full Text Available The present article shows the design and construction of a classifier didactical machine through artificial vision. The implementation of the machine is to be used as a learning module of mechatronic processes. In the project, it is described the theoretical aspects that relate concepts of mechanical design, electronic design and software management which constitute popular field in science and technology, which is mechatronics. The design of the machine was developed based on the requirements of the user, through the concurrent design methodology to define and materialize the appropriate hardware and software solutions. LabVIEW 2015 was implemented for high-speed image acquisition and analysis, as well as for the establishment of data communication with a programmable logic controller (PLC via Ethernet and an open communications platform known as Open Platform Communications - OPC. In addition, the Arduino MEGA 2560 platform was used to control the movement of the step motor and the servo motors of the module. Also, is used the Arduino MEGA 2560 to control the movement of the stepper motor and servo motors in the module. Finally, we assessed whether the equipment meets the technical specifications raised by running specific test protocols.

  3. Machine learning algorithms to classify spinal muscular atrophy subtypes.

    Science.gov (United States)

    Srivastava, Tuhin; Darras, Basil T; Wu, Jim S; Rutkove, Seward B

    2012-07-24

    The development of better biomarkers for disease assessment remains an ongoing effort across the spectrum of neurologic illnesses. One approach for refining biomarkers is based on the concept of machine learning, in which individual, unrelated biomarkers are simultaneously evaluated. In this cross-sectional study, we assess the possibility of using machine learning, incorporating both quantitative muscle ultrasound (QMU) and electrical impedance myography (EIM) data, for classification of muscles affected by spinal muscular atrophy (SMA). Twenty-one normal subjects, 15 subjects with SMA type 2, and 10 subjects with SMA type 3 underwent EIM and QMU measurements of unilateral biceps, wrist extensors, quadriceps, and tibialis anterior. EIM and QMU parameters were then applied in combination using a support vector machine (SVM), a type of machine learning, in an attempt to accurately categorize 165 individual muscles. For all 3 classification problems, normal vs SMA, normal vs SMA 3, and SMA 2 vs SMA 3, use of SVM provided the greatest accuracy in discrimination, surpassing both EIM and QMU individually. For example, the accuracy, as measured by the receiver operating characteristic area under the curve (ROC-AUC) for the SVM discriminating SMA 2 muscles from SMA 3 muscles was 0.928; in comparison, the ROC-AUCs for EIM and QMU parameters alone were only 0.877 (p < 0.05) and 0.627 (p < 0.05), respectively. Combining EIM and QMU data categorizes individual SMA-affected muscles with very high accuracy. Further investigation of this approach for classifying and for following the progression of neuromuscular illness is warranted.

  4. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

    Directory of Open Access Journals (Sweden)

    C. V. Subbulakshmi

    2015-01-01

    Full Text Available Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learning capability of the particle swarm optimization (PSO algorithm with the extreme learning machine (ELM classifier. As a recent off-line learning method, ELM is a single-hidden layer feedforward neural network (FFNN, proved to be an excellent classifier with large number of hidden layer neurons. In this research, PSO is used to determine the optimum set of parameters for the ELM, thus reducing the number of hidden layer neurons, and it further improves the network generalization performance. The proposed method is experimented on five benchmarked datasets of the UCI Machine Learning Repository for handling medical dataset classification. Simulation results show that the proposed approach is able to achieve good generalization performance, compared to the results of other classifiers.

  5. SVM-based glioma grading. Optimization by feature reduction analysis

    Energy Technology Data Exchange (ETDEWEB)

    Zoellner, Frank G.; Schad, Lothar R. [University Medical Center Mannheim, Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Emblem, Kyrre E. [Massachusetts General Hospital, Charlestown, A.A. Martinos Center for Biomedical Imaging, Boston MA (United States). Dept. of Radiology; Harvard Medical School, Boston, MA (United States); Oslo Univ. Hospital (Norway). The Intervention Center

    2012-11-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values ({proportional_to}87%) while reducing the number of features by up to 98%. (orig.)

  6. PSO-SVM-Based Online Locomotion Mode Identification for Rehabilitation Robotic Exoskeletons.

    Science.gov (United States)

    Long, Yi; Du, Zhi-Jiang; Wang, Wei-Dong; Zhao, Guang-Yu; Xu, Guo-Qiang; He, Long; Mao, Xi-Wang; Dong, Wei

    2016-09-02

    Locomotion mode identification is essential for the control of a robotic rehabilitation exoskeletons. This paper proposes an online support vector machine (SVM) optimized by particle swarm optimization (PSO) to identify different locomotion modes to realize a smooth and automatic locomotion transition. A PSO algorithm is used to obtain the optimal parameters of SVM for a better overall performance. Signals measured by the foot pressure sensors integrated in the insoles of wearable shoes and the MEMS-based attitude and heading reference systems (AHRS) attached on the shoes and shanks of leg segments are fused together as the input information of SVM. Based on the chosen window whose size is 200 ms (with sampling frequency of 40 Hz), a three-layer wavelet packet analysis (WPA) is used for feature extraction, after which, the kernel principal component analysis (kPCA) is utilized to reduce the dimension of the feature set to reduce computation cost of the SVM. Since the signals are from two types of different sensors, the normalization is conducted to scale the input into the interval of [0, 1]. Five-fold cross validation is adapted to train the classifier, which prevents the classifier over-fitting. Based on the SVM model obtained offline in MATLAB, an online SVM algorithm is constructed for locomotion mode identification. Experiments are performed for different locomotion modes and experimental results show the effectiveness of the proposed algorithm with an accuracy of 96.00% ± 2.45%. To improve its accuracy, majority vote algorithm (MVA) is used for post-processing, with which the identification accuracy is better than 98.35% ± 1.65%. The proposed algorithm can be extended and employed in the field of robotic rehabilitation and assistance.

  7. PSO-SVM-Based Online Locomotion Mode Identification for Rehabilitation Robotic Exoskeletons

    Directory of Open Access Journals (Sweden)

    Yi Long

    2016-09-01

    Full Text Available Locomotion mode identification is essential for the control of a robotic rehabilitation exoskeletons. This paper proposes an online support vector machine (SVM optimized by particle swarm optimization (PSO to identify different locomotion modes to realize a smooth and automatic locomotion transition. A PSO algorithm is used to obtain the optimal parameters of SVM for a better overall performance. Signals measured by the foot pressure sensors integrated in the insoles of wearable shoes and the MEMS-based attitude and heading reference systems (AHRS attached on the shoes and shanks of leg segments are fused together as the input information of SVM. Based on the chosen window whose size is 200 ms (with sampling frequency of 40 Hz, a three-layer wavelet packet analysis (WPA is used for feature extraction, after which, the kernel principal component analysis (kPCA is utilized to reduce the dimension of the feature set to reduce computation cost of the SVM. Since the signals are from two types of different sensors, the normalization is conducted to scale the input into the interval of [0, 1]. Five-fold cross validation is adapted to train the classifier, which prevents the classifier over-fitting. Based on the SVM model obtained offline in MATLAB, an online SVM algorithm is constructed for locomotion mode identification. Experiments are performed for different locomotion modes and experimental results show the effectiveness of the proposed algorithm with an accuracy of 96.00% ± 2.45%. To improve its accuracy, majority vote algorithm (MVA is used for post-processing, with which the identification accuracy is better than 98.35% ± 1.65%. The proposed algorithm can be extended and employed in the field of robotic rehabilitation and assistance.

  8. LaSVM-based big data learning system for dynamic prediction of air pollution in Tehran.

    Science.gov (United States)

    Ghaemi, Z; Alimohammadi, A; Farnaghi, M

    2018-04-20

    Due to critical impacts of air pollution, prediction and monitoring of air quality in urban areas are important tasks. However, because of the dynamic nature and high spatio-temporal variability, prediction of the air pollutant concentrations is a complex spatio-temporal problem. Distribution of pollutant concentration is influenced by various factors such as the historical pollution data and weather conditions. Conventional methods such as the support vector machine (SVM) or artificial neural networks (ANN) show some deficiencies when huge amount of streaming data have to be analyzed for urban air pollution prediction. In order to overcome the limitations of the conventional methods and improve the performance of urban air pollution prediction in Tehran, a spatio-temporal system is designed using a LaSVM-based online algorithm. Pollutant concentration and meteorological data along with geographical parameters are continually fed to the developed online forecasting system. Performance of the system is evaluated by comparing the prediction results of the Air Quality Index (AQI) with those of a traditional SVM algorithm. Results show an outstanding increase of speed by the online algorithm while preserving the accuracy of the SVM classifier. Comparison of the hourly predictions for next coming 24 h, with those of the measured pollution data in Tehran pollution monitoring stations shows an overall accuracy of 0.71, root mean square error of 0.54 and coefficient of determination of 0.81. These results are indicators of the practical usefulness of the online algorithm for real-time spatial and temporal prediction of the urban air quality.

  9. Agarwood oil quality classifier using machine learning | Abas ...

    African Journals Online (AJOL)

    Agarwood oil is known as one of the most expensive and precious oils being traded. It is widely used in traditional ceremonies and religious prayers. Its quality plays an important role on the market price that it can be traded. This paper proposes on a proper classification method of the agarwood oil quality using machine ...

  10. FaaPred: a SVM-based prediction method for fungal adhesins and adhesin-like proteins.

    Directory of Open Access Journals (Sweden)

    Jayashree Ramana

    Full Text Available Adhesion constitutes one of the initial stages of infection in microbial diseases and is mediated by adhesins. Hence, identification and comprehensive knowledge of adhesins and adhesin-like proteins is essential to understand adhesin mediated pathogenesis and how to exploit its therapeutic potential. However, the knowledge about fungal adhesins is rudimentary compared to that of bacterial adhesins. In addition to host cell attachment and mating, the fungal adhesins play a significant role in homotypic and xenotypic aggregation, foraging and biofilm formation. Experimental identification of fungal adhesins is labor- as well as time-intensive. In this work, we present a Support Vector Machine (SVM based method for the prediction of fungal adhesins and adhesin-like proteins. The SVM models were trained with different compositional features, namely, amino acid, dipeptide, multiplet fractions, charge and hydrophobic compositions, as well as PSI-BLAST derived PSSM matrices. The best classifiers are based on compositional properties as well as PSSM and yield an overall accuracy of 86%. The prediction method based on best classifiers is freely accessible as a world wide web based server at http://bioinfo.icgeb.res.in/faap. This work will aid rapid and rational identification of fungal adhesins, expedite the pace of experimental characterization of novel fungal adhesins and enhance our knowledge about role of adhesins in fungal infections.

  11. Classifying Structures in the ISM with Machine Learning Techniques

    Science.gov (United States)

    Beaumont, Christopher; Goodman, A. A.; Williams, J. P.

    2011-01-01

    The processes which govern molecular cloud evolution and star formation often sculpt structures in the ISM: filaments, pillars, shells, outflows, etc. Because of their morphological complexity, these objects are often identified manually. Manual classification has several disadvantages; the process is subjective, not easily reproducible, and does not scale well to handle increasingly large datasets. We have explored to what extent machine learning algorithms can be trained to autonomously identify specific morphological features in molecular cloud datasets. We show that the Support Vector Machine algorithm can successfully locate filaments and outflows blended with other emission structures. When the objects of interest are morphologically distinct from the surrounding emission, this autonomous classification achieves >90% accuracy. We have developed a set of IDL-based tools to apply this technique to other datasets.

  12. Using Machine Learning to classify the diffuse interstellar bands

    Science.gov (United States)

    Baron, Dalya; Poznanski, Dovi; Watson, Darach; Yao, Yushu; Cox, Nick L. J.; Prochaska, J. Xavier

    2015-07-01

    Using over a million and a half extragalactic spectra from the Sloan Digital Sky Survey we study the correlations of the diffuse interstellar bands (DIBs) in the Milky Way. We measure the correlation between DIB strength and dust extinction for 142 DIBs using 24 stacked spectra in the reddening range E(B - V) studied before. Most of the DIBs do not correlate with dust extinction. However, we find 10 weak and barely studied DIBs with correlations that are higher than 0.7 with dust extinction and confirm the high correlation of additional five strong DIBs. Furthermore, we find a pair of DIBs, 5925.9 and 5927.5 Å, which exhibits significant negative correlation with dust extinction, indicating that their carrier may be depleted on dust. We use Machine Learning algorithms to divide the DIBs to spectroscopic families based on 250 stacked spectra. By removing the dust dependence, we study how DIBs follow their local environment. We thus obtain six groups of weak DIBs, four of which are tightly associated with C2 or CN absorption lines.

  13. Oblique decision trees using embedded support vector machines in classifier ensembles

    NARCIS (Netherlands)

    Menkovski, V.; Christou, I.; Efremidis, S.

    2008-01-01

    Classifier ensembles have emerged in recent years as a promising research area for boosting pattern recognition systems' performance. We present a new base classifier that utilizes oblique decision tree technology based on support vector machines for the construction of oblique (non-axis parallel)

  14. Statistical and Machine-Learning Classifier Framework to Improve Pulse Shape Discrimination System Design

    Energy Technology Data Exchange (ETDEWEB)

    Wurtz, R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Kaplan, A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-10-28

    Pulse shape discrimination (PSD) is a variety of statistical classifier. Fully-­realized statistical classifiers rely on a comprehensive set of tools for designing, building, and implementing. PSD advances rely on improvements to the implemented algorithm. PSD advances can be improved by using conventional statistical classifier or machine learning methods. This paper provides the reader with a glossary of classifier-­building elements and their functions in a fully-­designed and operational classifier framework that can be used to discover opportunities for improving PSD classifier projects. This paper recommends reporting the PSD classifier’s receiver operating characteristic (ROC) curve and its behavior at a gamma rejection rate (GRR) relevant for realistic applications.

  15. A Novel Approach for Multi Class Fault Diagnosis in Induction Machine Based on Statistical Time Features and Random Forest Classifier

    Science.gov (United States)

    Sonje, M. Deepak; Kundu, P.; Chowdhury, A.

    2017-08-01

    Fault diagnosis and detection is the important area in health monitoring of electrical machines. This paper proposes the recently developed machine learning classifier for multi class fault diagnosis in induction machine. The classification is based on random forest (RF) algorithm. Initially, stator currents are acquired from the induction machine under various conditions. After preprocessing the currents, fourteen statistical time features are estimated for each phase of the current. These parameters are considered as inputs to the classifier. The main scope of the paper is to evaluate effectiveness of RF classifier for individual and mixed fault diagnosis in induction machine. The stator, rotor and mixed faults (stator and rotor faults) are classified using the proposed classifier. The obtained performance measures are compared with the multilayer perceptron neural network (MLPNN) classifier. The results show the much better performance measures and more accurate than MLPNN classifier. For demonstration of planned fault diagnosis algorithm, experimentally obtained results are considered to build the classifier more practical.

  16. Zooniverse: Combining Human and Machine Classifiers for the Big Survey Era

    Science.gov (United States)

    Fortson, Lucy; Wright, Darryl; Beck, Melanie; Lintott, Chris; Scarlata, Claudia; Dickinson, Hugh; Trouille, Laura; Willi, Marco; Laraia, Michael; Boyer, Amy; Veldhuis, Marten; Zooniverse

    2018-01-01

    Many analyses of astronomical data sets, ranging from morphological classification of galaxies to identification of supernova candidates, have relied on humans to classify data into distinct categories. Crowdsourced galaxy classifications via the Galaxy Zoo project provided a solution that scaled visual classification for extant surveys by harnessing the combined power of thousands of volunteers. However, the much larger data sets anticipated from upcoming surveys will require a different approach. Automated classifiers using supervised machine learning have improved considerably over the past decade but their increasing sophistication comes at the expense of needing ever more training data. Crowdsourced classification by human volunteers is a critical technique for obtaining these training data. But several improvements can be made on this zeroth order solution. Efficiency gains can be achieved by implementing a “cascade filtering” approach whereby the task structure is reduced to a set of binary questions that are more suited to simpler machines while demanding lower cognitive loads for humans.Intelligent subject retirement based on quantitative metrics of volunteer skill and subject label reliability also leads to dramatic improvements in efficiency. We note that human and machine classifiers may retire subjects differently leading to trade-offs in performance space. Drawing on work with several Zooniverse projects including Galaxy Zoo and Supernova Hunter, we will present recent findings from experiments that combine cohorts of human and machine classifiers. We show that the most efficient system results when appropriate subsets of the data are intelligently assigned to each group according to their particular capabilities.With sufficient online training, simple machines can quickly classify “easy” subjects, leaving more difficult (and discovery-oriented) tasks for volunteers. We also find humans achieve higher classification purity while samples

  17. A comparative study of machine learning classifiers for modeling travel mode choice

    NARCIS (Netherlands)

    Hagenauer, J; Helbich, M

    2017-01-01

    The analysis of travel mode choice is an important task in transportation planning and policy making in order to understand and predict travel demands. While advances in machine learning have led to numerous powerful classifiers, their usefulness for modeling travel mode choice remains largely

  18. Least Square Support Vector Machine Classifier vs a Logistic Regression Classifier on the Recognition of Numeric Digits

    Directory of Open Access Journals (Sweden)

    Danilo A. López-Sarmiento

    2013-11-01

    Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.

  19. Hybrid PSO–SVM-based method for forecasting of the remaining useful life for aircraft engines and evaluation of its reliability

    International Nuclear Information System (INIS)

    García Nieto, P.J.; García-Gonzalo, E.; Sánchez Lasheras, F.; Cos Juez, F.J. de

    2015-01-01

    The present paper describes a hybrid PSO–SVM-based model for the prediction of the remaining useful life of aircraft engines. The proposed hybrid model combines support vector machines (SVMs), which have been successfully adopted for regression problems, with the particle swarm optimization (PSO) technique. This optimization technique involves kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not been yet widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid PSO–SVM-based model from the remaining measured parameters (input variables) for aircraft engines with success. A coefficient of determination equal to 0.9034 was obtained when this hybrid PSO–RBF–SVM-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. One of the main advantages of this predictive model is that it does not require information about the previous operation states of the engine. Finally, the main conclusions of this study are exposed. - Highlights: • A hybrid PSO–SVM-based model is built as a predictive model of the RUL values for aircraft engines. • The remaining physical–chemical variables in this process are studied in depth. • The obtained regression accuracy of our method is about 95%. • The results show that PSO–SVM-based model can assist in the diagnosis of the RUL values with accuracy

  20. Using multivariate machine learning methods and structural MRI to classify childhood onset schizophrenia and healthy controls

    Directory of Open Access Journals (Sweden)

    Deanna eGreenstein

    2012-06-01

    Full Text Available Introduction: Multivariate machine learning methods can be used to classify groups of schizophrenia patients and controls using structural magnetic resonance imaging (MRI. However, machine learning methods to date have not been extended beyond classification and contemporaneously applied in a meaningful way to clinical measures. We hypothesized that brain measures would classify groups, and that increased likelihood of being classified as a patient using regional brain measures would be positively related to illness severity, developmental delays and genetic risk. Methods: Using 74 anatomic brain MRI sub regions and Random Forest, we classified 98 COS patients and 99 age, sex, and ethnicity-matched healthy controls. We also used Random Forest to determine the likelihood of being classified as a schizophrenia patient based on MRI measures. We then explored relationships between brain-based probability of illness and symptoms, premorbid development, and presence of copy number variation associated with schizophrenia. Results: Brain regions jointly classified COS and control groups with 73.7% accuracy. Greater brain-based probability of illness was associated with worse functioning (p= 0.0004 and fewer developmental delays (p=0.02. Presence of copy number variation (CNV was associated with lower probability of being classified as schizophrenia (p=0.001. The regions that were most important in classifying groups included left temporal lobes, bilateral dorsolateral prefrontal regions, and left medial parietal lobes. Conclusions: Schizophrenia and control groups can be well classified using Random Forest and anatomic brain measures, and brain-based probability of illness has a positive relationship with illness severity and a negative relationship with developmental delays/problems and CNV-based risk.

  1. Diagnosis of Elevator Faults with LS-SVM Based on Optimization by K-CV

    Directory of Open Access Journals (Sweden)

    Zhou Wan

    2015-01-01

    Full Text Available Several common elevator malfunctions were diagnosed with a least square support vector machine (LS-SVM. After acquiring vibration signals of various elevator functions, their energy characteristics and time domain indicators were extracted by theoretically analyzing the optimal wavelet packet, in order to construct a feature vector of malfunctions for identifying causes of the malfunctions as input of LS-SVM. Meanwhile, parameters about LS-SVM were optimized by K-fold cross validation (K-CV. After diagnosing deviated elevator guide rail, deviated shape of guide shoe, abnormal running of tractor, erroneous rope groove of traction sheave, deviated guide wheel, and tension of wire rope, the results suggested that the LS-SVM based on K-CV optimization was one of effective methods for diagnosing elevator malfunctions.

  2. A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

    Science.gov (United States)

    S K, Somasundaram; P, Alli

    2017-11-09

    The main complication of diabetes is Diabetic retinopathy (DR), retinal vascular disease and it leads to the blindness. Regular screening for early DR disease detection is considered as an intensive labor and resource oriented task. Therefore, automatic detection of DR diseases is performed only by using the computational technique is the great solution. An automatic method is more reliable to determine the presence of an abnormality in Fundus images (FI) but, the classification process is poorly performed. Recently, few research works have been designed for analyzing texture discrimination capacity in FI to distinguish the healthy images. However, the feature extraction (FE) process was not performed well, due to the high dimensionality. Therefore, to identify retinal features for DR disease diagnosis and early detection using Machine Learning and Ensemble Classification method, called, Machine Learning Bagging Ensemble Classifier (ML-BEC) is designed. The ML-BEC method comprises of two stages. The first stage in ML-BEC method comprises extraction of the candidate objects from Retinal Images (RI). The candidate objects or the features for DR disease diagnosis include blood vessels, optic nerve, neural tissue, neuroretinal rim, optic disc size, thickness and variance. These features are initially extracted by applying Machine Learning technique called, t-distributed Stochastic Neighbor Embedding (t-SNE). Besides, t-SNE generates a probability distribution across high-dimensional images where the images are separated into similar and dissimilar pairs. Then, t-SNE describes a similar probability distribution across the points in the low-dimensional map. This lessens the Kullback-Leibler divergence among two distributions regarding the locations of the points on the map. The second stage comprises of application of ensemble classifiers to the extracted features for providing accurate analysis of digital FI using machine learning. In this stage, an automatic detection

  3. Classifying cognitive profiles using machine learning with privileged information in Mild Cognitive Impairment

    Directory of Open Access Journals (Sweden)

    Hanin Hamdan Alahmadi

    2016-11-01

    Full Text Available Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalised Matrix Learning Vector Quantization (GMLVQ classifiers to discriminate patients with Mild Cognitive Impairment (MCI from healthy controls based on their cognitive skills. Further, we adopted a ``Learning with privileged information'' approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants.MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning classifiers can be used to discriminate patients from controls based on the learning performance and whether differences between these groups relate to individual cognitive profiles. To this end, we tested participants in four cognitive tasks: working memory, cognitive inhibition, divided attention, and selective attention. We also collected fMRI data before and after training on the learning task and extracted fMRI responses and connectivity as features for machine learning classifiers. Our results show that the PI guided GMLVQ classifiers outperform the baseline classifier that only used the cognitive data. In addition, we found that for the baseline classifier, divided attention is the only relevant cognitive feature. When PI was incorporated, divided attention remained the most relevant feature while cognitive inhibition became also relevant for the task. Interestingly, this analysis for the fMRI GMLVQ classifier suggests that (1 when overall fMRI signal for structured stimuli is

  4. Use of Machine Learning Classifiers and Sensor Data to Detect Neurological Deficit in Stroke Patients.

    Science.gov (United States)

    Park, Eunjeong; Chang, Hyuk-Jae; Nam, Hyo Suk

    2017-04-18

    The pronator drift test (PDT), a neurological examination, is widely used in clinics to measure motor weakness of stroke patients. The aim of this study was to develop a PDT tool with machine learning classifiers to detect stroke symptoms based on quantification of proximal arm weakness using inertial sensors and signal processing. We extracted features of drift and pronation from accelerometer signals of wearable devices on the inner wrists of 16 stroke patients and 10 healthy controls. Signal processing and feature selection approach were applied to discriminate PDT features used to classify stroke patients. A series of machine learning techniques, namely support vector machine (SVM), radial basis function network (RBFN), and random forest (RF), were implemented to discriminate stroke patients from controls with leave-one-out cross-validation. Signal processing by the PDT tool extracted a total of 12 PDT features from sensors. Feature selection abstracted the major attributes from the 12 PDT features to elucidate the dominant characteristics of proximal weakness of stroke patients using machine learning classification. Our proposed PDT classifiers had an area under the receiver operating characteristic curve (AUC) of .806 (SVM), .769 (RBFN), and .900 (RF) without feature selection, and feature selection improves the AUCs to .913 (SVM), .956 (RBFN), and .975 (RF), representing an average performance enhancement of 15.3%. Sensors and machine learning methods can reliably detect stroke signs and quantify proximal arm weakness. Our proposed solution will facilitate pervasive monitoring of stroke patients. ©Eunjeong Park, Hyuk-Jae Chang, Hyo Suk Nam. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 18.04.2017.

  5. Machinery Bearing Fault Diagnosis Using Variational Mode Decomposition and Support Vector Machine as a Classifier

    Science.gov (United States)

    Rama Krishna, K.; Ramachandran, K. I.

    2018-02-01

    Crack propagation is a major cause of failure in rotating machines. It adversely affects the productivity, safety, and the machining quality. Hence, detecting the crack’s severity accurately is imperative for the predictive maintenance of such machines. Fault diagnosis is an established concept in identifying the faults, for observing the non-linear behaviour of the vibration signals at various operating conditions. In this work, we find the classification efficiencies for both original and the reconstructed vibrational signals. The reconstructed signals are obtained using Variational Mode Decomposition (VMD), by splitting the original signal into three intrinsic mode functional components and framing them accordingly. Feature extraction, feature selection and feature classification are the three phases in obtaining the classification efficiencies. All the statistical features from the original signals and reconstructed signals are found out in feature extraction process individually. A few statistical parameters are selected in feature selection process and are classified using the SVM classifier. The obtained results show the best parameters and appropriate kernel in SVM classifier for detecting the faults in bearings. Hence, we conclude that better results were obtained by VMD and SVM process over normal process using SVM. This is owing to denoising and filtering the raw vibrational signals.

  6. Feature Import Vector Machine: A General Classifier with Flexible Feature Selection.

    Science.gov (United States)

    Ghosh, Samiran; Wang, Yazhen

    2015-02-01

    The support vector machine (SVM) and other reproducing kernel Hilbert space (RKHS) based classifier systems are drawing much attention recently due to its robustness and generalization capability. General theme here is to construct classifiers based on the training data in a high dimensional space by using all available dimensions. The SVM achieves huge data compression by selecting only few observations which lie close to the boundary of the classifier function. However when the number of observations are not very large (small n ) but the number of dimensions/features are large (large p ), then it is not necessary that all available features are of equal importance in the classification context. Possible selection of an useful fraction of the available features may result in huge data compression. In this paper we propose an algorithmic approach by means of which such an optimal set of features could be selected. In short, we reverse the traditional sequential observation selection strategy of SVM to that of sequential feature selection. To achieve this we have modified the solution proposed by Zhu and Hastie (2005) in the context of import vector machine (IVM), to select an optimal sub-dimensional model to build the final classifier with sufficient accuracy.

  7. Fault Diagnosis for Distribution Networks Using Enhanced Support Vector Machine Classifier with Classical Multidimensional Scaling

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Cho

    2017-09-01

    Full Text Available In this paper, a new fault diagnosis techniques based on time domain reflectometry (TDR method with pseudo-random binary sequence (PRBS stimulus and support vector machine (SVM classifier has been investigated to recognize the different types of fault in the radial distribution feeders. This novel technique has considered the amplitude of reflected signals and the peaks of cross-correlation (CCR between the reflected and incident wave for generating fault current dataset for SVM. Furthermore, this multi-layer enhanced SVM classifier is combined with classical multidimensional scaling (CMDS feature extraction algorithm and kernel parameter optimization to increase training speed and improve overall classification accuracy. The proposed technique has been tested on a radial distribution feeder to identify ten different types of fault considering 12 input features generated by using Simulink software and MATLAB Toolbox. The success rate of SVM classifier is over 95% which demonstrates the effectiveness and the high accuracy of proposed method.

  8. DisArticle: a web server for SVM-based discrimination of articles on traditional medicine.

    Science.gov (United States)

    Kim, Sang-Kyun; Nam, SeJin; Kim, SangHyun

    2017-01-28

    Much research has been done in Northeast Asia to show the efficacy of traditional medicine. While MEDLINE contains many biomedical articles including those on traditional medicine, it does not categorize those articles by specific research area. The aim of this study was to provide a method that searches for articles only on traditional medicine in Northeast Asia, including traditional Chinese medicine, from among the articles in MEDLINE. This research established an SVM-based classifier model to identify articles on traditional medicine. The TAK + HM classifier, trained with the features of title, abstract, keywords, herbal data, and MeSH, has a precision of 0.954 and a recall of 0.902. In particular, the feature of herbal data significantly increased the performance of the classifier. By using the TAK + HM classifier, a total of about 108,000 articles were discriminated as articles on traditional medicine from among all articles in MEDLINE. We also built a web server called DisArticle ( http://informatics.kiom.re.kr/disarticle ), in which users can search for the articles and obtain statistical data. Because much evidence-based research on traditional medicine has been published in recent years, it has become necessary to search for articles on traditional medicine exclusively in literature databases. DisArticle can help users to search for and analyze the research trends in traditional medicine.

  9. Relevance Vector Machine and Support Vector Machine Classifier Analysis of Scanning Laser Polarimetry Retinal Nerve Fiber Layer Measurements

    Science.gov (United States)

    Bowd, Christopher; Medeiros, Felipe A.; Zhang, Zuohua; Zangwill, Linda M.; Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.; Weinreb, Robert N.; Goldbaum, Michael H.

    2010-01-01

    Purpose To classify healthy and glaucomatous eyes using relevance vector machine (RVM) and support vector machine (SVM) learning classifiers trained on retinal nerve fiber layer (RNFL) thickness measurements obtained by scanning laser polarimetry (SLP). Methods Seventy-two eyes of 72 healthy control subjects (average age = 64.3 ± 8.8 years, visual field mean deviation =−0.71 ± 1.2 dB) and 92 eyes of 92 patients with glaucoma (average age = 66.9 ± 8.9 years, visual field mean deviation =−5.32 ± 4.0 dB) were imaged with SLP with variable corneal compensation (GDx VCC; Laser Diagnostic Technologies, San Diego, CA). RVM and SVM learning classifiers were trained and tested on SLP-determined RNFL thickness measurements from 14 standard parameters and 64 sectors (approximately 5.6° each) obtained in the circumpapillary area under the instrument-defined measurement ellipse (total 78 parameters). Tenfold cross-validation was used to train and test RVM and SVM classifiers on unique subsets of the full 164-eye data set and areas under the receiver operating characteristic (AUROC) curve for the classification of eyes in the test set were generated. AUROC curve results from RVM and SVM were compared to those for 14 SLP software-generated global and regional RNFL thickness parameters. Also reported was the AUROC curve for the GDx VCC software-generated nerve fiber indicator (NFI). Results The AUROC curves for RVM and SVM were 0.90 and 0.91, respectively, and increased to 0.93 and 0.94 when the training sets were optimized with sequential forward and backward selection (resulting in reduced dimensional data sets). AUROC curves for optimized RVM and SVM were significantly larger than those for all individual SLP parameters. The AUROC curve for the NFI was 0.87. Conclusions Results from RVM and SVM trained on SLP RNFL thickness measurements are similar and provide accurate classification of glaucomatous and healthy eyes. RVM may be preferable to SVM, because it provides a

  10. Modeling the milling tool wear by using an evolutionary SVM-based model from milling runs experimental data

    Science.gov (United States)

    Nieto, Paulino José García; García-Gonzalo, Esperanza; Vilán, José Antonio Vilán; Robleda, Abraham Segade

    2015-12-01

    The main aim of this research work is to build a new practical hybrid regression model to predict the milling tool wear in a regular cut as well as entry cut and exit cut of a milling tool. The model was based on Particle Swarm Optimization (PSO) in combination with support vector machines (SVMs). This optimization mechanism involved kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy. Bearing this in mind, a PSO-SVM-based model, which is based on the statistical learning theory, was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc. To accomplish the objective of this study, the experimental dataset represents experiments from runs on a milling machine under various operating conditions. In this way, data sampled by three different types of sensors (acoustic emission sensor, vibration sensor and current sensor) were acquired at several positions. A second aim is to determine the factors with the greatest bearing on the milling tool flank wear with a view to proposing milling machine's improvements. Firstly, this hybrid PSO-SVM-based regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the flank wear (output variable) and input variables (time, depth of cut, feed, etc.). Indeed, regression with optimal hyperparameters was performed and a determination coefficient of 0.95 was obtained. The agreement of this model with experimental data confirmed its good performance. Secondly, the main advantages of this PSO-SVM-based model are its capacity to produce a simple, easy-to-interpret model, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, the main conclusions of this study are exposed.

  11. Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology

    Directory of Open Access Journals (Sweden)

    Ashley I. Heinson

    2017-02-01

    Full Text Available Reverse vaccinology (RV is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML techniques to distinguish bacterial protective antigens (BPAs from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM classifier that could discriminate BPAs (n = 200 from non-BPAs (n = 200 with an area under the curve (AUC of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.

  12. Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology

    KAUST Repository

    Heinson, Ashley

    2017-02-01

    Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.

  13. SVM Classifier – a comprehensive java interface for support vector machine classification of microarray data

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-01-01

    Motivation Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. Results The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1–BRCA2 samples with RBF kernel of SVM. Conclusion We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at . PMID:17217518

  14. SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data.

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-12-12

    Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.

  15. Enhancing the Biological Relevance of Machine Learning Classifiers for Reverse Vaccinology

    KAUST Repository

    Heinson, Ashley; Gunawardana, Yawwani; Moesker, Bastiaan; Hume, Carmen; Vataga, Elena; Hall, Yper; Stylianou, Elena; McShane, Helen; Williams, Ann; Niranjan, Mahesan; Woelk, Christopher

    2017-01-01

    Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.

  16. Classifying spatially heterogeneous wetland communities using machine learning algorithms and spectral and textural features.

    Science.gov (United States)

    Szantoi, Zoltan; Escobedo, Francisco J; Abd-Elrahman, Amr; Pearlstine, Leonard; Dewitt, Bon; Smith, Scot

    2015-05-01

    Mapping of wetlands (marsh vs. swamp vs. upland) is a common remote sensing application.Yet, discriminating between similar freshwater communities such as graminoid/sedge fromremotely sensed imagery is more difficult. Most of this activity has been performed using medium to low resolution imagery. There are only a few studies using highspatial resolutionimagery and machine learning image classification algorithms for mapping heterogeneouswetland plantcommunities. This study addresses this void by analyzing whether machine learning classifierssuch as decisiontrees (DT) and artificial neural networks (ANN) can accurately classify graminoid/sedgecommunities usinghigh resolution aerial imagery and image texture data in the Everglades National Park, Florida.In addition tospectral bands, the normalized difference vegetation index, and first- and second-order texturefeatures derivedfrom the near-infrared band were analyzed. Classifier accuracies were assessed using confusiontablesand the calculated kappa coefficients of the resulting maps. The results indicated that an ANN(multilayerperceptron based on backpropagation) algorithm produced a statistically significantly higheraccuracy(82.04%) than the DT (QUEST) algorithm (80.48%) or the maximum likelihood (80.56%)classifier (αtexture features.

  17. Development of a skateboarding trick classifier using accelerometry and machine learning

    Directory of Open Access Journals (Sweden)

    Nicholas Kluge Corrêa

    Full Text Available Abstract Introduction Skateboarding is one of the most popular cultures in Brazil, with more than 8.5 million skateboarders. Nowadays, the discipline of street skating has gained recognition among other more classical sports and awaits its debut at the Tokyo 2020 Summer Olympic Games. This study aimed to explore the state-of-the-art for inertial measurement unit (IMU use in skateboarding trick detection, and to develop new classification methods using supervised machine learning and artificial neural networks (ANN. Methods State-of-the-art knowledge regarding motion detection in skateboarding was used to generate 543 artificial acceleration signals through signal modeling, corresponding to 181 flat ground tricks divided into five classes (NOLLIE, NSHOV, FLIP, SHOV, OLLIE. The classifier consisted of a multilayer feed-forward neural network created with three layers and a supervised learning algorithm (backpropagation. Results The use of ANNs trained specifically for each measured axis of acceleration resulted in error percentages inferior to 0.05%, with a computational efficiency that makes real-time application possible. Conclusion Machine learning can be a useful technique for classifying skateboarding flat ground tricks, assuming that the classifiers are properly constructed and trained, and the acceleration signals are preprocessed correctly.

  18. Support vector machine as a binary classifier for automated object detection in remotely sensed data

    International Nuclear Information System (INIS)

    Wardaya, P D

    2014-01-01

    In the present paper, author proposes the application of Support Vector Machine (SVM) for the analysis of satellite imagery. One of the advantages of SVM is that, with limited training data, it may generate comparable or even better results than the other methods. The SVM algorithm is used for automated object detection and characterization. Specifically, the SVM is applied in its basic nature as a binary classifier where it classifies two classes namely, object and background. The algorithm aims at effectively detecting an object from its background with the minimum training data. The synthetic image containing noises is used for algorithm testing. Furthermore, it is implemented to perform remote sensing image analysis such as identification of Island vegetation, water body, and oil spill from the satellite imagery. It is indicated that SVM provides the fast and accurate analysis with the acceptable result

  19. Support vector machine as a binary classifier for automated object detection in remotely sensed data

    Science.gov (United States)

    Wardaya, P. D.

    2014-02-01

    In the present paper, author proposes the application of Support Vector Machine (SVM) for the analysis of satellite imagery. One of the advantages of SVM is that, with limited training data, it may generate comparable or even better results than the other methods. The SVM algorithm is used for automated object detection and characterization. Specifically, the SVM is applied in its basic nature as a binary classifier where it classifies two classes namely, object and background. The algorithm aims at effectively detecting an object from its background with the minimum training data. The synthetic image containing noises is used for algorithm testing. Furthermore, it is implemented to perform remote sensing image analysis such as identification of Island vegetation, water body, and oil spill from the satellite imagery. It is indicated that SVM provides the fast and accurate analysis with the acceptable result.

  20. Hybrid Radar Emitter Recognition Based on Rough k-Means Classifier and Relevance Vector Machine

    Science.gov (United States)

    Yang, Zhutian; Wu, Zhilu; Yin, Zhendong; Quan, Taifan; Sun, Hongjian

    2013-01-01

    Due to the increasing complexity of electromagnetic signals, there exists a significant challenge for recognizing radar emitter signals. In this paper, a hybrid recognition approach is presented that classifies radar emitter signals by exploiting the different separability of samples. The proposed approach comprises two steps, namely the primary signal recognition and the advanced signal recognition. In the former step, a novel rough k-means classifier, which comprises three regions, i.e., certain area, rough area and uncertain area, is proposed to cluster the samples of radar emitter signals. In the latter step, the samples within the rough boundary are used to train the relevance vector machine (RVM). Then RVM is used to recognize the samples in the uncertain area; therefore, the classification accuracy is improved. Simulation results show that, for recognizing radar emitter signals, the proposed hybrid recognition approach is more accurate, and presents lower computational complexity than traditional approaches. PMID:23344380

  1. Stacking machine learning classifiers to identify Higgs bosons at the LHC

    International Nuclear Information System (INIS)

    Alves, A.

    2017-01-01

    Machine learning (ML) algorithms have been employed in the problem of classifying signal and background events with high accuracy in particle physics. In this paper, we compare the performance of a widespread ML technique, namely, stacked generalization , against the results of two state-of-art algorithms: (1) a deep neural network (DNN) in the task of discovering a new neutral Higgs boson and (2) a scalable machine learning system for tree boosting, in the Standard Model Higgs to tau leptons channel, both at the 8 TeV LHC. In a cut-and-count analysis, stacking three algorithms performed around 16% worse than DNN but demanding far less computation efforts, however, the same stacking outperforms boosted decision trees. Using the stacked classifiers in a multivariate statistical analysis (MVA), on the other hand, significantly enhances the statistical significance compared to cut-and-count in both Higgs processes, suggesting that combining an ensemble of simpler and faster ML algorithms with MVA tools is a better approach than building a complex state-of-art algorithm for cut-and-count.

  2. Machine learning classifier using abnormal brain network topological metrics in major depressive disorder.

    Science.gov (United States)

    Guo, Hao; Cao, Xiaohua; Liu, Zhifen; Li, Haifang; Chen, Junjie; Zhang, Kerang

    2012-12-05

    Resting state functional brain networks have been widely studied in brain disease research. However, it is currently unclear whether abnormal resting state functional brain network metrics can be used with machine learning for the classification of brain diseases. Resting state functional brain networks were constructed for 28 healthy controls and 38 major depressive disorder patients by thresholding partial correlation matrices of 90 regions. Three nodal metrics were calculated using graph theory-based approaches. Nonparametric permutation tests were then used for group comparisons of topological metrics, which were used as classified features in six different algorithms. We used statistical significance as the threshold for selecting features and measured the accuracies of six classifiers with different number of features. A sensitivity analysis method was used to evaluate the importance of different features. The result indicated that some of the regions exhibited significantly abnormal nodal centralities, including the limbic system, basal ganglia, medial temporal, and prefrontal regions. Support vector machine with radial basis kernel function algorithm and neural network algorithm exhibited the highest average accuracy (79.27 and 78.22%, respectively) with 28 features (Pdisorder is associated with abnormal functional brain network topological metrics and statistically significant nodal metrics can be successfully used for feature selection in classification algorithms.

  3. Bias correction for selecting the minimal-error classifier from many machine learning models.

    Science.gov (United States)

    Ding, Ying; Tang, Shaowu; Liao, Serena G; Jia, Jia; Oesterreich, Steffi; Lin, Yan; Tseng, George C

    2014-11-15

    Supervised machine learning is commonly applied in genomic research to construct a classifier from the training data that is generalizable to predict independent testing data. When test datasets are not available, cross-validation is commonly used to estimate the error rate. Many machine learning methods are available, and it is well known that no universally best method exists in general. It has been a common practice to apply many machine learning methods and report the method that produces the smallest cross-validation error rate. Theoretically, such a procedure produces a selection bias. Consequently, many clinical studies with moderate sample sizes (e.g. n = 30-60) risk reporting a falsely small cross-validation error rate that could not be validated later in independent cohorts. In this article, we illustrated the probabilistic framework of the problem and explored the statistical and asymptotic properties. We proposed a new bias correction method based on learning curve fitting by inverse power law (IPL) and compared it with three existing methods: nested cross-validation, weighted mean correction and Tibshirani-Tibshirani procedure. All methods were compared in simulation datasets, five moderate size real datasets and two large breast cancer datasets. The result showed that IPL outperforms the other methods in bias correction with smaller variance, and it has an additional advantage to extrapolate error estimates for larger sample sizes, a practical feature to recommend whether more samples should be recruited to improve the classifier and accuracy. An R package 'MLbias' and all source files are publicly available. tsenglab.biostat.pitt.edu/software.htm. ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Application of Machine Learning Approaches for Classifying Sitting Posture Based on Force and Acceleration Sensors

    Directory of Open Access Journals (Sweden)

    Roland Zemp

    2016-01-01

    Full Text Available Occupational musculoskeletal disorders, particularly chronic low back pain (LBP, are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user’s sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest. Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples. Sixteen force sensor values and the backrest angle were used as the explanatory variables (features for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.

  5. Application of Machine Learning Approaches for Classifying Sitting Posture Based on Force and Acceleration Sensors.

    Science.gov (United States)

    Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio

    2016-01-01

    Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.

  6. Mapping Woodland Cover in the Miombo Ecosystem: A Comparison of Machine Learning Classifiers

    Directory of Open Access Journals (Sweden)

    Courage Kamusoko

    2014-06-01

    Full Text Available Miombo woodlands in Southern Africa are experiencing accelerated changes due to natural and anthropogenic disturbances. In order to formulate sustainable woodland management strategies in the Miombo ecosystem, timely and up-to-date land cover information is required. Recent advances in remote sensing technology have improved land cover mapping in tropical evergreen ecosystems. However, woodland cover mapping remains a challenge in the Miombo ecosystem. The objective of the study was to evaluate the performance of decision trees (DT, random forests (RF, and support vector machines (SVM in the context of improving woodland and non-woodland cover mapping in the Miombo ecosystem in Zimbabwe. We used Multidate Landsat 8 spectral and spatial dependence (Moran’s I variables to map woodland and non-woodland cover. Results show that RF classifier outperformed the SVM and DT classifiers by 4% and 15%, respectively. The RF importance measures show that multidate Landsat 8 spectral and spatial variables had the greatest influence on class-separability in the study area. Therefore, the RF classifier has potential to improve woodland cover mapping in the Miombo ecosystem.

  7. Discovering mammography-based machine learning classifiers for breast cancer diagnosis.

    Science.gov (United States)

    Ramos-Pollán, Raúl; Guevara-López, Miguel Angel; Suárez-Ortega, Cesar; Díaz-Herrero, Guillermo; Franco-Valiente, Jose Miguel; Rubio-Del-Solar, Manuel; González-de-Posada, Naimy; Vaz, Mario Augusto Pires; Loureiro, Joana; Ramos, Isabel

    2012-08-01

    This work explores the design of mammography-based machine learning classifiers (MLC) and proposes a new method to build MLC for breast cancer diagnosis. We massively evaluated MLC configurations to classify features vectors extracted from segmented regions (pathological lesion or normal tissue) on craniocaudal (CC) and/or mediolateral oblique (MLO) mammography image views, providing BI-RADS diagnosis. Previously, appropriate combinations of image processing and normalization techniques were applied to reduce image artifacts and increase mammograms details. The method can be used under different data acquisition circumstances and exploits computer clusters to select well performing MLC configurations. We evaluated 286 cases extracted from the repository owned by HSJ-FMUP, where specialized radiologists segmented regions on CC and/or MLO images (biopsies provided the golden standard). Around 20,000 MLC configurations were evaluated, obtaining classifiers achieving an area under the ROC curve of 0.996 when combining features vectors extracted from CC and MLO views of the same case.

  8. Radiomic Machine Learning Classifiers for Prognostic Biomarkers of Head & Neck Cancer

    Directory of Open Access Journals (Sweden)

    Chintan eParmar

    2015-12-01

    Full Text Available Introduction: Radiomics extracts and mines large number of medical imaging features in a non-invasive and cost-effective way. The underlying assumption of radiomics is that these imaging features quantify phenotypic characteristics of entire tumor. In order to enhance applicability of radiomics in clinical oncology, highly accurate and reliable machine learning approaches are required. In this radiomic study, thirteen feature selection methods and eleven machine learning classification methods were evaluated in terms of their performance and stability for predicting overall survival in head and neck cancer patients. Methods: Two independent head and neck cancer cohorts were investigated. Training cohort HN1 consisted 101 HNSCC patients. Cohort HN2 (n=95 was used for validation. A total of 440 radiomic features were extracted from the segmented tumor regions in CT images. Feature selection and classification methods were compared using an unbiased evaluation framework. Results: We observed that the three feature selection methods MRMR (AUC = 0.69, Stability = 0.66, MIFS (AUC = 0.66, Stability = 0.69, and CIFE (AUC = 0.68, Stability = 0.7 had high prognostic performance and stability. The three classifiers BY (AUC = 0.67, RSD = 11.28, RF (AUC = 0.61, RSD = 7.36, and NN (AUC = 0.62, RSD = 10.52 also showed high prognostic performance and stability. Analysis investigating performance variability indicated that the choice of classification method is the major factor driving the performance variation (29.02% of total variance. Conclusions: Our study identified prognostic and reliable machine learning methods for the prediction of overall survival of head and neck cancer patients. Identification of optimal machine-learning methods for radiomics based prognostic analyses could broaden the scope of radiomics in precision oncology and cancer care.

  9. A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status

    Science.gov (United States)

    Bastani, Meysam; Vos, Larissa; Asgarian, Nasimeh; Deschenes, Jean; Graham, Kathryn; Mackey, John; Greiner, Russell

    2013-01-01

    Background Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. Methods To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. Results This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. Conclusions Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions. PMID:24312637

  10. A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status.

    Directory of Open Access Journals (Sweden)

    Meysam Bastani

    Full Text Available BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. METHODS: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. RESULTS: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. CONCLUSIONS: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions.

  11. Sensitivity and specificity of machine learning classifiers and spectral domain OCT for the diagnosis of glaucoma.

    Science.gov (United States)

    Vidotti, Vanessa G; Costa, Vital P; Silva, Fabrício R; Resende, Graziela M; Cremasco, Fernanda; Dias, Marcelo; Gomi, Edson S

    2012-06-15

    Purpose. To investigate the sensitivity and specificity of machine learning classifiers (MLC) and spectral domain optical coherence tomography (SD-OCT) for the diagnosis of glaucoma. Methods. Sixty-two patients with early to moderate glaucomatous visual field damage and 48 healthy individuals were included. All subjects underwent a complete ophthalmologic examination, achromatic standard automated perimetry, and RNFL imaging with SD-OCT (Cirrus HD-OCT; Carl Zeiss Meditec, Inc., Dublin, California, USA). Receiver operating characteristic (ROC) curves were obtained for all SD-OCT parameters. Subsequently, the following MLCs were tested: Classification Tree (CTREE), Random Forest (RAN), Bagging (BAG), AdaBoost M1 (ADA), Ensemble Selection (ENS), Multilayer Perceptron (MLP), Radial Basis Function (RBF), Naive-Bayes (NB), and Support Vector Machine (SVM). Areas under the ROC curves (aROCs) obtained for each parameter and each MLC were compared. Results. The mean age was 57.0±9.2 years for healthy individuals and 59.9±9.0 years for glaucoma patients (p=0.103). Mean deviation values were -4.1±2.4 dB for glaucoma patients and -1.5±1.6 dB for healthy individuals (pposition (0.765), and 6 o'clock position (0.754). The aROCs from classifiers varied from 0.785 (ADA) to 0.818 (BAG). The aROC obtained with BAG was not significantly different from the aROC obtained with the best single SD-OCT parameter (p=0.93). Conclusions. The SD-OCT showed good diagnostic accuracy in a group of patients with early glaucoma. In this series, MLCs did not improve the sensitivity and specificity of SD-OCT for the diagnosis of glaucoma.

  12. The Construction of Support Vector Machine Classifier Using the Firefly Algorithm

    Directory of Open Access Journals (Sweden)

    Chih-Feng Chao

    2015-01-01

    Full Text Available The setting of parameters in the support vector machines (SVMs is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM. This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI, machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM. The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy.

  13. A comparison of rule-based and machine learning approaches for classifying patient portal messages.

    Science.gov (United States)

    Cronin, Robert M; Fabbri, Daniel; Denny, Joshua C; Rosenbloom, S Trent; Jackson, Gretchen Purcell

    2017-09-01

    Secure messaging through patient portals is an increasingly popular way that consumers interact with healthcare providers. The increasing burden of secure messaging can affect clinic staffing and workflows. Manual management of portal messages is costly and time consuming. Automated classification of portal messages could potentially expedite message triage and delivery of care. We developed automated patient portal message classifiers with rule-based and machine learning techniques using bag of words and natural language processing (NLP) approaches. To evaluate classifier performance, we used a gold standard of 3253 portal messages manually categorized using a taxonomy of communication types (i.e., main categories of informational, medical, logistical, social, and other communications, and subcategories including prescriptions, appointments, problems, tests, follow-up, contact information, and acknowledgement). We evaluated our classifiers' accuracies in identifying individual communication types within portal messages with area under the receiver-operator curve (AUC). Portal messages often contain more than one type of communication. To predict all communication types within single messages, we used the Jaccard Index. We extracted the variables of importance for the random forest classifiers. The best performing approaches to classification for the major communication types were: logistic regression for medical communications (AUC: 0.899); basic (rule-based) for informational communications (AUC: 0.842); and random forests for social communications and logistical communications (AUCs: 0.875 and 0.925, respectively). The best performing classification approach of classifiers for individual communication subtypes was random forests for Logistical-Contact Information (AUC: 0.963). The Jaccard Indices by approach were: basic classifier, Jaccard Index: 0.674; Naïve Bayes, Jaccard Index: 0.799; random forests, Jaccard Index: 0.859; and logistic regression, Jaccard

  14. Machine-learning classifiers applied to habitat and geological substrate mapping offshore South Carolina

    Science.gov (United States)

    White, S. M.; Maschmeyer, C.; Anderson, E.; Knapp, C. C.; Brantley, D.

    2017-12-01

    classification as the classifier confused flat parts with relatively flat sand data. 100% of testing data representing rocky portions of the seafloor were correctly classified. The use of machine-learning classifiers to determine seafloor-type provides a new solution to habitat mapping and offshore engineering problems.

  15. Application of support vector machine classifiers to preoperative risk stratification with myocardial perfusion scintigraphy

    International Nuclear Information System (INIS)

    Kasamatsu, Tomotaka; Hashimoto, Jun; Nakahara, Tadaki; Bai, Jingming; Kitamura, Naoto; Kubo, Atsushi; Iyatomi, Hitoshi; Ogawa, Koichi

    2008-01-01

    Myocardial perfusion single-photon emission computed tomography (SPECT) has been used for risk stratification before non-cardiac surgery. However, few authors have used mathematical models for evaluating the likelihood of perioperative cardiac events. This retrospective cohort study collected data of 1,351 patients referred for SPECT before non-cardiac surgery. We generated binary classifiers using support vector machine (SVM) and conventional linear models for predicting perioperative cardiac events. We used clinical and surgical risk, and SPECT findings as input data, and the occurrence of all and hard cardiac events as output data. The area under the receiver-operating characteristic curve (AUC) was calculated for assessing the prediction accuracy. The AUC values were 0.884 and 0.748 in the SVM and linear models, respectively in predicting all cardiac events with clinical and surgical risk, and SPECT variables. The values were 0.861 (SVM) and 0.677 (linear) when not using SPECT data as input. In hard events, the AUC values were 0.892 (SVM) and 0.864 (linear) with SPECT, and 0.867 (SVM) and 0.768 (linear) without SPECT. The SVM was superior to the linear model in risk stratification. We also found an incremental prognostic value of SPECT results over information about clinical and surgical risk. (author)

  16. Perbandingan Simple Logistic Classifier dengan Support Vector Machine dalam Memprediksi Kemenangan Atlet

    Directory of Open Access Journals (Sweden)

    Ednawati Rainarli

    2017-10-01

    Full Text Available A coach must be able to select which athlete has a good prospect of winning a game. There are a lot of aspects which influence the athlete in winning a game, so it's not easy by coach to decide it.This research would compare Simple Logistic Classifier (SLC and Support Vector Machine (SVM usage applied to predict winning game of athlete based on health and physical condition record. The data get from 28 sports. The accuracy of SLC and SVM are 80% and 88% meanwhile processing times of SLC and SVM method are 1.6 seconds dan 0.2 seconds.The result shows the SVM usage superior to the SLC both of speed process and the value of accuracy. There were also testing of 24 features used in the classifications process. Based on the test, features selection process can cause decreasing the accuracy value. This result concludes that all features used in this research influence the determination of a victory athletes prediction.

  17. An MR Brain Images Classifier System via Particle Swarm Optimization and Kernel Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Yudong Zhang

    2013-01-01

    Full Text Available Automated abnormal brain detection is extremely of importance for clinical diagnosis. Over last decades numerous methods had been presented. In this paper, we proposed a novel hybrid system to classify a given MR brain image as either normal or abnormal. The proposed method first employed digital wavelet transform to extract features then used principal component analysis (PCA to reduce the feature space. Afterwards, we constructed a kernel support vector machine (KSVM with RBF kernel, using particle swarm optimization (PSO to optimize the parameters C and σ. Fivefold cross-validation was utilized to avoid overfitting. In the experimental procedure, we created a 90 images dataset brain downloaded from Harvard Medical School website. The abnormal brain MR images consist of the following diseases: glioma, metastatic adenocarcinoma, metastatic bronchogenic carcinoma, meningioma, sarcoma, Alzheimer, Huntington, motor neuron disease, cerebral calcinosis, Pick’s disease, Alzheimer plus visual agnosia, multiple sclerosis, AIDS dementia, Lyme encephalopathy, herpes encephalitis, Creutzfeld-Jakob disease, and cerebral toxoplasmosis. The 5-folded cross-validation classification results showed that our method achieved 97.78% classification accuracy, higher than 86.22% by BP-NN and 91.33% by RBF-NN. For the parameter selection, we compared PSO with those of random selection method. The results showed that the PSO is more effective to build optimal KSVM.

  18. An SVM Based Approach for the Analysis Of Mammography Images

    Science.gov (United States)

    Gan, X.; Kapsokalivas, L.; Skaliotis, A.; Steinhöfel, K.; Tangaro, S.

    2007-09-01

    Mammography is among the most popular imaging techniques used in the diagnosis of breast cancer. Nevertheless distinguishing between healthy and ill images is hard even for an experienced radiologist, because a single image usually includes several regions of interest (ROIs). The hardness of this classification problem along with the substantial amount of data, gathered from patients' medical history, motivates the use of a machine learning approach as part of a CAD (Computer Aided Detection) tool, aiming to assist radiologists in the characterization of mammography images. Specifically, our approach involves: i) the ROI extraction, ii) the Feature Vector extraction, iii) the Support Vector Machine (SVM) classification of ROIs and iv) the characterization of the whole image. We evaluate the performance of our approach in terms of the SVM's training and testing error and in terms of ROI specificity—sensitivity. The results show a relation between the number of features used and the SVM's performance.

  19. An SVM Based Approach for the Analysis Of Mammography Images

    International Nuclear Information System (INIS)

    Gan, X.; Kapsokalivas, L.; Skaliotis, A.; Steinhoefel, K.; Tangaro, S.

    2007-01-01

    Mammography is among the most popular imaging techniques used in the diagnosis of breast cancer. Nevertheless distinguishing between healthy and ill images is hard even for an experienced radiologist, because a single image usually includes several regions of interest (ROIs). The hardness of this classification problem along with the substantial amount of data, gathered from patients' medical history, motivates the use of a machine learning approach as part of a CAD (Computer Aided Detection) tool, aiming to assist radiologists in the characterization of mammography images. Specifically, our approach involves: i) the ROI extraction, ii) the Feature Vector extraction, iii) the Support Vector Machine (SVM) classification of ROIs and iv) the characterization of the whole image. We evaluate the performance of our approach in terms of the SVM's training and testing error and in terms of ROI specificity - sensitivity. The results show a relation between the number of features used and the SVM's performance

  20. A SVM-based quantitative fMRI method for resting-state functional network detection.

    Science.gov (United States)

    Song, Xiaomu; Chen, Nan-kuei

    2014-09-01

    Resting-state functional magnetic resonance imaging (fMRI) aims to measure baseline neuronal connectivity independent of specific functional tasks and to capture changes in the connectivity due to neurological diseases. Most existing network detection methods rely on a fixed threshold to identify functionally connected voxels under the resting state. Due to fMRI non-stationarity, the threshold cannot adapt to variation of data characteristics across sessions and subjects, and generates unreliable mapping results. In this study, a new method is presented for resting-state fMRI data analysis. Specifically, the resting-state network mapping is formulated as an outlier detection process that is implemented using one-class support vector machine (SVM). The results are refined by using a spatial-feature domain prototype selection method and two-class SVM reclassification. The final decision on each voxel is made by comparing its probabilities of functionally connected and unconnected instead of a threshold. Multiple features for resting-state analysis were extracted and examined using an SVM-based feature selection method, and the most representative features were identified. The proposed method was evaluated using synthetic and experimental fMRI data. A comparison study was also performed with independent component analysis (ICA) and correlation analysis. The experimental results show that the proposed method can provide comparable or better network detection performance than ICA and correlation analysis. The method is potentially applicable to various resting-state quantitative fMRI studies. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Study of Machine-Learning Classifier and Feature Set Selection for Intent Classification of Korean Tweets about Food Safety

    Directory of Open Access Journals (Sweden)

    Yeom, Ha-Neul

    2014-09-01

    Full Text Available In recent years, several studies have proposed making use of the Twitter micro-blogging service to track various trends in online media and discussion. In this study, we specifically examine the use of Twitter to track discussions of food safety in the Korean language. Given the irregularity of keyword use in most tweets, we focus on optimistic machine-learning and feature set selection to classify collected tweets. We build the classifier model using Naive Bayes & Naive Bayes Multinomial, Support Vector Machine, and Decision Tree Algorithms, all of which show good performance. To select an optimum feature set, we construct a basic feature set as a standard for performance comparison, so that further test feature sets can be evaluated. Experiments show that precision and F-measure performance are best when using a Naive Bayes Multinomial classifier model with a test feature set defined by extracting Substantive, Predicate, Modifier, and Interjection parts of speech.

  2. Glaucomatous patterns in Frequency Doubling Technology (FDT) perimetry data identified by unsupervised machine learning classifiers.

    Science.gov (United States)

    Bowd, Christopher; Weinreb, Robert N; Balasubramanian, Madhusudhanan; Lee, Intae; Jang, Giljin; Yousefi, Siamak; Zangwill, Linda M; Medeiros, Felipe A; Girkin, Christopher A; Liebmann, Jeffrey M; Goldbaum, Michael H

    2014-01-01

    The variational Bayesian independent component analysis-mixture model (VIM), an unsupervised machine-learning classifier, was used to automatically separate Matrix Frequency Doubling Technology (FDT) perimetry data into clusters of healthy and glaucomatous eyes, and to identify axes representing statistically independent patterns of defect in the glaucoma clusters. FDT measurements were obtained from 1,190 eyes with normal FDT results and 786 eyes with abnormal FDT results from the UCSD-based Diagnostic Innovations in Glaucoma Study (DIGS) and African Descent and Glaucoma Evaluation Study (ADAGES). For all eyes, VIM input was 52 threshold test points from the 24-2 test pattern, plus age. FDT mean deviation was -1.00 dB (S.D. = 2.80 dB) and -5.57 dB (S.D. = 5.09 dB) in FDT-normal eyes and FDT-abnormal eyes, respectively (p<0.001). VIM identified meaningful clusters of FDT data and positioned a set of statistically independent axes through the mean of each cluster. The optimal VIM model separated the FDT fields into 3 clusters. Cluster N contained primarily normal fields (1109/1190, specificity 93.1%) and clusters G1 and G2 combined, contained primarily abnormal fields (651/786, sensitivity 82.8%). For clusters G1 and G2 the optimal number of axes were 2 and 5, respectively. Patterns automatically generated along axes within the glaucoma clusters were similar to those known to be indicative of glaucoma. Fields located farther from the normal mean on each glaucoma axis showed increasing field defect severity. VIM successfully separated FDT fields from healthy and glaucoma eyes without a priori information about class membership, and identified familiar glaucomatous patterns of loss.

  3. Comparison of Random Forest and Support Vector Machine classifiers using UAV remote sensing imagery

    Science.gov (United States)

    Piragnolo, Marco; Masiero, Andrea; Pirotti, Francesco

    2017-04-01

    Since recent years surveying with unmanned aerial vehicles (UAV) is getting a great amount of attention due to decreasing costs, higher precision and flexibility of usage. UAVs have been applied for geomorphological investigations, forestry, precision agriculture, cultural heritage assessment and for archaeological purposes. It can be used for land use and land cover classification (LULC). In literature, there are two main types of approaches for classification of remote sensing imagery: pixel-based and object-based. On one hand, pixel-based approach mostly uses training areas to define classes and respective spectral signatures. On the other hand, object-based classification considers pixels, scale, spatial information and texture information for creating homogeneous objects. Machine learning methods have been applied successfully for classification, and their use is increasing due to the availability of faster computing capabilities. The methods learn and train the model from previous computation. Two machine learning methods which have given good results in previous investigations are Random Forest (RF) and Support Vector Machine (SVM). The goal of this work is to compare RF and SVM methods for classifying LULC using images collected with a fixed wing UAV. The processing chain regarding classification uses packages in R, an open source scripting language for data analysis, which provides all necessary algorithms. The imagery was acquired and processed in November 2015 with cameras providing information over the red, blue, green and near infrared wavelength reflectivity over a testing area in the campus of Agripolis, in Italy. Images were elaborated and ortho-rectified through Agisoft Photoscan. The ortho-rectified image is the full data set, and the test set is derived from partial sub-setting of the full data set. Different tests have been carried out, using a percentage from 2 % to 20 % of the total. Ten training sets and ten validation sets are obtained from

  4. Classifying injury narratives of large administrative databases for surveillance-A practical approach combining machine learning ensembles and human review.

    Science.gov (United States)

    Marucci-Wellman, Helen R; Corns, Helen L; Lehto, Mark R

    2017-01-01

    Injury narratives are now available real time and include useful information for injury surveillance and prevention. However, manual classification of the cause or events leading to injury found in large batches of narratives, such as workers compensation claims databases, can be prohibitive. In this study we compare the utility of four machine learning algorithms (Naïve Bayes, Single word and Bi-gram models, Support Vector Machine and Logistic Regression) for classifying narratives into Bureau of Labor Statistics Occupational Injury and Illness event leading to injury classifications for a large workers compensation database. These algorithms are known to do well classifying narrative text and are fairly easy to implement with off-the-shelf software packages such as Python. We propose human-machine learning ensemble approaches which maximize the power and accuracy of the algorithms for machine-assigned codes and allow for strategic filtering of rare, emerging or ambiguous narratives for manual review. We compare human-machine approaches based on filtering on the prediction strength of the classifier vs. agreement between algorithms. Regularized Logistic Regression (LR) was the best performing algorithm alone. Using this algorithm and filtering out the bottom 30% of predictions for manual review resulted in high accuracy (overall sensitivity/positive predictive value of 0.89) of the final machine-human coded dataset. The best pairings of algorithms included Naïve Bayes with Support Vector Machine whereby the triple ensemble NB SW =NB BI-GRAM =SVM had very high performance (0.93 overall sensitivity/positive predictive value and high accuracy (i.e. high sensitivity and positive predictive values)) across both large and small categories leaving 41% of the narratives for manual review. Integrating LR into this ensemble mix improved performance only slightly. For large administrative datasets we propose incorporation of methods based on human-machine pairings such as

  5. In-Vivo Imaging of Cell Migration Using Contrast Enhanced MRI and SVM Based Post-Processing.

    Science.gov (United States)

    Weis, Christian; Hess, Andreas; Budinsky, Lubos; Fabry, Ben

    2015-01-01

    The migration of cells within a living organism can be observed with magnetic resonance imaging (MRI) in combination with iron oxide nanoparticles as an intracellular contrast agent. This method, however, suffers from low sensitivity and specificty. Here, we developed a quantitative non-invasive in-vivo cell localization method using contrast enhanced multiparametric MRI and support vector machines (SVM) based post-processing. Imaging phantoms consisting of agarose with compartments containing different concentrations of cancer cells labeled with iron oxide nanoparticles were used to train and evaluate the SVM for cell localization. From the magnitude and phase data acquired with a series of T2*-weighted gradient-echo scans at different echo-times, we extracted features that are characteristic for the presence of superparamagnetic nanoparticles, in particular hyper- and hypointensities, relaxation rates, short-range phase perturbations, and perturbation dynamics. High detection quality was achieved by SVM analysis of the multiparametric feature-space. The in-vivo applicability was validated in animal studies. The SVM detected the presence of iron oxide nanoparticles in the imaging phantoms with high specificity and sensitivity with a detection limit of 30 labeled cells per mm3, corresponding to 19 μM of iron oxide. As proof-of-concept, we applied the method to follow the migration of labeled cancer cells injected in rats. The combination of iron oxide labeled cells, multiparametric MRI and a SVM based post processing provides high spatial resolution, specificity, and sensitivity, and is therefore suitable for non-invasive in-vivo cell detection and cell migration studies over prolonged time periods.

  6. SVM-Based Control System for a Robot Manipulator

    Directory of Open Access Journals (Sweden)

    Foudil Abdessemed

    2012-12-01

    Full Text Available Real systems are usually non-linear, ill-defined, have variable parameters and are subject to external disturbances. Modelling these systems is often an approximation of the physical phenomena involved. However, it is from this approximate system of representation that we propose - in this paper - to build a robust control, in the sense that it must ensure low sensitivity towards parameters, uncertainties, variations and external disturbances. The computed torque method is a well-established robot control technique which takes account of the dynamic coupling between the robot links. However, its main disadvantage lies on the assumption of an exactly known dynamic model which is not realizable in practice. To overcome this issue, we propose the estimation of the dynamics model of the nonlinear system with a machine learning regression method. The output of this regressor is used in conjunction with a PD controller to achieve the tracking trajectory task of a robot manipulator. In cases where some of the parameters of the plant undergo a change in their values, poor performance may result. To cope with this drawback, a fuzzy precompensator is inserted to reinforce the SVM computed torque-based controller and avoid any deterioration. The theory is developed and the simulation results are carried out on a two-degree of freedom robot manipulator to demonstrate the validity of the proposed approach.

  7. Recognition of acute lymphoblastic leukemia cells in microscopic images using k-means clustering and support vector machine classifier.

    Science.gov (United States)

    Amin, Morteza Moradi; Kermani, Saeed; Talebi, Ardeshir; Oghli, Mostafa Ghelich

    2015-01-01

    Acute lymphoblastic leukemia is the most common form of pediatric cancer which is categorized into three L1, L2, and L3 and could be detected through screening of blood and bone marrow smears by pathologists. Due to being time-consuming and tediousness of the procedure, a computer-based system is acquired for convenient detection of Acute lymphoblastic leukemia. Microscopic images are acquired from blood and bone marrow smears of patients with Acute lymphoblastic leukemia and normal cases. After applying image preprocessing, cells nuclei are segmented by k-means algorithm. Then geometric and statistical features are extracted from nuclei and finally these cells are classified to cancerous and noncancerous cells by means of support vector machine classifier with 10-fold cross validation. These cells are also classified into their sub-types by multi-Support vector machine classifier. Classifier is evaluated by these parameters: Sensitivity, specificity, and accuracy which values for cancerous and noncancerous cells 98%, 95%, and 97%, respectively. These parameters are also used for evaluation of cell sub-types which values in mean 84.3%, 97.3%, and 95.6%, respectively. The results show that proposed algorithm could achieve an acceptable performance for the diagnosis of Acute lymphoblastic leukemia and its sub-types and can be used as an assistant diagnostic tool for pathologists.

  8. A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier

    Energy Technology Data Exchange (ETDEWEB)

    Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug; Seo, Joon Beom [Department of Radiology, University of Ulsan College of Medicine, 388-1 Pungnap2-dong, Songpa-gu, Seoul 138-736 (Korea, Republic of); Lynch, David A. [Department of Radiology, National Jewish Medical and Research Center, Denver, Colorado 80206 (United States)

    2013-05-15

    Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessed using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For

  9. A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier

    International Nuclear Information System (INIS)

    Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug; Seo, Joon Beom; Lynch, David A.

    2013-01-01

    Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 × 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs—normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessed using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI

  10. CLASSIFYING STRUCTURES IN THE INTERSTELLAR MEDIUM WITH SUPPORT VECTOR MACHINES: THE G16.05-0.57 SUPERNOVA REMNANT

    International Nuclear Information System (INIS)

    Beaumont, Christopher N.; Williams, Jonathan P.; Goodman, Alyssa A.

    2011-01-01

    We apply Support Vector Machines (SVMs)-a machine learning algorithm-to the task of classifying structures in the interstellar medium (ISM). As a case study, we present a position-position-velocity (PPV) data cube of 12 CO J = 3-2 emission toward G16.05-0.57, a supernova remnant that lies behind the M17 molecular cloud. Despite the fact that these two objects partially overlap in PPV space, the two structures can easily be distinguished by eye based on their distinct morphologies. The SVM algorithm is able to infer these morphological distinctions, and associate individual pixels with each object at >90% accuracy. This case study suggests that similar techniques may be applicable to classifying other structures in the ISM-a task that has thus far proven difficult to automate.

  11. Visual Thing Recognition with Binary Scale-Invariant Feature Transform and Support Vector Machine Classifiers Using Color Information

    OpenAIRE

    Wei-Jong Yang; Wei-Hau Du; Pau-Choo Chang; Jar-Ferr Yang; Pi-Hsia Hung

    2017-01-01

    The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an importan...

  12. SVM-based feature extraction and classification of aflatoxin contaminated corn using fluorescence hyperspectral data

    Science.gov (United States)

    Support Vector Machine (SVM) was used in the Genetic Algorithms (GA) process to select and classify a subset of hyperspectral image bands. The method was applied to fluorescence hyperspectral data for the detection of aflatoxin contamination in Aspergillus flavus infected single corn kernels. In the...

  13. Classifying chemical mode of action using gene networks and machine learning: a case study with the herbicide linuron.

    Science.gov (United States)

    Ornostay, Anna; Cowie, Andrew M; Hindle, Matthew; Baker, Christopher J O; Martyniuk, Christopher J

    2013-12-01

    The herbicide linuron (LIN) is an endocrine disruptor with an anti-androgenic mode of action. The objectives of this study were to (1) improve knowledge of androgen and anti-androgen signaling in the teleostean ovary and to (2) assess the ability of gene networks and machine learning to classify LIN as an anti-androgen using transcriptomic data. Ovarian explants from vitellogenic fathead minnows (FHMs) were exposed to three concentrations of either 5α-dihydrotestosterone (DHT), flutamide (FLUT), or LIN for 12h. Ovaries exposed to DHT showed a significant increase in 17β-estradiol (E2) production while FLUT and LIN had no effect on E2. To improve understanding of androgen receptor signaling in the ovary, a reciprocal gene expression network was constructed for DHT and FLUT using pathway analysis and these data suggested that steroid metabolism, translation, and DNA replication are processes regulated through AR signaling in the ovary. Sub-network enrichment analysis revealed that FLUT and LIN shared more regulated gene networks in common compared to DHT. Using transcriptomic datasets from different fish species, machine learning algorithms classified LIN successfully with other anti-androgens. This study advances knowledge regarding molecular signaling cascades in the ovary that are responsive to androgens and anti-androgens and provides proof of concept that gene network analysis and machine learning can classify priority chemicals using experimental transcriptomic data collected from different fish species. © 2013.

  14. Linear SVM-Based Android Malware Detection for Reliable IoT Services

    Directory of Open Access Journals (Sweden)

    Hyo-Sik Ham

    2014-01-01

    Full Text Available Current many Internet of Things (IoT services are monitored and controlled through smartphone applications. By combining IoT with smartphones, many convenient IoT services have been provided to users. However, there are adverse underlying effects in such services including invasion of privacy and information leakage. In most cases, mobile devices have become cluttered with important personal user information as various services and contents are provided through them. Accordingly, attackers are expanding the scope of their attacks beyond the existing PC and Internet environment into mobile devices. In this paper, we apply a linear support vector machine (SVM to detect Android malware and compare the malware detection performance of SVM with that of other machine learning classifiers. Through experimental validation, we show that the SVM outperforms other machine learning classifiers.

  15. Promises, pitfalls, and basic guidelines for applying machine learning classifiers to psychiatric imaging data, with autism as an example

    Directory of Open Access Journals (Sweden)

    Pegah Kassraian Fard

    2016-12-01

    Full Text Available Most psychiatric disorders are associated with subtle alterations in brain function and are subject to large inter-individual differences. Typically the diagnosis of these disorders requires time-consuming behavioral assessments administered by a multi-disciplinary team with extensive experience. Whilst the application of machine learning classification methods (ML classifiers to neuroimaging data has the potential to speed and simplify diagnosis of psychiatric disorders, the methods, assumptions, and analytical steps are not currently opaque and accessible to researchers and clinicians outside the field. In this paper, we describe potential classification pipelines for Autism Spectrum Disorder, as an example of a psychiatric disorder. The analyses are based on resting-state fMRI data derived from a multi-site data repository (ABIDE. We compare several popular ML classifiers such as support vector machines, neural networks and regression approaches, among others. In a tutorial style, written to be equally accessible for researchers and clinicians, we explain the rationale of each classification approach, clarify the underlying assumptions, and discuss possible pitfalls and challenges. We also provide the data as well as the MATLAB code we used to achieve our results. We show that out-of-the-box ML classifiers can yield classification accuracies of about 60-70%. Finally, we discuss how classification accuracy can be further improved, and we mention methodological developments that are needed to pave the way for the use of ML classifiers in clinical practice.

  16. Use of a machine learning algorithm to classify expertise: analysis of hand motion patterns during a simulated surgical task.

    Science.gov (United States)

    Watson, Robert A

    2014-08-01

    To test the hypothesis that machine learning algorithms increase the predictive power to classify surgical expertise using surgeons' hand motion patterns. In 2012 at the University of North Carolina at Chapel Hill, 14 surgical attendings and 10 first- and second-year surgical residents each performed two bench model venous anastomoses. During the simulated tasks, the participants wore an inertial measurement unit on the dorsum of their dominant (right) hand to capture their hand motion patterns. The pattern from each bench model task performed was preprocessed into a symbolic time series and labeled as expert (attending) or novice (resident). The labeled hand motion patterns were processed and used to train a Support Vector Machine (SVM) classification algorithm. The trained algorithm was then tested for discriminative/predictive power against unlabeled (blinded) hand motion patterns from tasks not used in the training. The Lempel-Ziv (LZ) complexity metric was also measured from each hand motion pattern, with an optimal threshold calculated to separately classify the patterns. The LZ metric classified unlabeled (blinded) hand motion patterns into expert and novice groups with an accuracy of 70% (sensitivity 64%, specificity 80%). The SVM algorithm had an accuracy of 83% (sensitivity 86%, specificity 80%). The results confirmed the hypothesis. The SVM algorithm increased the predictive power to classify blinded surgical hand motion patterns into expert versus novice groups. With further development, the system used in this study could become a viable tool for low-cost, objective assessment of procedural proficiency in a competency-based curriculum.

  17. Applying Support Vector Machine in classifying satellite images for the assessment of urban sprawl

    Science.gov (United States)

    murgante, Beniamino; Nolè, Gabriele; Lasaponara, Rosa; Lanorte, Antonio; Calamita, Giuseppe

    2013-04-01

    In last decades the spreading of new buildings, road infrastructures and a scattered proliferation of houses in zones outside urban areas, produced a countryside urbanization with no rules, consuming soils and impoverishing the landscape. Such a phenomenon generated a huge environmental impact, diseconomies and a decrease in life quality. This study analyzes processes concerning land use change, paying particular attention to urban sprawl phenomenon. The application is based on the integration of Geographic Information Systems and Remote Sensing adopting open source technologies. The objective is to understand size distribution and dynamic expansion of urban areas in order to define a methodology useful to both identify and monitor the phenomenon. In order to classify "urban" pixels, over time monitoring of settlements spread, understanding trends of artificial territories, classifications of satellite images at different dates have been realized. In order to obtain these classifications, supervised classification algorithms have been adopted. More particularly, Support Vector Machine (SVM) learning algorithm has been applied to multispectral remote data. One of the more interesting features in SVM is the possibility to obtain good results also adopting few classification pixels of training areas. SVM has several interesting features, such as the capacity to obtain good results also adopting few classification pixels of training areas, a high possibility of configuration parameters and the ability to discriminate pixels with similar spectral responses. Multi-temporal ASTER satellite data at medium resolution have been adopted because are very suitable in evaluating such phenomena. The application is based on the integration of Geographic Information Systems and Remote Sensing technologies by means of open source software. Tools adopted in managing and processing data are GRASS GIS, Quantum GIS and R statistical project. The area of interest is located south of Bari

  18. Supervised Machine Learning Algorithms Can Classify Open-Text Feedback of Doctor Performance With Human-Level Accuracy

    Science.gov (United States)

    2017-01-01

    Background Machine learning techniques may be an effective and efficient way to classify open-text reports on doctor’s activity for the purposes of quality assurance, safety, and continuing professional development. Objective The objective of the study was to evaluate the accuracy of machine learning algorithms trained to classify open-text reports of doctor performance and to assess the potential for classifications to identify significant differences in doctors’ professional performance in the United Kingdom. Methods We used 1636 open-text comments (34,283 words) relating to the performance of 548 doctors collected from a survey of clinicians’ colleagues using the General Medical Council Colleague Questionnaire (GMC-CQ). We coded 77.75% (1272/1636) of the comments into 5 global themes (innovation, interpersonal skills, popularity, professionalism, and respect) using a qualitative framework. We trained 8 machine learning algorithms to classify comments and assessed their performance using several training samples. We evaluated doctor performance using the GMC-CQ and compared scores between doctors with different classifications using t tests. Results Individual algorithm performance was high (range F score=.68 to .83). Interrater agreement between the algorithms and the human coder was highest for codes relating to “popular” (recall=.97), “innovator” (recall=.98), and “respected” (recall=.87) codes and was lower for the “interpersonal” (recall=.80) and “professional” (recall=.82) codes. A 10-fold cross-validation demonstrated similar performance in each analysis. When combined together into an ensemble of multiple algorithms, mean human-computer interrater agreement was .88. Comments that were classified as “respected,” “professional,” and “interpersonal” related to higher doctor scores on the GMC-CQ compared with comments that were not classified (P.05). Conclusions Machine learning algorithms can classify open-text feedback

  19. Supervised Machine Learning Algorithms Can Classify Open-Text Feedback of Doctor Performance With Human-Level Accuracy.

    Science.gov (United States)

    Gibbons, Chris; Richards, Suzanne; Valderas, Jose Maria; Campbell, John

    2017-03-15

    Machine learning techniques may be an effective and efficient way to classify open-text reports on doctor's activity for the purposes of quality assurance, safety, and continuing professional development. The objective of the study was to evaluate the accuracy of machine learning algorithms trained to classify open-text reports of doctor performance and to assess the potential for classifications to identify significant differences in doctors' professional performance in the United Kingdom. We used 1636 open-text comments (34,283 words) relating to the performance of 548 doctors collected from a survey of clinicians' colleagues using the General Medical Council Colleague Questionnaire (GMC-CQ). We coded 77.75% (1272/1636) of the comments into 5 global themes (innovation, interpersonal skills, popularity, professionalism, and respect) using a qualitative framework. We trained 8 machine learning algorithms to classify comments and assessed their performance using several training samples. We evaluated doctor performance using the GMC-CQ and compared scores between doctors with different classifications using t tests. Individual algorithm performance was high (range F score=.68 to .83). Interrater agreement between the algorithms and the human coder was highest for codes relating to "popular" (recall=.97), "innovator" (recall=.98), and "respected" (recall=.87) codes and was lower for the "interpersonal" (recall=.80) and "professional" (recall=.82) codes. A 10-fold cross-validation demonstrated similar performance in each analysis. When combined together into an ensemble of multiple algorithms, mean human-computer interrater agreement was .88. Comments that were classified as "respected," "professional," and "interpersonal" related to higher doctor scores on the GMC-CQ compared with comments that were not classified (P.05). Machine learning algorithms can classify open-text feedback of doctor performance into multiple themes derived by human raters with high

  20. Comparing statistical and machine learning classifiers: alternatives for predictive modeling in human factors research.

    Science.gov (United States)

    Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann

    2003-01-01

    Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.

  1. Using machine learning to classify image features from canine pelvic radiographs

    DEFF Research Database (Denmark)

    McEvoy, Fintan; Amigo Rubio, Jose Manuel

    2013-01-01

    As the number of images per study increases in the field of veterinary radiology, there is a growing need for computer-assisted diagnosis techniques. The purpose of this study was to evaluate two machine learning statistical models for automatically identifying image regions that contain the canine...

  2. Exploring Machine Learning Techniques Using Patient Interactions in Online Health Forums to Classify Drug Safety

    Science.gov (United States)

    Chee, Brant Wah Kwong

    2011-01-01

    This dissertation explores the use of personal health messages collected from online message forums to predict drug safety using natural language processing and machine learning techniques. Drug safety is defined as any drug with an active safety alert from the US Food and Drug Administration (FDA). It is believed that this is the first…

  3. Using supervised machine learning to code policy issues: Can classifiers generalize across contexts?

    NARCIS (Netherlands)

    Burscher, B.; Vliegenthart, R.; de Vreese, C.H.

    2015-01-01

    Content analysis of political communication usually covers large amounts of material and makes the study of dynamics in issue salience a costly enterprise. In this article, we present a supervised machine learning approach for the automatic coding of policy issues, which we apply to news articles

  4. Machine Learning Methods for Classifying Human Physical Activity from On-Body Accelerometers

    Science.gov (United States)

    Mannini, Andrea; Sabatini, Angelo Maria

    2010-01-01

    The use of on-body wearable sensors is widespread in several academic and industrial domains. Of great interest are their applications in ambulatory monitoring and pervasive computing systems; here, some quantitative analysis of human motion and its automatic classification are the main computational tasks to be pursued. In this paper, we discuss how human physical activity can be classified using on-body accelerometers, with a major emphasis devoted to the computational algorithms employed for this purpose. In particular, we motivate our current interest for classifiers based on Hidden Markov Models (HMMs). An example is illustrated and discussed by analysing a dataset of accelerometer time series. PMID:22205862

  5. Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal.

    Science.gov (United States)

    Hosseinifard, Behshad; Moradi, Mohammad Hassan; Rostami, Reza

    2013-03-01

    Diagnosing depression in the early curable stages is very important and may even save the life of a patient. In this paper, we study nonlinear analysis of EEG signal for discriminating depression patients and normal controls. Forty-five unmedicated depressed patients and 45 normal subjects were participated in this study. Power of four EEG bands and four nonlinear features including detrended fluctuation analysis (DFA), higuchi fractal, correlation dimension and lyapunov exponent were extracted from EEG signal. For discriminating the two groups, k-nearest neighbor, linear discriminant analysis and logistic regression as the classifiers are then used. Highest classification accuracy of 83.3% is obtained by correlation dimension and LR classifier among other nonlinear features. For further improvement, all nonlinear features are combined and applied to classifiers. A classification accuracy of 90% is achieved by all nonlinear features and LR classifier. In all experiments, genetic algorithm is employed to select the most important features. The proposed technique is compared and contrasted with the other reported methods and it is demonstrated that by combining nonlinear features, the performance is enhanced. This study shows that nonlinear analysis of EEG can be a useful method for discriminating depressed patients and normal subjects. It is suggested that this analysis may be a complementary tool to help psychiatrists for diagnosing depressed patients. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  6. Classifying bent radio galaxies from a mixture of point-like/extended images with Machine Learning.

    Science.gov (United States)

    Bastien, David; Oozeer, Nadeem; Somanah, Radhakrishna

    2017-05-01

    The hypothesis that bent radio sources are supposed to be found in rich, massive galaxy clusters and the avalibility of huge amount of data from radio surveys have fueled our motivation to use Machine Learning (ML) to identify bent radio sources and as such use them as tracers for galaxy clusters. The shapelet analysis allowed us to decompose radio images into 256 features that could be fed into the ML algorithm. Additionally, ideas from the field of neuro-psychology helped us to consider training the machine to identify bent galaxies at different orientations. From our analysis, we found that the Random Forest algorithm was the most effective with an accuracy rate of 92% for a classification of point and extended sources as well as an accuracy of 80% for bent and unbent classification.

  7. Use of machine learning methods to classify Universities based on the income structure

    Science.gov (United States)

    Terlyga, Alexandra; Balk, Igor

    2017-10-01

    In this paper we discuss use of machine learning methods such as self organizing maps, k-means and Ward’s clustering to perform classification of universities based on their income. This classification will allow us to quantitate classification of universities as teaching, research, entrepreneur, etc. which is important tool for government, corporations and general public alike in setting expectation and selecting universities to achieve different goals.

  8. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    Science.gov (United States)

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  9. Sensitivity and specificity of machine learning classifiers for glaucoma diagnosis using Spectral Domain OCT and standard automated perimetry

    Directory of Open Access Journals (Sweden)

    Fabrício R. Silva

    2013-06-01

    Full Text Available PURPOSE: To evaluate the sensitivity and specificity of machine learning classifiers (MLCs for glaucoma diagnosis using Spectral Domain OCT (SD-OCT and standard automated perimetry (SAP. METHODS: Observational cross-sectional study. Sixty two glaucoma patients and 48 healthy individuals were included. All patients underwent a complete ophthalmologic examination, achromatic standard automated perimetry (SAP and retinal nerve fiber layer (RNFL imaging with SD-OCT (Cirrus HD-OCT; Carl Zeiss Meditec Inc., Dublin, California. Receiver operating characteristic (ROC curves were obtained for all SD-OCT parameters and global indices of SAP. Subsequently, the following MLCs were tested using parameters from the SD-OCT and SAP: Bagging (BAG, Naive-Bayes (NB, Multilayer Perceptron (MLP, Radial Basis Function (RBF, Random Forest (RAN, Ensemble Selection (ENS, Classification Tree (CTREE, Ada Boost M1(ADA,Support Vector Machine Linear (SVML and Support Vector Machine Gaussian (SVMG. Areas under the receiver operating characteristic curves (aROC obtained for isolated SAP and OCT parameters were compared with MLCs using OCT+SAP data. RESULTS: Combining OCT and SAP data, MLCs' aROCs varied from 0.777(CTREE to 0.946 (RAN.The best OCT+SAP aROC obtained with RAN (0.946 was significantly larger the best single OCT parameter (p<0.05, but was not significantly different from the aROC obtained with the best single SAP parameter (p=0.19. CONCLUSION: Machine learning classifiers trained on OCT and SAP data can successfully discriminate between healthy and glaucomatous eyes. The combination of OCT and SAP measurements improved the diagnostic accuracy compared with OCT data alone.

  10. Classification of EEG signals using a genetic-based machine learning classifier.

    Science.gov (United States)

    Skinner, B T; Nguyen, H T; Liu, D K

    2007-01-01

    This paper investigates the efficacy of the genetic-based learning classifier system XCS, for the classification of noisy, artefact-inclusive human electroencephalogram (EEG) signals represented using large condition strings (108bits). EEG signals from three participants were recorded while they performed four mental tasks designed to elicit hemispheric responses. Autoregressive (AR) models and Fast Fourier Transform (FFT) methods were used to form feature vectors with which mental tasks can be discriminated. XCS achieved a maximum classification accuracy of 99.3% and a best average of 88.9%. The relative classification performance of XCS was then compared against four non-evolutionary classifier systems originating from different learning techniques. The experimental results will be used as part of our larger research effort investigating the feasibility of using EEG signals as an interface to allow paralysed persons to control a powered wheelchair or other devices.

  11. Integrating support vector machines and random forests to classify crops in time series of Worldview-2 images

    Science.gov (United States)

    Zafari, A.; Zurita-Milla, R.; Izquierdo-Verdiguier, E.

    2017-10-01

    Crop maps are essential inputs for the agricultural planning done at various governmental and agribusinesses agencies. Remote sensing offers timely and costs efficient technologies to identify and map crop types over large areas. Among the plethora of classification methods, Support Vector Machine (SVM) and Random Forest (RF) are widely used because of their proven performance. In this work, we study the synergic use of both methods by introducing a random forest kernel (RFK) in an SVM classifier. A time series of multispectral WorldView-2 images acquired over Mali (West Africa) in 2014 was used to develop our case study. Ground truth containing five common crop classes (cotton, maize, millet, peanut, and sorghum) were collected at 45 farms and used to train and test the classifiers. An SVM with the standard Radial Basis Function (RBF) kernel, a RF, and an SVM-RFK were trained and tested over 10 random training and test subsets generated from the ground data. Results show that the newly proposed SVM-RFK classifier can compete with both RF and SVM-RBF. The overall accuracies based on the spectral bands only are of 83, 82 and 83% respectively. Adding vegetation indices to the analysis result in the classification accuracy of 82, 81 and 84% for SVM-RFK, RF, and SVM-RBF respectively. Overall, it can be observed that the newly tested RFK can compete with SVM-RBF and RF classifiers in terms of classification accuracy.

  12. A SVM-based method for sentiment analysis in Persian language

    Science.gov (United States)

    Hajmohammadi, Mohammad Sadegh; Ibrahim, Roliana

    2013-03-01

    Persian language is the official language of Iran, Tajikistan and Afghanistan. Local online users often represent their opinions and experiences on the web with written Persian. Although the information in those reviews is valuable to potential consumers and sellers, the huge amount of web reviews make it difficult to give an unbiased evaluation to a product. In this paper, standard machine learning techniques SVM and naive Bayes are incorporated into the domain of online Persian Movie reviews to automatically classify user reviews as positive or negative and performance of these two classifiers is compared with each other in this language. The effects of feature presentations on classification performance are discussed. We find that accuracy is influenced by interaction between the classification models and the feature options. The SVM classifier achieves as well as or better accuracy than naive Bayes in Persian movie. Unigrams are proved better features than bigrams and trigrams in capturing Persian sentiment orientation.

  13. SVM classifier on chip for melanoma detection.

    Science.gov (United States)

    Afifi, Shereen; GholamHosseini, Hamid; Sinha, Roopak

    2017-07-01

    Support Vector Machine (SVM) is a common classifier used for efficient classification with high accuracy. SVM shows high accuracy for classifying melanoma (skin cancer) clinical images within computer-aided diagnosis systems used by skin cancer specialists to detect melanoma early and save lives. We aim to develop a medical low-cost handheld device that runs a real-time embedded SVM-based diagnosis system for use in primary care for early detection of melanoma. In this paper, an optimized SVM classifier is implemented onto a recent FPGA platform using the latest design methodology to be embedded into the proposed device for realizing online efficient melanoma detection on a single system on chip/device. The hardware implementation results demonstrate a high classification accuracy of 97.9% and a significant acceleration factor of 26 from equivalent software implementation on an embedded processor, with 34% of resources utilization and 2 watts for power consumption. Consequently, the implemented system meets crucial embedded systems constraints of high performance and low cost, resources utilization and power consumption, while achieving high classification accuracy.

  14. Advanced Cell Classifier: User-Friendly Machine-Learning-Based Software for Discovering Phenotypes in High-Content Imaging Data.

    Science.gov (United States)

    Piccinini, Filippo; Balassa, Tamas; Szkalisity, Abel; Molnar, Csaba; Paavolainen, Lassi; Kujala, Kaisa; Buzas, Krisztina; Sarazova, Marie; Pietiainen, Vilja; Kutay, Ulrike; Smith, Kevin; Horvath, Peter

    2017-06-28

    High-content, imaging-based screens now routinely generate data on a scale that precludes manual verification and interrogation. Software applying machine learning has become an essential tool to automate analysis, but these methods require annotated examples to learn from. Efficiently exploring large datasets to find relevant examples remains a challenging bottleneck. Here, we present Advanced Cell Classifier (ACC), a graphical software package for phenotypic analysis that addresses these difficulties. ACC applies machine-learning and image-analysis methods to high-content data generated by large-scale, cell-based experiments. It features methods to mine microscopic image data, discover new phenotypes, and improve recognition performance. We demonstrate that these features substantially expedite the training process, successfully uncover rare phenotypes, and improve the accuracy of the analysis. ACC is extensively documented, designed to be user-friendly for researchers without machine-learning expertise, and distributed as a free open-source tool at www.cellclassifier.org. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Tool vibration detection with eddy current sensors in machining process and computation of stability lobes using fuzzy classifiers

    Science.gov (United States)

    Devillez, Arnaud; Dudzinski, Daniel

    2007-01-01

    Today the knowledge of a process is very important for engineers to find optimal combination of control parameters warranting productivity, quality and functioning without defects and failures. In our laboratory, we carry out research in the field of high speed machining with modelling, simulation and experimental approaches. The aim of our investigation is to develop a software allowing the cutting conditions optimisation to limit the number of predictive tests, and the process monitoring to prevent any trouble during machining operations. This software is based on models and experimental data sets which constitute the knowledge of the process. In this paper, we deal with the problem of vibrations occurring during a machining operation. These vibrations may cause some failures and defects to the process, like workpiece surface alteration and rapid tool wear. To measure on line the tool micro-movements, we equipped a lathe with a specific instrumentation using eddy current sensors. Obtained signals were correlated with surface finish and a signal processing algorithm was used to determine if a test is stable or unstable. Then, a fuzzy classification method was proposed to classify the tests in a space defined by the width of cut and the cutting speed. Finally, it was shown that the fuzzy classification takes into account of the measurements incertitude to compute the stability limit or stability lobes of the process.

  16. Robust LS-SVM-based adaptive constrained control for a class of uncertain nonlinear systems with time-varying predefined performance

    Science.gov (United States)

    Luo, Jianjun; Wei, Caisheng; Dai, Honghua; Yuan, Jianping

    2018-03-01

    This paper focuses on robust adaptive control for a class of uncertain nonlinear systems subject to input saturation and external disturbance with guaranteed predefined tracking performance. To reduce the limitations of classical predefined performance control method in the presence of unknown initial tracking errors, a novel predefined performance function with time-varying design parameters is first proposed. Then, aiming at reducing the complexity of nonlinear approximations, only two least-square-support-vector-machine-based (LS-SVM-based) approximators with two design parameters are required through norm form transformation of the original system. Further, a novel LS-SVM-based adaptive constrained control scheme is developed under the time-vary predefined performance using backstepping technique. Wherein, to avoid the tedious analysis and repeated differentiations of virtual control laws in the backstepping technique, a simple and robust finite-time-convergent differentiator is devised to only extract its first-order derivative at each step in the presence of external disturbance. In this sense, the inherent demerit of backstepping technique-;explosion of terms; brought by the recursive virtual controller design is conquered. Moreover, an auxiliary system is designed to compensate the control saturation. Finally, three groups of numerical simulations are employed to validate the effectiveness of the newly developed differentiator and the proposed adaptive constrained control scheme.

  17. Data fusion and machine learning to identify threat vectors for the Zika virus and classify vulnerability

    Science.gov (United States)

    Gentle, J. N., Jr.; Kahn, A.; Pierce, S. A.; Wang, S.; Wade, C.; Moran, S.

    2016-12-01

    With the continued spread of the zika virus in the United States in both Florida and Virginia, increased public awareness, prevention and targeted prediction is necessary to effectively mitigate further infection and propagation of the virus throughout the human population. The goal of this project is to utilize publicly accessible data and HPC resources coupled with machine learning algorithms to identify potential threat vectors for the spread of the zika virus in Texas, the United States and globally by correlating available zika case data collected from incident reports in medical databases (e.g., CDC, Florida Department of Health) with known bodies of water in various earth science databases (e.g., USGS NAQWA Data, NASA ASTER Data, TWDB Data) and by using known mosquito population centers as a proxy for trends in population distribution (e.g., WHO, European CDC, Texas Data) while correlating historical trends in the spread of other mosquito borne diseases (e.g., chikungunya, malaria, dengue, yellow fever, west nile, etc.). The resulting analysis should refine the identification of the specific threat vectors for the spread of the virus which will correspondingly increase the effectiveness of the limited resources allocated towards combating the disease through better strategic implementation of defense measures. The minimal outcome of this research is a better understanding of the factors involved in the spread of the zika virus, with the greater potential to save additional lives through more effective resource utilization and public outreach.

  18. Detecting and Classifying Human Touches in a Social Robot Through Acoustic Sensing and Machine Learning.

    Science.gov (United States)

    Alonso-Martín, Fernando; Gamboa-Montero, Juan José; Castillo, José Carlos; Castro-González, Álvaro; Salichs, Miguel Ángel

    2017-05-16

    An important aspect in Human-Robot Interaction is responding to different kinds of touch stimuli. To date, several technologies have been explored to determine how a touch is perceived by a social robot, usually placing a large number of sensors throughout the robot's shell. In this work, we introduce a novel approach, where the audio acquired from contact microphones located in the robot's shell is processed using machine learning techniques to distinguish between different types of touches. The system is able to determine when the robot is touched (touch detection), and to ascertain the kind of touch performed among a set of possibilities: stroke , tap , slap , and tickle (touch classification). This proposal is cost-effective since just a few microphones are able to cover the whole robot's shell since a single microphone is enough to cover each solid part of the robot. Besides, it is easy to install and configure as it just requires a contact surface to attach the microphone to the robot's shell and plug it into the robot's computer. Results show the high accuracy scores in touch gesture recognition. The testing phase revealed that Logistic Model Trees achieved the best performance, with an F -score of 0.81. The dataset was built with information from 25 participants performing a total of 1981 touch gestures.

  19. From Physiological data to Emotional States: Conducting a User Study and Comparing Machine Learning Classifiers

    Directory of Open Access Journals (Sweden)

    Ali Mehmood KHAN

    2016-06-01

    Full Text Available Recognizing emotional states is becoming a major part of a user's context for wearable computing applications. The system should be able to acquire a user's emotional states by using physiological sensors. We want to develop a personal emotional states recognition system that is practical, reliable, and can be used for health-care related applications. We propose to use the eHealth platform 1 which is a ready-made, light weight, small and easy to use device for recognizing a few emotional states like ‘Sad’, ‘Dislike’, ‘Joy’, ‘Stress’, ‘Normal’, ‘No-Idea’, ‘Positive’ and ‘Negative’ using decision tree (J48 and k-Nearest Neighbors (IBK classifiers. In this paper, we present an approach to build a system that exhibits this property and provides evidence based on data for 8 different emotional states collected from 24 different subjects. Our results indicate that the system has an accuracy rate of approximately 98 %. In our work, we used four physiological sensors i.e. ‘Blood Volume Pulse’ (BVP, ‘Electromyogram’ (EMG, ‘Galvanic Skin Response’ (GSR, and ‘Skin Temperature’ in order to recognize emotional states (i.e. Stress, Joy/Happy, Sad, Normal/Neutral, Dislike, No-idea, Positive and Negative.

  20. Classification of fMRI independent components using IC-fingerprints and support vector machine classifiers.

    Science.gov (United States)

    De Martino, Federico; Gentile, Francesco; Esposito, Fabrizio; Balsi, Marco; Di Salle, Francesco; Goebel, Rainer; Formisano, Elia

    2007-01-01

    We present a general method for the classification of independent components (ICs) extracted from functional MRI (fMRI) data sets. The method consists of two steps. In the first step, each fMRI-IC is associated with an IC-fingerprint, i.e., a representation of the component in a multidimensional space of parameters. These parameters are post hoc estimates of global properties of the ICs and are largely independent of a specific experimental design and stimulus timing. In the second step a machine learning algorithm automatically separates the IC-fingerprints into six general classes after preliminary training performed on a small subset of expert-labeled components. We illustrate this approach in a multisubject fMRI study employing visual structure-from-motion stimuli encoding faces and control random shapes. We show that: (1) IC-fingerprints are a valuable tool for the inspection, characterization and selection of fMRI-ICs and (2) automatic classifications of fMRI-ICs in new subjects present a high correspondence with those obtained by expert visual inspection of the components. Importantly, our classification procedure highlights several neurophysiologically interesting processes. The most intriguing of which is reflected, with high intra- and inter-subject reproducibility, in one IC exhibiting a transiently task-related activation in the 'face' region of the primary sensorimotor cortex. This suggests that in addition to or as part of the mirror system, somatotopic regions of the sensorimotor cortex are involved in disambiguating the perception of a moving body part. Finally, we show that the same classification algorithm can be successfully applied, without re-training, to fMRI collected using acquisition parameters, stimulation modality and timing considerably different from those used for training.

  1. Combining deep residual neural network features with supervised machine learning algorithms to classify diverse food image datasets.

    Science.gov (United States)

    McAllister, Patrick; Zheng, Huiru; Bond, Raymond; Moorhead, Anne

    2018-04-01

    Obesity is increasing worldwide and can cause many chronic conditions such as type-2 diabetes, heart disease, sleep apnea, and some cancers. Monitoring dietary intake through food logging is a key method to maintain a healthy lifestyle to prevent and manage obesity. Computer vision methods have been applied to food logging to automate image classification for monitoring dietary intake. In this work we applied pretrained ResNet-152 and GoogleNet convolutional neural networks (CNNs), initially trained using ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset with MatConvNet package, to extract features from food image datasets; Food 5K, Food-11, RawFooT-DB, and Food-101. Deep features were extracted from CNNs and used to train machine learning classifiers including artificial neural network (ANN), support vector machine (SVM), Random Forest, and Naive Bayes. Results show that using ResNet-152 deep features with SVM with RBF kernel can accurately detect food items with 99.4% accuracy using Food-5K validation food image dataset and 98.8% with Food-5K evaluation dataset using ANN, SVM-RBF, and Random Forest classifiers. Trained with ResNet-152 features, ANN can achieve 91.34%, 99.28% when applied to Food-11 and RawFooT-DB food image datasets respectively and SVM with RBF kernel can achieve 64.98% with Food-101 image dataset. From this research it is clear that using deep CNN features can be used efficiently for diverse food item image classification. The work presented in this research shows that pretrained ResNet-152 features provide sufficient generalisation power when applied to a range of food image classification tasks. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Cerebral 18F-FDG PET in macrophagic myofasciitis: An individual SVM-based approach.

    Science.gov (United States)

    Blanc-Durand, Paul; Van Der Gucht, Axel; Guedj, Eric; Abulizi, Mukedaisi; Aoun-Sebaiti, Mehdi; Lerman, Lionel; Verger, Antoine; Authier, François-Jérôme; Itti, Emmanuel

    2017-01-01

    Macrophagic myofasciitis (MMF) is an emerging condition with highly specific myopathological alterations. A peculiar spatial pattern of a cerebral glucose hypometabolism involving occipito-temporal cortex and cerebellum have been reported in patients with MMF; however, the full pattern is not systematically present in routine interpretation of scans, and with varying degrees of severity depending on the cognitive profile of patients. Aim was to generate and evaluate a support vector machine (SVM) procedure to classify patients between healthy or MMF 18F-FDG brain profiles. 18F-FDG PET brain images of 119 patients with MMF and 64 healthy subjects were retrospectively analyzed. The whole-population was divided into two groups; a training set (100 MMF, 44 healthy subjects) and a testing set (19 MMF, 20 healthy subjects). Dimensionality reduction was performed using a t-map from statistical parametric mapping (SPM) and a SVM with a linear kernel was trained on the training set. To evaluate the performance of the SVM classifier, values of sensitivity (Se), specificity (Sp), positive predictive value (PPV), negative predictive value (NPV) and accuracy (Acc) were calculated. The SPM12 analysis on the training set exhibited the already reported hypometabolism pattern involving occipito-temporal and fronto-parietal cortices, limbic system and cerebellum. The SVM procedure, based on the t-test mask generated from the training set, correctly classified MMF patients of the testing set with following Se, Sp, PPV, NPV and Acc: 89%, 85%, 85%, 89%, and 87%. We developed an original and individual approach including a SVM to classify patients between healthy or MMF metabolic brain profiles using 18F-FDG-PET. Machine learning algorithms are promising for computer-aided diagnosis but will need further validation in prospective cohorts.

  3. The employment of Support Vector Machine to classify high and low performance archers based on bio-physiological variables

    Science.gov (United States)

    Taha, Zahari; Muazu Musa, Rabiu; Majeed, Anwar P. P. Abdul; Razali Abdullah, Mohamad; Amirul Abdullah, Muhammad; Hasnun Arif Hassan, Mohd; Khalil, Zubair

    2018-04-01

    The present study employs a machine learning algorithm namely support vector machine (SVM) to classify high and low potential archers from a collection of bio-physiological variables trained on different SVMs. 50 youth archers with the average age and standard deviation of (17.0 ±.056) gathered from various archery programmes completed a one end shooting score test. The bio-physiological variables namely resting heart rate, resting respiratory rate, resting diastolic blood pressure, resting systolic blood pressure, as well as calories intake, were measured prior to their shooting tests. k-means cluster analysis was applied to cluster the archers based on their scores on variables assessed. SVM models i.e. linear, quadratic and cubic kernel functions, were trained on the aforementioned variables. The k-means clustered the archers into high (HPA) and low potential archers (LPA), respectively. It was demonstrated that the linear SVM exhibited good accuracy with a classification accuracy of 94% in comparison the other tested models. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from the selected bio-physiological variables examined.

  4. A support vector machine and a random forest classifier indicates a 15-miRNA set related to osteosarcoma recurrence

    Directory of Open Access Journals (Sweden)

    He Y

    2018-01-01

    Full Text Available Yunfei He,1,2,* Jun Ma,1,* An Wang,1,3,* Weiheng Wang,1 Shengchang Luo,1 Yaoming Liu,2 Xiaojian Ye1 1Department of Orthopaedics, Changzheng Hospital Affiliated with Second Military Medical University, Shanghai, 2Department of Orthopaedics, Lanzhou General Hospital of Lanzhou Military Command Region, Lanzhou, 3Department of Orthopaedics, Shanghai Armed Police Force Hospital, Shanghai, People’s Republic of China *These authors contributed equally to this work Background: Osteosarcoma, which originates in the mesenchymal tissue, is the prevalent primary solid malignancy of the bone. It is of great importance to explore the mechanisms of metastasis and recurrence, which are two primary reasons accounting for the high death rate in osteosarcoma. Data and methods: Three miRNA expression profiles related to osteosarcoma were downloaded from GEO DataSets. Differentially expressed miRNAs (DEmiRs were screened using MetaDE.ES of the MetaDE package. A support vector machine (SVM classifier was constructed using optimal miRNAs, and its prediction efficiency for recurrence was detected in independent datasets. Finally, a co-expression network was constructed based on the DEmiRs and their target genes. Results: In total, 78 significantly DEmiRs were screened. The SVM classifier constructed by 15 miRNAs could accurately classify 58 samples in 65 samples (89.2% in the GSE39040 database, which was validated in another two databases, GSE39052 (84.62%, 22/26 and GSE79181 (91.3%, 21/23. Cox regression showed that four miRNAs, including hsa-miR-10b, hsa-miR-1227, hsa-miR-146b-3p, and hsa-miR-873, significantly correlated with tumor recurrence time. There were 137, 147, 145, and 77 target genes of the above four miRNAs, respectively, which were assigned to 17 gene ontology functionally annotated terms and 14 Kyoto Encyclopedia of Genes and Genomes pathways. Among them, the “Osteoclast differentiation” pathway contained a total of seven target genes and was

  5. Detection of Driver Drowsiness Using Wavelet Analysis of Heart Rate Variability and a Support Vector Machine Classifier

    Directory of Open Access Journals (Sweden)

    Gang Li

    2013-12-01

    Full Text Available Driving while fatigued is just as dangerous as drunk driving and may result in car accidents. Heart rate variability (HRV analysis has been studied recently for the detection of driver drowsiness. However, the detection reliability has been lower than anticipated, because the HRV signals of drivers were always regarded as stationary signals. The wavelet transform method is a method for analyzing non-stationary signals. The aim of this study is to classify alert and drowsy driving events using the wavelet transform of HRV signals over short time periods and to compare the classification performance of this method with the conventional method that uses fast Fourier transform (FFT-based features. Based on the standard shortest duration for FFT-based short-term HRV evaluation, the wavelet decomposition is performed on 2-min HRV samples, as well as 1-min and 3-min samples for reference purposes. A receiver operation curve (ROC analysis and a support vector machine (SVM classifier are used for feature selection and classification, respectively. The ROC analysis results show that the wavelet-based method performs better than the FFT-based method regardless of the duration of the HRV sample that is used. Finally, based on the real-time requirements for driver drowsiness detection, the SVM classifier is trained using eighty FFT and wavelet-based features that are extracted from 1-min HRV signals from four subjects. The averaged leave-one-out (LOO classification performance using wavelet-based feature is 95% accuracy, 95% sensitivity, and 95% specificity. This is better than the FFT-based results that have 68.8% accuracy, 62.5% sensitivity, and 75% specificity. In addition, the proposed hardware platform is inexpensive and easy-to-use.

  6. Performance of a Machine Learning Classifier of Knee MRI Reports in Two Large Academic Radiology Practices: A Tool to Estimate Diagnostic Yield.

    Science.gov (United States)

    Hassanpour, Saeed; Langlotz, Curtis P; Amrhein, Timothy J; Befera, Nicholas T; Lungren, Matthew P

    2017-04-01

    The purpose of this study is to evaluate the performance of a natural language processing (NLP) system in classifying a database of free-text knee MRI reports at two separate academic radiology practices. An NLP system that uses terms and patterns in manually classified narrative knee MRI reports was constructed. The NLP system was trained and tested on expert-classified knee MRI reports from two major health care organizations. Radiology reports were modeled in the training set as vectors, and a support vector machine framework was used to train the classifier. A separate test set from each organization was used to evaluate the performance of the system. We evaluated the performance of the system both within and across organizations. Standard evaluation metrics, such as accuracy, precision, recall, and F1 score (i.e., the weighted average of the precision and recall), and their respective 95% CIs were used to measure the efficacy of our classification system. The accuracy for radiology reports that belonged to the model's clinically significant concept classes after training data from the same institution was good, yielding an F1 score greater than 90% (95% CI, 84.6-97.3%). Performance of the classifier on cross-institutional application without institution-specific training data yielded F1 scores of 77.6% (95% CI, 69.5-85.7%) and 90.2% (95% CI, 84.5-95.9%) at the two organizations studied. The results show excellent accuracy by the NLP machine learning classifier in classifying free-text knee MRI reports, supporting the institution-independent reproducibility of knee MRI report classification. Furthermore, the machine learning classifier performed well on free-text knee MRI reports from another institution. These data support the feasibility of multiinstitutional classification of radiologic imaging text reports with a single machine learning classifier without requiring institution-specific training data.

  7. Classifying Physical Morphology of Cocoa Beans Digital Images using Multiclass Ensemble Least-Squares Support Vector Machine

    Science.gov (United States)

    Lawi, Armin; Adhitya, Yudhi

    2018-03-01

    The objective of this research is to determine the quality of cocoa beans through morphology of their digital images. Samples of cocoa beans were scattered on a bright white paper under a controlled lighting condition. A compact digital camera was used to capture the images. The images were then processed to extract their morphological parameters. Classification process begins with an analysis of cocoa beans image based on morphological feature extraction. Parameters for extraction of morphological or physical feature parameters, i.e., Area, Perimeter, Major Axis Length, Minor Axis Length, Aspect Ratio, Circularity, Roundness, Ferret Diameter. The cocoa beans are classified into 4 groups, i.e.: Normal Beans, Broken Beans, Fractured Beans, and Skin Damaged Beans. The model of classification used in this paper is the Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM), a proposed improvement model of SVM using ensemble method in which the separate hyperplanes are obtained by least square approach and the multiclass procedure uses One-Against- All method. The result of our proposed model showed that the classification with morphological feature input parameters were accurately as 99.705% for the four classes, respectively.

  8. A cross-sectional evaluation of meditation experience on electroencephalography data by artificial neural network and support vector machine classifiers.

    Science.gov (United States)

    Lee, Yu-Hao; Hsieh, Ya-Ju; Shiah, Yung-Jong; Lin, Yu-Huei; Chen, Chiao-Yun; Tyan, Yu-Chang; GengQiu, JiaCheng; Hsu, Chung-Yao; Chen, Sharon Chia-Ju

    2017-04-01

    To quantitate the meditation experience is a subjective and complex issue because it is confounded by many factors such as emotional state, method of meditation, and personal physical condition. In this study, we propose a strategy with a cross-sectional analysis to evaluate the meditation experience with 2 artificial intelligence techniques: artificial neural network and support vector machine. Within this analysis system, 3 features of the electroencephalography alpha spectrum and variant normalizing scaling are manipulated as the evaluating variables for the detection of accuracy. Thereafter, by modulating the sliding window (the period of the analyzed data) and shifting interval of the window (the time interval to shift the analyzed data), the effect of immediate analysis for the 2 methods is compared. This analysis system is performed on 3 meditation groups, categorizing their meditation experiences in 10-year intervals from novice to junior and to senior. After an exhausted calculation and cross-validation across all variables, the high accuracy rate >98% is achievable under the criterion of 0.5-minute sliding window and 2 seconds shifting interval for both methods. In a word, the minimum analyzable data length is 0.5 minute and the minimum recognizable temporal resolution is 2 seconds in the decision of meditative classification. Our proposed classifier of the meditation experience promotes a rapid evaluation system to distinguish meditation experience and a beneficial utilization of artificial techniques for the big-data analysis.

  9. Stationary Wavelet Transform and AdaBoost with SVM Based Pathological Brain Detection in MRI Scanning.

    Science.gov (United States)

    Nayak, Deepak Ranjan; Dash, Ratnakar; Majhi, Banshidhar

    2017-01-01

    This paper presents an automatic classification system for segregating pathological brain from normal brains in magnetic resonance imaging scanning. The proposed system employs contrast limited adaptive histogram equalization scheme to enhance the diseased region in brain MR images. Two-dimensional stationary wavelet transform is harnessed to extract features from the preprocessed images. The feature vector is constructed using the energy and entropy values, computed from the level- 2 SWT coefficients. Then, the relevant and uncorrelated features are selected using symmetric uncertainty ranking filter. Subsequently, the selected features are given input to the proposed AdaBoost with support vector machine classifier, where SVM is used as the base classifier of AdaBoost algorithm. To validate the proposed system, three standard MR image datasets, Dataset-66, Dataset-160, and Dataset- 255 have been utilized. The 5 runs of k-fold stratified cross validation results indicate the suggested scheme offers better performance than other existing schemes in terms of accuracy and number of features. The proposed system earns ideal classification over Dataset-66 and Dataset-160; whereas, for Dataset- 255, an accuracy of 99.45% is achieved. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  10. Fault detection and diagnosis of an industrial steam turbine using fusion of SVM (support vector machine) and ANFIS (adaptive neuro-fuzzy inference system) classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Salahshoor, Karim [Department of Instrumentation and Automation, Petroleum University of Technology, Tehran (Iran, Islamic Republic of); Kordestani, Mojtaba; Khoshro, Majid S. [Department of Control Engineering, Islamic Azad University South Tehran branch (Iran, Islamic Republic of)

    2010-12-15

    The subject of FDD (fault detection and diagnosis) has gained widespread industrial interest in machine condition monitoring applications. This is mainly due to the potential advantage to be achieved from reduced maintenance costs, improved productivity and increased machine availability. This paper presents a new FDD scheme for condition machinery of an industrial steam turbine using a data fusion methodology. Fusion of a SVM (support vector machine) classifier with an ANFIS (adaptive neuro-fuzzy inference system) classifier, integrated into a common framework, is utilized to enhance the fault detection and diagnostic tasks. For this purpose, a multi-attribute data is fused into aggregated values of a single attribute by OWA (ordered weighted averaging) operators. The simulation studies indicate that the resulting fusion-based scheme outperforms the individual SVM and ANFIS systems to detect and diagnose incipient steam turbine faults. (author)

  11. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the 'Extreme Learning Machine' Algorithm.

    Directory of Open Access Journals (Sweden)

    Mark D McDonnell

    Full Text Available Recent advances in training deep (multi-layer architectures have inspired a renaissance in neural network use. For example, deep convolutional networks are becoming the default option for difficult tasks on large datasets, such as image and speech recognition. However, here we show that error rates below 1% on the MNIST handwritten digit benchmark can be replicated with shallow non-convolutional neural networks. This is achieved by training such networks using the 'Extreme Learning Machine' (ELM approach, which also enables a very rapid training time (∼ 10 minutes. Adding distortions, as is common practise for MNIST, reduces error rates even further. Our methods are also shown to be capable of achieving less than 5.5% error rates on the NORB image database. To achieve these results, we introduce several enhancements to the standard ELM algorithm, which individually and in combination can significantly improve performance. The main innovation is to ensure each hidden-unit operates only on a randomly sized and positioned patch of each image. This form of random 'receptive field' sampling of the input ensures the input weight matrix is sparse, with about 90% of weights equal to zero. Furthermore, combining our methods with a small number of iterations of a single-batch backpropagation method can significantly reduce the number of hidden-units required to achieve a particular performance. Our close to state-of-the-art results for MNIST and NORB suggest that the ease of use and accuracy of the ELM algorithm for designing a single-hidden-layer neural network classifier should cause it to be given greater consideration either as a standalone method for simpler problems, or as the final classification stage in deep neural networks applied to more difficult problems.

  12. Automatic epileptic seizure detection in EEGs using MF-DFA, SVM based on cloud computing.

    Science.gov (United States)

    Zhang, Zhongnan; Wen, Tingxi; Huang, Wei; Wang, Meihong; Li, Chunfeng

    2017-01-01

    Epilepsy is a chronic disease with transient brain dysfunction that results from the sudden abnormal discharge of neurons in the brain. Since electroencephalogram (EEG) is a harmless and noninvasive detection method, it plays an important role in the detection of neurological diseases. However, the process of analyzing EEG to detect neurological diseases is often difficult because the brain electrical signals are random, non-stationary and nonlinear. In order to overcome such difficulty, this study aims to develop a new computer-aided scheme for automatic epileptic seizure detection in EEGs based on multi-fractal detrended fluctuation analysis (MF-DFA) and support vector machine (SVM). New scheme first extracts features from EEG by MF-DFA during the first stage. Then, the scheme applies a genetic algorithm (GA) to calculate parameters used in SVM and classify the training data according to the selected features using SVM. Finally, the trained SVM classifier is exploited to detect neurological diseases. The algorithm utilizes MLlib from library of SPARK and runs on cloud platform. Applying to a public dataset for experiment, the study results show that the new feature extraction method and scheme can detect signals with less features and the accuracy of the classification reached up to 99%. MF-DFA is a promising approach to extract features for analyzing EEG, because of its simple algorithm procedure and less parameters. The features obtained by MF-DFA can represent samples as well as traditional wavelet transform and Lyapunov exponents. GA can always find useful parameters for SVM with enough execution time. The results illustrate that the classification model can achieve comparable accuracy, which means that it is effective in epileptic seizure detection.

  13. SVM-Based Spectral Analysis for Heart Rate from Multi-Channel WPPG Sensor Signals.

    Science.gov (United States)

    Xiong, Jiping; Cai, Lisang; Wang, Fei; He, Xiaowei

    2017-03-03

    Although wrist-type photoplethysmographic (hereafter referred to as WPPG) sensor signals can measure heart rate quite conveniently, the subjects' hand movements can cause strong motion artifacts, and then the motion artifacts will heavily contaminate WPPG signals. Hence, it is challenging for us to accurately estimate heart rate from WPPG signals during intense physical activities. The WWPG method has attracted more attention thanks to the popularity of wrist-worn wearable devices. In this paper, a mixed approach called Mix-SVM is proposed, it can use multi-channel WPPG sensor signals and simultaneous acceleration signals to measurement heart rate. Firstly, we combine the principle component analysis and adaptive filter to remove a part of the motion artifacts. Due to the strong relativity between motion artifacts and acceleration signals, the further denoising problem is regarded as a sparse signals reconstruction problem. Then, we use a spectrum subtraction method to eliminate motion artifacts effectively. Finally, the spectral peak corresponding to heart rate is sought by an SVM-based spectral analysis method. Through the public PPG database in the 2015 IEEE Signal Processing Cup, we acquire the experimental results, i.e., the average absolute error was 1.01 beat per minute, and the Pearson correlation was 0.9972. These results also confirm that the proposed Mix-SVM approach has potential for multi-channel WPPG-based heart rate estimation in the presence of intense physical exercise.

  14. SVM-Based System for Prediction of Epileptic Seizures from iEEG Signal

    Science.gov (United States)

    Cherkassky, Vladimir; Lee, Jieun; Veber, Brandon; Patterson, Edward E.; Brinkmann, Benjamin H.; Worrell, Gregory A.

    2017-01-01

    Objective This paper describes a data-analytic modeling approach for prediction of epileptic seizures from intracranial electroencephalogram (iEEG) recording of brain activity. Even though it is widely accepted that statistical characteristics of iEEG signal change prior to seizures, robust seizure prediction remains a challenging problem due to subject-specific nature of data-analytic modeling. Methods Our work emphasizes understanding of clinical considerations important for iEEG-based seizure prediction, and proper translation of these clinical considerations into data-analytic modeling assumptions. Several design choices during pre-processing and post-processing are considered and investigated for their effect on seizure prediction accuracy. Results Our empirical results show that the proposed SVM-based seizure prediction system can achieve robust prediction of preictal and interictal iEEG segments from dogs with epilepsy. The sensitivity is about 90–100%, and the false-positive rate is about 0–0.3 times per day. The results also suggest good prediction is subject-specific (dog or human), in agreement with earlier studies. Conclusion Good prediction performance is possible only if the training data contain sufficiently many seizure episodes, i.e., at least 5–7 seizures. Significance The proposed system uses subject-specific modeling and unbalanced training data. This system also utilizes three different time scales during training and testing stages. PMID:27362758

  15. SVM-Based Spectral Analysis for Heart Rate from Multi-Channel WPPG Sensor Signals

    Directory of Open Access Journals (Sweden)

    Jiping Xiong

    2017-03-01

    Full Text Available Although wrist-type photoplethysmographic (hereafter referred to as WPPG sensor signals can measure heart rate quite conveniently, the subjects’ hand movements can cause strong motion artifacts, and then the motion artifacts will heavily contaminate WPPG signals. Hence, it is challenging for us to accurately estimate heart rate from WPPG signals during intense physical activities. The WWPG method has attracted more attention thanks to the popularity of wrist-worn wearable devices. In this paper, a mixed approach called Mix-SVM is proposed, it can use multi-channel WPPG sensor signals and simultaneous acceleration signals to measurement heart rate. Firstly, we combine the principle component analysis and adaptive filter to remove a part of the motion artifacts. Due to the strong relativity between motion artifacts and acceleration signals, the further denoising problem is regarded as a sparse signals reconstruction problem. Then, we use a spectrum subtraction method to eliminate motion artifacts effectively. Finally, the spectral peak corresponding to heart rate is sought by an SVM-based spectral analysis method. Through the public PPG database in the 2015 IEEE Signal Processing Cup, we acquire the experimental results, i.e., the average absolute error was 1.01 beat per minute, and the Pearson correlation was 0.9972. These results also confirm that the proposed Mix-SVM approach has potential for multi-channel WPPG-based heart rate estimation in the presence of intense physical exercise.

  16. A review and experimental study on the application of classifiers and evolutionary algorithms in EEG-based brain-machine interface systems

    Science.gov (United States)

    Tahernezhad-Javazm, Farajollah; Azimirad, Vahid; Shoaran, Maryam

    2018-04-01

    Objective. Considering the importance and the near-future development of noninvasive brain-machine interface (BMI) systems, this paper presents a comprehensive theoretical-experimental survey on the classification and evolutionary methods for BMI-based systems in which EEG signals are used. Approach. The paper is divided into two main parts. In the first part, a wide range of different types of the base and combinatorial classifiers including boosting and bagging classifiers and evolutionary algorithms are reviewed and investigated. In the second part, these classifiers and evolutionary algorithms are assessed and compared based on two types of relatively widely used BMI systems, sensory motor rhythm-BMI and event-related potentials-BMI. Moreover, in the second part, some of the improved evolutionary algorithms as well as bi-objective algorithms are experimentally assessed and compared. Main results. In this study two databases are used, and cross-validation accuracy (CVA) and stability to data volume (SDV) are considered as the evaluation criteria for the classifiers. According to the experimental results on both databases, regarding the base classifiers, linear discriminant analysis and support vector machines with respect to CVA evaluation metric, and naive Bayes with respect to SDV demonstrated the best performances. Among the combinatorial classifiers, four classifiers, Bagg-DT (bagging decision tree), LogitBoost, and GentleBoost with respect to CVA, and Bagging-LR (bagging logistic regression) and AdaBoost (adaptive boosting) with respect to SDV had the best performances. Finally, regarding the evolutionary algorithms, single-objective invasive weed optimization (IWO) and bi-objective nondominated sorting IWO algorithms demonstrated the best performances. Significance. We present a general survey on the base and the combinatorial classification methods for EEG signals (sensory motor rhythm and event-related potentials) as well as their optimization methods

  17. A DWT and SVM based method for rolling element bearing fault diagnosis and its comparison with Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Sunil Tyagi

    2017-04-01

    Full Text Available A classification technique using Support Vector Machine (SVM classifier for detection of rolling element bearing fault is presented here.  The SVM was fed from features that were extracted from of vibration signals obtained from experimental setup consisting of rotating driveline that was mounted on rolling element bearings which were run in normal and with artificially faults induced conditions. The time-domain vibration signals were divided into 40 segments and simple features such as peaks in time domain and spectrum along with statistical features such as standard deviation, skewness, kurtosis etc. were extracted. Effectiveness of SVM classifier was compared with the performance of Artificial Neural Network (ANN classifier and it was found that the performance of SVM classifier is superior to that of ANN. The effect of pre-processing of the vibration signal by Discreet Wavelet Transform (DWT prior to feature extraction is also studied and it is shown that pre-processing of vibration signal with DWT enhances the effectiveness of both ANN and SVM classifiers. It has been demonstrated from experiment results that performance of SVM classifier is better than ANN in detection of bearing condition and pre-processing the vibration signal with DWT improves the performance of SVM classifier.

  18. Classifying low-grade and high-grade bladder cancer using label-free serum surface-enhanced Raman spectroscopy and support vector machine

    Science.gov (United States)

    Zhang, Yanjiao; Lai, Xiaoping; Zeng, Qiuyao; Li, Linfang; Lin, Lin; Li, Shaoxin; Liu, Zhiming; Su, Chengkang; Qi, Minni; Guo, Zhouyi

    2018-03-01

    This study aims to classify low-grade and high-grade bladder cancer (BC) patients using serum surface-enhanced Raman scattering (SERS) spectra and support vector machine (SVM) algorithms. Serum SERS spectra are acquired from 88 serum samples with silver nanoparticles as the SERS-active substrate. Diagnostic accuracies of 96.4% and 95.4% are obtained when differentiating the serum SERS spectra of all BC patients versus normal subjects and low-grade versus high-grade BC patients, respectively, with optimal SVM classifier models. This study demonstrates that the serum SERS technique combined with SVM has great potential to noninvasively detect and classify high-grade and low-grade BC patients.

  19. Examining the Capability of Supervised Machine Learning Classifiers in Extracting Flooded Areas from Landsat TM Imagery: A Case Study from a Mediterranean Flood

    Directory of Open Access Journals (Sweden)

    Gareth Ireland

    2015-03-01

    Full Text Available This study explored the capability of Support Vector Machines (SVMs and regularised kernel Fisher’s discriminant analysis (rkFDA machine learning supervised classifiers in extracting flooded area from optical Landsat TM imagery. The ability of both techniques was evaluated using a case study of a riverine flood event in 2010 in a heterogeneous Mediterranean region, for which TM imagery acquired shortly after the flood event was available. For the two classifiers, both linear and non-linear (kernel versions were utilised in their implementation. The ability of the different classifiers to map the flooded area extent was assessed on the basis of classification accuracy assessment metrics. Results showed that rkFDA outperformed SVMs in terms of accurate flooded pixels detection, also producing fewer missed detections of the flooded area. Yet, SVMs showed less false flooded area detections. Overall, the non-linear rkFDA classification method was the more accurate of the two techniques (OA = 96.23%, K = 0.877. Both methods outperformed the standard Normalized Difference Water Index (NDWI thresholding (OA = 94.63, K = 0.818 by roughly 0.06 K points. Although overall accuracy results for the rkFDA and SVMs classifications only showed a somewhat minor improvement on the overall accuracy exhibited by the NDWI thresholding, notably both classifiers considerably outperformed the thresholding algorithm in other specific accuracy measures (e.g. producer accuracy for the “not flooded” class was ~10.5% less accurate for the NDWI thresholding algorithm in comparison to the classifiers, and average per-class accuracy was ~5% less accurate than the machine learning models. This study provides evidence of the successful application of supervised machine learning for classifying flooded areas in Landsat imagery, where few studies so far exist in this direction. Considering that Landsat data is open access and has global coverage, the results of this study

  20. A new intelligent classifier for breast cancer diagnosis based on a rough set and extreme learning machine: RS + ELM

    OpenAIRE

    KAYA, Yılmaz

    2014-01-01

    Breast cancer is one of the leading causes of death among women all around the world. Therefore, true and early diagnosis of breast cancer is an important problem. The rough set (RS) and extreme learning machine (ELM) methods were used collectively in this study for the diagnosis of breast cancer. The unnecessary attributes were discarded from the dataset by means of the RS approach. The classification process by means of ELM was performed using the remaining attributes. The Wisconsin B...

  1. Area Determination of Diabetic Foot Ulcer Images Using a Cascaded Two-Stage SVM-Based Classification.

    Science.gov (United States)

    Wang, Lei; Pedersen, Peder C; Agu, Emmanuel; Strong, Diane M; Tulu, Bengisu

    2017-09-01

    The standard chronic wound assessment method based on visual examination is potentially inaccurate and also represents a significant clinical workload. Hence, computer-based systems providing quantitative wound assessment may be valuable for accurately monitoring wound healing status, with the wound area the best suited for automated analysis. Here, we present a novel approach, using support vector machines (SVM) to determine the wound boundaries on foot ulcer images captured with an image capture box, which provides controlled lighting and range. After superpixel segmentation, a cascaded two-stage classifier operates as follows: in the first stage, a set of k binary SVM classifiers are trained and applied to different subsets of the entire training images dataset, and incorrectly classified instances are collected. In the second stage, another binary SVM classifier is trained on the incorrectly classified set. We extracted various color and texture descriptors from superpixels that are used as input for each stage in the classifier training. Specifically, color and bag-of-word representations of local dense scale invariant feature transformation features are descriptors for ruling out irrelevant regions, and color and wavelet-based features are descriptors for distinguishing healthy tissue from wound regions. Finally, the detected wound boundary is refined by applying the conditional random field method. We have implemented the wound classification on a Nexus 5 smartphone platform, except for training which was done offline. Results are compared with other classifiers and show that our approach provides high global performance rates (average sensitivity = 73.3%, specificity = 94.6%) and is sufficiently efficient for a smartphone-based image analysis.

  2. Support vector machine based estimation of remaining useful life: current research status and future trends

    International Nuclear Information System (INIS)

    Huang, Hong Zhong; Wang, Hai Kun; Li, Yan Feng; Zhang, Longlong; Liu, Zhiliang

    2015-01-01

    Estimation of remaining useful life (RUL) is helpful to manage life cycles of machines and to reduce maintenance cost. Support vector machine (SVM) is a promising algorithm for estimation of RUL because it can easily process small training sets and multi-dimensional data. Many SVM based methods have been proposed to predict RUL of some key components. We did a literature review related to SVM based RUL estimation within a decade. The references reviewed are classified into two categories: improved SVM algorithms and their applications to RUL estimation. The latter category can be further divided into two types: one, to predict the condition state in the future and then build a relationship between state and RUL; two, to establish a direct relationship between current state and RUL. However, SVM is seldom used to track the degradation process and build an accurate relationship between the current health condition state and RUL. Based on the above review and summary, this paper points out that the ability to continually improve SVM, and obtain a novel idea for RUL prediction using SVM will be future works.

  3. Detection of surface cracking in steel pipes based on vibration data using a multi-class support vector machine classifier

    Science.gov (United States)

    Mustapha, S.; Braytee, A.; Ye, L.

    2017-04-01

    In this study, we focused at the development and verification of a robust framework for surface crack detection in steel pipes using measured vibration responses; with the presence of multiple progressive damage occurring in different locations within the structure. Feature selection, dimensionality reduction, and multi-class support vector machine were established for this purpose. Nine damage cases, at different locations, orientations and length, were introduced into the pipe structure. The pipe was impacted 300 times using an impact hammer, after each damage case, the vibration data were collected using 3 PZT wafers which were installed on the outer surface of the pipe. At first, damage sensitive features were extracted using the frequency response function approach followed by recursive feature elimination for dimensionality reduction. Then, a multi-class support vector machine learning algorithm was employed to train the data and generate a statistical model. Once the model is established, decision values and distances from the hyper-plane were generated for the new collected data using the trained model. This process was repeated on the data collected from each sensor. Overall, using a single sensor for training and testing led to a very high accuracy reaching 98% in the assessment of the 9 damage cases used in this study.

  4. Learning to classify organic and conventional wheat - a machine-learning driven approach using the MeltDB 2.0 metabolomics analysis platform

    Directory of Open Access Journals (Sweden)

    Nikolas eKessler

    2015-03-01

    Full Text Available We present results of our machine learning approach to the problem of classifying GC-MS data originating from wheat grains of different farming systems. The aim is to investigate the potential of learning algorithms to classify GC-MS data to be either from conventionally grown or from organically grown samples and considering different cultivars. The motivation of our work is rather obvious on the background of nowadays increased demand for organic food in post-industrialized societies and the necessity to prove organic food authenticity. The background of our data set is given by up to eleven wheat cultivars that have been cultivated in both farming systems, organic and conventional, throughout three years. More than 300 GC-MS measurements were recorded and subsequently processed and analyzed in the MeltDB 2.0 metabolomics analysis platform, being briefly outlined in this paper. We further describe how unsupervised (t-SNE, PCA and supervised (RF, SVM methods can be applied for sample visualization and classification. Our results clearly show that years have most and wheat cultivars have second-most influence on the metabolic composition of a sample. We can also show, that for a given year and cultivar, organic and conventional cultivation can be distinguished by machine-learning algorithms.

  5. Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18F-FDG PET/CT images.

    Science.gov (United States)

    Wang, Hongkai; Zhou, Zongwei; Li, Yingci; Chen, Zhonghua; Lu, Peiou; Wang, Wenzhi; Liu, Wanyu; Yu, Lijuan

    2017-12-01

    This study aimed to compare one state-of-the-art deep learning method and four classical machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer (NSCLC) from 18 F-FDG PET/CT images. Another objective was to compare the discriminative power of the recently popular PET/CT texture features with the widely used diagnostic features such as tumor size, CT value, SUV, image contrast, and intensity standard deviation. The four classical machine learning methods included random forests, support vector machines, adaptive boosting, and artificial neural network. The deep learning method was the convolutional neural networks (CNN). The five methods were evaluated using 1397 lymph nodes collected from PET/CT images of 168 patients, with corresponding pathology analysis results as gold standard. The comparison was conducted using 10 times 10-fold cross-validation based on the criterion of sensitivity, specificity, accuracy (ACC), and area under the ROC curve (AUC). For each classical method, different input features were compared to select the optimal feature set. Based on the optimal feature set, the classical methods were compared with CNN, as well as with human doctors from our institute. For the classical methods, the diagnostic features resulted in 81~85% ACC and 0.87~0.92 AUC, which were significantly higher than the results of texture features. CNN's sensitivity, specificity, ACC, and AUC were 84, 88, 86, and 0.91, respectively. There was no significant difference between the results of CNN and the best classical method. The sensitivity, specificity, and ACC of human doctors were 73, 90, and 82, respectively. All the five machine learning methods had higher sensitivities but lower specificities than human doctors. The present study shows that the performance of CNN is not significantly different from the best classical methods and human doctors for classifying mediastinal lymph node metastasis of NSCLC from PET/CT images

  6. Introducing instrumental variables in the LS-SVM based identification framework

    NARCIS (Netherlands)

    Laurain, V.; Zheng, W-X.; Toth, R.

    2011-01-01

    Least-Squares Support Vector Machines (LS-SVM) represent a promising approach to identify nonlinear systems via nonparametric estimation of the nonlinearities in a computationally and stochastically attractive way. All the methods dedicated to the solution of this problem rely on the minimization of

  7. A Statistical Parameter Analysis and SVM Based Fault Diagnosis Strategy for Dynamically Tuned Gyroscopes

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Gyro's fault diagnosis plays a critical role in inertia navigation systems for higher reliability and precision. A new fault diagnosis strategy based on the statistical parameter analysis (SPA) and support vector machine(SVM) classification model was proposed for dynamically tuned gyroscopes (DTG). The SPA, a kind of time domain analysis approach, was introduced to compute a set of statistical parameters of vibration signal as the state features of DTG, with which the SVM model, a novel learning machine based on statistical learning theory (SLT), was applied and constructed to train and identify the working state of DTG. The experimental results verify that the proposed diagnostic strategy can simply and effectively extract the state features of DTG, and it outperforms the radial-basis function (RBF) neural network based diagnostic method and can more reliably and accurately diagnose the working state of DTG.

  8. A Combination of Geographically Weighted Regression, Particle Swarm Optimization and Support Vector Machine for Landslide Susceptibility Mapping: A Case Study at Wanzhou in the Three Gorges Area, China

    Directory of Open Access Journals (Sweden)

    Xianyu Yu

    2016-05-01

    Full Text Available In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%–19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.

  9. A Combination of Geographically Weighted Regression, Particle Swarm Optimization and Support Vector Machine for Landslide Susceptibility Mapping: A Case Study at Wanzhou in the Three Gorges Area, China.

    Science.gov (United States)

    Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian

    2016-05-11

    In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%-19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.

  10. Machine Learning-based Individual Assessment of Cortical Atrophy Pattern in Alzheimer's Disease Spectrum: Development of the Classifier and Longitudinal Evaluation.

    Science.gov (United States)

    Lee, Jin San; Kim, Changsoo; Shin, Jeong-Hyeon; Cho, Hanna; Shin, Dae-Seock; Kim, Nakyoung; Kim, Hee Jin; Kim, Yeshin; Lockhart, Samuel N; Na, Duk L; Seo, Sang Won; Seong, Joon-Kyung

    2018-03-07

    To develop a new method for measuring Alzheimer's disease (AD)-specific similarity of cortical atrophy patterns at the individual-level, we employed an individual-level machine learning algorithm. A total of 869 cognitively normal (CN) individuals and 473 patients with probable AD dementia who underwent high-resolution 3T brain MRI were included. We propose a machine learning-based method for measuring the similarity of an individual subject's cortical atrophy pattern with that of a representative AD patient cohort. In addition, we validated this similarity measure in two longitudinal cohorts consisting of 79 patients with amnestic-mild cognitive impairment (aMCI) and 27 patients with probable AD dementia. Surface-based morphometry classifier for discriminating AD from CN showed sensitivity and specificity values of 87.1% and 93.3%, respectively. In the longitudinal validation study, aMCI-converts had higher atrophy similarity at both baseline (p < 0.001) and first year visits (p < 0.001) relative to non-converters. Similarly, AD patients with faster decline had higher atrophy similarity than slower decliners at baseline (p = 0.042), first year (p = 0.028), and third year visits (p = 0.027). The AD-specific atrophy similarity measure is a novel approach for the prediction of dementia risk and for the evaluation of AD trajectories on an individual subject level.

  11. SVM-Based Dynamic Reconfiguration CPS for Manufacturing System in Industry 4.0

    Directory of Open Access Journals (Sweden)

    Hyun-Jun Shin

    2018-01-01

    Full Text Available CPS is potential application in various fields, such as medical, healthcare, energy, transportation, and defense, as well as Industry 4.0 in Germany. Although studies on the equipment aging and prediction of problem have been done by combining CPS with Industry 4.0, such studies were based on small numbers and majority of the papers focused primarily on CPS methodology. Therefore, it is necessary to study active self-protection to enable self-management functions, such as self-healing by applying CPS in shop-floor. In this paper, we have proposed modeling of shop-floor and a dynamic reconfigurable CPS scheme that can predict the occurrence of anomalies and self-protection in the model. For this purpose, SVM was used as a machine learning technology and it was possible to restrain overloading in manufacturing process. In addition, we design CPS framework based on machine learning for Industry 4.0, simulate it, and perform. Simulation results show the simulation model autonomously detects the abnormal situation and it is dynamically reconfigured through self-healing.

  12. An SVM-Based Solution for Fault Detection in Wind Turbines

    Directory of Open Access Journals (Sweden)

    Pedro Santos

    2015-03-01

    Full Text Available Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.

  13. Methodology for selection of attributes and operating conditions for SVM-Based fault locator's

    Directory of Open Access Journals (Sweden)

    Debbie Johan Arredondo Arteaga

    2017-01-01

    Full Text Available Context: Energy distribution companies must employ strategies to meet their timely and high quality service, and fault-locating techniques represent and agile alternative for restoring the electric service in the power distribution due to the size of distribution services (generally large and the usual interruptions in the service. However, these techniques are not robust enough and present some limitations in both computational cost and the mathematical description of the models they use. Method: This paper performs an analysis based on a Support Vector Machine for the evaluation of the proper conditions to adjust and validate a fault locator for distribution systems; so that it is possible to determine the minimum number of operating conditions that allow to achieve a good performance with a low computational effort. Results: We tested the proposed methodology in a prototypical distribution circuit, located in a rural area of Colombia. This circuit has a voltage of 34.5 KV and is subdivided in 20 zones. Additionally, the characteristics of the circuit allowed us to obtain a database of 630.000 records of single-phase faults and different operating conditions. As a result, we could determine that the locator showed a performance above 98% with 200 suitable selected operating conditions. Conclusions: It is possible to improve the performance of fault locators based on Support Vector Machine. Specifically, these improvements are achieved by properly selecting optimal operating conditions and attributes, since they directly affect the performance in terms of efficiency and the computational cost.

  14. An SVM-based solution for fault detection in wind turbines.

    Science.gov (United States)

    Santos, Pedro; Villa, Luisa F; Reñones, Aníbal; Bustillo, Andres; Maudes, Jesús

    2015-03-09

    Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs) are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs) shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.

  15. Fractal and twin SVM-based handgrip recognition for healthy subjects and trans-radial amputees using myoelectric signal.

    Science.gov (United States)

    Arjunan, Sridhar Poosapadi; Kumar, Dinesh Kant; Jayadeva J

    2016-02-01

    Identifying functional handgrip patterns using surface electromygram (sEMG) signal recorded from amputee residual muscle is required for controlling the myoelectric prosthetic hand. In this study, we have computed the signal fractal dimension (FD) and maximum fractal length (MFL) during different grip patterns performed by healthy and transradial amputee subjects. The FD and MFL of the sEMG, referred to as the fractal features, were classified using twin support vector machines (TSVM) to recognize the handgrips. TSVM requires fewer support vectors, is suitable for data sets with unbalanced distributions, and can simultaneously be trained for improving both sensitivity and specificity. When compared with other methods, this technique resulted in improved grip recognition accuracy, sensitivity, and specificity, and this improvement was significant (κ=0.91).

  16. VLSI Design of SVM-Based Seizure Detection System With On-Chip Learning Capability.

    Science.gov (United States)

    Feng, Lichen; Li, Zunchao; Wang, Yuanfa

    2018-02-01

    Portable automatic seizure detection system is very convenient for epilepsy patients to carry. In order to make the system on-chip trainable with high efficiency and attain high detection accuracy, this paper presents a very large scale integration (VLSI) design based on the nonlinear support vector machine (SVM). The proposed design mainly consists of a feature extraction (FE) module and an SVM module. The FE module performs the three-level Daubechies discrete wavelet transform to fit the physiological bands of the electroencephalogram (EEG) signal and extracts the time-frequency domain features reflecting the nonstationary signal properties. The SVM module integrates the modified sequential minimal optimization algorithm with the table-driven-based Gaussian kernel to enable efficient on-chip learning. The presented design is verified on an Altera Cyclone II field-programmable gate array and tested using the two publicly available EEG datasets. Experiment results show that the designed VLSI system improves the detection accuracy and training efficiency.

  17. A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

    Directory of Open Access Journals (Sweden)

    Hongzhuan Zhao

    2016-04-01

    Full Text Available With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS. Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.

  18. Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection.

    Science.gov (United States)

    Li, Baopu; Meng, Max Q-H

    2012-05-01

    Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.

  19. [Application of optimized parameters SVM based on photoacoustic spectroscopy method in fault diagnosis of power transformer].

    Science.gov (United States)

    Zhang, Yu-xin; Cheng, Zhi-feng; Xu, Zheng-ping; Bai, Jing

    2015-01-01

    In order to solve the problems such as complex operation, consumption for the carrier gas and long test period in traditional power transformer fault diagnosis approach based on dissolved gas analysis (DGA), this paper proposes a new method which is detecting 5 types of characteristic gas content in transformer oil such as CH4, C2H2, C2H4, C2H6 and H2 based on photoacoustic Spectroscopy and C2H2/C2H4, CH4/H2, C2H4/C2H6 three-ratios data are calculated. The support vector machine model was constructed using cross validation method under five support vector machine functions and four kernel functions, heuristic algorithms were used in parameter optimization for penalty factor c and g, which to establish the best SVM model for the highest fault diagnosis accuracy and the fast computing speed. Particles swarm optimization and genetic algorithm two types of heuristic algorithms were comparative studied in this paper for accuracy and speed in optimization. The simulation result shows that SVM model composed of C-SVC, RBF kernel functions and genetic algorithm obtain 97. 5% accuracy in test sample set and 98. 333 3% accuracy in train sample set, and genetic algorithm was about two times faster than particles swarm optimization in computing speed. The methods described in this paper has many advantages such as simple operation, non-contact measurement, no consumption for the carrier gas, long test period, high stability and sensitivity, the result shows that the methods described in this paper can instead of the traditional transformer fault diagnosis by gas chromatography and meets the actual project needs in transformer fault diagnosis.

  20. SVM-based multimodal classification of activities of daily living in Health Smart Homes: sensors, algorithms, and first experimental results.

    Science.gov (United States)

    Fleury, Anthony; Vacher, Michel; Noury, Norbert

    2010-03-01

    By 2050, about one third of the French population will be over 65. Our laboratory's current research focuses on the monitoring of elderly people at home, to detect a loss of autonomy as early as possible. Our aim is to quantify criteria such as the international activities of daily living (ADL) or the French Autonomie Gerontologie Groupes Iso-Ressources (AGGIR) scales, by automatically classifying the different ADL performed by the subject during the day. A Health Smart Home is used for this. Our Health Smart Home includes, in a real flat, infrared presence sensors (location), door contacts (to control the use of some facilities), temperature and hygrometry sensor in the bathroom, and microphones (sound classification and speech recognition). A wearable kinematic sensor also informs postural transitions (using pattern recognition) and walk periods (frequency analysis). This data collected from the various sensors are then used to classify each temporal frame into one of the ADL that was previously acquired (seven activities: hygiene, toilet use, eating, resting, sleeping, communication, and dressing/undressing). This is done using support vector machines. We performed a 1-h experimentation with 13 young and healthy subjects to determine the models of the different activities, and then we tested the classification algorithm (cross validation) with real data.

  1. Comparative Study on KNN and SVM Based Weather Classification Models for Day Ahead Short Term Solar PV Power Forecasting

    Directory of Open Access Journals (Sweden)

    Fei Wang

    2017-12-01

    Full Text Available Accurate solar photovoltaic (PV power forecasting is an essential tool for mitigating the negative effects caused by the uncertainty of PV output power in systems with high penetration levels of solar PV generation. Weather classification based modeling is an effective way to increase the accuracy of day-ahead short-term (DAST solar PV power forecasting because PV output power is strongly dependent on the specific weather conditions in a given time period. However, the accuracy of daily weather classification relies on both the applied classifiers and the training data. This paper aims to reveal how these two factors impact the classification performance and to delineate the relation between classification accuracy and sample dataset scale. Two commonly used classification methods, K-nearest neighbors (KNN and support vector machines (SVM are applied to classify the daily local weather types for DAST solar PV power forecasting using the operation data from a grid-connected PV plant in Hohhot, Inner Mongolia, China. We assessed the performance of SVM and KNN approaches, and then investigated the influences of sample scale, the number of categories, and the data distribution in different categories on the daily weather classification results. The simulation results illustrate that SVM performs well with small sample scale, while KNN is more sensitive to the length of the training dataset and can achieve higher accuracy than SVM with sufficient samples.

  2. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    Science.gov (United States)

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  3. Finding New Perovskite Halides via Machine learning

    Directory of Open Access Journals (Sweden)

    Ghanshyam ePilania

    2016-04-01

    Full Text Available Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning via building a support vector machine (SVM based classifier that uses elemental features (or descriptors to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  4. Finding New Perovskite Halides via Machine learning

    Science.gov (United States)

    Pilania, Ghanshyam; Balachandran, Prasanna V.; Kim, Chiho; Lookman, Turab

    2016-04-01

    Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning) via building a support vector machine (SVM) based classifier that uses elemental features (or descriptors) to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion) in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  5. Localized thin-section CT with radiomics feature extraction and machine learning to classify early-detected pulmonary nodules from lung cancer screening

    Science.gov (United States)

    Tu, Shu-Ju; Wang, Chih-Wei; Pan, Kuang-Tse; Wu, Yi-Cheng; Wu, Chen-Te

    2018-03-01

    Lung cancer screening aims to detect small pulmonary nodules and decrease the mortality rate of those affected. However, studies from large-scale clinical trials of lung cancer screening have shown that the false-positive rate is high and positive predictive value is low. To address these problems, a technical approach is greatly needed for accurate malignancy differentiation among these early-detected nodules. We studied the clinical feasibility of an additional protocol of localized thin-section CT for further assessment on recalled patients from lung cancer screening tests. Our approach of localized thin-section CT was integrated with radiomics features extraction and machine learning classification which was supervised by pathological diagnosis. Localized thin-section CT images of 122 nodules were retrospectively reviewed and 374 radiomics features were extracted. In this study, 48 nodules were benign and 74 malignant. There were nine patients with multiple nodules and four with synchronous multiple malignant nodules. Different machine learning classifiers with a stratified ten-fold cross-validation were used and repeated 100 times to evaluate classification accuracy. Of the image features extracted from the thin-section CT images, 238 (64%) were useful in differentiating between benign and malignant nodules. These useful features include CT density (p  =  0.002 518), sigma (p  =  0.002 781), uniformity (p  =  0.032 41), and entropy (p  =  0.006 685). The highest classification accuracy was 79% by the logistic classifier. The performance metrics of this logistic classification model was 0.80 for the positive predictive value, 0.36 for the false-positive rate, and 0.80 for the area under the receiver operating characteristic curve. Our approach of direct risk classification supervised by the pathological diagnosis with localized thin-section CT and radiomics feature extraction may support clinical physicians in determining

  6. Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    Science.gov (United States)

    Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

    2018-06-01

    In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.

  7. Application of support vector machines in the evaluation of reliability generation and transmission systems; Aplicacao de maquinas de vetores suporte na avaliacao da confiabilidade de sistemas de geracao e transmissao

    Energy Technology Data Exchange (ETDEWEB)

    Dutra, Wellington Damascena; Resende, Leonidas Chaves de [Universidade Federal de Sao Joao Del-Rei (UFSJ), MG (Brazil); Manso, Luiz Antonio da Fonseca; Silva, Armando Martins Leite da [Universidade Federal de Itajuba (UNIFEI), MG (Brazil)

    2010-07-01

    This paper presents a methodology for assessing the reliability indices for composite generation and transmission systems based on Support Vector Machines (SVM). The importance of SVMs is its high generalization ability. The SVMs are used to classify data into two distinct classes. These can be named positive and negative. Thus, the basic idea is to classify the system states into success or failure. For this, a pre-classification of states is achieved by performing the proposed SVM-based neural network, where the sampled states during the beginning of the non-sequential Monte Carlo simulation (MCS) are considered as input data for training and validation sets. By adopting this procedure, a large number of states are classified by a simple evaluation of the network, providing significant reductions in computational costs. The proposed methodology is applied to the IEEE Reliability Test System and to the IEEE Modified Reliability Test System. (author)

  8. Two Classifiers Based on Serum Peptide Pattern for Prediction of HBV-Induced Liver Cirrhosis Using MALDI-TOF MS

    Directory of Open Access Journals (Sweden)

    Yuan Cao

    2013-01-01

    Full Text Available Chronic infection with hepatitis B virus (HBV is associated with the majority of cases of liver cirrhosis (LC in China. Although liver biopsy is the reference method for evaluation of cirrhosis, it is an invasive procedure with inherent risk. The aim of this study is to discover novel noninvasive specific serum biomarkers for the diagnosis of HBV-induced LC. We performed bead fractionation/MALDI-TOF MS analysis on sera from patients with LC. Thirteen feature peaks which had optimal discriminatory performance were obtained by using support-vector-machine-(SVM- based strategy. Based on the previous results, five supervised machine learning methods were employed to construct classifiers that discriminated proteomic spectra of patients with HBV-induced LC from those of controls. Here, we describe two novel methods for prediction of HBV-induced LC, termed LC-NB and LC-MLP, respectively. We obtained a sensitivity of 90.9%, a specificity of 94.9%, and overall accuracy of 93.8% on an independent test set. Comparisons with the existing methods showed that LC-NB and LC-MLP held better accuracy. Our study suggests that potential serum biomarkers can be determined for discriminating LC and non-LC cohorts by using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. These two classifiers could be used for clinical practice in HBV-induced LC assessment.

  9. Identification of Auditory Object-Specific Attention from Single-Trial Electroencephalogram Signals via Entropy Measures and Machine Learning

    Directory of Open Access Journals (Sweden)

    Yun Lu

    2018-05-01

    Full Text Available Existing research has revealed that auditory attention can be tracked from ongoing electroencephalography (EEG signals. The aim of this novel study was to investigate the identification of peoples’ attention to a specific auditory object from single-trial EEG signals via entropy measures and machine learning. Approximate entropy (ApEn, sample entropy (SampEn, composite multiscale entropy (CmpMSE and fuzzy entropy (FuzzyEn were used to extract the informative features of EEG signals under three kinds of auditory object-specific attention (Rest, Auditory Object1 Attention (AOA1 and Auditory Object2 Attention (AOA2. The linear discriminant analysis and support vector machine (SVM, were used to construct two auditory attention classifiers. The statistical results of entropy measures indicated that there were significant differences in the values of ApEn, SampEn, CmpMSE and FuzzyEn between Rest, AOA1 and AOA2. For the SVM-based auditory attention classifier, the auditory object-specific attention of Rest, AOA1 and AOA2 could be identified from EEG signals using ApEn, SampEn, CmpMSE and FuzzyEn as features and the identification rates were significantly different from chance level. The optimal identification was achieved by the SVM-based auditory attention classifier using CmpMSE with the scale factor τ = 10. This study demonstrated a novel solution to identify the auditory object-specific attention from single-trial EEG signals without the need to access the auditory stimulus.

  10. Solution to Detect, Classify, and Report Illicit Online Marketing and Sales of Controlled Substances via Twitter: Using Machine Learning and Web Forensics to Combat Digital Opioid Access

    Science.gov (United States)

    Klugman, Josh; Kuzmenko, Ella; Gupta, Rashmi

    2018-01-01

    Background On December 6 and 7, 2017, the US Department of Health and Human Services (HHS) hosted its first Code-a-Thon event aimed at leveraging technology and data-driven solutions to help combat the opioid epidemic. The authors—an interdisciplinary team from academia, the private sector, and the US Centers for Disease Control and Prevention—participated in the Code-a-Thon as part of the prevention track. Objective The aim of this study was to develop and deploy a methodology using machine learning to accurately detect the marketing and sale of opioids by illicit online sellers via Twitter as part of participation at the HHS Opioid Code-a-Thon event. Methods Tweets were collected from the Twitter public application programming interface stream filtered for common prescription opioid keywords in conjunction with participation in the Code-a-Thon from November 15, 2017 to December 5, 2017. An unsupervised machine learning–based approach was developed and used during the Code-a-Thon competition (24 hours) to obtain a summary of the content of the tweets to isolate those clusters associated with illegal online marketing and sale using a biterm topic model (BTM). After isolating relevant tweets, hyperlinks associated with these tweets were reviewed to assess the characteristics of illegal online sellers. Results We collected and analyzed 213,041 tweets over the course of the Code-a-Thon containing keywords codeine, percocet, vicodin, oxycontin, oxycodone, fentanyl, and hydrocodone. Using BTM, 0.32% (692/213,041) tweets were identified as being associated with illegal online marketing and sale of prescription opioids. After removing duplicates and dead links, we identified 34 unique “live” tweets, with 44% (15/34) directing consumers to illicit online pharmacies, 32% (11/34) linked to individual drug sellers, and 21% (7/34) used by marketing affiliates. In addition to offering the “no prescription” sale of opioids, many of these vendors also sold other

  11. Multi-view L2-SVM and its multi-view core vector machine.

    Science.gov (United States)

    Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

    2016-03-01

    In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. a Comparison Study of Different Kernel Functions for Svm-Based Classification of Multi-Temporal Polarimetry SAR Data

    Science.gov (United States)

    Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.

    2014-10-01

    In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

  13. Classification of masses on mammograms using support vector machine

    Science.gov (United States)

    Chu, Yong; Li, Lihua; Goldgof, Dmitry B.; Qui, Yan; Clark, Robert A.

    2003-05-01

    Mammography is the most effective method for early detection of breast cancer. However, the positive predictive value for classification of malignant and benign lesion from mammographic images is not very high. Clinical studies have shown that most biopsies for cancer are very low, between 15% and 30%. It is important to increase the diagnostic accuracy by improving the positive predictive value to reduce the number of unnecessary biopsies. In this paper, a new classification method was proposed to distinguish malignant from benign masses in mammography by Support Vector Machine (SVM) method. Thirteen features were selected based on receiver operating characteristic (ROC) analysis of classification using individual feature. These features include four shape features, two gradient features and seven Laws features. With these features, SVM was used to classify the masses into two categories, benign and malignant, in which a Gaussian kernel and sequential minimal optimization learning technique are performed. The data set used in this study consists of 193 cases, in which there are 96 benign cases and 97 malignant cases. The leave-one-out evaluation of SVM classifier was taken. The results show that the positive predict value of the presented method is 81.6% with the sensitivity of 83.7% and the false-positive rate of 30.2%. It demonstrated that the SVM-based classifier is effective in mass classification.

  14. Solution to Detect, Classify, and Report Illicit Online Marketing and Sales of Controlled Substances via Twitter: Using Machine Learning and Web Forensics to Combat Digital Opioid Access.

    Science.gov (United States)

    Mackey, Tim; Kalyanam, Janani; Klugman, Josh; Kuzmenko, Ella; Gupta, Rashmi

    2018-04-27

    On December 6 and 7, 2017, the US Department of Health and Human Services (HHS) hosted its first Code-a-Thon event aimed at leveraging technology and data-driven solutions to help combat the opioid epidemic. The authors—an interdisciplinary team from academia, the private sector, and the US Centers for Disease Control and Prevention—participated in the Code-a-Thon as part of the prevention track. The aim of this study was to develop and deploy a methodology using machine learning to accurately detect the marketing and sale of opioids by illicit online sellers via Twitter as part of participation at the HHS Opioid Code-a-Thon event. Tweets were collected from the Twitter public application programming interface stream filtered for common prescription opioid keywords in conjunction with participation in the Code-a-Thon from November 15, 2017 to December 5, 2017. An unsupervised machine learning–based approach was developed and used during the Code-a-Thon competition (24 hours) to obtain a summary of the content of the tweets to isolate those clusters associated with illegal online marketing and sale using a biterm topic model (BTM). After isolating relevant tweets, hyperlinks associated with these tweets were reviewed to assess the characteristics of illegal online sellers. We collected and analyzed 213,041 tweets over the course of the Code-a-Thon containing keywords codeine, percocet, vicodin, oxycontin, oxycodone, fentanyl, and hydrocodone. Using BTM, 0.32% (692/213,041) tweets were identified as being associated with illegal online marketing and sale of prescription opioids. After removing duplicates and dead links, we identified 34 unique “live” tweets, with 44% (15/34) directing consumers to illicit online pharmacies, 32% (11/34) linked to individual drug sellers, and 21% (7/34) used by marketing affiliates. In addition to offering the “no prescription” sale of opioids, many of these vendors also sold other controlled substances and illicit drugs

  15. On-line detection of apnea/hypopnea events using SpO2 signal: a rule-based approach employing binary classifier models.

    Science.gov (United States)

    Koley, Bijoy Laxmi; Dey, Debangshu

    2014-01-01

    This paper presents an online method for automatic detection of apnea/hypopnea events, with the help of oxygen saturation (SpO2) signal, measured at fingertip by Bluetooth nocturnal pulse oximeter. Event detection is performed by identifying abnormal data segments from the recorded SpO2 signal, employing a binary classifier model based on a support vector machine (SVM). Thereafter the abnormal segment is further analyzed to detect different states within the segment, i.e., steady, desaturation, and resaturation, with the help of another SVM-based binary ensemble classifier model. Finally, a heuristically obtained rule-based system is used to identify the apnea/hypopnea events from the time-sequenced decisions of these classifier models. In the developmental phase, a set of 34 time domain-based features was extracted from the segmented SpO2 signal using an overlapped windowing technique. Later, an optimal set of features was selected on the basis of recursive feature elimination technique. A total of 34 subjects were included in the study. The results show average event detection accuracies of 96.7% and 93.8% for the offline and the online tests, respectively. The proposed system provides direct estimation of the apnea/hypopnea index with the help of a relatively inexpensive and widely available pulse oximeter. Moreover, the system can be monitored and accessed by physicians through LAN/WAN/Internet and can be extended to deploy in Bluetooth-enabled mobile phones.

  16. Classifying Microorganisms

    DEFF Research Database (Denmark)

    Sommerlund, Julie

    2006-01-01

    This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological characteris......This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological...... characteristics. The coexistence of the classification systems does not lead to a conflict between them. Rather, the systems seem to co-exist in different configurations, through which they are complementary, contradictory and inclusive in different situations-sometimes simultaneously. The systems come...

  17. 基于信息熵的SVM入侵检测技术%Exploring SVM-based intrusion detection through information entropy theory

    Institute of Scientific and Technical Information of China (English)

    朱文杰; 王强; 翟献军

    2013-01-01

    在传统基于SVM的入侵检测中,核函数构造和特征选择采用先验知识,普遍存在准确度不高、效率低下的问题.通过信息熵理论与SVM算法相结合的方法改进为基于信息熵的SVM入侵检测算法,可以提高入侵检测的准确性,提升入侵检测的效率.基于信息熵的SVM入侵检测算法包括两个方面:一方面,根据样本包含的用户信息熵和方差,将样本特征统一,以特征是否属于置信区间来度量.将得到的样本特征置信向量作为SVM核函数的构造参数,既可保证训练样本集与最优分类面之间的对应关系,又可得到入侵检测需要的最大分类间隔;另一方面,将样本包含的用户信息量作为度量大幅度约简样本特征子集,不但降低了样本计算规模,而且提高了分类器的训练速度.实验表明,该算法在入侵检测系统中的应用优于传统的SVM算法.%In traditional SVM based intrusion detection approaches,both core function construction and feature selection use prior knowdege.Due to this,they are not only inefficient but also inaccurate.It is observed that integrating information entropy theory into SVM-based intrusion detection can enhance both the precision and the speed.Concludely speaking,SVM-based entropy intrusion detection algorithms are made up of two aspects:on one hand,setting sample confidence vector as core function's constructor of SVM algorithm can guarantee the mapping relationship between training sample and optimization classification plane.Also,the intrusion detection's maximum interval can be acquired.On the other hand,simplifying feature subset with samples's entropy as metric standard can not only shrink the computing scale but also improve the speed.Experiments prove that the SVM based entropy intrusion detection algoritm outperfomrs other tradional algorithms.

  18. Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences

    Directory of Open Access Journals (Sweden)

    Ji-Yong An

    2016-01-01

    Full Text Available We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM model and Local Phase Quantization (LPQ to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM, reducing the influence of noise using a Principal Component Analysis (PCA, and using a Relevance Vector Machine (RVM based classifier. We perform 5-fold cross-validation experiments on Yeast and Human datasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM classifier on the Yeast dataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.

  19. GAPscreener: An automatic tool for screening human genetic association literature in PubMed using the support vector machine technique

    Directory of Open Access Journals (Sweden)

    Khoury Muin J

    2008-04-01

    Full Text Available Abstract Background Synthesis of data from published human genetic association studies is a critical step in the translation of human genome discoveries into health applications. Although genetic association studies account for a substantial proportion of the abstracts in PubMed, identifying them with standard queries is not always accurate or efficient. Further automating the literature-screening process can reduce the burden of a labor-intensive and time-consuming traditional literature search. The Support Vector Machine (SVM, a well-established machine learning technique, has been successful in classifying text, including biomedical literature. The GAPscreener, a free SVM-based software tool, can be used to assist in screening PubMed abstracts for human genetic association studies. Results The data source for this research was the HuGE Navigator, formerly known as the HuGE Pub Lit database. Weighted SVM feature selection based on a keyword list obtained by the two-way z score method demonstrated the best screening performance, achieving 97.5% recall, 98.3% specificity and 31.9% precision in performance testing. Compared with the traditional screening process based on a complex PubMed query, the SVM tool reduced by about 90% the number of abstracts requiring individual review by the database curator. The tool also ascertained 47 articles that were missed by the traditional literature screening process during the 4-week test period. We examined the literature on genetic associations with preterm birth as an example. Compared with the traditional, manual process, the GAPscreener both reduced effort and improved accuracy. Conclusion GAPscreener is the first free SVM-based application available for screening the human genetic association literature in PubMed with high recall and specificity. The user-friendly graphical user interface makes this a practical, stand-alone application. The software can be downloaded at no charge.

  20. Fingerprint prediction using classifier ensembles

    CSIR Research Space (South Africa)

    Molale, P

    2011-11-01

    Full Text Available ); logistic discrimination (LgD), k-nearest neighbour (k-NN), artificial neural network (ANN), association rules (AR) decision tree (DT), naive Bayes classifier (NBC) and the support vector machine (SVM). The performance of several multiple classifier systems...

  1. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants.

    Science.gov (United States)

    Vieira, Lucas Maciel; Grativol, Clicia; Thiebaut, Flavia; Carvalho, Thais G; Hardoim, Pablo R; Hemerly, Adriana; Lifschitz, Sergio; Ferreira, Paulo Cavalcanti Gomes; Walter, Maria Emilia M T

    2017-03-04

    Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane ( Saccharum spp.) and in maize ( Zea mays ). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  2. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants

    Directory of Open Access Journals (Sweden)

    Lucas Maciel Vieira

    2017-03-01

    Full Text Available Non-coding RNAs (ncRNAs constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs, which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM. We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp. and in maize (Zea mays. From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  3. Carbon classified?

    DEFF Research Database (Denmark)

    Lippert, Ingmar

    2012-01-01

    . Using an actor- network theory (ANT) framework, the aim is to investigate the actors who bring together the elements needed to classify their carbon emission sources and unpack the heterogeneous relations drawn on. Based on an ethnographic study of corporate agents of ecological modernisation over...... a period of 13 months, this paper provides an exploration of three cases of enacting classification. Drawing on ANT, we problematise the silencing of a range of possible modalities of consumption facts and point to the ontological ethics involved in such performances. In a context of global warming...

  4. SVM-based prediction of propeptide cleavage sites in spider toxins identifies toxin innovation in an Australian tarantula.

    Directory of Open Access Journals (Sweden)

    Emily S W Wong

    Full Text Available Spider neurotoxins are commonly used as pharmacological tools and are a popular source of novel compounds with therapeutic and agrochemical potential. Since venom peptides are inherently toxic, the host spider must employ strategies to avoid adverse effects prior to venom use. It is partly for this reason that most spider toxins encode a protective proregion that upon enzymatic cleavage is excised from the mature peptide. In order to identify the mature toxin sequence directly from toxin transcripts, without resorting to protein sequencing, the propeptide cleavage site in the toxin precursor must be predicted bioinformatically. We evaluated different machine learning strategies (support vector machines, hidden Markov model and decision tree and developed an algorithm (SpiderP for prediction of propeptide cleavage sites in spider toxins. Our strategy uses a support vector machine (SVM framework that combines both local and global sequence information. Our method is superior or comparable to current tools for prediction of propeptide sequences in spider toxins. Evaluation of the SVM method on an independent test set of known toxin sequences yielded 96% sensitivity and 100% specificity. Furthermore, we sequenced five novel peptides (not used to train the final predictor from the venom of the Australian tarantula Selenotypus plumipes to test the accuracy of the predictor and found 80% sensitivity and 99.6% 8-mer specificity. Finally, we used the predictor together with homology information to predict and characterize seven groups of novel toxins from the deeply sequenced venom gland transcriptome of S. plumipes, which revealed structural complexity and innovations in the evolution of the toxins. The precursor prediction tool (SpiderP is freely available on ArachnoServer (http://www.arachnoserver.org/spiderP.html, a web portal to a comprehensive relational database of spider toxins. All training data, test data, and scripts used are available from

  5. Voice Activity Detection Using Fuzzy Entropy and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    R. Johny Elton

    2016-08-01

    Full Text Available This paper proposes support vector machine (SVM based voice activity detection using FuzzyEn to improve detection performance under noisy conditions. The proposed voice activity detection (VAD uses fuzzy entropy (FuzzyEn as a feature extracted from noise-reduced speech signals to train an SVM model for speech/non-speech classification. The proposed VAD method was tested by conducting various experiments by adding real background noises of different signal-to-noise ratios (SNR ranging from −10 dB to 10 dB to actual speech signals collected from the TIMIT database. The analysis proves that FuzzyEn feature shows better results in discriminating noise and corrupted noisy speech. The efficacy of the SVM classifier was validated using 10-fold cross validation. Furthermore, the results obtained by the proposed method was compared with those of previous standardized VAD algorithms as well as recently developed methods. Performance comparison suggests that the proposed method is proven to be more efficient in detecting speech under various noisy environments with an accuracy of 93.29%, and the FuzzyEn feature detects speech efficiently even at low SNR levels.

  6. The method and efficacy of support vector machine classifiers based on texture features and multi-resolution histogram from 18F-FDG PET-CT images for the evaluation of mediastinal lymph nodes in patients with lung cancer

    International Nuclear Information System (INIS)

    Gao, Xuan; Chu, Chunyu; Li, Yingci; Lu, Peiou; Wang, Wenzhi; Liu, Wanyu; Yu, Lijuan

    2015-01-01

    Highlights: • Three support vector machine classifiers were constructed from PET-CT images. • The areas under the ROC curve for SVM1, SVM2, and SVM3 were 0.689, 0.579, and 0.685, respectively. • The areas under curves for maximum short diameter and SUV max were 0.684 and 0.652, respectively. • The algorithm based on SVM was potential in the diagnosis of mediastinal lymph nodes. - Abstract: Objectives: In clinical practice, image analysis is dependent on simply visual perception and the diagnostic efficacy of this analysis pattern is limited for mediastinal lymph nodes in patients with lung cancer. In order to improve diagnostic efficacy, we developed a new computer-based algorithm and tested its diagnostic efficacy. Methods: 132 consecutive patients with lung cancer underwent 18 F-FDG PET/CT examination before treatment. After all data were imported into the database of an on-line medical image analysis platform, the diagnostic efficacy of visual analysis was first evaluated without knowing pathological results, and the maximum short diameter and maximum standardized uptake value (SUV max ) were measured. Then lymph nodes were segmented manually. Three classifiers based on support vector machine (SVM) were constructed from CT, PET, and combined PET-CT images, respectively. The diagnostic efficacy of SVM classifiers was obtained and evaluated. Results: According to ROC curves, the areas under curves for maximum short diameter and SUV max were 0.684 and 0.652, respectively. The areas under the ROC curve for SVM1, SVM2, and SVM3 were 0.689, 0.579, and 0.685, respectively. Conclusion: The algorithm based on SVM was potential in the diagnosis of mediastinal lymph nodes

  7. Machine-Learning Classifier for Patients with Major Depressive Disorder: Multifeature Approach Based on a High-Order Minimum Spanning Tree Functional Brain Network.

    Science.gov (United States)

    Guo, Hao; Qin, Mengna; Chen, Junjie; Xu, Yong; Xiang, Jie

    2017-01-01

    High-order functional connectivity networks are rich in time information that can reflect dynamic changes in functional connectivity between brain regions. Accordingly, such networks are widely used to classify brain diseases. However, traditional methods for processing high-order functional connectivity networks generally include the clustering method, which reduces data dimensionality. As a result, such networks cannot be effectively interpreted in the context of neurology. Additionally, due to the large scale of high-order functional connectivity networks, it can be computationally very expensive to use complex network or graph theory to calculate certain topological properties. Here, we propose a novel method of generating a high-order minimum spanning tree functional connectivity network. This method increases the neurological significance of the high-order functional connectivity network, reduces network computing consumption, and produces a network scale that is conducive to subsequent network analysis. To ensure the quality of the topological information in the network structure, we used frequent subgraph mining technology to capture the discriminative subnetworks as features and combined this with quantifiable local network features. Then we applied a multikernel learning technique to the corresponding selected features to obtain the final classification results. We evaluated our proposed method using a data set containing 38 patients with major depressive disorder and 28 healthy controls. The experimental results showed a classification accuracy of up to 97.54%.

  8. A Tool for Creating Regionally Calibrated High-Resolution Land Cover Data Sets for the West African Sahel: Using Machine Learning to Scale Up Hand-Classified Maps in a Data-Sparse Environment

    Science.gov (United States)

    Van Gordon, M.; Van Gordon, S.; Min, A.; Sullivan, J.; Weiner, Z.; Tappan, G. G.

    2017-12-01

    Using support vector machine (SVM) learning and high-accuracy hand-classified maps, we have developed a publicly available land cover classification tool for the West African Sahel. Our classifier produces high-resolution and regionally calibrated land cover maps for the Sahel, representing a significant contribution to the data available for this region. Global land cover products are unreliable for the Sahel, and accurate land cover data for the region are sparse. To address this gap, the U.S. Geological Survey and the Regional Center for Agriculture, Hydrology and Meteorology (AGRHYMET) in Niger produced high-quality land cover maps for the region via hand-classification of Landsat images. This method produces highly accurate maps, but the time and labor required constrain the spatial and temporal resolution of the data products. By using these hand-classified maps alongside SVM techniques, we successfully increase the resolution of the land cover maps by 1-2 orders of magnitude, from 2km-decadal resolution to 30m-annual resolution. These high-resolution regionally calibrated land cover datasets, along with the classifier we developed to produce them, lay the foundation for major advances in studies of land surface processes in the region. These datasets will provide more accurate inputs for food security modeling, hydrologic modeling, analyses of land cover change and climate change adaptation efforts. The land cover classification tool we have developed will be publicly available for use in creating additional West Africa land cover datasets with future remote sensing data and can be adapted for use in other parts of the world.

  9. SVM-based multisensor data fusion for phase concentration measurement in biomass-coal co-combustion

    Science.gov (United States)

    Wang, Xiaoxin; Hu, Hongli; Jia, Huiqin; Tang, Kaihao

    2018-05-01

    In this paper, the electrical method combines the electrostatic sensor and capacitance sensor to measure the phase concentration of pulverized coal/biomass/air three-phase flow through data fusion technology. In order to eliminate the effects of flow regimes and improve the accuracy of the phase concentration measurement, the mel frequency cepstrum coefficient features extracted from electrostatic signals are used to train the Continuous Gaussian Mixture Hidden Markov Model (CGHMM) for flow regime identification. Support Vector Machine (SVM) is introduced to establish the concentration information fusion model under identified flow regimes. The CGHMM models and SVM models are transplanted on digital signal processing (DSP) to realize on-line accurate measurement. The DSP flow regime identification time is 1.4 ms, and the concentration predict time is 164 μs, which can fully meet the real-time requirement. The average absolute value of the relative error of the pulverized coal is about 1.5% and that of the biomass is about 2.2%.

  10. Fault Diagnosis of a Reconfigurable Crawling–Rolling Robot Based on Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Karthikeyan Elangovan

    2017-10-01

    Full Text Available As robots begin to perform jobs autonomously, with minimal or no human intervention, a new challenge arises: robots also need to autonomously detect errors and recover from faults. In this paper, we present a Support Vector Machine (SVM-based fault diagnosis system for a bio-inspired reconfigurable robot named Scorpio. The diagnosis system needs to detect and classify faults while Scorpio uses its crawling and rolling locomotion modes. Specifically, we classify between faulty and non-faulty conditions by analyzing onboard Inertial Measurement Unit (IMU sensor data. The data capture nine different locomotion gaits, which include rolling and crawling modes, at three different speeds. Statistical methods are applied to extract features and to reduce the dimensionality of original IMU sensor data features. These statistical features were given as inputs for training and testing. Additionally, the c-Support Vector Classification (c-SVC and nu-SVC models of SVM, and their fault classification accuracies, were compared. The results show that the proposed SVM approach can be used to autonomously diagnose locomotion gait faults while the reconfigurable robot is in operation.

  11. Support-Vector-Machine-Based Reduced-Order Model for Limit Cycle Oscillation Prediction of Nonlinear Aeroelastic System

    Directory of Open Access Journals (Sweden)

    Gang Chen

    2012-01-01

    Full Text Available It is not easy for the system identification-based reduced-order model (ROM and even eigenmode based reduced-order model to predict the limit cycle oscillation generated by the nonlinear unsteady aerodynamics. Most of these traditional ROMs are sensitive to the flow parameter variation. In order to deal with this problem, a support vector machine- (SVM- based ROM was investigated and the general construction framework was proposed. The two-DOF aeroelastic system for the NACA 64A010 airfoil in transonic flow was then demonstrated for the new SVM-based ROM. The simulation results show that the new ROM can capture the LCO behavior of the nonlinear aeroelastic system with good accuracy and high efficiency. The robustness and computational efficiency of the SVM-based ROM would provide a promising tool for real-time flight simulation including nonlinear aeroelastic effects.

  12. Quantum ensembles of quantum classifiers.

    Science.gov (United States)

    Schuld, Maria; Petruccione, Francesco

    2018-02-09

    Quantum machine learning witnesses an increasing amount of quantum algorithms for data-driven decision making, a problem with potential applications ranging from automated image recognition to medical diagnosis. Many of those algorithms are implementations of quantum classifiers, or models for the classification of data inputs with a quantum computer. Following the success of collective decision making with ensembles in classical machine learning, this paper introduces the concept of quantum ensembles of quantum classifiers. Creating the ensemble corresponds to a state preparation routine, after which the quantum classifiers are evaluated in parallel and their combined decision is accessed by a single-qubit measurement. This framework naturally allows for exponentially large ensembles in which - similar to Bayesian learning - the individual classifiers do not have to be trained. As an example, we analyse an exponentially large quantum ensemble in which each classifier is weighed according to its performance in classifying the training data, leading to new results for quantum as well as classical machine learning.

  13. LCC: Light Curves Classifier

    Science.gov (United States)

    Vo, Martin

    2017-08-01

    Light Curves Classifier uses data mining and machine learning to obtain and classify desired objects. This task can be accomplished by attributes of light curves or any time series, including shapes, histograms, or variograms, or by other available information about the inspected objects, such as color indices, temperatures, and abundances. After specifying features which describe the objects to be searched, the software trains on a given training sample, and can then be used for unsupervised clustering for visualizing the natural separation of the sample. The package can be also used for automatic tuning parameters of used methods (for example, number of hidden neurons or binning ratio). Trained classifiers can be used for filtering outputs from astronomical databases or data stored locally. The Light Curve Classifier can also be used for simple downloading of light curves and all available information of queried stars. It natively can connect to OgleII, OgleIII, ASAS, CoRoT, Kepler, Catalina and MACHO, and new connectors or descriptors can be implemented. In addition to direct usage of the package and command line UI, the program can be used through a web interface. Users can create jobs for ”training” methods on given objects, querying databases and filtering outputs by trained filters. Preimplemented descriptors, classifier and connectors can be picked by simple clicks and their parameters can be tuned by giving ranges of these values. All combinations are then calculated and the best one is used for creating the filter. Natural separation of the data can be visualized by unsupervised clustering.

  14. Mapping membrane activity in undiscovered peptide sequence space using machine learning.

    Science.gov (United States)

    Lee, Ernest Y; Fulan, Benjamin M; Wong, Gerard C L; Ferguson, Andrew L

    2016-11-29

    There are some ∼1,100 known antimicrobial peptides (AMPs), which permeabilize microbial membranes but have diverse sequences. Here, we develop a support vector machine (SVM)-based classifier to investigate ⍺-helical AMPs and the interrelated nature of their functional commonality and sequence homology. SVM is used to search the undiscovered peptide sequence space and identify Pareto-optimal candidates that simultaneously maximize the distance σ from the SVM hyperplane (thus maximize its "antimicrobialness") and its ⍺-helicity, but minimize mutational distance to known AMPs. By calibrating SVM machine learning results with killing assays and small-angle X-ray scattering (SAXS), we find that the SVM metric σ correlates not with a peptide's minimum inhibitory concentration (MIC), but rather its ability to generate negative Gaussian membrane curvature. This surprising result provides a topological basis for membrane activity common to AMPs. Moreover, we highlight an important distinction between the maximal recognizability of a sequence to a trained AMP classifier (its ability to generate membrane curvature) and its maximal antimicrobial efficacy. As mutational distances are increased from known AMPs, we find AMP-like sequences that are increasingly difficult for nature to discover via simple mutation. Using the sequence map as a discovery tool, we find a unexpectedly diverse taxonomy of sequences that are just as membrane-active as known AMPs, but with a broad range of primary functions distinct from AMP functions, including endogenous neuropeptides, viral fusion proteins, topogenic peptides, and amyloids. The SVM classifier is useful as a general detector of membrane activity in peptide sequences.

  15. A Fast SVM-Based Tongue’s Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Directory of Open Access Journals (Sweden)

    Nur Diyana Kamarudin

    2017-01-01

    Full Text Available In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye’s ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue’s multicolour classification based on a support vector machine (SVM whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black, deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  16. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Science.gov (United States)

    Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds. PMID:29065640

  17. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis.

    Science.gov (United States)

    Kamarudin, Nur Diyana; Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k -means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k -means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  18. Prediction of Banking Systemic Risk Based on Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Shouwei Li

    2013-01-01

    Full Text Available Banking systemic risk is a complex nonlinear phenomenon and has shed light on the importance of safeguarding financial stability by recent financial crisis. According to the complex nonlinear characteristics of banking systemic risk, in this paper we apply support vector machine (SVM to the prediction of banking systemic risk in an attempt to suggest a new model with better explanatory power and stability. We conduct a case study of an SVM-based prediction model for Chinese banking systemic risk and find the experiment results showing that support vector machine is an efficient method in such case.

  19. Hierarchical mixtures of naive Bayes classifiers

    NARCIS (Netherlands)

    Wiering, M.A.

    2002-01-01

    Naive Bayes classifiers tend to perform very well on a large number of problem domains, although their representation power is quite limited compared to more sophisticated machine learning algorithms. In this pa- per we study combining multiple naive Bayes classifiers by using the hierar- chical

  20. Feature extraction for dynamic integration of classifiers

    NARCIS (Netherlands)

    Pechenizkiy, M.; Tsymbal, A.; Puuronen, S.; Patterson, D.W.

    2007-01-01

    Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. In this paper, we present an algorithm for the dynamic integration of classifiers in the space of extracted features (FEDIC). It is based on the technique

  1. Kernel based machine learning algorithm for the efficient prediction of type III polyketide synthase family of proteins

    Directory of Open Access Journals (Sweden)

    Mallika V

    2010-03-01

    Full Text Available Type III Polyketide synthases (PKS are family of proteins considered to have significant role in the biosynthesis of various polyketides in plants, fungi and bacteria. As these proteins show positive effects to human health, more researches are going on regarding this particular protein. Developing a tool to identify the probability of sequence, being a type III polyketide synthase will minimize the time consumption and manpower efforts. In this approach, we have designed and implemented PKSIIIpred, a high performance prediction server for type III PKS where the classifier is Support Vector Machine (SVM. Based on the limited training dataset, the tool efficiently predicts the type III PKS superfamily of proteins with high sensitivity and specificity. PKSIIIpred is available at http://type3pks.in/prediction/. We expect that this tool may serve as a useful resource for type III PKS researchers. Currently work is being progressed for further betterment of prediction accuracy by including more sequence features in the training dataset.

  2. Energy-Efficient Neuromorphic Classifiers.

    Science.gov (United States)

    Martí, Daniel; Rigotti, Mattia; Seok, Mingoo; Fusi, Stefano

    2016-10-01

    Neuromorphic engineering combines the architectural and computational principles of systems neuroscience with semiconductor electronics, with the aim of building efficient and compact devices that mimic the synaptic and neural machinery of the brain. The energy consumptions promised by neuromorphic engineering are extremely low, comparable to those of the nervous system. Until now, however, the neuromorphic approach has been restricted to relatively simple circuits and specialized functions, thereby obfuscating a direct comparison of their energy consumption to that used by conventional von Neumann digital machines solving real-world tasks. Here we show that a recent technology developed by IBM can be leveraged to realize neuromorphic circuits that operate as classifiers of complex real-world stimuli. Specifically, we provide a set of general prescriptions to enable the practical implementation of neural architectures that compete with state-of-the-art classifiers. We also show that the energy consumption of these architectures, realized on the IBM chip, is typically two or more orders of magnitude lower than that of conventional digital machines implementing classifiers with comparable performance. Moreover, the spike-based dynamics display a trade-off between integration time and accuracy, which naturally translates into algorithms that can be flexibly deployed for either fast and approximate classifications, or more accurate classifications at the mere expense of longer running times and higher energy costs. This work finally proves that the neuromorphic approach can be efficiently used in real-world applications and has significant advantages over conventional digital devices when energy consumption is considered.

  3. Hybrid Neuro-Fuzzy Classifier Based On Nefclass Model

    Directory of Open Access Journals (Sweden)

    Bogdan Gliwa

    2011-01-01

    Full Text Available The paper presents hybrid neuro-fuzzy classifier, based on NEFCLASS model, which wasmodified. The presented classifier was compared to popular classifiers – neural networks andk-nearest neighbours. Efficiency of modifications in classifier was compared with methodsused in original model NEFCLASS (learning methods. Accuracy of classifier was testedusing 3 datasets from UCI Machine Learning Repository: iris, wine and breast cancer wisconsin.Moreover, influence of ensemble classification methods on classification accuracy waspresented.

  4. Error minimizing algorithms for nearest eighbor classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory; Zimmer, G. Beate [TEXAS A& M

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. We use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.

  5. Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values.

    Directory of Open Access Journals (Sweden)

    Talayeh Razzaghi

    Full Text Available This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized techniques of data-preprocessing and classification. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. It is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.

  6. A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data.

    Science.gov (United States)

    Hu, Yongli; Hase, Takeshi; Li, Hui Peng; Prabhakar, Shyam; Kitano, Hiroaki; Ng, See Kiong; Ghosh, Samik; Wee, Lawrence Jin Kiat

    2016-12-22

    The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time. However, till date, there has not been a suitable computational methodology for the analysis of such intricate deluge of data, in particular techniques which will aid the identification of the unique transcriptomic profiles difference between the different cellular subtypes. In this paper, we describe the novel methodology for the analysis of single-cell RNA-seq data, obtained from neocortical cells and neural progenitor cells, using machine learning algorithms (Support Vector machine (SVM) and Random Forest (RF)). Thirty-eight key transcripts were identified, using the SVM-based recursive feature elimination (SVM-RFE) method of feature selection, to best differentiate developing neocortical cells from neural progenitor cells in the SVM and RF classifiers built. Also, these genes possessed a higher discriminative power (enhanced prediction accuracy) as compared commonly used statistical techniques or geneset-based approaches. Further downstream network reconstruction analysis was carried out to unravel hidden general regulatory networks where novel interactions could be further validated in web-lab experimentation and be useful candidates to be targeted for the treatment of neuronal developmental diseases. This novel approach reported for is able to identify transcripts, with reported neuronal involvement, which optimally differentiate neocortical cells and neural progenitor cells. It is believed to be extensible and applicable to other single-cell RNA-seq expression profiles like that of the study of the cancer progression and treatment within a highly heterogeneous tumour.

  7. A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks.

    Science.gov (United States)

    Mei, Suyu; Zhu, Hao

    2015-01-26

    Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus.

  8. Joint Training of Deep Boltzmann Machines

    OpenAIRE

    Goodfellow, Ian; Courville, Aaron; Bengio, Yoshua

    2012-01-01

    We introduce a new method for training deep Boltzmann machines jointly. Prior methods require an initial learning pass that trains the deep Boltzmann machine greedily, one layer at a time, or do not perform well on classifi- cation tasks.

  9. Relevance vector machine technique for the inverse scattering problem

    International Nuclear Information System (INIS)

    Wang Fang-Fang; Zhang Ye-Rong

    2012-01-01

    A novel method based on the relevance vector machine (RVM) for the inverse scattering problem is presented in this paper. The nonlinearity and the ill-posedness inherent in this problem are simultaneously considered. The nonlinearity is embodied in the relation between the scattered field and the target property, which can be obtained through the RVM training process. Besides, rather than utilizing regularization, the ill-posed nature of the inversion is naturally accounted for because the RVM can produce a probabilistic output. Simulation results reveal that the proposed RVM-based approach can provide comparative performances in terms of accuracy, convergence, robustness, generalization, and improved performance in terms of sparse property in comparison with the support vector machine (SVM) based approach. (general)

  10. RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences

    Directory of Open Access Journals (Sweden)

    Ji-Yong An

    2016-05-01

    Full Text Available Protein-Protein Interactions (PPIs play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM model and Average Blocks (AB to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM, reducing the influence of noise using a Principal Component Analysis (PCA, and using a Relevance Vector Machine (RVM based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed

  11. Using Neural Networks to Classify Digitized Images of Galaxies

    Science.gov (United States)

    Goderya, S. N.; McGuire, P. C.

    2000-12-01

    Automated classification of Galaxies into Hubble types is of paramount importance to study the large scale structure of the Universe, particularly as survey projects like the Sloan Digital Sky Survey complete their data acquisition of one million galaxies. At present it is not possible to find robust and efficient artificial intelligence based galaxy classifiers. In this study we will summarize progress made in the development of automated galaxy classifiers using neural networks as machine learning tools. We explore the Bayesian linear algorithm, the higher order probabilistic network, the multilayer perceptron neural network and Support Vector Machine Classifier. The performance of any machine classifier is dependant on the quality of the parameters that characterize the different groups of galaxies. Our effort is to develop geometric and invariant moment based parameters as input to the machine classifiers instead of the raw pixel data. Such an approach reduces the dimensionality of the classifier considerably, and removes the effects of scaling and rotation, and makes it easier to solve for the unknown parameters in the galaxy classifier. To judge the quality of training and classification we develop the concept of Mathews coefficients for the galaxy classification community. Mathews coefficients are single numbers that quantify classifier performance even with unequal prior probabilities of the classes.

  12. Dynamic integration of classifiers in the space of principal components

    NARCIS (Netherlands)

    Tsymbal, A.; Pechenizkiy, M.; Puuronen, S.; Patterson, D.W.; Kalinichenko, L.A.; Manthey, R.; Thalheim, B.; Wloka, U.

    2003-01-01

    Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. It was shown that, for an ensemble to be successful, it should consist of accurate and diverse base classifiers. However, it is also important that the

  13. Classifying Returns as Extreme

    DEFF Research Database (Denmark)

    Christiansen, Charlotte

    2014-01-01

    I consider extreme returns for the stock and bond markets of 14 EU countries using two classification schemes: One, the univariate classification scheme from the previous literature that classifies extreme returns for each market separately, and two, a novel multivariate classification scheme tha...

  14. Upport vector machines for nonlinear kernel ARMA system identification.

    Science.gov (United States)

    Martínez-Ramón, Manel; Rojo-Alvarez, José Luis; Camps-Valls, Gustavo; Muñioz-Marí, Jordi; Navia-Vázquez, Angel; Soria-Olivas, Emilio; Figueiras-Vidal, Aníbal R

    2006-11-01

    Nonlinear system identification based on support vector machines (SVM) has been usually addressed by means of the standard SVM regression (SVR), which can be seen as an implicit nonlinear autoregressive and moving average (ARMA) model in some reproducing kernel Hilbert space (RKHS). The proposal of this letter is twofold. First, the explicit consideration of an ARMA model in an RKHS (SVM-ARMA2K) is proposed. We show that stating the ARMA equations in an RKHS leads to solving the regularized normal equations in that RKHS, in terms of the autocorrelation and cross correlation of the (nonlinearly) transformed input and output discrete time processes. Second, a general class of SVM-based system identification nonlinear models is presented, based on the use of composite Mercer's kernels. This general class can improve model flexibility by emphasizing the input-output cross information (SVM-ARMA4K), which leads to straightforward and natural combinations of implicit and explicit ARMA models (SVR-ARMA2K and SVR-ARMA4K). Capabilities of these different SVM-based system identification schemes are illustrated with two benchmark problems.

  15. Intelligent Garbage Classifier

    Directory of Open Access Journals (Sweden)

    Ignacio Rodríguez Novelle

    2008-12-01

    Full Text Available IGC (Intelligent Garbage Classifier is a system for visual classification and separation of solid waste products. Currently, an important part of the separation effort is based on manual work, from household separation to industrial waste management. Taking advantage of the technologies currently available, a system has been built that can analyze images from a camera and control a robot arm and conveyor belt to automatically separate different kinds of waste.

  16. Classifying Linear Canonical Relations

    OpenAIRE

    Lorand, Jonathan

    2015-01-01

    In this Master's thesis, we consider the problem of classifying, up to conjugation by linear symplectomorphisms, linear canonical relations (lagrangian correspondences) from a finite-dimensional symplectic vector space to itself. We give an elementary introduction to the theory of linear canonical relations and present partial results toward the classification problem. This exposition should be accessible to undergraduate students with a basic familiarity with linear algebra.

  17. Machine learning techniques in disease forecasting: a case study on rice blast prediction

    Directory of Open Access Journals (Sweden)

    Kapoor Amar S

    2006-11-01

    Full Text Available Abstract Background Diverse modeling approaches viz. neural networks and multiple regression have been followed to date for disease prediction in plant populations. However, due to their inability to predict value of unknown data points and longer training times, there is need for exploiting new prediction softwares for better understanding of plant-pathogen-environment relationships. Further, there is no online tool available which can help the plant researchers or farmers in timely application of control measures. This paper introduces a new prediction approach based on support vector machines for developing weather-based prediction models of plant diseases. Results Six significant weather variables were selected as predictor variables. Two series of models (cross-location and cross-year were developed and validated using a five-fold cross validation procedure. For cross-year models, the conventional multiple regression (REG approach achieved an average correlation coefficient (r of 0.50, which increased to 0.60 and percent mean absolute error (%MAE decreased from 65.42 to 52.24 when back-propagation neural network (BPNN was used. With generalized regression neural network (GRNN, the r increased to 0.70 and %MAE also improved to 46.30, which further increased to r = 0.77 and %MAE = 36.66 when support vector machine (SVM based method was used. Similarly, cross-location validation achieved r = 0.48, 0.56 and 0.66 using REG, BPNN and GRNN respectively, with their corresponding %MAE as 77.54, 66.11 and 58.26. The SVM-based method outperformed all the three approaches by further increasing r to 0.74 with improvement in %MAE to 44.12. Overall, this SVM-based prediction approach will open new vistas in the area of forecasting plant diseases of various crops. Conclusion Our case study demonstrated that SVM is better than existing machine learning techniques and conventional REG approaches in forecasting plant diseases. In this direction, we have also

  18. Machine-Learning Research

    OpenAIRE

    Dietterich, Thomas G.

    1997-01-01

    Machine-learning research has been making great progress in many directions. This article summarizes four of these directions and discusses some current open problems. The four directions are (1) the improvement of classification accuracy by learning ensembles of classifiers, (2) methods for scaling up supervised learning algorithms, (3) reinforcement learning, and (4) the learning of complex stochastic models.

  19. Maximum margin classifier working in a set of strings.

    Science.gov (United States)

    Koyano, Hitoshi; Hayashida, Morihiro; Akutsu, Tatsuya

    2016-03-01

    Numbers and numerical vectors account for a large portion of data. However, recently, the amount of string data generated has increased dramatically. Consequently, classifying string data is a common problem in many fields. The most widely used approach to this problem is to convert strings into numerical vectors using string kernels and subsequently apply a support vector machine that works in a numerical vector space. However, this non-one-to-one conversion involves a loss of information and makes it impossible to evaluate, using probability theory, the generalization error of a learning machine, considering that the given data to train and test the machine are strings generated according to probability laws. In this study, we approach this classification problem by constructing a classifier that works in a set of strings. To evaluate the generalization error of such a classifier theoretically, probability theory for strings is required. Therefore, we first extend a limit theorem for a consensus sequence of strings demonstrated by one of the authors and co-workers in a previous study. Using the obtained result, we then demonstrate that our learning machine classifies strings in an asymptotically optimal manner. Furthermore, we demonstrate the usefulness of our machine in practical data analysis by applying it to predicting protein-protein interactions using amino acid sequences and classifying RNAs by the secondary structure using nucleotide sequences.

  20. Ultrasonic fluid quantity measurement in dynamic vehicular applications a support vector machine approach

    CERN Document Server

    Terzic, Jenny; Nagarajah, Romesh; Alamgir, Muhammad

    2013-01-01

    Accurate fluid level measurement in dynamic environments can be assessed using a Support Vector Machine (SVM) approach. SVM is a supervised learning model that analyzes and recognizes patterns. It is a signal classification technique which has far greater accuracy than conventional signal averaging methods. Ultrasonic Fluid Quantity Measurement in Dynamic Vehicular Applications: A Support Vector Machine Approach describes the research and development of a fluid level measurement system for dynamic environments. The measurement system is based on a single ultrasonic sensor. A Support Vector Machines (SVM) based signal characterization and processing system has been developed to compensate for the effects of slosh and temperature variation in fluid level measurement systems used in dynamic environments including automotive applications. It has been demonstrated that a simple ν-SVM model with Radial Basis Function (RBF) Kernel with the inclusion of a Moving Median filter could be used to achieve the high levels...

  1. Machine Shop Grinding Machines.

    Science.gov (United States)

    Dunn, James

    This curriculum manual is one in a series of machine shop curriculum manuals intended for use in full-time secondary and postsecondary classes, as well as part-time adult classes. The curriculum can also be adapted to open-entry, open-exit programs. Its purpose is to equip students with basic knowledge and skills that will enable them to enter the…

  2. Stack filter classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory

    2009-01-01

    Just as linear models generalize the sample mean and weighted average, weighted order statistic models generalize the sample median and weighted median. This analogy can be continued informally to generalized additive modeels in the case of the mean, and Stack Filters in the case of the median. Both of these model classes have been extensively studied for signal and image processing but it is surprising to find that for pattern classification, their treatment has been significantly one sided. Generalized additive models are now a major tool in pattern classification and many different learning algorithms have been developed to fit model parameters to finite data. However Stack Filters remain largely confined to signal and image processing and learning algorithms for classification are yet to be seen. This paper is a step towards Stack Filter Classifiers and it shows that the approach is interesting from both a theoretical and a practical perspective.

  3. Transient Classifier Systems and Man-Machine Interface Research.

    Science.gov (United States)

    1987-08-31

    different timbre from two different resonant sources, i.e., like a violin and oboe emitting nearly the same fundamental mode fre- quency, but each with its...the subjects by examing both hits and misses for signal and noise stimuli. A pairwise com- parison of the means resulted in significant differences (at

  4. Evaluating Machine Learning Classifiers for Hybrid Network Intrusion Detection Systems

    Science.gov (United States)

    2015-03-26

    and the value-focused method. Comparing results from the two evaluation methods, fallacies are revealed with 2 of the 5 notional weighting schemes...for them, because of their relentless support, love , and encouragement. I give a sincere thank you to my research advisor, Dr. Robert Mills, for his...though Ad- aBoost.BayesNet dominated the traditional PR space using a single curve approach. This evaluation fallacy has not been demonstrated prior to

  5. Classifying objects in LWIR imagery via CNNs

    Science.gov (United States)

    Rodger, Iain; Connor, Barry; Robertson, Neil M.

    2016-10-01

    The aim of the presented work is to demonstrate enhanced target recognition and improved false alarm rates for a mid to long range detection system, utilising a Long Wave Infrared (LWIR) sensor. By exploiting high quality thermal image data and recent techniques in machine learning, the system can provide automatic target recognition capabilities. A Convolutional Neural Network (CNN) is trained and the classifier achieves an overall accuracy of > 95% for 6 object classes related to land defence. While the highly accurate CNN struggles to recognise long range target classes, due to low signal quality, robust target discrimination is achieved for challenging candidates. The overall performance of the methodology presented is assessed using human ground truth information, generating classifier evaluation metrics for thermal image sequences.

  6. A systematic comparison of supervised classifiers.

    Directory of Open Access Journals (Sweden)

    Diego Raphael Amancio

    Full Text Available Pattern recognition has been employed in a myriad of industrial, commercial and academic applications. Many techniques have been devised to tackle such a diversity of applications. Despite the long tradition of pattern recognition research, there is no technique that yields the best classification in all scenarios. Therefore, as many techniques as possible should be considered in high accuracy applications. Typical related works either focus on the performance of a given algorithm or compare various classification methods. In many occasions, however, researchers who are not experts in the field of machine learning have to deal with practical classification tasks without an in-depth knowledge about the underlying parameters. Actually, the adequate choice of classifiers and parameters in such practical circumstances constitutes a long-standing problem and is one of the subjects of the current paper. We carried out a performance study of nine well-known classifiers implemented in the Weka framework and compared the influence of the parameter configurations on the accuracy. The default configuration of parameters in Weka was found to provide near optimal performance for most cases, not including methods such as the support vector machine (SVM. In addition, the k-nearest neighbor method frequently allowed the best accuracy. In certain conditions, it was possible to improve the quality of SVM by more than 20% with respect to their default parameter configuration.

  7. Sustainable machining

    CERN Document Server

    2017-01-01

    This book provides an overview on current sustainable machining. Its chapters cover the concept in economic, social and environmental dimensions. It provides the reader with proper ways to handle several pollutants produced during the machining process. The book is useful on both undergraduate and postgraduate levels and it is of interest to all those working with manufacturing and machining technology.

  8. Research on bearing life prediction based on support vector machine and its application

    International Nuclear Information System (INIS)

    Sun Chuang; Zhang Zhousuo; He Zhengjia

    2011-01-01

    Life prediction of rolling element bearing is the urgent demand in engineering practice, and the effective life prediction technique is beneficial to predictive maintenance. Support vector machine (SVM) is a novel machine learning method based on statistical learning theory, and is of advantage in prediction. This paper develops SVM-based model for bearing life prediction. The inputs of the model are features of bearing vibration signal and the output is the bearing running time-bearing failure time ratio. The model is built base on a few failed bearing data, and it can fuse information of the predicted bearing. So it is of advantage to bearing life prediction in practice. The model is applied to life prediction of a bearing, and the result shows the proposed model is of high precision.

  9. Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers

    OpenAIRE

    Rosenberg, Ishai; Shabtai, Asaf; Rokach, Lior; Elovici, Yuval

    2017-01-01

    In this paper, we present a black-box attack against API call based machine learning malware classifiers, focusing on generating adversarial sequences combining API calls and static features (e.g., printable strings) that will be misclassified by the classifier without affecting the malware functionality. We show that this attack is effective against many classifiers due to the transferability principle between RNN variants, feed forward DNNs, and traditional machine learning classifiers such...

  10. Machine Learning for Security

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    Applied statistics, aka ‘Machine Learning’, offers a wealth of techniques for answering security questions. It’s a much hyped topic in the big data world, with many companies now providing machine learning as a service. This talk will demystify these techniques, explain the math, and demonstrate their application to security problems. The presentation will include how-to’s on classifying malware, looking into encrypted tunnels, and finding botnets in DNS data. About the speaker Josiah is a security researcher with HP TippingPoint DVLabs Research Group. He has over 15 years of professional software development experience. Josiah used to do AI, with work focused on graph theory, search, and deductive inference on large knowledge bases. As rules only get you so far, he moved from AI to using machine learning techniques identifying failure modes in email traffic. There followed digressions into clustered data storage and later integrated control systems. Current ...

  11. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    International Nuclear Information System (INIS)

    Blanco, A; Rodriguez, R; Martinez-Maranon, I

    2014-01-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity

  12. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    Science.gov (United States)

    Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.

    2014-03-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.

  13. COMPARISON OF SVM AND FUZZY CLASSIFIER FOR AN INDIAN SCRIPT

    Directory of Open Access Journals (Sweden)

    M. J. Baheti

    2012-01-01

    Full Text Available With the advent of technological era, conversion of scanned document (handwritten or printed into machine editable format has attracted many researchers. This paper deals with the problem of recognition of Gujarati handwritten numerals. Gujarati numeral recognition requires performing some specific steps as a part of preprocessing. For preprocessing digitization, segmentation, normalization and thinning are done with considering that the image have almost no noise. Further affine invariant moments based model is used for feature extraction and finally Support Vector Machine (SVM and Fuzzy classifiers are used for numeral classification. . The comparison of SVM and Fuzzy classifier is made and it can be seen that SVM procured better results as compared to Fuzzy Classifier.

  14. Comparison of random forests and support vector machine for real-time radar-derived rainfall forecasting

    Science.gov (United States)

    Yu, Pao-Shan; Yang, Tao-Chang; Chen, Szu-Yin; Kuo, Chen-Min; Tseng, Hung-Wei

    2017-09-01

    This study aims to compare two machine learning techniques, random forests (RF) and support vector machine (SVM), for real-time radar-derived rainfall forecasting. The real-time radar-derived rainfall forecasting models use the present grid-based radar-derived rainfall as the output variable and use antecedent grid-based radar-derived rainfall, grid position (longitude and latitude) and elevation as the input variables to forecast 1- to 3-h ahead rainfalls for all grids in a catchment. Grid-based radar-derived rainfalls of six typhoon events during 2012-2015 in three reservoir catchments of Taiwan are collected for model training and verifying. Two kinds of forecasting models are constructed and compared, which are single-mode forecasting model (SMFM) and multiple-mode forecasting model (MMFM) based on RF and SVM. The SMFM uses the same model for 1- to 3-h ahead rainfall forecasting; the MMFM uses three different models for 1- to 3-h ahead forecasting. According to forecasting performances, it reveals that the SMFMs give better performances than MMFMs and both SVM-based and RF-based SMFMs show satisfactory performances for 1-h ahead forecasting. However, for 2- and 3-h ahead forecasting, it is found that the RF-based SMFM underestimates the observed radar-derived rainfalls in most cases and the SVM-based SMFM can give better performances than RF-based SMFM.

  15. Daily River Flow Forecasting with Hybrid Support Vector Machine – Particle Swarm Optimization

    Science.gov (United States)

    Zaini, N.; Malek, M. A.; Yusoff, M.; Mardi, N. H.; Norhisham, S.

    2018-04-01

    The application of artificial intelligence techniques for river flow forecasting can further improve the management of water resources and flood prevention. This study concerns the development of support vector machine (SVM) based model and its hybridization with particle swarm optimization (PSO) to forecast short term daily river flow at Upper Bertam Catchment located in Cameron Highland, Malaysia. Ten years duration of historical rainfall, antecedent river flow data and various meteorology parameters data from 2003 to 2012 are used in this study. Four SVM based models are proposed which are SVM1, SVM2, SVM-PSO1 and SVM-PSO2 to forecast 1 to 7 day ahead of river flow. SVM1 and SVM-PSO1 are the models with historical rainfall and antecedent river flow as its input, while SVM2 and SVM-PSO2 are the models with historical rainfall, antecedent river flow data and additional meteorological parameters as input. The performances of the proposed model are measured in term of RMSE and R2 . It is found that, SVM2 outperformed SVM1 and SVM-PSO2 outperformed SVM-PSO1 which meant the additional meteorology parameters used as input to the proposed models significantly affect the model performances. Hybrid models SVM-PSO1 and SVM-PSO2 yield higher performances as compared to SVM1 and SVM2. It is found that hybrid models are more effective in forecasting river flow at 1 to 7 day ahead at the study area.

  16. Improved Reliability-Based Optimization with Support Vector Machines and Its Application in Aircraft Wing Design

    Directory of Open Access Journals (Sweden)

    Yu Wang

    2015-01-01

    Full Text Available A new reliability-based design optimization (RBDO method based on support vector machines (SVM and the Most Probable Point (MPP is proposed in this work. SVM is used to create a surrogate model of the limit-state function at the MPP with the gradient information in the reliability analysis. This guarantees that the surrogate model not only passes through the MPP but also is tangent to the limit-state function at the MPP. Then, importance sampling (IS is used to calculate the probability of failure based on the surrogate model. This treatment significantly improves the accuracy of reliability analysis. For RBDO, the Sequential Optimization and Reliability Assessment (SORA is employed as well, which decouples deterministic optimization from the reliability analysis. The improved SVM-based reliability analysis is used to amend the error from linear approximation for limit-state function in SORA. A mathematical example and a simplified aircraft wing design demonstrate that the improved SVM-based reliability analysis is more accurate than FORM and needs less training points than the Monte Carlo simulation and that the proposed optimization strategy is efficient.

  17. Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier.

    Science.gov (United States)

    Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal

    2015-01-01

    Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.

  18. Simple machines

    CERN Document Server

    Graybill, George

    2007-01-01

    Just how simple are simple machines? With our ready-to-use resource, they are simple to teach and easy to learn! Chocked full of information and activities, we begin with a look at force, motion and work, and examples of simple machines in daily life are given. With this background, we move on to different kinds of simple machines including: Levers, Inclined Planes, Wedges, Screws, Pulleys, and Wheels and Axles. An exploration of some compound machines follows, such as the can opener. Our resource is a real time-saver as all the reading passages, student activities are provided. Presented in s

  19. Machine learning in virtual screening.

    Science.gov (United States)

    Melville, James L; Burke, Edmund K; Hirst, Jonathan D

    2009-05-01

    In this review, we highlight recent applications of machine learning to virtual screening, focusing on the use of supervised techniques to train statistical learning algorithms to prioritize databases of molecules as active against a particular protein target. Both ligand-based similarity searching and structure-based docking have benefited from machine learning algorithms, including naïve Bayesian classifiers, support vector machines, neural networks, and decision trees, as well as more traditional regression techniques. Effective application of these methodologies requires an appreciation of data preparation, validation, optimization, and search methodologies, and we also survey developments in these areas.

  20. Classified

    CERN Multimedia

    Computer Security Team

    2011-01-01

    In the last issue of the Bulletin, we have discussed recent implications for privacy on the Internet. But privacy of personal data is just one facet of data protection. Confidentiality is another one. However, confidentiality and data protection are often perceived as not relevant in the academic environment of CERN.   But think twice! At CERN, your personal data, e-mails, medical records, financial and contractual documents, MARS forms, group meeting minutes (and of course your password!) are all considered to be sensitive, restricted or even confidential. And this is not all. Physics results, in particular when being preliminary and pending scrutiny, are sensitive, too. Just recently, an ATLAS collaborator copy/pasted the abstract of an ATLAS note onto an external public blog, despite the fact that this document was clearly marked as an "Internal Note". Such an act was not only embarrassing to the ATLAS collaboration, and had negative impact on CERN’s reputation --- i...

  1. Application of SVM classifier in thermographic image classification for early detection of breast cancer

    Science.gov (United States)

    Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał

    2016-09-01

    This article presents the application of machine learning algorithms for early detection of breast cancer on the basis of thermographic images. Supervised learning model: Support vector machine (SVM) and Sequential Minimal Optimization algorithm (SMO) for the training of SVM classifier were implemented. The SVM classifier was included in a client-server application which enables to create a training set of examinations and to apply classifiers (including SVM) for the diagnosis and early detection of the breast cancer. The sensitivity and specificity of SVM classifier were calculated based on the thermographic images from studies. Furthermore, the heuristic method for SVM's parameters tuning was proposed.

  2. Classifying Sluice Occurrences in Dialogue

    DEFF Research Database (Denmark)

    Baird, Austin; Hamza, Anissa; Hardt, Daniel

    2018-01-01

    perform manual annotation with acceptable inter-coder agreement. We build classifier models with Decision Trees and Naive Bayes, with accuracy of 67%. We deploy a classifier to automatically classify sluice occurrences in OpenSubtitles, resulting in a corpus with 1.7 million occurrences. This will support....... Despite this, the corpus can be of great use in research on sluicing and development of systems, and we are making the corpus freely available on request. Furthermore, we are in the process of improving the accuracy of sluice identification and annotation for the purpose of created a subsequent version...

  3. Support vector machine learning model for the prediction of sentinel node status in patients with cutaneous melanoma.

    Science.gov (United States)

    Mocellin, Simone; Ambrosi, Alessandro; Montesco, Maria Cristina; Foletto, Mirto; Zavagno, Giorgio; Nitti, Donato; Lise, Mario; Rossi, Carlo Riccardo

    2006-08-01

    Currently, approximately 80% of melanoma patients undergoing sentinel node biopsy (SNB) have negative sentinel lymph nodes (SLNs), and no prediction system is reliable enough to be implemented in the clinical setting to reduce the number of SNB procedures. In this study, the predictive power of support vector machine (SVM)-based statistical analysis was tested. The clinical records of 246 patients who underwent SNB at our institution were used for this analysis. The following clinicopathologic variables were considered: the patient's age and sex and the tumor's histological subtype, Breslow thickness, Clark level, ulceration, mitotic index, lymphocyte infiltration, regression, angiolymphatic invasion, microsatellitosis, and growth phase. The results of SVM-based prediction of SLN status were compared with those achieved with logistic regression. The SLN positivity rate was 22% (52 of 234). When the accuracy was > or = 80%, the negative predictive value, positive predictive value, specificity, and sensitivity were 98%, 54%, 94%, and 77% and 82%, 41%, 69%, and 93% by using SVM and logistic regression, respectively. Moreover, SVM and logistic regression were associated with a diagnostic error and an SNB percentage reduction of (1) 1% and 60% and (2) 15% and 73%, respectively. The results from this pilot study suggest that SVM-based prediction of SLN status might be evaluated as a prognostic method to avoid the SNB procedure in 60% of patients currently eligible, with a very low error rate. If validated in larger series, this strategy would lead to obvious advantages in terms of both patient quality of life and costs for the health care system.

  4. Face machines

    Energy Technology Data Exchange (ETDEWEB)

    Hindle, D.

    1999-06-01

    The article surveys latest equipment available from the world`s manufacturers of a range of machines for tunnelling. These are grouped under headings: excavators; impact hammers; road headers; and shields and tunnel boring machines. Products of thirty manufacturers are referred to. Addresses and fax numbers of companies are supplied. 5 tabs., 13 photos.

  5. Electric machine

    Science.gov (United States)

    El-Refaie, Ayman Mohamed Fawzi [Niskayuna, NY; Reddy, Patel Bhageerath [Madison, WI

    2012-07-17

    An interior permanent magnet electric machine is disclosed. The interior permanent magnet electric machine comprises a rotor comprising a plurality of radially placed magnets each having a proximal end and a distal end, wherein each magnet comprises a plurality of magnetic segments and at least one magnetic segment towards the distal end comprises a high resistivity magnetic material.

  6. Machine Learning.

    Science.gov (United States)

    Kirrane, Diane E.

    1990-01-01

    As scientists seek to develop machines that can "learn," that is, solve problems by imitating the human brain, a gold mine of information on the processes of human learning is being discovered, expert systems are being improved, and human-machine interactions are being enhanced. (SK)

  7. Nonplanar machines

    International Nuclear Information System (INIS)

    Ritson, D.

    1989-05-01

    This talk examines methods available to minimize, but never entirely eliminate, degradation of machine performance caused by terrain following. Breaking of planar machine symmetry for engineering convenience and/or monetary savings must be balanced against small performance degradation, and can only be decided on a case-by-case basis. 5 refs

  8. IAEA safeguards and classified materials

    International Nuclear Information System (INIS)

    Pilat, J.F.; Eccleston, G.W.; Fearey, B.L.; Nicholas, N.J.; Tape, J.W.; Kratzer, M.

    1997-01-01

    The international community in the post-Cold War period has suggested that the International Atomic Energy Agency (IAEA) utilize its expertise in support of the arms control and disarmament process in unprecedented ways. The pledges of the US and Russian presidents to place excess defense materials, some of which are classified, under some type of international inspections raises the prospect of using IAEA safeguards approaches for monitoring classified materials. A traditional safeguards approach, based on nuclear material accountancy, would seem unavoidably to reveal classified information. However, further analysis of the IAEA's safeguards approaches is warranted in order to understand fully the scope and nature of any problems. The issues are complex and difficult, and it is expected that common technical understandings will be essential for their resolution. Accordingly, this paper examines and compares traditional safeguards item accounting of fuel at a nuclear power station (especially spent fuel) with the challenges presented by inspections of classified materials. This analysis is intended to delineate more clearly the problems as well as reveal possible approaches, techniques, and technologies that could allow the adaptation of safeguards to the unprecedented task of inspecting classified materials. It is also hoped that a discussion of these issues can advance ongoing political-technical debates on international inspections of excess classified materials

  9. Clustering Categories in Support Vector Machines

    DEFF Research Database (Denmark)

    Carrizosa, Emilio; Nogales-Gómez, Amaya; Morales, Dolores Romero

    2017-01-01

    The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in in...

  10. Hybrid classifiers methods of data, knowledge, and classifier combination

    CERN Document Server

    Wozniak, Michal

    2014-01-01

    This book delivers a definite and compact knowledge on how hybridization can help improving the quality of computer classification systems. In order to make readers clearly realize the knowledge of hybridization, this book primarily focuses on introducing the different levels of hybridization and illuminating what problems we will face with as dealing with such projects. In the first instance the data and knowledge incorporated in hybridization were the action points, and then a still growing up area of classifier systems known as combined classifiers was considered. This book comprises the aforementioned state-of-the-art topics and the latest research results of the author and his team from Department of Systems and Computer Networks, Wroclaw University of Technology, including as classifier based on feature space splitting, one-class classification, imbalance data, and data stream classification.

  11. The Machine within the Machine

    CERN Multimedia

    Katarina Anthony

    2014-01-01

    Although Virtual Machines are widespread across CERN, you probably won't have heard of them unless you work for an experiment. Virtual machines - known as VMs - allow you to create a separate machine within your own, allowing you to run Linux on your Mac, or Windows on your Linux - whatever combination you need.   Using a CERN Virtual Machine, a Linux analysis software runs on a Macbook. When it comes to LHC data, one of the primary issues collaborations face is the diversity of computing environments among collaborators spread across the world. What if an institute cannot run the analysis software because they use different operating systems? "That's where the CernVM project comes in," says Gerardo Ganis, PH-SFT staff member and leader of the CernVM project. "We were able to respond to experimentalists' concerns by providing a virtual machine package that could be used to run experiment software. This way, no matter what hardware they have ...

  12. Machine translation

    Energy Technology Data Exchange (ETDEWEB)

    Nagao, M

    1982-04-01

    Each language has its own structure. In translating one language into another one, language attributes and grammatical interpretation must be defined in an unambiguous form. In order to parse a sentence, it is necessary to recognize its structure. A so-called context-free grammar can help in this respect for machine translation and machine-aided translation. Problems to be solved in studying machine translation are taken up in the paper, which discusses subjects for semantics and for syntactic analysis and translation software. 14 references.

  13. Comparison of confirmed inactive and randomly selected compounds as negative training examples in support vector machine-based virtual screening.

    Science.gov (United States)

    Heikamp, Kathrin; Bajorath, Jürgen

    2013-07-22

    The choice of negative training data for machine learning is a little explored issue in chemoinformatics. In this study, the influence of alternative sets of negative training data and different background databases on support vector machine (SVM) modeling and virtual screening has been investigated. Target-directed SVM models have been derived on the basis of differently composed training sets containing confirmed inactive molecules or randomly selected database compounds as negative training instances. These models were then applied to search background databases consisting of biological screening data or randomly assembled compounds for available hits. Negative training data were found to systematically influence compound recall in virtual screening. In addition, different background databases had a strong influence on the search results. Our findings also indicated that typical benchmark settings lead to an overestimation of SVM-based virtual screening performance compared to search conditions that are more relevant for practical applications.

  14. Silicon nanowire arrays as learning chemical vapour classifiers

    International Nuclear Information System (INIS)

    Niskanen, A O; Colli, A; White, R; Li, H W; Spigone, E; Kivioja, J M

    2011-01-01

    Nanowire field-effect transistors are a promising class of devices for various sensing applications. Apart from detecting individual chemical or biological analytes, it is especially interesting to use multiple selective sensors to look at their collective response in order to perform classification into predetermined categories. We show that non-functionalised silicon nanowire arrays can be used to robustly classify different chemical vapours using simple statistical machine learning methods. We were able to distinguish between acetone, ethanol and water with 100% accuracy while methanol, ethanol and 2-propanol were classified with 96% accuracy in ambient conditions.

  15. Machine Learning

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Machine learning, which builds on ideas in computer science, statistics, and optimization, focuses on developing algorithms to identify patterns and regularities in data, and using these learned patterns to make predictions on new observations. Boosted by its industrial and commercial applications, the field of machine learning is quickly evolving and expanding. Recent advances have seen great success in the realms of computer vision, natural language processing, and broadly in data science. Many of these techniques have already been applied in particle physics, for instance for particle identification, detector monitoring, and the optimization of computer resources. Modern machine learning approaches, such as deep learning, are only just beginning to be applied to the analysis of High Energy Physics data to approach more and more complex problems. These classes will review the framework behind machine learning and discuss recent developments in the field.

  16. Machine Translation

    Indian Academy of Sciences (India)

    Research Mt System Example: The 'Janus' Translating Phone Project. The Janus ... based on laptops, and simultaneous translation of two speakers in a dialogue. For more ..... The current focus in MT research is on using machine learning.

  17. 3D Bayesian contextual classifiers

    DEFF Research Database (Denmark)

    Larsen, Rasmus

    2000-01-01

    We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours.......We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours....

  18. Security Enrichment in Intrusion Detection System Using Classifier Ensemble

    Directory of Open Access Journals (Sweden)

    Uma R. Salunkhe

    2017-01-01

    Full Text Available In the era of Internet and with increasing number of people as its end users, a large number of attack categories are introduced daily. Hence, effective detection of various attacks with the help of Intrusion Detection Systems is an emerging trend in research these days. Existing studies show effectiveness of machine learning approaches in handling Intrusion Detection Systems. In this work, we aim to enhance detection rate of Intrusion Detection System by using machine learning technique. We propose a novel classifier ensemble based IDS that is constructed using hybrid approach which combines data level and feature level approach. Classifier ensembles combine the opinions of different experts and improve the intrusion detection rate. Experimental results show the improved detection rates of our system compared to reference technique.

  19. Knowledge Uncertainty and Composed Classifier

    Czech Academy of Sciences Publication Activity Database

    Klimešová, Dana; Ocelíková, E.

    2007-01-01

    Roč. 1, č. 2 (2007), s. 101-105 ISSN 1998-0140 Institutional research plan: CEZ:AV0Z10750506 Keywords : Boosting architecture * contextual modelling * composed classifier * knowledge management, * knowledge * uncertainty Subject RIV: IN - Informatics, Computer Science

  20. Correlation Dimension-Based Classifier

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel; Jiřina jr., M.

    2014-01-01

    Roč. 44, č. 12 (2014), s. 2253-2263 ISSN 2168-2267 R&D Projects: GA MŠk(CZ) LG12020 Institutional support: RVO:67985807 Keywords : classifier * multidimensional data * correlation dimension * scaling exponent * polynomial expansion Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 3.469, year: 2014

  1. Evaluating Classifiers in Detecting 419 Scams in Bilingual Cybercriminal Communities

    OpenAIRE

    Mbaziira, Alex V.; Abozinadah, Ehab; Jones Jr, James H.

    2015-01-01

    Incidents of organized cybercrime are rising because of criminals are reaping high financial rewards while incurring low costs to commit crime. As the digital landscape broadens to accommodate more internet-enabled devices and technologies like social media, more cybercriminals who are not native English speakers are invading cyberspace to cash in on quick exploits. In this paper we evaluate the performance of three machine learning classifiers in detecting 419 scams in a bilingual Nigerian c...

  2. Support Vector Machines for decision support in electricity markets׳ strategic bidding

    DEFF Research Database (Denmark)

    Pinto, Tiago; Sousa, Tiago M.; Praça, Isabel

    2015-01-01

    . The ALBidS system allows MASCEM market negotiating players to take the best possible advantages from the market context. This paper presents the application of a Support Vector Machines (SVM) based approach to provide decision support to electricity market players. This strategy is tested and validated...... by being included in ALBidS and then compared with the application of an Artificial Neural Network (ANN), originating promising results: an effective electricity market price forecast in a fast execution time. The proposed approach is tested and validated using real electricity markets data from MIBEL......׳ research group has developed a multi-agent system: Multi-Agent System for Competitive Electricity Markets (MASCEM), which simulates the electricity markets environment. MASCEM is integrated with Adaptive Learning Strategic Bidding System (ALBidS) that works as a decision support system for market players...

  3. Machine Learning for Medical Imaging.

    Science.gov (United States)

    Erickson, Bradley J; Korfiatis, Panagiotis; Akkus, Zeynettin; Kline, Timothy L

    2017-01-01

    Machine learning is a technique for recognizing patterns that can be applied to medical images. Although it is a powerful tool that can help in rendering medical diagnoses, it can be misapplied. Machine learning typically begins with the machine learning algorithm system computing the image features that are believed to be of importance in making the prediction or diagnosis of interest. The machine learning algorithm system then identifies the best combination of these image features for classifying the image or computing some metric for the given image region. There are several methods that can be used, each with different strengths and weaknesses. There are open-source versions of most of these machine learning methods that make them easy to try and apply to images. Several metrics for measuring the performance of an algorithm exist; however, one must be aware of the possible associated pitfalls that can result in misleading metrics. More recently, deep learning has started to be used; this method has the benefit that it does not require image feature identification and calculation as a first step; rather, features are identified as part of the learning process. Machine learning has been used in medical imaging and will have a greater influence in the future. Those working in medical imaging must be aware of how machine learning works. © RSNA, 2017.

  4. Machine Protection

    International Nuclear Information System (INIS)

    Zerlauth, Markus; Schmidt, Rüdiger; Wenninger, Jörg

    2012-01-01

    The present architecture of the machine protection system is being recalled and the performance of the associated systems during the 2011 run will be briefly summarized. An analysis of the causes of beam dumps as well as an assessment of the dependability of the machine protection systems (MPS) itself is being presented. Emphasis will be given to events that risked exposing parts of the machine to damage. Further improvements and mitigations of potential holes in the protection systems will be evaluated along with their impact on the 2012 run. The role of rMPP during the various operational phases (commissioning, intensity ramp up, MDs...) will be discussed along with a proposal for the intensity ramp up for the start of beam operation in 2012

  5. Machine Learning

    Energy Technology Data Exchange (ETDEWEB)

    Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.; Carroll, Thomas E.; Muller, George

    2017-04-21

    The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networks and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.

  6. Machine Protection

    CERN Document Server

    Zerlauth, Markus; Wenninger, Jörg

    2012-01-01

    The present architecture of the machine protection system is being recalled and the performance of the associated systems during the 2011 run will be briefly summarized. An analysis of the causes of beam dumps as well as an assessment of the dependability of the machine protection systems (MPS) itself is being presented. Emphasis will be given to events that risked exposing parts of the machine to damage. Further improvements and mitigations of potential holes in the protection systems will be evaluated along with their impact on the 2012 run. The role of rMPP during the various operational phases (commissioning, intensity ramp up, MDs...) will be discussed along with a proposal for the intensity ramp up for the start of beam operation in 2012.

  7. Machine Protection

    Energy Technology Data Exchange (ETDEWEB)

    Zerlauth, Markus; Schmidt, Rüdiger; Wenninger, Jörg [European Organization for Nuclear Research, Geneva (Switzerland)

    2012-07-01

    The present architecture of the machine protection system is being recalled and the performance of the associated systems during the 2011 run will be briefly summarized. An analysis of the causes of beam dumps as well as an assessment of the dependability of the machine protection systems (MPS) itself is being presented. Emphasis will be given to events that risked exposing parts of the machine to damage. Further improvements and mitigations of potential holes in the protection systems will be evaluated along with their impact on the 2012 run. The role of rMPP during the various operational phases (commissioning, intensity ramp up, MDs...) will be discussed along with a proposal for the intensity ramp up for the start of beam operation in 2012.

  8. Classified facilities for environmental protection

    International Nuclear Information System (INIS)

    Anon.

    1993-02-01

    The legislation of the classified facilities governs most of the dangerous or polluting industries or fixed activities. It rests on the law of 9 July 1976 concerning facilities classified for environmental protection and its application decree of 21 September 1977. This legislation, the general texts of which appear in this volume 1, aims to prevent all the risks and the harmful effects coming from an installation (air, water or soil pollutions, wastes, even aesthetic breaches). The polluting or dangerous activities are defined in a list called nomenclature which subjects the facilities to a declaration or an authorization procedure. The authorization is delivered by the prefect at the end of an open and contradictory procedure after a public survey. In addition, the facilities can be subjected to technical regulations fixed by the Environment Minister (volume 2) or by the prefect for facilities subjected to declaration (volume 3). (A.B.)

  9. Teletherapy machine

    International Nuclear Information System (INIS)

    Panyam, Vinatha S.; Rakshit, Sougata; Kulkarni, M.S.; Pradeepkumar, K.S.

    2017-01-01

    Radiation Standards Section (RSS), RSSD, BARC is the national metrology institute for ionizing radiation. RSS develops and maintains radiation standards for X-ray, beta, gamma and neutron radiations. In radiation dosimetry, traceability, accuracy and consistency of radiation measurements is very important especially in radiotherapy where the success of patient treatment is dependent on the accuracy of the dose delivered to the tumour. Cobalt teletherapy machines have been used in the treatment of cancer since the early 1950s and India had its first cobalt teletherapy machine installed at the Cancer Institute, Chennai in 1956

  10. Machine learning versus knowledge based classification of legal texts

    NARCIS (Netherlands)

    de Maat, E.; Krabben, K.; Winkels, R.; Winkels, R.G.F.

    2010-01-01

    This paper presents results of an experiment in which we used machine learning (ML) techniques to classify sentences in Dutch legislation. These results are compared to the results of a pattern-based classifier. Overall, the ML classifier performs as accurate (>90%) as the pattern based one, but

  11. 76 FR 34761 - Classified National Security Information

    Science.gov (United States)

    2011-06-14

    ... MARINE MAMMAL COMMISSION Classified National Security Information [Directive 11-01] AGENCY: Marine... Commission's (MMC) policy on classified information, as directed by Information Security Oversight Office... of Executive Order 13526, ``Classified National Security Information,'' and 32 CFR part 2001...

  12. Machine testning

    DEFF Research Database (Denmark)

    De Chiffre, Leonardo

    This document is used in connection with a laboratory exercise of 3 hours duration as a part of the course GEOMETRICAL METROLOGY AND MACHINE TESTING. The exercise includes a series of tests carried out by the student on a conventional and a numerically controled lathe, respectively. This document...

  13. Machine rates for selected forest harvesting machines

    Science.gov (United States)

    R.W. Brinker; J. Kinard; Robert Rummer; B. Lanford

    2002-01-01

    Very little new literature has been published on the subject of machine rates and machine cost analysis since 1989 when the Alabama Agricultural Experiment Station Circular 296, Machine Rates for Selected Forest Harvesting Machines, was originally published. Many machines discussed in the original publication have undergone substantial changes in various aspects, not...

  14. Comparing SVM and ANN based Machine Learning Methods for Species Identification of Food Contaminating Beetles.

    Science.gov (United States)

    Bisgin, Halil; Bera, Tanmay; Ding, Hongjian; Semey, Howard G; Wu, Leihong; Liu, Zhichao; Barnes, Amy E; Langley, Darryl A; Pava-Ripoll, Monica; Vyas, Himansu J; Tong, Weida; Xu, Joshua

    2018-04-25

    Insect pests, such as pantry beetles, are often associated with food contaminations and public health risks. Machine learning has the potential to provide a more accurate and efficient solution in detecting their presence in food products, which is currently done manually. In our previous research, we demonstrated such feasibility where Artificial Neural Network (ANN) based pattern recognition techniques could be implemented for species identification in the context of food safety. In this study, we present a Support Vector Machine (SVM) model which improved the average accuracy up to 85%. Contrary to this, the ANN method yielded ~80% accuracy after extensive parameter optimization. Both methods showed excellent genus level identification, but SVM showed slightly better accuracy  for most species. Highly accurate species level identification remains a challenge, especially in distinguishing between species from the same genus which may require improvements in both imaging and machine learning techniques. In summary, our work does illustrate a new SVM based technique and provides a good comparison with the ANN model in our context. We believe such insights will pave better way forward for the application of machine learning towards species identification and food safety.

  15. Restrictions of process machine retooling at machine-building enterprises

    Directory of Open Access Journals (Sweden)

    Kuznetsova Elena

    2017-01-01

    Full Text Available The competitiveness of the national economy depends on the technological level of the machine-building enterprises production equipment. Today in Russia there are objective and subjective restrictions for the optimum policy formation of the manufacturing equipment renewal. The analysis of the manufacturing equipment age structure dynamics in the Russian machine-building complex indicates the negative tendencies intensification: increase in the equipment service life, reduction in the share of up-to-date equipment, and drop in its use efficiency. The article investigates and classifies the main restrictions of the manufacturing equipment renewal process, such as regulatory and legislative, financial, organizational, competency-based. The economic consequences of the revealed restrictions influence on the machine-building enterprises activity are shown.

  16. Electric machines

    CERN Document Server

    Gross, Charles A

    2006-01-01

    BASIC ELECTROMAGNETIC CONCEPTSBasic Magnetic ConceptsMagnetically Linear Systems: Magnetic CircuitsVoltage, Current, and Magnetic Field InteractionsMagnetic Properties of MaterialsNonlinear Magnetic Circuit AnalysisPermanent MagnetsSuperconducting MagnetsThe Fundamental Translational EM MachineThe Fundamental Rotational EM MachineMultiwinding EM SystemsLeakage FluxThe Concept of Ratings in EM SystemsSummaryProblemsTRANSFORMERSThe Ideal n-Winding TransformerTransformer Ratings and Per-Unit ScalingThe Nonideal Three-Winding TransformerThe Nonideal Two-Winding TransformerTransformer Efficiency and Voltage RegulationPractical ConsiderationsThe AutotransformerOperation of Transformers in Three-Phase EnvironmentsSequence Circuit Models for Three-Phase Transformer AnalysisHarmonics in TransformersSummaryProblemsBASIC MECHANICAL CONSIDERATIONSSome General PerspectivesEfficiencyLoad Torque-Speed CharacteristicsMass Polar Moment of InertiaGearingOperating ModesTranslational SystemsA Comprehensive Example: The ElevatorP...

  17. Charging machine

    International Nuclear Information System (INIS)

    Medlin, J.B.

    1976-01-01

    A charging machine for loading fuel slugs into the process tubes of a nuclear reactor includes a tubular housing connected to the process tube, a charging trough connected to the other end of the tubular housing, a device for loading the charging trough with a group of fuel slugs, means for equalizing the coolant pressure in the charging trough with the pressure in the process tubes, means for pushing the group of fuel slugs into the process tube and a latch and a seal engaging the last object in the group of fuel slugs to prevent the fuel slugs from being ejected from the process tube when the pusher is removed and to prevent pressure liquid from entering the charging machine. 3 claims, 11 drawing figures

  18. Current Directional Protection of Series Compensated Line Using Intelligent Classifier

    Directory of Open Access Journals (Sweden)

    M. Mollanezhad Heydarabadi

    2016-12-01

    Full Text Available Current inversion condition leads to incorrect operation of current based directional relay in power system with series compensated device. Application of the intelligent system for fault direction classification has been suggested in this paper. A new current directional protection scheme based on intelligent classifier is proposed for the series compensated line. The proposed classifier uses only half cycle of pre-fault and post fault current samples at relay location to feed the classifier. A lot of forward and backward fault simulations under different system conditions upon a transmission line with a fixed series capacitor are carried out using PSCAD/EMTDC software. The applicability of decision tree (DT, probabilistic neural network (PNN and support vector machine (SVM are investigated using simulated data under different system conditions. The performance comparison of the classifiers indicates that the SVM is a best suitable classifier for fault direction discriminating. The backward faults can be accurately distinguished from forward faults even under current inversion without require to detect of the current inversion condition.

  19. Genesis machines

    CERN Document Server

    Amos, Martyn

    2014-01-01

    Silicon chips are out. Today's scientists are using real, wet, squishy, living biology to build the next generation of computers. Cells, gels and DNA strands are the 'wetware' of the twenty-first century. Much smaller and more intelligent, these organic computers open up revolutionary possibilities. Tracing the history of computing and revealing a brave new world to come, Genesis Machines describes how this new technology will change the way we think not just about computers - but about life itself.

  20. Galaxy Classification using Machine Learning

    Science.gov (United States)

    Fowler, Lucas; Schawinski, Kevin; Brandt, Ben-Elias; widmer, Nicole

    2017-01-01

    We present our current research into the use of machine learning to classify galaxy imaging data with various convolutional neural network configurations in TensorFlow. We are investigating how five-band Sloan Digital Sky Survey imaging data can be used to train on physical properties such as redshift, star formation rate, mass and morphology. We also investigate the performance of artificially redshifted images in recovering physical properties as image quality degrades.

  1. Waste classifying and separation device

    International Nuclear Information System (INIS)

    Kakiuchi, Hiroki.

    1997-01-01

    A flexible plastic bags containing solid wastes of indefinite shape is broken and the wastes are classified. The bag cutting-portion of the device has an ultrasonic-type or a heater-type cutting means, and the cutting means moves in parallel with the transferring direction of the plastic bags. A classification portion separates and discriminates the plastic bag from the contents and conducts classification while rotating a classification table. Accordingly, the plastic bag containing solids of indefinite shape can be broken and classification can be conducted efficiently and reliably. The device of the present invention has a simple structure which requires small installation space and enables easy maintenance. (T.M.)

  2. Defining and Classifying Interest Groups

    DEFF Research Database (Denmark)

    Baroni, Laura; Carroll, Brendan; Chalmers, Adam

    2014-01-01

    The interest group concept is defined in many different ways in the existing literature and a range of different classification schemes are employed. This complicates comparisons between different studies and their findings. One of the important tasks faced by interest group scholars engaged...... in large-N studies is therefore to define the concept of an interest group and to determine which classification scheme to use for different group types. After reviewing the existing literature, this article sets out to compare different approaches to defining and classifying interest groups with a sample...... in the organizational attributes of specific interest group types. As expected, our comparison of coding schemes reveals a closer link between group attributes and group type in narrower classification schemes based on group organizational characteristics than those based on a behavioral definition of lobbying....

  3. An assessment of support vector machines for land cover classification

    Science.gov (United States)

    Huang, C.; Davis, L.S.; Townshend, J.R.G.

    2002-01-01

    The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.

  4. Representational Machines

    DEFF Research Database (Denmark)

    Photography not only represents space. Space is produced photographically. Since its inception in the 19th century, photography has brought to light a vast array of represented subjects. Always situated in some spatial order, photographic representations have been operatively underpinned by social...... to the enterprises of the medium. This is the subject of Representational Machines: How photography enlists the workings of institutional technologies in search of establishing new iconic and social spaces. Together, the contributions to this edited volume span historical epochs, social environments, technological...... possibilities, and genre distinctions. Presenting several distinct ways of producing space photographically, this book opens a new and important field of inquiry for photography research....

  5. Shear machines

    International Nuclear Information System (INIS)

    Astill, M.; Sunderland, A.; Waine, M.G.

    1980-01-01

    A shear machine for irradiated nuclear fuel elements has a replaceable shear assembly comprising a fuel element support block, a shear blade support and a clamp assembly which hold the fuel element to be sheared in contact with the support block. A first clamp member contacts the fuel element remote from the shear blade and a second clamp member contacts the fuel element adjacent the shear blade and is advanced towards the support block during shearing to compensate for any compression of the fuel element caused by the shear blade (U.K.)

  6. Regularized maximum correntropy machine

    KAUST Repository

    Wang, Jim Jing-Yan; Wang, Yunji; Jing, Bing-Yi; Gao, Xin

    2015-01-01

    In this paper we investigate the usage of regularized correntropy framework for learning of classifiers from noisy labels. The class label predictors learned by minimizing transitional loss functions are sensitive to the noisy and outlying labels of training samples, because the transitional loss functions are equally applied to all the samples. To solve this problem, we propose to learn the class label predictors by maximizing the correntropy between the predicted labels and the true labels of the training samples, under the regularized Maximum Correntropy Criteria (MCC) framework. Moreover, we regularize the predictor parameter to control the complexity of the predictor. The learning problem is formulated by an objective function considering the parameter regularization and MCC simultaneously. By optimizing the objective function alternately, we develop a novel predictor learning algorithm. The experiments on two challenging pattern classification tasks show that it significantly outperforms the machines with transitional loss functions.

  7. Regularized maximum correntropy machine

    KAUST Repository

    Wang, Jim Jing-Yan

    2015-02-12

    In this paper we investigate the usage of regularized correntropy framework for learning of classifiers from noisy labels. The class label predictors learned by minimizing transitional loss functions are sensitive to the noisy and outlying labels of training samples, because the transitional loss functions are equally applied to all the samples. To solve this problem, we propose to learn the class label predictors by maximizing the correntropy between the predicted labels and the true labels of the training samples, under the regularized Maximum Correntropy Criteria (MCC) framework. Moreover, we regularize the predictor parameter to control the complexity of the predictor. The learning problem is formulated by an objective function considering the parameter regularization and MCC simultaneously. By optimizing the objective function alternately, we develop a novel predictor learning algorithm. The experiments on two challenging pattern classification tasks show that it significantly outperforms the machines with transitional loss functions.

  8. Electricity of machine tool

    International Nuclear Information System (INIS)

    Gijeon media editorial department

    1977-10-01

    This book is divided into three parts. The first part deals with electricity machine, which can taints from generator to motor, motor a power source of machine tool, electricity machine for machine tool such as switch in main circuit, automatic machine, a knife switch and pushing button, snap switch, protection device, timer, solenoid, and rectifier. The second part handles wiring diagram. This concludes basic electricity circuit of machine tool, electricity wiring diagram in your machine like milling machine, planer and grinding machine. The third part introduces fault diagnosis of machine, which gives the practical solution according to fault diagnosis and the diagnostic method with voltage and resistance measurement by tester.

  9. Environmentally Friendly Machining

    CERN Document Server

    Dixit, U S; Davim, J Paulo

    2012-01-01

    Environment-Friendly Machining provides an in-depth overview of environmentally-friendly machining processes, covering numerous different types of machining in order to identify which practice is the most environmentally sustainable. The book discusses three systems at length: machining with minimal cutting fluid, air-cooled machining and dry machining. Also covered is a way to conserve energy during machining processes, along with useful data and detailed descriptions for developing and utilizing the most efficient modern machining tools. Researchers and engineers looking for sustainable machining solutions will find Environment-Friendly Machining to be a useful volume.

  10. Machine Protection

    CERN Document Server

    Schmidt, R

    2014-01-01

    The protection of accelerator equipment is as old as accelerator technology and was for many years related to high-power equipment. Examples are the protection of powering equipment from overheating (magnets, power converters, high-current cables), of superconducting magnets from damage after a quench and of klystrons. The protection of equipment from beam accidents is more recent. It is related to the increasing beam power of high-power proton accelerators such as ISIS, SNS, ESS and the PSI cyclotron, to the emission of synchrotron light by electron–positron accelerators and FELs, and to the increase of energy stored in the beam (in particular for hadron colliders such as LHC). Designing a machine protection system requires an excellent understanding of accelerator physics and operation to anticipate possible failures that could lead to damage. Machine protection includes beam and equipment monitoring, a system to safely stop beam operation (e.g. dumping the beam or stopping the beam at low energy) and an ...

  11. Nonlinear Knowledge in Kernel-Based Multiple Criteria Programming Classifier

    Science.gov (United States)

    Zhang, Dongling; Tian, Yingjie; Shi, Yong

    Kernel-based Multiple Criteria Linear Programming (KMCLP) model is used as classification methods, which can learn from training examples. Whereas, in traditional machine learning area, data sets are classified only by prior knowledge. Some works combine the above two classification principle to overcome the defaults of each approach. In this paper, we propose a model to incorporate the nonlinear knowledge into KMCLP in order to solve the problem when input consists of not only training example, but also nonlinear prior knowledge. In dealing with real world case breast cancer diagnosis, the model shows its better performance than the model solely based on training data.

  12. Deep Feature Learning and Cascaded Classifier for Large Scale Data

    DEFF Research Database (Denmark)

    Prasoon, Adhish

    from data rather than having a predefined feature set. We explore deep learning approach of convolutional neural network (CNN) for segmenting three dimensional medical images. We propose a novel system integrating three 2D CNNs, which have a one-to-one association with the xy, yz and zx planes of 3D......This thesis focuses on voxel/pixel classification based approaches for image segmentation. The main application is segmentation of articular cartilage in knee MRIs. The first major contribution of the thesis deals with large scale machine learning problems. Many medical imaging problems need huge...... amount of training data to cover sufficient biological variability. Learning methods scaling badly with number of training data points cannot be used in such scenarios. This may restrict the usage of many powerful classifiers having excellent generalization ability. We propose a cascaded classifier which...

  13. MAMMOGRAMS ANALYSIS USING SVM CLASSIFIER IN COMBINED TRANSFORMS DOMAIN

    Directory of Open Access Journals (Sweden)

    B.N. Prathibha

    2011-02-01

    Full Text Available Breast cancer is a primary cause of mortality and morbidity in women. Reports reveal that earlier the detection of abnormalities, better the improvement in survival. Digital mammograms are one of the most effective means for detecting possible breast anomalies at early stages. Digital mammograms supported with Computer Aided Diagnostic (CAD systems help the radiologists in taking reliable decisions. The proposed CAD system extracts wavelet features and spectral features for the better classification of mammograms. The Support Vector Machines classifier is used to analyze 206 mammogram images from Mias database pertaining to the severity of abnormality, i.e., benign and malign. The proposed system gives 93.14% accuracy for discrimination between normal-malign and 87.25% accuracy for normal-benign samples and 89.22% accuracy for benign-malign samples. The study reveals that features extracted in hybrid transform domain with SVM classifier proves to be a promising tool for analysis of mammograms.

  14. Effects of cultural characteristics on building an emotion classifier through facial expression analysis

    Science.gov (United States)

    da Silva, Flávio Altinier Maximiano; Pedrini, Helio

    2015-03-01

    Facial expressions are an important demonstration of humanity's humors and emotions. Algorithms capable of recognizing facial expressions and associating them with emotions were developed and employed to compare the expressions that different cultural groups use to show their emotions. Static pictures of predominantly occidental and oriental subjects from public datasets were used to train machine learning algorithms, whereas local binary patterns, histogram of oriented gradients (HOGs), and Gabor filters were employed to describe the facial expressions for six different basic emotions. The most consistent combination, formed by the association of HOG filter and support vector machines, was then used to classify the other cultural group: there was a strong drop in accuracy, meaning that the subtle differences of facial expressions of each culture affected the classifier performance. Finally, a classifier was trained with images from both occidental and oriental subjects and its accuracy was higher on multicultural data, evidencing the need of a multicultural training set to build an efficient classifier.

  15. Composite Classifiers for Automatic Target Recognition

    National Research Council Canada - National Science Library

    Wang, Lin-Cheng

    1998-01-01

    ...) using forward-looking infrared (FLIR) imagery. Two existing classifiers, one based on learning vector quantization and the other on modular neural networks, are used as the building blocks for our composite classifiers...

  16. An Active Learning Classifier for Further Reducing Diabetic Retinopathy Screening System Cost

    Directory of Open Access Journals (Sweden)

    Yinan Zhang

    2016-01-01

    Full Text Available Diabetic retinopathy (DR screening system raises a financial problem. For further reducing DR screening cost, an active learning classifier is proposed in this paper. Our approach identifies retinal images based on features extracted by anatomical part recognition and lesion detection algorithms. Kernel extreme learning machine (KELM is a rapid classifier for solving classification problems in high dimensional space. Both active learning and ensemble technique elevate performance of KELM when using small training dataset. The committee only proposes necessary manual work to doctor for saving cost. On the publicly available Messidor database, our classifier is trained with 20%–35% of labeled retinal images and comparative classifiers are trained with 80% of labeled retinal images. Results show that our classifier can achieve better classification accuracy than Classification and Regression Tree, radial basis function SVM, Multilayer Perceptron SVM, Linear SVM, and K Nearest Neighbor. Empirical experiments suggest that our active learning classifier is efficient for further reducing DR screening cost.

  17. Classifier fusion for VoIP attacks classification

    Science.gov (United States)

    Safarik, Jakub; Rezac, Filip

    2017-05-01

    SIP is one of the most successful protocols in the field of IP telephony communication. It establishes and manages VoIP calls. As the number of SIP implementation rises, we can expect a higher number of attacks on the communication system in the near future. This work aims at malicious SIP traffic classification. A number of various machine learning algorithms have been developed for attack classification. The paper presents a comparison of current research and the use of classifier fusion method leading to a potential decrease in classification error rate. Use of classifier combination makes a more robust solution without difficulties that may affect single algorithms. Different voting schemes, combination rules, and classifiers are discussed to improve the overall performance. All classifiers have been trained on real malicious traffic. The concept of traffic monitoring depends on the network of honeypot nodes. These honeypots run in several networks spread in different locations. Separation of honeypots allows us to gain an independent and trustworthy attack information.

  18. Learning to classify wakes from local sensory information

    Science.gov (United States)

    Alsalman, Mohamad; Colvert, Brendan; Kanso, Eva; Kanso Team

    2017-11-01

    Aquatic organisms exhibit remarkable abilities to sense local flow signals contained in their fluid environment and to surmise the origins of these flows. For example, fish can discern the information contained in various flow structures and utilize this information for obstacle avoidance and prey tracking. Flow structures created by flapping and swimming bodies are well characterized in the fluid dynamics literature; however, such characterization relies on classical methods that use an external observer to reconstruct global flow fields. The reconstructed flows, or wakes, are then classified according to the unsteady vortex patterns. Here, we propose a new approach for wake identification: we classify the wakes resulting from a flapping airfoil by applying machine learning algorithms to local flow information. In particular, we simulate the wakes of an oscillating airfoil in an incoming flow, extract the downstream vorticity information, and train a classifier to learn the different flow structures and classify new ones. This data-driven approach provides a promising framework for underwater navigation and detection in application to autonomous bio-inspired vehicles.

  19. Performance evaluation of various classifiers for color prediction of rice paddy plant leaf

    Science.gov (United States)

    Singh, Amandeep; Singh, Maninder Lal

    2016-11-01

    The food industry is one of the industries that uses machine vision for a nondestructive quality evaluation of the produce. These quality measuring systems and softwares are precalculated on the basis of various image-processing algorithms which generally use a particular type of classifier. These classifiers play a vital role in making the algorithms so intelligent that it can contribute its best while performing the said quality evaluations by translating the human perception into machine vision and hence machine learning. The crop of interest is rice, and the color of this crop indicates the health status of the plant. An enormous number of classifiers are available to solve the purpose of color prediction, but choosing the best among them is the focus of this paper. Performance of a total of 60 classifiers has been analyzed from the application point of view, and the results have been discussed. The motivation comes from the idea of providing a set of classifiers with excellent performance and implementing them on a single algorithm for the improvement of machine vision learning and, hence, associated applications.

  20. Experimental comparison of support vector machines with random ...

    Indian Academy of Sciences (India)

    dient method, support vector machines, and random forests to improve producer accuracy and overall classification accuracy. The performance comparison of these classifiers is valuable for a decision maker ... ping, surveillance system, resource management, tracking ... rocks, water bodies, and anthropogenic elements,.

  1. Aggregation Operator Based Fuzzy Pattern Classifier Design

    DEFF Research Database (Denmark)

    Mönks, Uwe; Larsen, Henrik Legind; Lohweg, Volker

    2009-01-01

    This paper presents a novel modular fuzzy pattern classifier design framework for intelligent automation systems, developed on the base of the established Modified Fuzzy Pattern Classifier (MFPC) and allows designing novel classifier models which are hardware-efficiently implementable....... The performances of novel classifiers using substitutes of MFPC's geometric mean aggregator are benchmarked in the scope of an image processing application against the MFPC to reveal classification improvement potentials for obtaining higher classification rates....

  2. 15 CFR 4.8 - Classified Information.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 1 2010-01-01 2010-01-01 false Classified Information. 4.8 Section 4... INFORMATION Freedom of Information Act § 4.8 Classified Information. In processing a request for information..., the information shall be reviewed to determine whether it should remain classified. Ordinarily the...

  3. Analysis of machining and machine tools

    CERN Document Server

    Liang, Steven Y

    2016-01-01

    This book delivers the fundamental science and mechanics of machining and machine tools by presenting systematic and quantitative knowledge in the form of process mechanics and physics. It gives readers a solid command of machining science and engineering, and familiarizes them with the geometry and functionality requirements of creating parts and components in today’s markets. The authors address traditional machining topics, such as: single and multiple point cutting processes grinding components accuracy and metrology shear stress in cutting cutting temperature and analysis chatter They also address non-traditional machining, such as: electrical discharge machining electrochemical machining laser and electron beam machining A chapter on biomedical machining is also included. This book is appropriate for advanced undergraduate and graduate mechani cal engineering students, manufacturing engineers, and researchers. Each chapter contains examples, exercises and their solutions, and homework problems that re...

  4. Machine Protection

    International Nuclear Information System (INIS)

    Schmidt, R

    2014-01-01

    The protection of accelerator equipment is as old as accelerator technology and was for many years related to high-power equipment. Examples are the protection of powering equipment from overheating (magnets, power converters, high-current cables), of superconducting magnets from damage after a quench and of klystrons. The protection of equipment from beam accidents is more recent. It is related to the increasing beam power of high-power proton accelerators such as ISIS, SNS, ESS and the PSI cyclotron, to the emission of synchrotron light by electron–positron accelerators and FELs, and to the increase of energy stored in the beam (in particular for hadron colliders such as LHC). Designing a machine protection system requires an excellent understanding of accelerator physics and operation to anticipate possible failures that could lead to damage. Machine protection includes beam and equipment monitoring, a system to safely stop beam operation (e.g. dumping the beam or stopping the beam at low energy) and an interlock system providing the glue between these systems. The most recent accelerator, the LHC, will operate with about 3 × 10 14 protons per beam, corresponding to an energy stored in each beam of 360 MJ. This energy can cause massive damage to accelerator equipment in case of uncontrolled beam loss, and a single accident damaging vital parts of the accelerator could interrupt operation for years. This article provides an overview of the requirements for protection of accelerator equipment and introduces the various protection systems. Examples are mainly from LHC, SNS and ESS

  5. Machine terms dictionary

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1979-04-15

    This book gives descriptions of machine terms which includes machine design, drawing, the method of machine, machine tools, machine materials, automobile, measuring and controlling, electricity, basic of electron, information technology, quality assurance, Auto CAD and FA terms and important formula of mechanical engineering.

  6. Machine-based mapping of innovation portfolios

    NARCIS (Netherlands)

    de Visser, Matthias; Miao, Shengfa; Englebienne, Gwenn; Sools, Anna Maria; Visscher, Klaasjan

    2017-01-01

    Machine learning techniques show a great promise for improving innovation portfolio management. In this paper we experiment with different methods to classify innovation projects of a high-tech firm as either explorative or exploitative, and compare the results with a manual, theory-based mapping of

  7. BENCHMARKING MACHINE LEARNING TECHNIQUES FOR SOFTWARE DEFECT DETECTION

    OpenAIRE

    Saiqa Aleem; Luiz Fernando Capretz; Faheem Ahmed

    2015-01-01

    Machine Learning approaches are good in solving problems that have less information. In most cases, the software domain problems characterize as a process of learning that depend on the various circumstances and changes accordingly. A predictive model is constructed by using machine learning approaches and classified them into defective and non-defective modules. Machine learning techniques help developers to retrieve useful information after the classification and enable them to analyse data...

  8. BEBP: An Poisoning Method Against Machine Learning Based IDSs

    OpenAIRE

    Li, Pan; Liu, Qiang; Zhao, Wentao; Wang, Dongxu; Wang, Siqi

    2018-01-01

    In big data era, machine learning is one of fundamental techniques in intrusion detection systems (IDSs). However, practical IDSs generally update their decision module by feeding new data then retraining learning models in a periodical way. Hence, some attacks that comprise the data for training or testing classifiers significantly challenge the detecting capability of machine learning-based IDSs. Poisoning attack, which is one of the most recognized security threats towards machine learning...

  9. Addiction Machines

    Directory of Open Access Journals (Sweden)

    James Godley

    2011-10-01

    Full Text Available Entry into the crypt William Burroughs shared with his mother opened and shut around a failed re-enactment of William Tell’s shot through the prop placed upon a loved one’s head. The accidental killing of his wife Joan completed the installation of the addictation machine that spun melancholia as manic dissemination. An early encryptment to which was added the audio portion of abuse deposited an undeliverable message in WB. Wil- liam could never tell, although his corpus bears the in- scription of this impossibility as another form of pos- sibility. James Godley is currently a doctoral candidate in Eng- lish at SUNY Buffalo, where he studies psychoanalysis, Continental philosophy, and nineteenth-century litera- ture and poetry (British and American. His work on the concept of mourning and “the dead” in Freudian and Lacanian approaches to psychoanalytic thought and in Gothic literature has also spawned an essay on zombie porn. Since entering the Academy of Fine Arts Karlsruhe in 2007, Valentin Hennig has studied in the classes of Sil- via Bächli, Claudio Moser, and Corinne Wasmuht. In 2010 he spent a semester at the Dresden Academy of Fine Arts. His work has been shown in group exhibi- tions in Freiburg and Karlsruhe.

  10. Machine musicianship

    Science.gov (United States)

    Rowe, Robert

    2002-05-01

    The training of musicians begins by teaching basic musical concepts, a collection of knowledge commonly known as musicianship. Computer programs designed to implement musical skills (e.g., to make sense of what they hear, perform music expressively, or compose convincing pieces) can similarly benefit from access to a fundamental level of musicianship. Recent research in music cognition, artificial intelligence, and music theory has produced a repertoire of techniques that can make the behavior of computer programs more musical. Many of these were presented in a recently published book/CD-ROM entitled Machine Musicianship. For use in interactive music systems, we are interested in those which are fast enough to run in real time and that need only make reference to the material as it appears in sequence. This talk will review several applications that are able to identify the tonal center of musical material during performance. Beyond this specific task, the design of real-time algorithmic listening through the concurrent operation of several connected analyzers is examined. The presentation includes discussion of a library of C++ objects that can be combined to perform interactive listening and a demonstration of their capability.

  11. Construction of Pancreatic Cancer Classifier Based on SVM Optimized by Improved FOA

    Science.gov (United States)

    Ma, Xiaoqi

    2015-01-01

    A novel method is proposed to establish the pancreatic cancer classifier. Firstly, the concept of quantum and fruit fly optimal algorithm (FOA) are introduced, respectively. Then FOA is improved by quantum coding and quantum operation, and a new smell concentration determination function is defined. Finally, the improved FOA is used to optimize the parameters of support vector machine (SVM) and the classifier is established by optimized SVM. In order to verify the effectiveness of the proposed method, SVM and other classification methods have been chosen as the comparing methods. The experimental results show that the proposed method can improve the classifier performance and cost less time. PMID:26543867

  12. Multiple classifier systems in texton-based approach for the classification of CT images of Lung

    DEFF Research Database (Denmark)

    Gangeh, Mehrdad J.; Sørensen, Lauge; Shaker, Saher B.

    2010-01-01

    In this paper, we propose using texton signatures based on raw pixel representation along with a parallel multiple classifier system for the classification of emphysema in computed tomography images of the lung. The multiple classifier system is composed of support vector machines on the texton.......e., texton size and k value in k-means. Our results show that while aggregation of single decisions by SVMs over various k values using multiple classifier systems helps to improve the results compared to single SVMs, combining over different texton sizes is not beneficial. The performance of the proposed...

  13. Multiobjective optimization of classifiers by means of 3D convex-hull-based evolutionary algorithms

    NARCIS (Netherlands)

    Zhao, J.; Basto, Fernandes V.; Jiao, L.; Yevseyeva, I.; Asep, Maulana A.; Li, R.; Bäck, T.H.W.; Tang, T.; Michael, Emmerich T. M.

    2016-01-01

    The receiver operating characteristic (ROC) and detection error tradeoff(DET) curves are frequently used in the machine learning community to analyze the performance of binary classifiers. Recently, the convex-hull-based multiobjective genetic programming algorithm was proposed and successfully

  14. Optimization of Support Vector Machine (SVM) for Object Classification

    Science.gov (United States)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  15. Novelty Detection Classifiers in Weed Mapping: Silybum marianum Detection on UAV Multispectral Images.

    Science.gov (United States)

    Alexandridis, Thomas K; Tamouridou, Afroditi Alexandra; Pantazi, Xanthoula Eirini; Lagopodi, Anastasia L; Kashefi, Javid; Ovakoglou, Georgios; Polychronos, Vassilios; Moshou, Dimitrios

    2017-09-01

    In the present study, the detection and mapping of Silybum marianum (L.) Gaertn. weed using novelty detection classifiers is reported. A multispectral camera (green-red-NIR) on board a fixed wing unmanned aerial vehicle (UAV) was employed for obtaining high-resolution images. Four novelty detection classifiers were used to identify S. marianum between other vegetation in a field. The classifiers were One Class Support Vector Machine (OC-SVM), One Class Self-Organizing Maps (OC-SOM), Autoencoders and One Class Principal Component Analysis (OC-PCA). As input features to the novelty detection classifiers, the three spectral bands and texture were used. The S. marianum identification accuracy using OC-SVM reached an overall accuracy of 96%. The results show the feasibility of effective S. marianum mapping by means of novelty detection classifiers acting on multispectral UAV imagery.

  16. Case base classification on digital mammograms: improving the performance of case base classifier

    Science.gov (United States)

    Raman, Valliappan; Then, H. H.; Sumari, Putra; Venkatesa Mohan, N.

    2011-10-01

    Breast cancer continues to be a significant public health problem in the world. Early detection is the key for improving breast cancer prognosis. The aim of the research presented here is in twofold. First stage of research involves machine learning techniques, which segments and extracts features from the mass of digital mammograms. Second level is on problem solving approach which includes classification of mass by performance based case base classifier. In this paper we build a case-based Classifier in order to diagnose mammographic images. We explain different methods and behaviors that have been added to the classifier to improve the performance of the classifier. Currently the initial Performance base Classifier with Bagging is proposed in the paper and it's been implemented and it shows an improvement in specificity and sensitivity.

  17. General and Local: Averaged k-Dependence Bayesian Classifiers

    Directory of Open Access Journals (Sweden)

    Limin Wang

    2015-06-01

    Full Text Available The inference of a general Bayesian network has been shown to be an NP-hard problem, even for approximate solutions. Although k-dependence Bayesian (KDB classifier can construct at arbitrary points (values of k along the attribute dependence spectrum, it cannot identify the changes of interdependencies when attributes take different values. Local KDB, which learns in the framework of KDB, is proposed in this study to describe the local dependencies implicated in each test instance. Based on the analysis of functional dependencies, substitution-elimination resolution, a new type of semi-naive Bayesian operation, is proposed to substitute or eliminate generalization to achieve accurate estimation of conditional probability distribution while reducing computational complexity. The final classifier, averaged k-dependence Bayesian (AKDB classifiers, will average the output of KDB and local KDB. Experimental results on the repository of machine learning databases from the University of California Irvine (UCI showed that AKDB has significant advantages in zero-one loss and bias relative to naive Bayes (NB, tree augmented naive Bayes (TAN, Averaged one-dependence estimators (AODE, and KDB. Moreover, KDB and local KDB show mutually complementary characteristics with respect to variance.

  18. A Novel Cascade Classifier for Automatic Microcalcification Detection.

    Directory of Open Access Journals (Sweden)

    Seung Yeon Shin

    Full Text Available In this paper, we present a novel cascaded classification framework for automatic detection of individual and clusters of microcalcifications (μC. Our framework comprises three classification stages: i a random forest (RF classifier for simple features capturing the second order local structure of individual μCs, where non-μC pixels in the target mammogram are efficiently eliminated; ii a more complex discriminative restricted Boltzmann machine (DRBM classifier for μC candidates determined in the RF stage, which automatically learns the detailed morphology of μC appearances for improved discriminative power; and iii a detector to detect clusters of μCs from the individual μC detection results, using two different criteria. From the two-stage RF-DRBM classifier, we are able to distinguish μCs using explicitly computed features, as well as learn implicit features that are able to further discriminate between confusing cases. Experimental evaluation is conducted on the original Mammographic Image Analysis Society (MIAS and mini-MIAS databases, as well as our own Seoul National University Bundang Hospital digital mammographic database. It is shown that the proposed method outperforms comparable methods in terms of receiver operating characteristic (ROC and precision-recall curves for detection of individual μCs and free-response receiver operating characteristic (FROC curve for detection of clustered μCs.

  19. Comparison of artificial intelligence classifiers for SIP attack data

    Science.gov (United States)

    Safarik, Jakub; Slachta, Jiri

    2016-05-01

    Honeypot application is a source of valuable data about attacks on the network. We run several SIP honeypots in various computer networks, which are separated geographically and logically. Each honeypot runs on public IP address and uses standard SIP PBX ports. All information gathered via honeypot is periodically sent to the centralized server. This server classifies all attack data by neural network algorithm. The paper describes optimizations of a neural network classifier, which lower the classification error. The article contains the comparison of two neural network algorithm used for the classification of validation data. The first is the original implementation of the neural network described in recent work; the second neural network uses further optimizations like input normalization or cross-entropy cost function. We also use other implementations of neural networks and machine learning classification algorithms. The comparison test their capabilities on validation data to find the optimal classifier. The article result shows promise for further development of an accurate SIP attack classification engine.

  20. SVM-based Partial Discharge Pattern Classification for GIS

    Science.gov (United States)

    Ling, Yin; Bai, Demeng; Wang, Menglin; Gong, Xiaojin; Gu, Chao

    2018-01-01

    Partial discharges (PD) occur when there are localized dielectric breakdowns in small regions of gas insulated substations (GIS). It is of high importance to recognize the PD patterns, through which we can diagnose the defects caused by different sources so that predictive maintenance can be conducted to prevent from unplanned power outage. In this paper, we propose an approach to perform partial discharge pattern classification. It first recovers the PRPD matrices from the PRPD2D images; then statistical features are extracted from the recovered PRPD matrix and fed into SVM for classification. Experiments conducted on a dataset containing thousands of images demonstrates the high effectiveness of the method.

  1. Intelligent Machine Vision for Automated Fence Intruder Detection Using Self-organizing Map

    OpenAIRE

    Veldin A. Talorete Jr.; Sherwin A Guirnaldo

    2017-01-01

    This paper presents an intelligent machine vision for automated fence intruder detection. A series of still captured images that contain fence events using Internet Protocol cameras was used as input data to the system. Two classifiers were used; the first is to classify human posture and the second one will classify intruder location. The system classifiers were implemented using Self-Organizing Map after the implementation of several image segmentation processes. The human posture classifie...

  2. AN IMPLEMENTATION OF EIS-SVM CLASSIFIER USING RESEARCH ARTICLES FOR TEXT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    B Ramesh

    2016-04-01

    Full Text Available Automatic text classification is a prominent research topic in text mining. The text pre-processing is a major role in text classifier. The efficiency of pre-processing techniques is increasing the performance of text classifier. In this paper, we are implementing ECAS stemmer, Efficient Instance Selection and Pre-computed Kernel Support Vector Machine for text classification using recent research articles. We are using better pre-processing techniques such as ECAS stemmer to find root word, Efficient Instance Selection for dimensionality reduction of text data and Pre-computed Kernel Support Vector Machine for classification of selected instances. In this experiments were performed on 750 research articles with three classes such as engineering article, medical articles and educational articles. The EIS-SVM classifier provides better performance in real-time research articles classification.

  3. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.

    Science.gov (United States)

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  4. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    Directory of Open Access Journals (Sweden)

    Yi Zhang

    2015-01-01

    Full Text Available Maximum likelihood classifier (MLC and support vector machines (SVM are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  5. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Balachandran Manavalan

    2018-03-01

    Full Text Available Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.

  6. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine.

    Science.gov (United States)

    Manavalan, Balachandran; Shin, Tae H; Lee, Gwang

    2018-01-01

    Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.

  7. Support vector machines for prediction and analysis of beta and gamma-turns in proteins.

    Science.gov (United States)

    Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

    2005-04-01

    Tight turns have long been recognized as one of the three important features of proteins, together with alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns and most of the rest are gamma-turns. Analysis and prediction of beta-turns and gamma-turns is very useful for design of new molecules such as drugs, pesticides, and antigens. In this paper we investigated two aspects of applying support vector machine (SVM), a promising machine learning method for bioinformatics, to prediction and analysis of beta-turns and gamma-turns. First, we developed two SVM-based methods, called BTSVM and GTSVM, which predict beta-turns and gamma-turns in a protein from its sequence. When compared with other methods, BTSVM has a superior performance and GTSVM is competitive. Second, we used SVMs with a linear kernel to estimate the support of amino acids for the formation of beta-turns and gamma-turns depending on their position in a protein. Our analysis results are more comprehensive and easier to use than the previous results in designing turns in proteins.

  8. A Support Vector Machine Approach for Truncated Fingerprint Image Detection from Sweeping Fingerprint Sensors

    Science.gov (United States)

    Chen, Chi-Jim; Pai, Tun-Wen; Cheng, Mox

    2015-01-01

    A sweeping fingerprint sensor converts fingerprints on a row by row basis through image reconstruction techniques. However, a built fingerprint image might appear to be truncated and distorted when the finger was swept across a fingerprint sensor at a non-linear speed. If the truncated fingerprint images were enrolled as reference targets and collected by any automated fingerprint identification system (AFIS), successful prediction rates for fingerprint matching applications would be decreased significantly. In this paper, a novel and effective methodology with low time computational complexity was developed for detecting truncated fingerprints in a real time manner. Several filtering rules were implemented to validate existences of truncated fingerprints. In addition, a machine learning method of supported vector machine (SVM), based on the principle of structural risk minimization, was applied to reject pseudo truncated fingerprints containing similar characteristics of truncated ones. The experimental result has shown that an accuracy rate of 90.7% was achieved by successfully identifying truncated fingerprint images from testing images before AFIS enrollment procedures. The proposed effective and efficient methodology can be extensively applied to all existing fingerprint matching systems as a preliminary quality control prior to construction of fingerprint templates. PMID:25835186

  9. Joint Machine Learning and Game Theory for Rate Control in High Efficiency Video Coding.

    Science.gov (United States)

    Gao, Wei; Kwong, Sam; Jia, Yuheng

    2017-08-25

    In this paper, a joint machine learning and game theory modeling (MLGT) framework is proposed for inter frame coding tree unit (CTU) level bit allocation and rate control (RC) optimization in High Efficiency Video Coding (HEVC). First, a support vector machine (SVM) based multi-classification scheme is proposed to improve the prediction accuracy of CTU-level Rate-Distortion (R-D) model. The legacy "chicken-and-egg" dilemma in video coding is proposed to be overcome by the learning-based R-D model. Second, a mixed R-D model based cooperative bargaining game theory is proposed for bit allocation optimization, where the convexity of the mixed R-D model based utility function is proved, and Nash bargaining solution (NBS) is achieved by the proposed iterative solution search method. The minimum utility is adjusted by the reference coding distortion and frame-level Quantization parameter (QP) change. Lastly, intra frame QP and inter frame adaptive bit ratios are adjusted to make inter frames have more bit resources to maintain smooth quality and bit consumption in the bargaining game optimization. Experimental results demonstrate that the proposed MLGT based RC method can achieve much better R-D performances, quality smoothness, bit rate accuracy, buffer control results and subjective visual quality than the other state-of-the-art one-pass RC methods, and the achieved R-D performances are very close to the performance limits from the FixedQP method.

  10. A Support Vector Machine Approach for Truncated Fingerprint Image Detection from Sweeping Fingerprint Sensors

    Directory of Open Access Journals (Sweden)

    Chi-Jim Chen

    2015-03-01

    Full Text Available A sweeping fingerprint sensor converts fingerprints on a row by row basis through image reconstruction techniques. However, a built fingerprint image might appear to be truncated and distorted when the finger was swept across a fingerprint sensor at a non-linear speed. If the truncated fingerprint images were enrolled as reference targets and collected by any automated fingerprint identification system (AFIS, successful prediction rates for fingerprint matching applications would be decreased significantly. In this paper, a novel and effective methodology with low time computational complexity was developed for detecting truncated fingerprints in a real time manner. Several filtering rules were implemented to validate existences of truncated fingerprints. In addition, a machine learning method of supported vector machine (SVM, based on the principle of structural risk minimization, was applied to reject pseudo truncated fingerprints containing similar characteristics of truncated ones. The experimental result has shown that an accuracy rate of 90.7% was achieved by successfully identifying truncated fingerprint images from testing images before AFIS enrollment procedures. The proposed effective and efficient methodology can be extensively applied to all existing fingerprint matching systems as a preliminary quality control prior to construction of fingerprint templates.

  11. Normal mammogram detection based on local probability difference transforms and support vector machines

    International Nuclear Information System (INIS)

    Chiracharit, W.; Kumhom, P.; Chamnongthai, K.; Sun, Y.; Delp, E.J.; Babbs, C.F

    2007-01-01

    Automatic detection of normal mammograms, as a ''first look'' for breast cancer, is a new approach to computer-aided diagnosis. This approach may be limited, however, by two main causes. The first problem is the presence of poorly separable ''crossed-distributions'' in which the correct classification depends upon the value of each feature. The second problem is overlap of the feature distributions that are extracted from digitized mammograms of normal and abnormal patients. Here we introduce a new Support Vector Machine (SVM) based method utilizing with the proposed uncrossing mapping and Local Probability Difference (LPD). Crossed-distribution feature pairs are identified and mapped into a new features that can be separated by a zero-hyperplane of the new axis. The probability density functions of the features of normal and abnormal mammograms are then sampled and the local probability difference functions are estimated to enhance the features. From 1,000 ground-truth-known mammograms, 250 normal and 250 abnormal cases, including spiculated lesions, circumscribed masses or microcalcifications, are used for training a support vector machine. The classification results tested with another 250 normal and 250 abnormal sets show improved testing performances with 90% sensitivity and 89% specificity. (author)

  12. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    Science.gov (United States)

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  13. Hyperspectral image classification using Support Vector Machine

    International Nuclear Information System (INIS)

    Moughal, T A

    2013-01-01

    Classification of land cover hyperspectral images is a very challenging task due to the unfavourable ratio between the number of spectral bands and the number of training samples. The focus in many applications is to investigate an effective classifier in terms of accuracy. The conventional multiclass classifiers have the ability to map the class of interest but the considerable efforts and large training sets are required to fully describe the classes spectrally. Support Vector Machine (SVM) is suggested in this paper to deal with the multiclass problem of hyperspectral imagery. The attraction to this method is that it locates the optimal hyper plane between the class of interest and the rest of the classes to separate them in a new high-dimensional feature space by taking into account only the training samples that lie on the edge of the class distributions known as support vectors and the use of the kernel functions made the classifier more flexible by making it robust against the outliers. A comparative study has undertaken to find an effective classifier by comparing Support Vector Machine (SVM) to the other two well known classifiers i.e. Maximum likelihood (ML) and Spectral Angle Mapper (SAM). At first, the Minimum Noise Fraction (MNF) was applied to extract the best possible features form the hyperspectral imagery and then the resulting subset of the features was applied to the classifiers. Experimental results illustrate that the integration of MNF and SVM technique significantly reduced the classification complexity and improves the classification accuracy.

  14. Machine technology: a survey

    International Nuclear Information System (INIS)

    Barbier, M.M.

    1981-01-01

    An attempt was made to find existing machines that have been upgraded and that could be used for large-scale decontamination operations outdoors. Such machines are in the building industry, the mining industry, and the road construction industry. The road construction industry has yielded the machines in this presentation. A review is given of operations that can be done with the machines available

  15. Machine Shop Lathes.

    Science.gov (United States)

    Dunn, James

    This guide, the second in a series of five machine shop curriculum manuals, was designed for use in machine shop courses in Oklahoma. The purpose of the manual is to equip students with basic knowledge and skills that will enable them to enter the machine trade at the machine-operator level. The curriculum is designed so that it can be used in…

  16. Superconducting rotating machines

    International Nuclear Information System (INIS)

    Smith, J.L. Jr.; Kirtley, J.L. Jr.; Thullen, P.

    1975-01-01

    The opportunities and limitations of the applications of superconductors in rotating electric machines are given. The relevant properties of superconductors and the fundamental requirements for rotating electric machines are discussed. The current state-of-the-art of superconducting machines is reviewed. Key problems, future developments and the long range potential of superconducting machines are assessed

  17. Comparing classifiers for pronunciation error detection

    NARCIS (Netherlands)

    Strik, H.; Truong, K.; Wet, F. de; Cucchiarini, C.

    2007-01-01

    Providing feedback on pronunciation errors in computer assisted language learning systems requires that pronunciation errors be detected automatically. In the present study we compare four types of classifiers that can be used for this purpose: two acoustic-phonetic classifiers (one of which employs

  18. Non-equilibrium quantum heat machines

    Science.gov (United States)

    Alicki, Robert; Gelbwaser-Klimovsky, David

    2015-11-01

    Standard heat machines (engine, heat pump, refrigerator) are composed of a system (working fluid) coupled to at least two equilibrium baths at different temperatures and periodically driven by an external device (piston or rotor) sometimes called the work reservoir. The aim of this paper is to go beyond this scheme by considering environments which are stationary but cannot be decomposed into a few baths at thermal equilibrium. Such situations are important, for example in solar cells, chemical machines in biology, various realizations of laser cooling or nanoscopic machines driven by laser radiation. We classify non-equilibrium baths depending on their thermodynamic behavior and show that the efficiency of heat machines powered by them is limited by the generalized Carnot bound.

  19. Non-equilibrium quantum heat machines

    International Nuclear Information System (INIS)

    Alicki, Robert; Gelbwaser-Klimovsky, David

    2015-01-01

    Standard heat machines (engine, heat pump, refrigerator) are composed of a system (working fluid) coupled to at least two equilibrium baths at different temperatures and periodically driven by an external device (piston or rotor) sometimes called the work reservoir. The aim of this paper is to go beyond this scheme by considering environments which are stationary but cannot be decomposed into a few baths at thermal equilibrium. Such situations are important, for example in solar cells, chemical machines in biology, various realizations of laser cooling or nanoscopic machines driven by laser radiation. We classify non-equilibrium baths depending on their thermodynamic behavior and show that the efficiency of heat machines powered by them is limited by the generalized Carnot bound. (paper)

  20. Deconvolution When Classifying Noisy Data Involving Transformations

    KAUST Repository

    Carroll, Raymond

    2012-09-01

    In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.

  1. Deconvolution When Classifying Noisy Data Involving Transformations.

    Science.gov (United States)

    Carroll, Raymond; Delaigle, Aurore; Hall, Peter

    2012-09-01

    In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.

  2. Deconvolution When Classifying Noisy Data Involving Transformations

    KAUST Repository

    Carroll, Raymond; Delaigle, Aurore; Hall, Peter

    2012-01-01

    In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.

  3. Classifying Radio Galaxies with the Convolutional Neural Network

    International Nuclear Information System (INIS)

    Aniyan, A. K.; Thorat, K.

    2017-01-01

    We present the application of a deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks (CNN). In this study, we have taken the case of the Fanaroff–Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA)—Faint Images of the Radio Sky at Twenty Centimeters survey and existing visually classified samples available in the literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is ∼200 sources, which has been augmented by rotated versions of the same. Our study shows that CNNs can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and a “fusion classifier,” which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while the bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification, while being much faster. Finally, we discuss the computational and data-related challenges associated with the morphological classification of radio galaxies with CNNs.

  4. Classifying Radio Galaxies with the Convolutional Neural Network

    Energy Technology Data Exchange (ETDEWEB)

    Aniyan, A. K.; Thorat, K. [Department of Physics and Electronics, Rhodes University, Grahamstown (South Africa)

    2017-06-01

    We present the application of a deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks (CNN). In this study, we have taken the case of the Fanaroff–Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA)—Faint Images of the Radio Sky at Twenty Centimeters survey and existing visually classified samples available in the literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is ∼200 sources, which has been augmented by rotated versions of the same. Our study shows that CNNs can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and a “fusion classifier,” which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while the bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification, while being much faster. Finally, we discuss the computational and data-related challenges associated with the morphological classification of radio galaxies with CNNs.

  5. Classifying Radio Galaxies with the Convolutional Neural Network

    Science.gov (United States)

    Aniyan, A. K.; Thorat, K.

    2017-06-01

    We present the application of a deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks (CNN). In this study, we have taken the case of the Fanaroff-Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA)—Faint Images of the Radio Sky at Twenty Centimeters survey and existing visually classified samples available in the literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is ˜200 sources, which has been augmented by rotated versions of the same. Our study shows that CNNs can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and a “fusion classifier,” which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while the bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification, while being much faster. Finally, we discuss the computational and data-related challenges associated with the morphological classification of radio galaxies with CNNs.

  6. Snoring classified: The Munich-Passau Snore Sound Corpus.

    Science.gov (United States)

    Janott, Christoph; Schmitt, Maximilian; Zhang, Yue; Qian, Kun; Pandit, Vedhas; Zhang, Zixing; Heiser, Clemens; Hohenhorst, Winfried; Herzog, Michael; Hemmert, Werner; Schuller, Björn

    2018-03-01

    Snoring can be excited in different locations within the upper airways during sleep. It was hypothesised that the excitation locations are correlated with distinct acoustic characteristics of the snoring noise. To verify this hypothesis, a database of snore sounds is developed, labelled with the location of sound excitation. Video and audio recordings taken during drug induced sleep endoscopy (DISE) examinations from three medical centres have been semi-automatically screened for snore events, which subsequently have been classified by ENT experts into four classes based on the VOTE classification. The resulting dataset containing 828 snore events from 219 subjects has been split into Train, Development, and Test sets. An SVM classifier has been trained using low level descriptors (LLDs) related to energy, spectral features, mel frequency cepstral coefficients (MFCC), formants, voicing, harmonic-to-noise ratio (HNR), spectral harmonicity, pitch, and microprosodic features. An unweighted average recall (UAR) of 55.8% could be achieved using the full set of LLDs including formants. Best performing subset is the MFCC-related set of LLDs. A strong difference in performance could be observed between the permutations of train, development, and test partition, which may be caused by the relatively low number of subjects included in the smaller classes of the strongly unbalanced data set. A database of snoring sounds is presented which are classified according to their sound excitation location based on objective criteria and verifiable video material. With the database, it could be demonstrated that machine classifiers can distinguish different excitation location of snoring sounds in the upper airway based on acoustic parameters. Copyright © 2018 Elsevier Ltd. All rights reserved.

  7. Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

    Science.gov (United States)

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-04-21

    In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.

  8. Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies

    Science.gov (United States)

    Theis, Fabian J.

    2017-01-01

    Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on nonstratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverse-probability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different behaviors between the random forest and other classifiers. In conclusion, we provide guidance for choosing correction methods when training classifiers on biased samples. For random forests, our method outperforms state-of-the-art procedures if distribution assumptions are roughly fulfilled. We provide our implementation in the R package sambia. PMID:29312464

  9. Logarithmic learning for generalized classifier neural network.

    Science.gov (United States)

    Ozyildirim, Buse Melis; Avci, Mutlu

    2014-12-01

    Generalized classifier neural network is introduced as an efficient classifier among the others. Unless the initial smoothing parameter value is close to the optimal one, generalized classifier neural network suffers from convergence problem and requires quite a long time to converge. In this work, to overcome this problem, a logarithmic learning approach is proposed. The proposed method uses logarithmic cost function instead of squared error. Minimization of this cost function reduces the number of iterations used for reaching the minima. The proposed method is tested on 15 different data sets and performance of logarithmic learning generalized classifier neural network is compared with that of standard one. Thanks to operation range of radial basis function included by generalized classifier neural network, proposed logarithmic approach and its derivative has continuous values. This makes it possible to adopt the advantage of logarithmic fast convergence by the proposed learning method. Due to fast convergence ability of logarithmic cost function, training time is maximally decreased to 99.2%. In addition to decrease in training time, classification performance may also be improved till 60%. According to the test results, while the proposed method provides a solution for time requirement problem of generalized classifier neural network, it may also improve the classification accuracy. The proposed method can be considered as an efficient way for reducing the time requirement problem of generalized classifier neural network. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. A hybrid approach to select features and classify diseases based on medical data

    Science.gov (United States)

    AbdelLatif, Hisham; Luo, Jiawei

    2018-03-01

    Feature selection is popular problem in the classification of diseases in clinical medicine. Here, we developing a hybrid methodology to classify diseases, based on three medical datasets, Arrhythmia, Breast cancer, and Hepatitis datasets. This methodology called k-means ANOVA Support Vector Machine (K-ANOVA-SVM) uses K-means cluster with ANOVA statistical to preprocessing data and selection the significant features, and Support Vector Machines in the classification process. To compare and evaluate the performance, we choice three classification algorithms, decision tree Naïve Bayes, Support Vector Machines and applied the medical datasets direct to these algorithms. Our methodology was a much better classification accuracy is given of 98% in Arrhythmia datasets, 92% in Breast cancer datasets and 88% in Hepatitis datasets, Compare to use the medical data directly with decision tree Naïve Bayes, and Support Vector Machines. Also, the ROC curve and precision with (K-ANOVA-SVM) Achieved best results than other algorithms

  11. Machine-z: Rapid Machine-Learned Redshift Indicator for Swift Gamma-Ray Bursts

    Science.gov (United States)

    Ukwatta, T. N.; Wozniak, P. R.; Gehrels, N.

    2016-01-01

    Studies of high-redshift gamma-ray bursts (GRBs) provide important information about the early Universe such as the rates of stellar collapsars and mergers, the metallicity content, constraints on the re-ionization period, and probes of the Hubble expansion. Rapid selection of high-z candidates from GRB samples reported in real time by dedicated space missions such as Swift is the key to identifying the most distant bursts before the optical afterglow becomes too dim to warrant a good spectrum. Here, we introduce 'machine-z', a redshift prediction algorithm and a 'high-z' classifier for Swift GRBs based on machine learning. Our method relies exclusively on canonical data commonly available within the first few hours after the GRB trigger. Using a sample of 284 bursts with measured redshifts, we trained a randomized ensemble of decision trees (random forest) to perform both regression and classification. Cross-validated performance studies show that the correlation coefficient between machine-z predictions and the true redshift is nearly 0.6. At the same time, our high-z classifier can achieve 80 per cent recall of true high-redshift bursts, while incurring a false positive rate of 20 per cent. With 40 per cent false positive rate the classifier can achieve approximately 100 per cent recall. The most reliable selection of high-redshift GRBs is obtained by combining predictions from both the high-z classifier and the machine-z regressor.

  12. A CLASSIFIER SYSTEM USING SMOOTH GRAPH COLORING

    Directory of Open Access Journals (Sweden)

    JORGE FLORES CRUZ

    2017-01-01

    Full Text Available Unsupervised classifiers allow clustering methods with less or no human intervention. Therefore it is desirable to group the set of items with less data processing. This paper proposes an unsupervised classifier system using the model of soft graph coloring. This method was tested with some classic instances in the literature and the results obtained were compared with classifications made with human intervention, yielding as good or better results than supervised classifiers, sometimes providing alternative classifications that considers additional information that humans did not considered.

  13. High dimensional classifiers in the imbalanced case

    DEFF Research Database (Denmark)

    Bak, Britta Anker; Jensen, Jens Ledet

    We consider the binary classification problem in the imbalanced case where the number of samples from the two groups differ. The classification problem is considered in the high dimensional case where the number of variables is much larger than the number of samples, and where the imbalance leads...... to a bias in the classification. A theoretical analysis of the independence classifier reveals the origin of the bias and based on this we suggest two new classifiers that can handle any imbalance ratio. The analytical results are supplemented by a simulation study, where the suggested classifiers in some...

  14. Arabic Handwriting Recognition Using Neural Network Classifier

    African Journals Online (AJOL)

    pc

    2018-03-05

    Mar 5, 2018 ... an OCR using Neural Network classifier preceded by a set of preprocessing .... Artificial Neural Networks (ANNs), which we adopt in this research, consist of ... advantage and disadvantages of each technique. In [9],. Khemiri ...

  15. Classifiers based on optimal decision rules

    KAUST Repository

    Amin, Talha

    2013-11-25

    Based on dynamic programming approach we design algorithms for sequential optimization of exact and approximate decision rules relative to the length and coverage [3, 4]. In this paper, we use optimal rules to construct classifiers, and study two questions: (i) which rules are better from the point of view of classification-exact or approximate; and (ii) which order of optimization gives better results of classifier work: length, length+coverage, coverage, or coverage+length. Experimental results show that, on average, classifiers based on exact rules are better than classifiers based on approximate rules, and sequential optimization (length+coverage or coverage+length) is better than the ordinary optimization (length or coverage).

  16. Classifiers based on optimal decision rules

    KAUST Repository

    Amin, Talha M.; Chikalov, Igor; Moshkov, Mikhail; Zielosko, Beata

    2013-01-01

    Based on dynamic programming approach we design algorithms for sequential optimization of exact and approximate decision rules relative to the length and coverage [3, 4]. In this paper, we use optimal rules to construct classifiers, and study two questions: (i) which rules are better from the point of view of classification-exact or approximate; and (ii) which order of optimization gives better results of classifier work: length, length+coverage, coverage, or coverage+length. Experimental results show that, on average, classifiers based on exact rules are better than classifiers based on approximate rules, and sequential optimization (length+coverage or coverage+length) is better than the ordinary optimization (length or coverage).

  17. Combining multiple classifiers for age classification

    CSIR Research Space (South Africa)

    Van Heerden, C

    2009-11-01

    Full Text Available The authors compare several different classifier combination methods on a single task, namely speaker age classification. This task is well suited to combination strategies, since significantly different feature classes are employed. Support vector...

  18. Neural Network Classifiers for Local Wind Prediction.

    Science.gov (United States)

    Kretzschmar, Ralf; Eckert, Pierre; Cattani, Daniel; Eggimann, Fritz

    2004-05-01

    This paper evaluates the quality of neural network classifiers for wind speed and wind gust prediction with prediction lead times between +1 and +24 h. The predictions were realized based on local time series and model data. The selection of appropriate input features was initiated by time series analysis and completed by empirical comparison of neural network classifiers trained on several choices of input features. The selected input features involved day time, yearday, features from a single wind observation device at the site of interest, and features derived from model data. The quality of the resulting classifiers was benchmarked against persistence for two different sites in Switzerland. The neural network classifiers exhibited superior quality when compared with persistence judged on a specific performance measure, hit and false-alarm rates.

  19. Machine tool structures

    CERN Document Server

    Koenigsberger, F

    1970-01-01

    Machine Tool Structures, Volume 1 deals with fundamental theories and calculation methods for machine tool structures. Experimental investigations into stiffness are discussed, along with the application of the results to the design of machine tool structures. Topics covered range from static and dynamic stiffness to chatter in metal cutting, stability in machine tools, and deformations of machine tool structures. This volume is divided into three sections and opens with a discussion on stiffness specifications and the effect of stiffness on the behavior of the machine under forced vibration c

  20. Consistency Analysis of Nearest Subspace Classifier

    OpenAIRE

    Wang, Yi

    2015-01-01

    The Nearest subspace classifier (NSS) finds an estimation of the underlying subspace within each class and assigns data points to the class that corresponds to its nearest subspace. This paper mainly studies how well NSS can be generalized to new samples. It is proved that NSS is strongly consistent under certain assumptions. For completeness, NSS is evaluated through experiments on various simulated and real data sets, in comparison with some other linear model based classifiers. It is also ...

  1. Profiled support vector machines for antisense oligonucleotide efficacy prediction

    Directory of Open Access Journals (Sweden)

    Martín-Guerrero José D

    2004-09-01

    Full Text Available Abstract Background This paper presents the use of Support Vector Machines (SVMs for prediction and analysis of antisense oligonucleotide (AO efficacy. The collected database comprises 315 AO molecules including 68 features each, inducing a problem well-suited to SVMs. The task of feature selection is crucial given the presence of noisy or redundant features, and the well-known problem of the curse of dimensionality. We propose a two-stage strategy to develop an optimal model: (1 feature selection using correlation analysis, mutual information, and SVM-based recursive feature elimination (SVM-RFE, and (2 AO prediction using standard and profiled SVM formulations. A profiled SVM gives different weights to different parts of the training data to focus the training on the most important regions. Results In the first stage, the SVM-RFE technique was most efficient and robust in the presence of low number of samples and high input space dimension. This method yielded an optimal subset of 14 representative features, which were all related to energy and sequence motifs. The second stage evaluated the performance of the predictors (overall correlation coefficient between observed and predicted efficacy, r; mean error, ME; and root-mean-square-error, RMSE using 8-fold and minus-one-RNA cross-validation methods. The profiled SVM produced the best results (r = 0.44, ME = 0.022, and RMSE= 0.278 and predicted high (>75% inhibition of gene expression and low efficacy (http://aosvm.cgb.ki.se/. Conclusions The SVM approach is well suited to the AO prediction problem, and yields a prediction accuracy superior to previous methods. The profiled SVM was found to perform better than the standard SVM, suggesting that it could lead to improvements in other prediction problems as well.

  2. Sentiment analysis system for movie review in Bahasa Indonesia using naive bayes classifier method

    Science.gov (United States)

    Nurdiansyah, Yanuar; Bukhori, Saiful; Hidayat, Rahmad

    2018-04-01

    There are many ways of implementing the use of sentiments often found in documents; one of which is the sentiments found on the product or service reviews. It is so important to be able to process and extract textual data from the documents. Therefore, we propose a system that is able to classify sentiments from review documents into two classes: positive sentiment and negative sentiment. We use Naive Bayes Classifier method in this document classification system that we build. We choose Movienthusiast, a movie reviews in Bahasa Indonesia website as the source of our review documents. From there, we were able to collect 1201 movie reviews: 783 positive reviews and 418 negative reviews that we use as the dataset for this machine learning classifier. The classifying accuracy yields an average of 88.37% from five times of accuracy measuring attempts using aforementioned dataset.

  3. Speaker gender identification based on majority vote classifiers

    Science.gov (United States)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2017-03-01

    Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.

  4. MITS machine operations

    International Nuclear Information System (INIS)

    Flinchem, J.

    1980-01-01

    This document contains procedures which apply to operations performed on individual P-1c machines in the Machine Interface Test System (MITS) at AiResearch Manufacturing Company's Torrance, California Facility

  5. Brain versus Machine Control.

    Directory of Open Access Journals (Sweden)

    Jose M Carmena

    2004-12-01

    Full Text Available Dr. Octopus, the villain of the movie "Spiderman 2", is a fusion of man and machine. Neuroscientist Jose Carmena examines the facts behind this fictional account of a brain- machine interface

  6. Applied machining technology

    CERN Document Server

    Tschätsch, Heinz

    2010-01-01

    Machining and cutting technologies are still crucial for many manufacturing processes. This reference presents all important machining processes in a comprehensive and coherent way. It includes many examples of concrete calculations, problems and solutions.

  7. Machining with abrasives

    CERN Document Server

    Jackson, Mark J

    2011-01-01

    Abrasive machining is key to obtaining the desired geometry and surface quality in manufacturing. This book discusses the fundamentals and advances in the abrasive machining processes. It provides a complete overview of developing areas in the field.

  8. Machine medical ethics

    CERN Document Server

    Pontier, Matthijs

    2015-01-01

    The essays in this book, written by researchers from both humanities and sciences, describe various theoretical and experimental approaches to adding medical ethics to a machine in medical settings. Medical machines are in close proximity with human beings, and getting closer: with patients who are in vulnerable states of health, who have disabilities of various kinds, with the very young or very old, and with medical professionals. In such contexts, machines are undertaking important medical tasks that require emotional sensitivity, knowledge of medical codes, human dignity, and privacy. As machine technology advances, ethical concerns become more urgent: should medical machines be programmed to follow a code of medical ethics? What theory or theories should constrain medical machine conduct? What design features are required? Should machines share responsibility with humans for the ethical consequences of medical actions? How ought clinical relationships involving machines to be modeled? Is a capacity for e...

  9. Machine protection systems

    CERN Document Server

    Macpherson, A L

    2010-01-01

    A summary of the Machine Protection System of the LHC is given, with particular attention given to the outstanding issues to be addressed, rather than the successes of the machine protection system from the 2009 run. In particular, the issues of Safe Machine Parameter system, collimation and beam cleaning, the beam dump system and abort gap cleaning, injection and dump protection, and the overall machine protection program for the upcoming run are summarised.

  10. Dictionary of machine terms

    International Nuclear Information System (INIS)

    1990-06-01

    This book has introduction of dictionary of machine terms, and a compilation committee and introductory remarks. It gives descriptions of the machine terms in alphabetical order from a to Z and also includes abbreviation of machine terms and symbol table, way to read mathematical symbols and abbreviation and terms of drawings.

  11. Mankind, machines and people

    Energy Technology Data Exchange (ETDEWEB)

    Hugli, A

    1984-01-01

    The following questions are addressed: is there a difference between machines and men, between human communication and communication with machines. Will we ever reach the point when the dream of artificial intelligence becomes a reality. Will thinking machines be able to replace the human spirit in all its aspects. Social consequences and philosophical aspects are addressed. 8 references.

  12. A Universal Reactive Machine

    DEFF Research Database (Denmark)

    Andersen, Henrik Reif; Mørk, Simon; Sørensen, Morten U.

    1997-01-01

    Turing showed the existence of a model universal for the set of Turing machines in the sense that given an encoding of any Turing machine asinput the universal Turing machine simulates it. We introduce the concept of universality for reactive systems and construct a CCS processuniversal...

  13. HTS machine laboratory prototype

    DEFF Research Database (Denmark)

    machine. The machine comprises six stationary HTS field windings wound from both YBCO and BiSCOO tape operated at liquid nitrogen temperature and enclosed in a cryostat, and a three phase armature winding spinning at up to 300 rpm. This design has full functionality of HTS synchronous machines. The design...

  14. Your Sewing Machine.

    Science.gov (United States)

    Peacock, Marion E.

    The programed instruction manual is designed to aid the student in learning the parts, uses, and operation of the sewing machine. Drawings of sewing machine parts are presented, and space is provided for the student's written responses. Following an introductory section identifying sewing machine parts, the manual deals with each part and its…

  15. Machine learning applied to crime prediction

    OpenAIRE

    Vaquero Barnadas, Miquel

    2016-01-01

    Machine Learning is a cornerstone when it comes to artificial intelligence and big data analysis. It provides powerful algorithms that are capable of recognizing patterns, classifying data, and, basically, learn by themselves to perform a specific task. This field has incredibly grown in popularity these days, however, it still remains unknown for the majority of people, and even for most professionals. This project intends to provide an understandable explanation of what is it, what types ar...

  16. Machine Learning Methods for Production Cases Analysis

    Science.gov (United States)

    Mokrova, Nataliya V.; Mokrov, Alexander M.; Safonova, Alexandra V.; Vishnyakov, Igor V.

    2018-03-01

    Approach to analysis of events occurring during the production process were proposed. Described machine learning system is able to solve classification tasks related to production control and hazard identification at an early stage. Descriptors of the internal production network data were used for training and testing of applied models. k-Nearest Neighbors and Random forest methods were used to illustrate and analyze proposed solution. The quality of the developed classifiers was estimated using standard statistical metrics, such as precision, recall and accuracy.

  17. Reinforcement Learning Based Artificial Immune Classifier

    Directory of Open Access Journals (Sweden)

    Mehmet Karakose

    2013-01-01

    Full Text Available One of the widely used methods for classification that is a decision-making process is artificial immune systems. Artificial immune systems based on natural immunity system can be successfully applied for classification, optimization, recognition, and learning in real-world problems. In this study, a reinforcement learning based artificial immune classifier is proposed as a new approach. This approach uses reinforcement learning to find better antibody with immune operators. The proposed new approach has many contributions according to other methods in the literature such as effectiveness, less memory cell, high accuracy, speed, and data adaptability. The performance of the proposed approach is demonstrated by simulation and experimental results using real data in Matlab and FPGA. Some benchmark data and remote image data are used for experimental results. The comparative results with supervised/unsupervised based artificial immune system, negative selection classifier, and resource limited artificial immune classifier are given to demonstrate the effectiveness of the proposed new method.

  18. Can single classifiers be as useful as model ensembles to produce benthic seabed substratum maps?

    Science.gov (United States)

    Turner, Joseph A.; Babcock, Russell C.; Hovey, Renae; Kendrick, Gary A.

    2018-05-01

    Numerous machine-learning classifiers are available for benthic habitat map production, which can lead to different results. This study highlights the performance of the Random Forest (RF) classifier, which was significantly better than Classification Trees (CT), Naïve Bayes (NB), and a multi-model ensemble in terms of overall accuracy, Balanced Error Rate (BER), Kappa, and area under the curve (AUC) values. RF accuracy was often higher than 90% for each substratum class, even at the most detailed level of the substratum classification and AUC values also indicated excellent performance (0.8-1). Total agreement between classifiers was high at the broadest level of classification (75-80%) when differentiating between hard and soft substratum. However, this sharply declined as the number of substratum categories increased (19-45%) including a mix of rock, gravel, pebbles, and sand. The model ensemble, produced from the results of all three classifiers by majority voting, did not show any increase in predictive performance when compared to the single RF classifier. This study shows how a single classifier may be sufficient to produce benthic seabed maps and model ensembles of multiple classifiers.

  19. Classifier utility modeling and analysis of hypersonic inlet start/unstart considering training data costs

    Science.gov (United States)

    Chang, Juntao; Hu, Qinghua; Yu, Daren; Bao, Wen

    2011-11-01

    Start/unstart detection is one of the most important issues of hypersonic inlets and is also the foundation of protection control of scramjet. The inlet start/unstart detection can be attributed to a standard pattern classification problem, and the training sample costs have to be considered for the classifier modeling as the CFD numerical simulations and wind tunnel experiments of hypersonic inlets both cost time and money. To solve this problem, the CFD simulation of inlet is studied at first step, and the simulation results could provide the training data for pattern classification of hypersonic inlet start/unstart. Then the classifier modeling technology and maximum classifier utility theories are introduced to analyze the effect of training data cost on classifier utility. In conclusion, it is useful to introduce support vector machine algorithms to acquire the classifier model of hypersonic inlet start/unstart, and the minimum total cost of hypersonic inlet start/unstart classifier can be obtained by the maximum classifier utility theories.

  20. Classifier Fusion With Contextual Reliability Evaluation.

    Science.gov (United States)

    Liu, Zhunga; Pan, Quan; Dezert, Jean; Han, Jun-Wei; He, You

    2018-05-01

    Classifier fusion is an efficient strategy to improve the classification performance for the complex pattern recognition problem. In practice, the multiple classifiers to combine can have different reliabilities and the proper reliability evaluation plays an important role in the fusion process for getting the best classification performance. We propose a new method for classifier fusion with contextual reliability evaluation (CF-CRE) based on inner reliability and relative reliability concepts. The inner reliability, represented by a matrix, characterizes the probability of the object belonging to one class when it is classified to another class. The elements of this matrix are estimated from the -nearest neighbors of the object. A cautious discounting rule is developed under belief functions framework to revise the classification result according to the inner reliability. The relative reliability is evaluated based on a new incompatibility measure which allows to reduce the level of conflict between the classifiers by applying the classical evidence discounting rule to each classifier before their combination. The inner reliability and relative reliability capture different aspects of the classification reliability. The discounted classification results are combined with Dempster-Shafer's rule for the final class decision making support. The performance of CF-CRE have been evaluated and compared with those of main classical fusion methods using real data sets. The experimental results show that CF-CRE can produce substantially higher accuracy than other fusion methods in general. Moreover, CF-CRE is robust to the changes of the number of nearest neighbors chosen for estimating the reliability matrix, which is appealing for the applications.

  1. Quantum machine learning.

    Science.gov (United States)

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-13

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  2. Asynchronized synchronous machines

    CERN Document Server

    Botvinnik, M M

    1964-01-01

    Asynchronized Synchronous Machines focuses on the theoretical research on asynchronized synchronous (AS) machines, which are "hybrids” of synchronous and induction machines that can operate with slip. Topics covered in this book include the initial equations; vector diagram of an AS machine; regulation in cases of deviation from the law of full compensation; parameters of the excitation system; and schematic diagram of an excitation regulator. The possible applications of AS machines and its calculations in certain cases are also discussed. This publication is beneficial for students and indiv

  3. Classifying sows' activity types from acceleration patterns

    DEFF Research Database (Denmark)

    Cornou, Cecile; Lundbye-Christensen, Søren

    2008-01-01

    An automated method of classifying sow activity using acceleration measurements would allow the individual sow's behavior to be monitored throughout the reproductive cycle; applications for detecting behaviors characteristic of estrus and farrowing or to monitor illness and welfare can be foreseen....... This article suggests a method of classifying five types of activity exhibited by group-housed sows. The method involves the measurement of acceleration in three dimensions. The five activities are: feeding, walking, rooting, lying laterally and lying sternally. Four time series of acceleration (the three...

  4. Data characteristics that determine classifier performance

    CSIR Research Space (South Africa)

    Van der Walt, Christiaan M

    2006-11-01

    Full Text Available available at [11]. The kNN uses a LinearNN nearest neighbour search algorithm with an Euclidean distance metric [8]. The optimal k value is determined by performing 10-fold cross-validation. An optimal k value between 1 and 10 is used for Experiments 1... classifiers. 10-fold cross-validation is used to evaluate and compare the performance of the classifiers on the different data sets. 3.1. Artificial data generation Multivariate Gaussian distributions are used to generate artificial data sets. We use d...

  5. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  6. A survey of decision tree classifier methodology

    Science.gov (United States)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  7. Machine learning analysis of binaural rowing sounds

    DEFF Research Database (Denmark)

    Johard, Leonard; Ruffaldi, Emanuele; Hoffmann, Pablo F.

    2011-01-01

    Techniques for machine hearing are increasing their potentiality due to new application domains. In this work we are addressing the analysis of rowing sounds in natural context for the purpose of supporting a training system based on virtual environments. This paper presents the acquisition metho...... methodology and the evaluation of different machine learning techniques for classifying rowing-sound data. We see that a combination of principal component analysis and shallow networks perform equally well as deep architectures, while being much faster to train.......Techniques for machine hearing are increasing their potentiality due to new application domains. In this work we are addressing the analysis of rowing sounds in natural context for the purpose of supporting a training system based on virtual environments. This paper presents the acquisition...

  8. Machine Learning Phases of Strongly Correlated Fermions

    Directory of Open Access Journals (Sweden)

    Kelvin Ch’ng

    2017-08-01

    Full Text Available Machine learning offers an unprecedented perspective for the problem of classifying phases in condensed matter physics. We employ neural-network machine learning techniques to distinguish finite-temperature phases of the strongly correlated fermions on cubic lattices. We show that a three-dimensional convolutional network trained on auxiliary field configurations produced by quantum Monte Carlo simulations of the Hubbard model can correctly predict the magnetic phase diagram of the model at the average density of one (half filling. We then use the network, trained at half filling, to explore the trend in the transition temperature as the system is doped away from half filling. This transfer learning approach predicts that the instability to the magnetic phase extends to at least 5% doping in this region. Our results pave the way for other machine learning applications in correlated quantum many-body systems.

  9. Machine Learning Techniques in Clinical Vision Sciences.

    Science.gov (United States)

    Caixinha, Miguel; Nunes, Sandrina

    2017-01-01

    This review presents and discusses the contribution of machine learning techniques for diagnosis and disease monitoring in the context of clinical vision science. Many ocular diseases leading to blindness can be halted or delayed when detected and treated at its earliest stages. With the recent developments in diagnostic devices, imaging and genomics, new sources of data for early disease detection and patients' management are now available. Machine learning techniques emerged in the biomedical sciences as clinical decision-support techniques to improve sensitivity and specificity of disease detection and monitoring, increasing objectively the clinical decision-making process. This manuscript presents a review in multimodal ocular disease diagnosis and monitoring based on machine learning approaches. In the first section, the technical issues related to the different machine learning approaches will be present. Machine learning techniques are used to automatically recognize complex patterns in a given dataset. These techniques allows creating homogeneous groups (unsupervised learning), or creating a classifier predicting group membership of new cases (supervised learning), when a group label is available for each case. To ensure a good performance of the machine learning techniques in a given dataset, all possible sources of bias should be removed or minimized. For that, the representativeness of the input dataset for the true population should be confirmed, the noise should be removed, the missing data should be treated and the data dimensionally (i.e., the number of parameters/features and the number of cases in the dataset) should be adjusted. The application of machine learning techniques in ocular disease diagnosis and monitoring will be presented and discussed in the second section of this manuscript. To show the clinical benefits of machine learning in clinical vision sciences, several examples will be presented in glaucoma, age-related macular degeneration

  10. Feature extraction using convolutional neural network for classifying breast density in mammographic images

    Science.gov (United States)

    Thomaz, Ricardo L.; Carneiro, Pedro C.; Patrocinio, Ana C.

    2017-03-01

    Breast cancer is the leading cause of death for women in most countries. The high levels of mortality relate mostly to late diagnosis and to the direct proportionally relationship between breast density and breast cancer development. Therefore, the correct assessment of breast density is important to provide better screening for higher risk patients. However, in modern digital mammography the discrimination among breast densities is highly complex due to increased contrast and visual information for all densities. Thus, a computational system for classifying breast density might be a useful tool for aiding medical staff. Several machine-learning algorithms are already capable of classifying small number of classes with good accuracy. However, machinelearning algorithms main constraint relates to the set of features extracted and used for classification. Although well-known feature extraction techniques might provide a good set of features, it is a complex task to select an initial set during design of a classifier. Thus, we propose feature extraction using a Convolutional Neural Network (CNN) for classifying breast density by a usual machine-learning classifier. We used 307 mammographic images downsampled to 260x200 pixels to train a CNN and extract features from a deep layer. After training, the activation of 8 neurons from a deep fully connected layer are extracted and used as features. Then, these features are feedforward to a single hidden layer neural network that is cross-validated using 10-folds to classify among four classes of breast density. The global accuracy of this method is 98.4%, presenting only 1.6% of misclassification. However, the small set of samples and memory constraints required the reuse of data in both CNN and MLP-NN, therefore overfitting might have influenced the results even though we cross-validated the network. Thus, although we presented a promising method for extracting features and classifying breast density, a greater database is

  11. 75 FR 37253 - Classified National Security Information

    Science.gov (United States)

    2010-06-28

    ... ``Secret.'' (3) Each interior page of a classified document shall be marked at the top and bottom either... ``(TS)'' for Top Secret, ``(S)'' for Secret, and ``(C)'' for Confidential will be used. (2) Portions... from the informational text. (1) Conspicuously place the overall classification at the top and bottom...

  12. 75 FR 707 - Classified National Security Information

    Science.gov (United States)

    2010-01-05

    ... classified at one of the following three levels: (1) ``Top Secret'' shall be applied to information, the... exercise this authority. (2) ``Top Secret'' original classification authority may be delegated only by the... official has been delegated ``Top Secret'' original classification authority by the agency head. (4) Each...

  13. Neural Network Classifier Based on Growing Hyperspheres

    Czech Academy of Sciences Publication Activity Database

    Jiřina Jr., Marcel; Jiřina, Marcel

    2000-01-01

    Roč. 10, č. 3 (2000), s. 417-428 ISSN 1210-0552. [Neural Network World 2000. Prague, 09.07.2000-12.07.2000] Grant - others:MŠMT ČR(CZ) VS96047; MPO(CZ) RP-4210 Institutional research plan: AV0Z1030915 Keywords : neural network * classifier * hyperspheres * big -dimensional data Subject RIV: BA - General Mathematics

  14. Histogram deconvolution - An aid to automated classifiers

    Science.gov (United States)

    Lorre, J. J.

    1983-01-01

    It is shown that N-dimensional histograms are convolved by the addition of noise in the picture domain. Three methods are described which provide the ability to deconvolve such noise-affected histograms. The purpose of the deconvolution is to provide automated classifiers with a higher quality N-dimensional histogram from which to obtain classification statistics.

  15. Classifying web pages with visual features

    NARCIS (Netherlands)

    de Boer, V.; van Someren, M.; Lupascu, T.; Filipe, J.; Cordeiro, J.

    2010-01-01

    To automatically classify and process web pages, current systems use the textual content of those pages, including both the displayed content and the underlying (HTML) code. However, a very important feature of a web page is its visual appearance. In this paper, we show that using generic visual

  16. Feature selection and classification of mechanical fault of an induction motor using random forest classifier

    OpenAIRE

    Patel, Raj Kumar; Giri, V.K.

    2016-01-01

    Fault detection and diagnosis is the most important technology in condition-based maintenance (CBM) system for rotating machinery. This paper experimentally explores the development of a random forest (RF) classifier, a recently emerged machine learning technique, for multi-class mechanical fault diagnosis in bearing of an induction motor. Firstly, the vibration signals are collected from the bearing using accelerometer sensor. Parameters from the vibration signal are extracted in the form of...

  17. On the statistical assessment of classifiers using DNA microarray data

    Directory of Open Access Journals (Sweden)

    Carella M

    2006-08-01

    Full Text Available Abstract Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22 and tumor (25 specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045 as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS and Support Vector Machines (SVM classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035 and e = 18% (p = 0.037 respectively. Moreover, the error rate

  18. Seismic reliability assessment of RC structures including soil–structure interaction using wavelet weighted least squares support vector machine

    International Nuclear Information System (INIS)

    Khatibinia, Mohsen; Javad Fadaee, Mohammad; Salajegheh, Javad; Salajegheh, Eysa

    2013-01-01

    An efficient metamodeling framework in conjunction with the Monte-Carlo Simulation (MCS) is introduced to reduce the computational cost in seismic reliability assessment of existing RC structures. In order to achieve this purpose, the metamodel is designed by combining weighted least squares support vector machine (WLS-SVM) and a wavelet kernel function, called wavelet weighted least squares support vector machine (WWLS-SVM). In this study, the seismic reliability assessment of existing RC structures with consideration of soil–structure interaction (SSI) effects is investigated in accordance with Performance-Based Design (PBD). This study aims to incorporate the acceptable performance levels of PBD into reliability theory for comparing the obtained annual probability of non-performance with the target values for each performance level. The MCS method as the most reliable method is utilized to estimate the annual probability of failure associated with a given performance level in this study. In WWLS-SVM-based MCS, the structural seismic responses are accurately predicted by WWLS-SVM for reducing the computational cost. To show the efficiency and robustness of the proposed metamodel, two RC structures are studied. Numerical results demonstrate the efficiency and computational advantages of the proposed metamodel for the seismic reliability assessment of structures. Furthermore, the consideration of the SSI effects in the seismic reliability assessment of existing RC structures is compared to the fixed base model. It shows which SSI has the significant influence on the seismic reliability assessment of structures.

  19. Support vector machine based fault detection approach for RFT-30 cyclotron

    Energy Technology Data Exchange (ETDEWEB)

    Kong, Young Bae, E-mail: ybkong@kaeri.re.kr; Lee, Eun Je; Hur, Min Goo; Park, Jeong Hoon; Park, Yong Dae; Yang, Seung Dae

    2016-10-21

    An RFT-30 is a 30 MeV cyclotron used for radioisotope applications and radiopharmaceutical researches. The RFT-30 cyclotron is highly complex and includes many signals for control and monitoring of the system. It is quite difficult to detect and monitor the system failure in real time. Moreover, continuous monitoring of the system is hard and time-consuming work for human operators. In this paper, we propose a support vector machine (SVM) based fault detection approach for the RFT-30 cyclotron. The proposed approach performs SVM learning with training samples to construct the classification model. To compensate the system complexity due to the large-scale accelerator, we utilize the principal component analysis (PCA) for transformation of the original data. After training procedure, the proposed approach detects the system faults in real time. We analyzed the performance of the proposed approach utilizing the experimental data of the RFT-30 cyclotron. The performance results show that the proposed SVM approach can provide an efficient way to control the cyclotron system.

  20. CompareSVM: supervised, Support Vector Machine (SVM) inference of gene regularity networks.

    Science.gov (United States)

    Gillani, Zeeshan; Akash, Muhammad Sajid Hamid; Rahaman, M D Matiur; Chen, Ming

    2014-11-30

    Predication of gene regularity network (GRN) from expression data is a challenging task. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. Most promising methods are based on support vector machine (SVM). There is a need for comprehensive analysis on prediction accuracy of supervised method SVM using different kernels on different biological experimental conditions and network size. We developed a tool (CompareSVM) based on SVM to compare different kernel methods for inference of GRN. Using CompareSVM, we investigated and evaluated different SVM kernel methods on simulated datasets of microarray of different sizes in detail. The results obtained from CompareSVM showed that accuracy of inference method depends upon the nature of experimental condition and size of the network. For network with nodes (SVM Gaussian kernel outperform on knockout, knockdown, and multifactorial datasets compared to all the other inference methods. For network with large number of nodes (~500), choice of inference method depend upon nature of experimental condition. CompareSVM is available at http://bis.zju.edu.cn/CompareSVM/ .

  1. Freshwater Algal Bloom Prediction by Support Vector Machine in Macau Storage Reservoirs

    Directory of Open Access Journals (Sweden)

    Zhengchao Xie

    2012-01-01

    Full Text Available Understanding and predicting dynamic change of algae population in freshwater reservoirs is particularly important, as algae-releasing cyanotoxins are carcinogens that would affect the health of public. However, the high complex nonlinearity of water variables and their interactions makes it difficult to model the growth of algae species. Recently, support vector machine (SVM was reported to have advantages of only requiring a small amount of samples, high degree of prediction accuracy, and long prediction period to solve the nonlinear problems. In this study, the SVM-based prediction and forecast models for phytoplankton abundance in Macau Storage Reservoir (MSR are proposed, in which the water parameters of pH, SiO2, alkalinity, bicarbonate (HCO3 -, dissolved oxygen (DO, total nitrogen (TN, UV254, turbidity, conductivity, nitrate, total nitrogen (TN, orthophosphate (PO4 3−, total phosphorus (TP, suspended solid (SS and total organic carbon (TOC selected from the correlation analysis of the 23 monthly water variables were included, with 8-year (2001–2008 data for training and the most recent 3 years (2009–2011 for testing. The modeling results showed that the prediction and forecast powers were estimated as approximately 0.76 and 0.86, respectively, showing that the SVM is an effective new way that can be used for monitoring algal bloom in drinking water storage reservoir.

  2. Effluent composition prediction of a two-stage anaerobic digestion process: machine learning and stoichiometry techniques.

    Science.gov (United States)

    Alejo, Luz; Atkinson, John; Guzmán-Fierro, Víctor; Roeckel, Marlene

    2018-05-16

    Computational self-adapting methods (Support Vector Machines, SVM) are compared with an analytical method in effluent composition prediction of a two-stage anaerobic digestion (AD) process. Experimental data for the AD of poultry manure were used. The analytical method considers the protein as the only source of ammonia production in AD after degradation. Total ammonia nitrogen (TAN), total solids (TS), chemical oxygen demand (COD), and total volatile solids (TVS) were measured in the influent and effluent of the process. The TAN concentration in the effluent was predicted, this being the most inhibiting and polluting compound in AD. Despite the limited data available, the SVM-based model outperformed the analytical method for the TAN prediction, achieving a relative average error of 15.2% against 43% for the analytical method. Moreover, SVM showed higher prediction accuracy in comparison with Artificial Neural Networks. This result reveals the future promise of SVM for prediction in non-linear and dynamic AD processes. Graphical abstract ᅟ.

  3. Support Vector Machines with Manifold Learning and Probabilistic Space Projection for Tourist Expenditure Analysis

    Directory of Open Access Journals (Sweden)

    Xin Xu

    2009-03-01

    Full Text Available The significant economic contributions of the tourism industry in recent years impose an unprecedented force for data mining and machine learning methods to analyze tourism data. The intrinsic problems of raw data in tourism are largely related to the complexity, noise and nonlinearity in the data that may introduce many challenges for the existing data mining techniques such as rough sets and neural networks. In this paper, a novel method using SVM- based classification with two nonlinear feature projection techniques is proposed for tourism data analysis. The first feature projection method is based on ISOMAP (Isometric Feature Mapping, which is a class of manifold learning approaches for dimension reduction. By making use of ISOMAP, part of the noisy data can be identified and the classification accuracy of SVMs can be improved by appropriately discarding the noisy training data. The second feature projection method is a probabilistic space mapping technique for scale transformation. Experimental results on expenditure data of business travelers show that the proposed method can improve prediction performance both in terms of testing accuracy and statistical coincidence. In addition, both of the feature projection methods are helpful to reduce the training time of SVMs.

  4. Ball Nut Preload Diagnosis of the Hollow Ball Screw through Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Yi-Cheng Huang

    2018-01-01

    Full Text Available This paper studies the diagnostic results of hollow ball screws with different ball nut preload through the support vector machine (SVM process. The method is testified by considering the use of ball screw pretension and different ball nut preload. SVM was used to discriminate the hollow ball screw preload status through the vibration signals and servo motor current signals. Maximum dynamic preloads of 2%, 4%, and 6% ball screws were predesigned, manufactured, and conducted experimentally. Signal patterns with different preload features are separatedby SVM. The irregularity development of the ball screw driving motion current and rolling balls vibration of the ball screw can be discriminated via SVM based on complexity perception. The experimental results successfully show that the prognostic status of ball nut preload can be envisaged by the proposed methodology. The smart reasoning for the health of the ball screw is available based on classification of SVM. This diagnostic method satisfies the purposes of prognostic effectiveness on knowing the ball nut preload status

  5. A survey on queues in machining system: Progress from 2010 to 2017

    Directory of Open Access Journals (Sweden)

    Shekhar C.

    2017-01-01

    Full Text Available The aim of the present article is to give a historical survey of some important research works related to queues in machining system since 2010. Queues of failed machines in machine repairing problem occur due to the failure of machines at random in the manufacturing industries, where different jobs are performed on machining stations. Machines are subject to failure what may result in significant loss of production, revenue, or goodwill. In addition to the references on queues in machining system, which is also called `Machine Repair Problem' (MRP or `Machine Interference Problem' (MIP, a meticulous list of books and survey papers is also prepared so as to provide a detailed catalog for understanding the research in queueing domain. We have classified the relevant literature according to a year of publishing, methodological, and modeling aspects. The author(s hope that this survey paper could be of help to learners contemplating research on queueing domain.

  6. Automated beam placement for breast radiotherapy using a support vector machine based algorithm

    International Nuclear Information System (INIS)

    Zhao Xuan; Kong, Dewen; Jozsef, Gabor; Chang, Jenghwa; Wong, Edward K.; Formenti, Silvia C.; Wang Yao

    2012-01-01

    Purpose: To develop an automated beam placement technique for whole breast radiotherapy using tangential beams. We seek to find optimal parameters for tangential beams to cover the whole ipsilateral breast (WB) and minimize the dose to the organs at risk (OARs). Methods: A support vector machine (SVM) based method is proposed to determine the optimal posterior plane of the tangential beams. Relative significances of including/avoiding the volumes of interests are incorporated into the cost function of the SVM. After finding the optimal 3-D plane that separates the whole breast (WB) and the included clinical target volumes (CTVs) from the OARs, the gantry angle, collimator angle, and posterior jaw size of the tangential beams are derived from the separating plane equation. Dosimetric measures of the treatment plans determined by the automated method are compared with those obtained by applying manual beam placement by the physicians. The method can be further extended to use multileaf collimator (MLC) blocking by optimizing posterior MLC positions. Results: The plans for 36 patients (23 prone- and 13 supine-treated) with left breast cancer were analyzed. Our algorithm reduced the volume of the heart that receives >500 cGy dose (V5) from 2.7 to 1.7 cm 3 (p = 0.058) on average and the volume of the ipsilateral lung that receives >1000 cGy dose (V10) from 55.2 to 40.7 cm 3 (p = 0.0013). The dose coverage as measured by volume receiving >95% of the prescription dose (V95%) of the WB without a 5 mm superficial layer decreases by only 0.74% (p = 0.0002) and the V95% for the tumor bed with 1.5 cm margin remains unchanged. Conclusions: This study has demonstrated the feasibility of using a SVM-based algorithm to determine optimal beam placement without a physician's intervention. The proposed method reduced the dose to OARs, especially for supine treated patients, without any relevant degradation of dose homogeneity and coverage in general.

  7. Prediction of interactions between viral and host proteins using supervised machine learning methods.

    Directory of Open Access Journals (Sweden)

    Ranjan Kumar Barman

    Full Text Available BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs has great implication for therapeutics. METHODS: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49% and Random Forest (55.66%. However the specificity of Naïve Bayes was the highest (99.52% as compared with SVM (74% and Random Forest (89.08%. Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus "C protein" binds to membrane docking protein, while "X protein" and "P protein" interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV, interacting partners of host

  8. Classifying features in CT imagery: accuracy for some single- and multiple-species classifiers

    Science.gov (United States)

    Daniel L. Schmoldt; Jing He; A. Lynn Abbott

    1998-01-01

    Our current approach to automatically label features in CT images of hardwood logs classifies each pixel of an image individually. These feature classifiers use a back-propagation artificial neural network (ANN) and feature vectors that include a small, local neighborhood of pixels and the distance of the target pixel to the center of the log. Initially, this type of...

  9. Support vector machine-based exergetic modelling of a DI diesel engine running on biodiesel–diesel blends containing expanded polystyrene

    International Nuclear Information System (INIS)

    Shamshirband, Shahaboddin; Tabatabaei, Meisam; Aghbashlo, Mortaza; Yee, Por Lip; Petković, Dalibor

    2016-01-01

    Highlights: • SVM-based thermodynamic modelling of a DI diesel engine working with diesel/biodiesel blends containing EPS. • Comparison of SVM-WT, SVM-FFA, SVM-RBF, SVM-QPSO, and ANN approaches for exergetic modelling of the engine. • Satisfactory performance of the SVM-WT for performance modelling of the engine over the other approaches. - Abstract: In the present study, four Support Vector Machine-based (SVM-based) approaches and the standard artificial neural network (ANN) model were designed and compared in modelling the exergetic parameters of a DI diesel engine running on diesel/biodiesel blends containing expanded polystyrene (EPS) wastes. For this aim, the SVM was coupled with discrete wavelet transform (SVM-WT), firefly algorithm (SVM-FFA), radial basis function (SVM-RBF) and quantum particle swarm optimization (SVM-QPSO). The exergetic data were computed using mass, energy, and exergy balance equations for the engine at different speeds and loads as well as various biodiesel and EPS wastes quantities. Three statistical indicators namely root means square error, coefficient of determination and Pearson coefficient were used to access the capability of the developed approaches for exergetic performance modelling of the DI diesel engine. The modelling results indicated that the SVM-WT approach was more efficient in exergetic modelling of the engine than the other three approaches. Moreover, the results obtained confirmed the effectiveness of the SVM-WT model in identifying the most exergy-efficient combustion conditions and the best fuel composition for achieving the most cost-effective and eco-friendly combustion process.

  10. Identification and optimization of classifier genes from multi-class earthworm microarray dataset.

    Directory of Open Access Journals (Sweden)

    Ying Li

    Full Text Available Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. A variety of toxicological effects have been associated with explosive compounds TNT and RDX. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. We have developed an earthworm microarray containing 15,208 unique oligo probes and have used it to profile gene expression in 248 earthworms exposed to TNT, RDX or neither. We assembled a new machine learning pipeline consisting of several well-established feature filtering/selection and classification techniques to analyze the 248-array dataset in order to construct classifier models that can separate earthworm samples into three groups: control, TNT-treated, and RDX-treated. First, a total of 869 genes differentially expressed in response to TNT or RDX exposure were identified using a univariate statistical algorithm of class comparison. Then, decision tree-based algorithms were applied to select a subset of 354 classifier genes, which were ranked by their overall weight of significance. A multiclass support vector machine (MC-SVM method and an unsupervised K-mean clustering method were applied to independently refine the classifier, producing a smaller subset of 39 and 30 classifier genes, separately, with 11 common genes being potential biomarkers. The combined 58 genes were considered the refined subset and used to build MC-SVM and clustering models with classification accuracy of 83.5% and 56.9%, respectively. This study demonstrates that the machine learning approach can be used to identify and optimize a small subset of classifier/biomarker genes from high dimensional datasets and generate classification models of acceptable precision for multiple classes.

  11. A Machine Learning Approach for Identifying and Classifying Faults in Wireless Sensor Networks

    NARCIS (Netherlands)

    Warriach, Ehsan Ullah; Aiello, Marco; Tei, Kenji

    2012-01-01

    Wireless Sensor Network (WSN) deployment experiences show that collected data is prone to be faulty. Faults are due to internal and external influences, such as calibration, low battery, environmental interference and sensor aging. However, only few solutions exist to deal with faulty sensory data

  12. Assessing machine learning classifiers for the detection of animals’ behavior using depth-based tracking

    NARCIS (Netherlands)

    Pons, Patricia; Jaen, Javier; Catala, Alejandro

    2017-01-01

    There is growing interest in the automatic detection of animals’ behaviors and body postures within the field of Animal Computer Interaction, and the benefits this could bring to animal welfare, enabling remote communication, welfare assessment, detection of behavioral patterns, interactive and

  13. Knowledge Discovery using Least Squares Support Vector Machine Classifiers: a Direct Marketing Case

    NARCIS (Netherlands)

    Viaene, S.; Baesens, B.; Van Gestel, T.; Suykens, J.A.K.; Van den Poel, D.; Vanthienen, J.; De Moor, B.; Dedene, G.; Zighed, D.A.; Komorowsky, J.; Żytkow, J.

    2000-01-01

    The case involves the detection and qualification of the most relevant predictors for repeat-purchase modelling in a direct marketing setting. Analysis is based on a wrapped form of feature selection using a sensitivity based pruning heuristic to guide a greedy, step-wise and backward traversal of

  14. Machine learning in jet physics

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    High energy collider experiments produce several petabytes of data every year. Given the magnitude and complexity of the raw data, machine learning algorithms provide the best available platform to transform and analyse these data to obtain valuable insights to understand Standard Model and Beyond Standard Model theories. These collider experiments produce both quark and gluon initiated hadronic jets as the core components. Deep learning techniques enable us to classify quark/gluon jets through image recognition and help us to differentiate signals and backgrounds in Beyond Standard Model searches at LHC. We are currently working on quark/gluon jet classification and progressing in our studies to find the bias between event generators using domain adversarial neural networks (DANN). We also plan to investigate top tagging, weak supervision on mixed samples in high energy physics, utilizing transfer learning from simulated data to real experimental data.

  15. Hard-Rock Stability Analysis for Span Design in Entry-Type Excavations with Learning Classifiers

    Directory of Open Access Journals (Sweden)

    Esperanza García-Gonzalo

    2016-06-01

    Full Text Available The mining industry relies heavily on empirical analysis for design and prediction. An empirical design method, called the critical span graph, was developed specifically for rock stability analysis in entry-type excavations, based on an extensive case-history database of cut and fill mining in Canada. This empirical span design chart plots the critical span against rock mass rating for the observed case histories and has been accepted by many mining operations for the initial span design of cut and fill stopes. Different types of analysis have been used to classify the observed cases into stable, potentially unstable and unstable groups. The main purpose of this paper is to present a new method for defining rock stability areas of the critical span graph, which applies machine learning classifiers (support vector machine and extreme learning machine. The results show a reasonable correlation with previous guidelines. These machine learning methods are good tools for developing empirical methods, since they make no assumptions about the regression function. With this software, it is easy to add new field observations to a previous database, improving prediction output with the addition of data that consider the local conditions for each mine.

  16. Hard-Rock Stability Analysis for Span Design in Entry-Type Excavations with Learning Classifiers.

    Science.gov (United States)

    García-Gonzalo, Esperanza; Fernández-Muñiz, Zulima; García Nieto, Paulino José; Bernardo Sánchez, Antonio; Menéndez Fernández, Marta

    2016-06-29

    The mining industry relies heavily on empirical analysis for design and prediction. An empirical design method, called the critical span graph, was developed specifically for rock stability analysis in entry-type excavations, based on an extensive case-history database of cut and fill mining in Canada. This empirical span design chart plots the critical span against rock mass rating for the observed case histories and has been accepted by many mining operations for the initial span design of cut and fill stopes. Different types of analysis have been used to classify the observed cases into stable, potentially unstable and unstable groups. The main purpose of this paper is to present a new method for defining rock stability areas of the critical span graph, which applies machine learning classifiers (support vector machine and extreme learning machine). The results show a reasonable correlation with previous guidelines. These machine learning methods are good tools for developing empirical methods, since they make no assumptions about the regression function. With this software, it is easy to add new field observations to a previous database, improving prediction output with the addition of data that consider the local conditions for each mine.

  17. Vision based nutrient deficiency classification in maize plants using multi class support vector machines

    Science.gov (United States)

    Leena, N.; Saju, K. K.

    2018-04-01

    Nutritional deficiencies in plants are a major concern for farmers as it affects productivity and thus profit. The work aims to classify nutritional deficiencies in maize plant in a non-destructive mannerusing image processing and machine learning techniques. The colored images of the leaves are analyzed and classified with multi-class support vector machine (SVM) method. Several images of maize leaves with known deficiencies like nitrogen, phosphorous and potassium (NPK) are used to train the SVM classifier prior to the classification of test images. The results show that the method was able to classify and identify nutritional deficiencies.

  18. Disassembly and Sanitization of Classified Matter

    International Nuclear Information System (INIS)

    Stockham, Dwight J.; Saad, Max P.

    2008-01-01

    The Disassembly Sanitization Operation (DSO) process was implemented to support weapon disassembly and disposition by using recycling and waste minimization measures. This process was initiated by treaty agreements and reconfigurations within both the DOD and DOE Complexes. The DOE is faced with disassembling and disposing of a huge inventory of retired weapons, components, training equipment, spare parts, weapon maintenance equipment, and associated material. In addition, regulations have caused a dramatic increase in the need for information required to support the handling and disposition of these parts and materials. In the past, huge inventories of classified weapon components were required to have long-term storage at Sandia and at many other locations throughout the DoE Complex. These materials are placed in onsite storage unit due to classification issues and they may also contain radiological and/or hazardous components. Since no disposal options exist for this material, the only choice was long-term storage. Long-term storage is costly and somewhat problematic, requiring a secured storage area, monitoring, auditing, and presenting the potential for loss or theft of the material. Overall recycling rates for materials sent through the DSO process have enabled 70 to 80% of these components to be recycled. These components are made of high quality materials and once this material has been sanitized, the demand for the component metals for recycling efforts is very high. The DSO process for NGPF, classified components established the credibility of this technique for addressing the long-term storage requirements of the classified weapons component inventory. The success of this application has generated interest from other Sandia organizations and other locations throughout the complex. Other organizations are requesting the help of the DSO team and the DSO is responding to these requests by expanding its scope to include Work-for- Other projects. For example

  19. Predicting High Frequency Exchange Rates using Machine Learning

    OpenAIRE

    Palikuca, Aleksandar; Seidl,, Timo

    2016-01-01

    This thesis applies a committee of Artificial Neural Networks and Support Vector Machines on high-dimensional, high-frequency EUR/USD exchange rate data in an effort to predict directional market movements on up to a 60 second prediction horizon. The study shows that combining multiple classifiers into a committee produces improved precision relative to the best individual committee members and outperforms previously reported results. A trading simulation implementing the committee classifier...

  20. Pattern recognition & machine learning

    CERN Document Server

    Anzai, Y

    1992-01-01

    This is the first text to provide a unified and self-contained introduction to visual pattern recognition and machine learning. It is useful as a general introduction to artifical intelligence and knowledge engineering, and no previous knowledge of pattern recognition or machine learning is necessary. Basic for various pattern recognition and machine learning methods. Translated from Japanese, the book also features chapter exercises, keywords, and summaries.

  1. Support vector machines applications

    CERN Document Server

    Guo, Guodong

    2014-01-01

    Support vector machines (SVM) have both a solid mathematical background and good performance in practical applications. This book focuses on the recent advances and applications of the SVM in different areas, such as image processing, medical practice, computer vision, pattern recognition, machine learning, applied statistics, business intelligence, and artificial intelligence. The aim of this book is to create a comprehensive source on support vector machine applications, especially some recent advances.

  2. The Newest Machine Material

    International Nuclear Information System (INIS)

    Seo, Yeong Seop; Choe, Byeong Do; Bang, Meong Sung

    2005-08-01

    This book gives descriptions of machine material with classification of machine material and selection of machine material, structure and connection of material, coagulation of metal and crystal structure, equilibrium diagram, properties of metal material, elasticity and plasticity, biopsy of metal, material test and nondestructive test. It also explains steel material such as heat treatment of steel, cast iron and cast steel, nonferrous metal materials, non metallic materials, and new materials.

  3. Introduction to machine learning

    OpenAIRE

    Baştanlar, Yalın; Özuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning app...

  4. Machinability of advanced materials

    CERN Document Server

    Davim, J Paulo

    2014-01-01

    Machinability of Advanced Materials addresses the level of difficulty involved in machining a material, or multiple materials, with the appropriate tooling and cutting parameters.  A variety of factors determine a material's machinability, including tool life rate, cutting forces and power consumption, surface integrity, limiting rate of metal removal, and chip shape. These topics, among others, and multiple examples comprise this research resource for engineering students, academics, and practitioners.

  5. Machining of titanium alloys

    CERN Document Server

    2014-01-01

    This book presents a collection of examples illustrating the resent research advances in the machining of titanium alloys. These materials have excellent strength and fracture toughness as well as low density and good corrosion resistance; however, machinability is still poor due to their low thermal conductivity and high chemical reactivity with cutting tool materials. This book presents solutions to enhance machinability in titanium-based alloys and serves as a useful reference to professionals and researchers in aerospace, automotive and biomedical fields.

  6. Comparing cosmic web classifiers using information theory

    International Nuclear Information System (INIS)

    Leclercq, Florent; Lavaux, Guilhem; Wandelt, Benjamin; Jasche, Jens

    2016-01-01

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

  7. Design of Robust Neural Network Classifiers

    DEFF Research Database (Denmark)

    Larsen, Jan; Andersen, Lars Nonboe; Hintz-Madsen, Mads

    1998-01-01

    This paper addresses a new framework for designing robust neural network classifiers. The network is optimized using the maximum a posteriori technique, i.e., the cost function is the sum of the log-likelihood and a regularization term (prior). In order to perform robust classification, we present...... a modified likelihood function which incorporates the potential risk of outliers in the data. This leads to the introduction of a new parameter, the outlier probability. Designing the neural classifier involves optimization of network weights as well as outlier probability and regularization parameters. We...... suggest to adapt the outlier probability and regularisation parameters by minimizing the error on a validation set, and a simple gradient descent scheme is derived. In addition, the framework allows for constructing a simple outlier detector. Experiments with artificial data demonstrate the potential...

  8. Comparing cosmic web classifiers using information theory

    Energy Technology Data Exchange (ETDEWEB)

    Leclercq, Florent [Institute of Cosmology and Gravitation (ICG), University of Portsmouth, Dennis Sciama Building, Burnaby Road, Portsmouth PO1 3FX (United Kingdom); Lavaux, Guilhem; Wandelt, Benjamin [Institut d' Astrophysique de Paris (IAP), UMR 7095, CNRS – UPMC Université Paris 6, Sorbonne Universités, 98bis boulevard Arago, F-75014 Paris (France); Jasche, Jens, E-mail: florent.leclercq@polytechnique.org, E-mail: lavaux@iap.fr, E-mail: j.jasche@tum.de, E-mail: wandelt@iap.fr [Excellence Cluster Universe, Technische Universität München, Boltzmannstrasse 2, D-85748 Garching (Germany)

    2016-08-01

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

  9. Detection of Fundus Lesions Using Classifier Selection

    Science.gov (United States)

    Nagayoshi, Hiroto; Hiramatsu, Yoshitaka; Sako, Hiroshi; Himaga, Mitsutoshi; Kato, Satoshi

    A system for detecting fundus lesions caused by diabetic retinopathy from fundus images is being developed. The system can screen the images in advance in order to reduce the inspection workload on doctors. One of the difficulties that must be addressed in completing this system is how to remove false positives (which tend to arise near blood vessels) without decreasing the detection rate of lesions in other areas. To overcome this difficulty, we developed classifier selection according to the position of a candidate lesion, and we introduced new features that can distinguish true lesions from false positives. A system incorporating classifier selection and these new features was tested in experiments using 55 fundus images with some lesions and 223 images without lesions. The results of the experiments confirm the effectiveness of the proposed system, namely, degrees of sensitivity and specificity of 98% and 81%, respectively.

  10. Proposed hybrid-classifier ensemble algorithm to map snow cover area

    Science.gov (United States)

    Nijhawan, Rahul; Raman, Balasubramanian; Das, Josodhir

    2018-01-01

    Metaclassification ensemble approach is known to improve the prediction performance of snow-covered area. The methodology adopted in this case is based on neural network along with four state-of-art machine learning algorithms: support vector machine, artificial neural networks, spectral angle mapper, K-mean clustering, and a snow index: normalized difference snow index. An AdaBoost ensemble algorithm related to decision tree for snow-cover mapping is also proposed. According to available literature, these methods have been rarely used for snow-cover mapping. Employing the above techniques, a study was conducted for Raktavarn and Chaturangi Bamak glaciers, Uttarakhand, Himalaya using multispectral Landsat 7 ETM+ (enhanced thematic mapper) image. The study also compares the results with those obtained from statistical combination methods (majority rule and belief functions) and accuracies of individual classifiers. Accuracy assessment is performed by computing the quantity and allocation disagreement, analyzing statistic measures (accuracy, precision, specificity, AUC, and sensitivity) and receiver operating characteristic curves. A total of 225 combinations of parameters for individual classifiers were trained and tested on the dataset and results were compared with the proposed approach. It was observed that the proposed methodology produced the highest classification accuracy (95.21%), close to (94.01%) that was produced by the proposed AdaBoost ensemble algorithm. From the sets of observations, it was concluded that the ensemble of classifiers produced better results compared to individual classifiers.

  11. Comparison of Shallow and Deep Learning Methods on Classifying the Regional Pattern of Diffuse Lung Disease.

    Science.gov (United States)

    Kim, Guk Bae; Jung, Kyu-Hwan; Lee, Yeha; Kim, Hyun-Jun; Kim, Namkug; Jun, Sanghoon; Seo, Joon Beom; Lynch, David A

    2017-10-17

    This study aimed to compare shallow and deep learning of classifying the patterns of interstitial lung diseases (ILDs). Using high-resolution computed tomography images, two experienced radiologists marked 1200 regions of interest (ROIs), in which 600 ROIs were each acquired using a GE or Siemens scanner and each group of 600 ROIs consisted of 100 ROIs for subregions that included normal and five regional pulmonary disease patterns (ground-glass opacity, consolidation, reticular opacity, emphysema, and honeycombing). We employed the convolution neural network (CNN) with six learnable layers that consisted of four convolution layers and two fully connected layers. The classification results were compared with the results classified by a shallow learning of a support vector machine (SVM). The CNN classifier showed significantly better performance for accuracy compared with that of the SVM classifier by 6-9%. As the convolution layer increases, the classification accuracy of the CNN showed better performance from 81.27 to 95.12%. Especially in the cases showing pathological ambiguity such as between normal and emphysema cases or between honeycombing and reticular opacity cases, the increment of the convolution layer greatly drops the misclassification rate between each case. Conclusively, the CNN classifier showed significantly greater accuracy than the SVM classifier, and the results implied structural characteristics that are inherent to the specific ILD patterns.

  12. Ensembles of novelty detection classifiers for structural health monitoring using guided waves

    Science.gov (United States)

    Dib, Gerges; Karpenko, Oleksii; Koricho, Ermias; Khomenko, Anton; Haq, Mahmoodul; Udpa, Lalita

    2018-01-01

    Guided wave structural health monitoring uses sparse sensor networks embedded in sophisticated structures for defect detection and characterization. The biggest challenge of those sensor networks is developing robust techniques for reliable damage detection under changing environmental and operating conditions (EOC). To address this challenge, we develop a novelty classifier for damage detection based on one class support vector machines. We identify appropriate features for damage detection and introduce a feature aggregation method which quadratically increases the number of available training observations. We adopt a two-level voting scheme by using an ensemble of classifiers and predictions. Each classifier is trained on a different segment of the guided wave signal, and each classifier makes an ensemble of predictions based on a single observation. Using this approach, the classifier can be trained using a small number of baseline signals. We study the performance using Monte-Carlo simulations of an analytical model and data from impact damage experiments on a glass fiber composite plate. We also demonstrate the classifier performance using two types of baseline signals: fixed and rolling baseline training set. The former requires prior knowledge of baseline signals from all EOC, while the latter does not and leverages the fact that EOC vary slowly over time and can be modeled as a Gaussian process.

  13. Learning for VMM + WTA Embedded Classifiers

    Science.gov (United States)

    2016-03-31

    Learning for VMM + WTA Embedded Classifiers Jennifer Hasler and Sahil Shah Electrical and Computer Engineering Georgia Institute of Technology...enabling correct classification of each novel acoustic signal (generator, idle car, and idle truck ). The classification structure requires, after...measured on our SoC FPAA IC. The test input is composed of signals from urban environment for 3 objects (generator, idle car, and idle truck

  14. Bayes classifiers for imbalanced traffic accidents datasets.

    Science.gov (United States)

    Mujalli, Randa Oqab; López, Griselda; Garach, Laura

    2016-03-01

    Traffic accidents data sets are usually imbalanced, where the number of instances classified under the killed or severe injuries class (minority) is much lower than those classified under the slight injuries class (majority). This, however, supposes a challenging problem for classification algorithms and may cause obtaining a model that well cover the slight injuries instances whereas the killed or severe injuries instances are misclassified frequently. Based on traffic accidents data collected on urban and suburban roads in Jordan for three years (2009-2011); three different data balancing techniques were used: under-sampling which removes some instances of the majority class, oversampling which creates new instances of the minority class and a mix technique that combines both. In addition, different Bayes classifiers were compared for the different imbalanced and balanced data sets: Averaged One-Dependence Estimators, Weightily Average One-Dependence Estimators, and Bayesian networks in order to identify factors that affect the severity of an accident. The results indicated that using the balanced data sets, especially those created using oversampling techniques, with Bayesian networks improved classifying a traffic accident according to its severity and reduced the misclassification of killed and severe injuries instances. On the other hand, the following variables were found to contribute to the occurrence of a killed causality or a severe injury in a traffic accident: number of vehicles involved, accident pattern, number of directions, accident type, lighting, surface condition, and speed limit. This work, to the knowledge of the authors, is the first that aims at analyzing historical data records for traffic accidents occurring in Jordan and the first to apply balancing techniques to analyze injury severity of traffic accidents. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. A Bayesian classifier for symbol recognition

    OpenAIRE

    Barrat , Sabine; Tabbone , Salvatore; Nourrissier , Patrick

    2007-01-01

    URL : http://www.buyans.com/POL/UploadedFile/134_9977.pdf; International audience; We present in this paper an original adaptation of Bayesian networks to symbol recognition problem. More precisely, a descriptor combination method, which enables to improve significantly the recognition rate compared to the recognition rates obtained by each descriptor, is presented. In this perspective, we use a simple Bayesian classifier, called naive Bayes. In fact, probabilistic graphical models, more spec...

  16. Tribology in machine design

    CERN Document Server

    Stolarski, Tadeusz

    1999-01-01

    ""Tribology in Machine Design is strongly recommended for machine designers, and engineers and scientists interested in tribology. It should be in the engineering library of companies producing mechanical equipment.""Applied Mechanics ReviewTribology in Machine Design explains the role of tribology in the design of machine elements. It shows how algorithms developed from the basic principles of tribology can be used in a range of practical applications within mechanical devices and systems.The computer offers today's designer the possibility of greater stringen

  17. Induction machine handbook

    CERN Document Server

    Boldea, Ion

    2002-01-01

    Often called the workhorse of industry, the advent of power electronics and advances in digital control are transforming the induction motor into the racehorse of industrial motion control. Now, the classic texts on induction machines are nearly three decades old, while more recent books on electric motors lack the necessary depth and detail on induction machines.The Induction Machine Handbook fills industry's long-standing need for a comprehensive treatise embracing the many intricate facets of induction machine analysis and design. Moving gradually from simple to complex and from standard to

  18. Chaotic Boltzmann machines

    Science.gov (United States)

    Suzuki, Hideyuki; Imura, Jun-ichi; Horio, Yoshihiko; Aihara, Kazuyuki

    2013-01-01

    The chaotic Boltzmann machine proposed in this paper is a chaotic pseudo-billiard system that works as a Boltzmann machine. Chaotic Boltzmann machines are shown numerically to have computing abilities comparable to conventional (stochastic) Boltzmann machines. Since no randomness is required, efficient hardware implementation is expected. Moreover, the ferromagnetic phase transition of the Ising model is shown to be characterised by the largest Lyapunov exponent of the proposed system. In general, a method to relate probabilistic models to nonlinear dynamics by derandomising Gibbs sampling is presented. PMID:23558425

  19. Electrical machines & drives

    CERN Document Server

    Hammond, P

    1985-01-01

    Containing approximately 200 problems (100 worked), the text covers a wide range of topics concerning electrical machines, placing particular emphasis upon electrical-machine drive applications. The theory is concisely reviewed and focuses on features common to all machine types. The problems are arranged in order of increasing levels of complexity and discussions of the solutions are included where appropriate to illustrate the engineering implications. This second edition includes an important new chapter on mathematical and computer simulation of machine systems and revised discussions o

  20. Nanocomposites for Machining Tools

    Directory of Open Access Journals (Sweden)

    Daria Sidorenko

    2017-10-01

    Full Text Available Machining tools are used in many areas of production. To a considerable extent, the performance characteristics of the tools determine the quality and cost of obtained products. The main materials used for producing machining tools are steel, cemented carbides, ceramics and superhard materials. A promising way to improve the performance characteristics of these materials is to design new nanocomposites based on them. The application of micromechanical modeling during the elaboration of composite materials for machining tools can reduce the financial and time costs for development of new tools, with enhanced performance. This article reviews the main groups of nanocomposites for machining tools and their performance.

  1. Machine listening intelligence

    Science.gov (United States)

    Cella, C. E.

    2017-05-01

    This manifesto paper will introduce machine listening intelligence, an integrated research framework for acoustic and musical signals modelling, based on signal processing, deep learning and computational musicology.

  2. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2013-01-01

    Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or

  3. Rotating electrical machines

    CERN Document Server

    Le Doeuff, René

    2013-01-01

    In this book a general matrix-based approach to modeling electrical machines is promulgated. The model uses instantaneous quantities for key variables and enables the user to easily take into account associations between rotating machines and static converters (such as in variable speed drives).   General equations of electromechanical energy conversion are established early in the treatment of the topic and then applied to synchronous, induction and DC machines. The primary characteristics of these machines are established for steady state behavior as well as for variable speed scenarios. I

  4. Feature weighting using particle swarm optimization for learning vector quantization classifier

    Science.gov (United States)

    Dongoran, A.; Rahmadani, S.; Zarlis, M.; Zakarias

    2018-03-01

    This paper discusses and proposes a method of feature weighting in classification assignments on competitive learning artificial neural network LVQ. The weighting feature method is the search for the weight of an attribute using the PSO so as to give effect to the resulting output. This method is then applied to the LVQ-Classifier and tested on the 3 datasets obtained from the UCI Machine Learning repository. Then an accuracy analysis will be generated by two approaches. The first approach using LVQ1, referred to as LVQ-Classifier and the second approach referred to as PSOFW-LVQ, is a proposed model. The result shows that the PSO algorithm is capable of finding attribute weights that increase LVQ-classifier accuracy.

  5. Optimization of short amino acid sequences classifier

    Science.gov (United States)

    Barcz, Aleksy; Szymański, Zbigniew

    This article describes processing methods used for short amino acid sequences classification. The data processed are 9-symbols string representations of amino acid sequences, divided into 49 data sets - each one containing samples labeled as reacting or not with given enzyme. The goal of the classification is to determine for a single enzyme, whether an amino acid sequence would react with it or not. Each data set is processed separately. Feature selection is performed to reduce the number of dimensions for each data set. The method used for feature selection consists of two phases. During the first phase, significant positions are selected using Classification and Regression Trees. Afterwards, symbols appearing at the selected positions are substituted with numeric values of amino acid properties taken from the AAindex database. In the second phase the new set of features is reduced using a correlation-based ranking formula and Gram-Schmidt orthogonalization. Finally, the preprocessed data is used for training LS-SVM classifiers. SPDE, an evolutionary algorithm, is used to obtain optimal hyperparameters for the LS-SVM classifier, such as error penalty parameter C and kernel-specific hyperparameters. A simple score penalty is used to adapt the SPDE algorithm to the task of selecting classifiers with best performance measures values.

  6. Are there intelligent Turing machines?

    OpenAIRE

    Bátfai, Norbert

    2015-01-01

    This paper introduces a new computing model based on the cooperation among Turing machines called orchestrated machines. Like universal Turing machines, orchestrated machines are also designed to simulate Turing machines but they can also modify the original operation of the included Turing machines to create a new layer of some kind of collective behavior. Using this new model we can define some interested notions related to cooperation ability of Turing machines such as the intelligence quo...

  7. Experiments with the Dragon Machine

    International Nuclear Information System (INIS)

    Malenfant, R.E.

    2005-01-01

    The basic characteristics of a self-sustaining chain reaction were demonstrated with the Chicago Pile in 1943, but it was not until early 1945 that sufficient enriched material became available to experimentally verify fast-neutron cross-sections and the kinetic characteristics of a nuclear chain reaction sustained with prompt neutrons alone. However, the demands of wartime and the rapid decline in effort following the cessation of hostilities often resulted in the failure to fully document the experiments or in the loss of documentation as personnel returned to civilian pursuits. When documented, the results were often highly classified. Even when eventually declassified, the data were often not approved for public release until years later.2 Even after declassification and approval for public release, the records are sometimes difficult to find. Through a fortuitous discovery, a set of handwritten notes by ''ORF July 1945'' entitled ''Dragon - Research with a Pulsed Fission Reactor'' was found by William L. Myers in an old storage safe at Pajarito Site of the Los Alamos National Laboratory3. Of course, ORF was identified as Otto R. Frisch. The document was attached to a page in a nondescript spiral bound notebook labeled ''494 Book'' that bore the signatures of Louis Slotin and P. Morrison. The notes also reference an ''Idea LS'' that can only be Louis Slotin. The discovery of the notes led to a search of Laboratory Archives, the negative files of the photo lab, and the Report Library for additional details of the experiments with the Dragon machine that were conducted between January and July 1945. The assembly machine and the experiments were carefully conceived and skillfully executed. The analyses--without the crutch of computers--display real insight into the characteristics of the nuclear chain reaction. The information presented here provides what is believed to be a complete collection of the original documentation of the observations made with the Dragon

  8. Building gene expression profile classifiers with a simple and efficient rejection option in R.

    Science.gov (United States)

    Benso, Alfredo; Di Carlo, Stefano; Politano, Gianfranco; Savino, Alessandro; Hafeezurrehman, Hafeez

    2011-01-01

    The collection of gene expression profiles from DNA microarrays and their analysis with pattern recognition algorithms is a powerful technology applied to several biological problems. Common pattern recognition systems classify samples assigning them to a set of known classes. However, in a clinical diagnostics setup, novel and unknown classes (new pathologies) may appear and one must be able to reject those samples that do not fit the trained model. The problem of implementing a rejection option in a multi-class classifier has not been widely addressed in the statistical literature. Gene expression profiles represent a critical case study since they suffer from the curse of dimensionality problem that negatively reflects on the reliability of both traditional rejection models and also more recent approaches such as one-class classifiers. This paper presents a set of empirical decision rules that can be used to implement a rejection option in a set of multi-class classifiers widely used for the analysis of gene expression profiles. In particular, we focus on the classifiers implemented in the R Language and Environment for Statistical Computing (R for short in the remaining of this paper). The main contribution of the proposed rules is their simplicity, which enables an easy integration with available data analysis environments. Since in the definition of a rejection model tuning of the involved parameters is often a complex and delicate task, in this paper we exploit an evolutionary strategy to automate this process. This allows the final user to maximize the rejection accuracy with minimum manual intervention. This paper shows how the use of simple decision rules can be used to help the use of complex machine learning algorithms in real experimental setups. The proposed approach is almost completely automated and therefore a good candidate for being integrated in data analysis flows in labs where the machine learning expertise required to tune traditional

  9. Using human brain activity to guide machine learning.

    Science.gov (United States)

    Fong, Ruth C; Scheirer, Walter J; Cox, David D

    2018-03-29

    Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of "neurally-weighted" machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain. After training, these neurally-weighted classifiers are able to classify images without requiring any additional neural data. We show that our neural-weighting approach can lead to large performance gains when used with traditional machine vision features, as well as to significant improvements with already high-performing convolutional neural network features. The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.

  10. Component Pin Recognition Using Algorithms Based on Machine Learning

    Science.gov (United States)

    Xiao, Yang; Hu, Hong; Liu, Ze; Xu, Jiangchang

    2018-04-01

    The purpose of machine vision for a plug-in machine is to improve the machine’s stability and accuracy, and recognition of the component pin is an important part of the vision. This paper focuses on component pin recognition using three different techniques. The first technique involves traditional image processing using the core algorithm for binary large object (BLOB) analysis. The second technique uses the histogram of oriented gradients (HOG), to experimentally compare the effect of the support vector machine (SVM) and the adaptive boosting machine (AdaBoost) learning meta-algorithm classifiers. The third technique is the use of an in-depth learning method known as convolution neural network (CNN), which involves identifying the pin by comparing a sample to its training. The main purpose of the research presented in this paper is to increase the knowledge of learning methods used in the plug-in machine industry in order to achieve better results.

  11. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach

    OpenAIRE

    Weng, Wei-Hung; Wagholikar, Kavishwar B.; McCray, Alexa T.; Szolovits, Peter; Chueh, Henry C.

    2017-01-01

    Background The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. Methods We constructed the pipeline using the clinical ...

  12. Exploration of machine learning techniques in predicting multiple sclerosis disease course

    OpenAIRE

    Zhao, Yijun; Healy, Brian C.; Rotstein, Dalia; Guttmann, Charles R. G.; Bakshi, Rohit; Weiner, Howard L.; Brodley, Carla E.; Chitnis, Tanuja

    2017-01-01

    Objective To explore the value of machine learning methods for predicting multiple sclerosis disease course. Methods 1693 CLIMB study patients were classified as increased EDSS?1.5 (worsening) or not (non-worsening) at up to five years after baseline visit. Support vector machines (SVM) were used to build the classifier, and compared to logistic regression (LR) using demographic, clinical and MRI data obtained at years one and two to predict EDSS at five years follow-up. Results Baseline data...

  13. Robust Framework to Combine Diverse Classifiers Assigning Distributed Confidence to Individual Classifiers at Class Level

    Directory of Open Access Journals (Sweden)

    Shehzad Khalid

    2014-01-01

    Full Text Available We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes.

  14. The Protection of Classified Information: The Legal Framework

    National Research Council Canada - National Science Library

    Elsea, Jennifer K

    2006-01-01

    Recent incidents involving leaks of classified information have heightened interest in the legal framework that governs security classification, access to classified information, and penalties for improper disclosure...

  15. Microsoft Azure machine learning

    CERN Document Server

    Mund, Sumit

    2015-01-01

    The book is intended for those who want to learn how to use Azure Machine Learning. Perhaps you already know a bit about Machine Learning, but have never used ML Studio in Azure; or perhaps you are an absolute newbie. In either case, this book will get you up-and-running quickly.

  16. The Hooey Machine.

    Science.gov (United States)

    Scarnati, James T.; Tice, Craig J.

    1992-01-01

    Describes how students can make and use Hooey Machines to learn how mechanical energy can be transferred from one object to another within a system. The Hooey Machine is made using a pencil, eight thumbtacks, one pushpin, tape, scissors, graph paper, and a plastic lid. (PR)

  17. Nanocomposites for Machining Tools

    DEFF Research Database (Denmark)

    Sidorenko, Daria; Loginov, Pavel; Mishnaevsky, Leon

    2017-01-01

    Machining tools are used in many areas of production. To a considerable extent, the performance characteristics of the tools determine the quality and cost of obtained products. The main materials used for producing machining tools are steel, cemented carbides, ceramics and superhard materials...

  18. A nucleonic weighing machine

    International Nuclear Information System (INIS)

    Anon.

    1978-01-01

    The design and operation of a nucleonic weighing machine fabricated for continuous weighing of material over conveyor belt are described. The machine uses a 40 mCi cesium-137 line source and a 10 litre capacity ionization chamber. It is easy to maintain as there are no moving parts. It can also be easily removed and reinstalled. (M.G.B.)

  19. An asymptotical machine

    Science.gov (United States)

    Cristallini, Achille

    2016-07-01

    A new and intriguing machine may be obtained replacing the moving pulley of a gun tackle with a fixed point in the rope. Its most important feature is the asymptotic efficiency. Here we obtain a satisfactory description of this machine by means of vector calculus and elementary trigonometry. The mathematical model has been compared with experimental data and briefly discussed.

  20. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2015-01-01

    Perhaps you already know a bit about machine learning but have never used R, or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.

  1. The deleuzian abstract machines

    DEFF Research Database (Denmark)

    Werner Petersen, Erik

    2005-01-01

    To most people the concept of abstract machines is connected to the name of Alan Turing and the development of the modern computer. The Turing machine is universal, axiomatic and symbolic (E.g. operating on symbols). Inspired by Foucault, Deleuze and Guattari extended the concept of abstract...

  2. Human Machine Learning Symbiosis

    Science.gov (United States)

    Walsh, Kenneth R.; Hoque, Md Tamjidul; Williams, Kim H.

    2017-01-01

    Human Machine Learning Symbiosis is a cooperative system where both the human learner and the machine learner learn from each other to create an effective and efficient learning environment adapted to the needs of the human learner. Such a system can be used in online learning modules so that the modules adapt to each learner's learning state both…

  3. Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients

    Science.gov (United States)

    Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad

    2010-12-01

    There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.

  4. Stability of halophilic proteins: from dipeptide attributes to discrimination classifier.

    Science.gov (United States)

    Zhang, Guangya; Huihua, Ge; Yi, Lin

    2013-02-01

    To investigate the molecular features responsible for protein halophilicity is of great significance for understanding the structure basis of protein halo-stability and would help to develop a practical strategy for designing halophilic proteins. In this work, we have systematically analyzed the dipeptide composition of the halophilic and non-halophilic protein sequences. We observed the halophilic proteins contained more DA, RA, AD, RR, AP, DD, PD, EA, VG and DV at the expense of LK, IL, II, IA, KK, IS, KA, GK, RK and AI. We identified some macromolecular signatures of halo-adaptation, and thought the dipeptide composition might contain more information than amino acid composition. Based on the dipeptide composition, we have developed a machine learning method for classifying halophilic and non-halophilic proteins for the first time. The accuracy of our method for the training dataset was 100.0%, and for the 10-fold cross-validation was 93.1%. We also discussed the influence of some specific dipeptides on prediction accuracy. Copyright © 2012 Elsevier B.V. All rights reserved.

  5. Addressing the Challenge of Defining Valid Proteomic Biomarkers and Classifiers

    LENUS (Irish Health Repository)

    Dakna, Mohammed

    2010-12-10

    Abstract Background The purpose of this manuscript is to provide, based on an extensive analysis of a proteomic data set, suggestions for proper statistical analysis for the discovery of sets of clinically relevant biomarkers. As tractable example we define the measurable proteomic differences between apparently healthy adult males and females. We choose urine as body-fluid of interest and CE-MS, a thoroughly validated platform technology, allowing for routine analysis of a large number of samples. The second urine of the morning was collected from apparently healthy male and female volunteers (aged 21-40) in the course of the routine medical check-up before recruitment at the Hannover Medical School. Results We found that the Wilcoxon-test is best suited for the definition of potential biomarkers. Adjustment for multiple testing is necessary. Sample size estimation can be performed based on a small number of observations via resampling from pilot data. Machine learning algorithms appear ideally suited to generate classifiers. Assessment of any results in an independent test-set is essential. Conclusions Valid proteomic biomarkers for diagnosis and prognosis only can be defined by applying proper statistical data mining procedures. In particular, a justification of the sample size should be part of the study design.

  6. An Informed Framework for Training Classifiers from Social Media

    Directory of Open Access Journals (Sweden)

    Dong Seon Cheng

    2016-04-01

    Full Text Available Extracting information from social media has become a major focus of companies and researchers in recent years. Aside from the study of the social aspects, it has also been found feasible to exploit the collaborative strength of crowds to help solve classical machine learning problems like object recognition. In this work, we focus on the generally underappreciated problem of building effective datasets for training classifiers by automatically assembling data from social media. We detail some of the challenges of this approach and outline a framework that uses expanded search queries to retrieve more qualified data. In particular, we concentrate on collaboratively tagged media on the social platform Flickr, and on the problem of image classification to evaluate our approach. Finally, we describe a novel entropy-based method to incorporate an information-theoretic principle to guide our framework. Experimental validation against well-known public datasets shows the viability of this approach and marks an improvement over the state of the art in terms of simplicity and performance.

  7. Online Feature Selection for Classifying Emphysema in HRCT Images

    Directory of Open Access Journals (Sweden)

    M. Prasad

    2008-06-01

    Full Text Available Feature subset selection, applied as a pre- processing step to machine learning, is valuable in dimensionality reduction, eliminating irrelevant data and improving classifier performance. In the classic formulation of the feature selection problem, it is assumed that all the features are available at the beginning. However, in many real world problems, there are scenarios where not all features are present initially and must be integrated as they become available. In such scenarios, online feature selection provides an efficient way to sort through a large space of features. It is in this context that we introduce online feature selection for the classification of emphysema, a smoking related disease that appears as low attenuation regions in High Resolution Computer Tomography (HRCT images. The technique was successfully evaluated on 61 HRCT scans and compared with different online feature selection approaches, including hill climbing, best first search, grafting, and correlation-based feature selection. The results were also compared against ldensity maskr, a standard approach used for emphysema detection in medical image analysis.

  8. Instance Selection for Classifier Performance Estimation in Meta Learning

    Directory of Open Access Journals (Sweden)

    Marcin Blachnik

    2017-11-01

    Full Text Available Building an accurate prediction model is challenging and requires appropriate model selection. This process is very time consuming but can be accelerated with meta-learning–automatic model recommendation by estimating the performances of given prediction models without training them. Meta-learning utilizes metadata extracted from the dataset to effectively estimate the accuracy of the model in question. To achieve that goal, metadata descriptors must be gathered efficiently and must be informative to allow the precise estimation of prediction accuracy. In this paper, a new type of metadata descriptors is analyzed. These descriptors are based on the compression level obtained from the instance selection methods at the data-preprocessing stage. To verify their suitability, two types of experiments on real-world datasets have been conducted. In the first one, 11 instance selection methods were examined in order to validate the compression–accuracy relation for three classifiers: k-nearest neighbors (kNN, support vector machine (SVM, and random forest. From this analysis, two methods are recommended (instance-based learning type 2 (IB2, and edited nearest neighbor (ENN which are then compared with the state-of-the-art metaset descriptors. The obtained results confirm that the two suggested compression-based meta-features help to predict accuracy of the base model much more accurately than the state-of-the-art solution.

  9. Precision machining commercialization

    International Nuclear Information System (INIS)

    1978-01-01

    To accelerate precision machining development so as to realize more of the potential savings within the next few years of known Department of Defense (DOD) part procurement, the Air Force Materials Laboratory (AFML) is sponsoring the Precision Machining Commercialization Project (PMC). PMC is part of the Tri-Service Precision Machine Tool Program of the DOD Manufacturing Technology Five-Year Plan. The technical resources supporting PMC are provided under sponsorship of the Department of Energy (DOE). The goal of PMC is to minimize precision machining development time and cost risk for interested vendors. PMC will do this by making available the high precision machining technology as developed in two DOE contractor facilities, the Lawrence Livermore Laboratory of the University of California and the Union Carbide Corporation, Nuclear Division, Y-12 Plant, at Oak Ridge, Tennessee

  10. Introduction to machine learning.

    Science.gov (United States)

    Baştanlar, Yalin; Ozuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods.

  11. LHC Report: machine development

    CERN Multimedia

    Rogelio Tomás García for the LHC team

    2015-01-01

    Machine development weeks are carefully planned in the LHC operation schedule to optimise and further study the performance of the machine. The first machine development session of Run 2 ended on Saturday, 25 July. Despite various hiccoughs, it allowed the operators to make great strides towards improving the long-term performance of the LHC.   The main goals of this first machine development (MD) week were to determine the minimum beam-spot size at the interaction points given existing optics and collimation constraints; to test new beam instrumentation; to evaluate the effectiveness of performing part of the beam-squeezing process during the energy ramp; and to explore the limits on the number of protons per bunch arising from the electromagnetic interactions with the accelerator environment and the other beam. Unfortunately, a series of events reduced the machine availability for studies to about 50%. The most critical issue was the recurrent trip of a sextupolar corrector circuit –...

  12. Classifying spaces of degenerating polarized Hodge structures

    CERN Document Server

    Kato, Kazuya

    2009-01-01

    In 1970, Phillip Griffiths envisioned that points at infinity could be added to the classifying space D of polarized Hodge structures. In this book, Kazuya Kato and Sampei Usui realize this dream by creating a logarithmic Hodge theory. They use the logarithmic structures begun by Fontaine-Illusie to revive nilpotent orbits as a logarithmic Hodge structure. The book focuses on two principal topics. First, Kato and Usui construct the fine moduli space of polarized logarithmic Hodge structures with additional structures. Even for a Hermitian symmetric domain D, the present theory is a refinem

  13. Gearbox Condition Monitoring Using Advanced Classifiers

    Directory of Open Access Journals (Sweden)

    P. Večeř

    2010-01-01

    Full Text Available New efficient and reliable methods for gearbox diagnostics are needed in automotive industry because of growing demand for production quality. This paper presents the application of two different classifiers for gearbox diagnostics – Kohonen Neural Networks and the Adaptive-Network-based Fuzzy Interface System (ANFIS. Two different practical applications are presented. In the first application, the tested gearboxes are separated into two classes according to their condition indicators. In the second example, ANFIS is applied to label the tested gearboxes with a Quality Index according to the condition indicators. In both applications, the condition indicators were computed from the vibration of the gearbox housing. 

  14. Cubical sets as a classifying topos

    DEFF Research Database (Denmark)

    Spitters, Bas

    Coquand’s cubical set model for homotopy type theory provides the basis for a computational interpretation of the univalence axiom and some higher inductive types, as implemented in the cubical proof assistant. We show that the underlying cube category is the opposite of the Lawvere theory of De...... Morgan algebras. The topos of cubical sets itself classifies the theory of ‘free De Morgan algebras’. This provides us with a topos with an internal ‘interval’. Using this interval we construct a model of type theory following van den Berg and Garner. We are currently investigating the precise relation...

  15. Double Ramp Loss Based Reject Option Classifier

    Science.gov (United States)

    2015-05-22

    of convex (DC) functions. To minimize it, we use DC programming approach [1]. The proposed method has following advantages: (1) the proposed loss LDR ...space constraints. We see that LDR does not put any restriction on ρ for it to be an upper bound of L0−d−1. 2.2 Risk Formulation Using LDR Let S = {(xn...classifier learnt using LDR based approach (C = 100, μ = 1, d = .2). Filled circles and triangles represent the support vectors. 4 Experimental Results We show

  16. A review of machine learning in obesity.

    Science.gov (United States)

    DeGregory, K W; Kuiper, P; DeSilvio, T; Pleuss, J D; Miller, R; Roginski, J W; Fisher, C B; Harness, D; Viswanath, S; Heymsfield, S B; Dungan, I; Thomas, D M

    2018-05-01

    Rich sources of obesity-related data arising from sensors, smartphone apps, electronic medical health records and insurance data can bring new insights for understanding, preventing and treating obesity. For such large datasets, machine learning provides sophisticated and elegant tools to describe, classify and predict obesity-related risks and outcomes. Here, we review machine learning methods that predict and/or classify such as linear and logistic regression, artificial neural networks, deep learning and decision tree analysis. We also review methods that describe and characterize data such as cluster analysis, principal component analysis, network science and topological data analysis. We introduce each method with a high-level overview followed by examples of successful applications. The algorithms were then applied to National Health and Nutrition Examination Survey to demonstrate methodology, utility and outcomes. The strengths and limitations of each method were also evaluated. This summary of machine learning algorithms provides a unique overview of the state of data analysis applied specifically to obesity. © 2018 World Obesity Federation.

  17. Machine learning search for variable stars

    Science.gov (United States)

    Pashchenko, Ilya N.; Sokolovsky, Kirill V.; Gavras, Panagiotis

    2018-04-01

    Photometric variability detection is often considered as a hypothesis testing problem: an object is variable if the null hypothesis that its brightness is constant can be ruled out given the measurements and their uncertainties. The practical applicability of this approach is limited by uncorrected systematic errors. We propose a new variability detection technique sensitive to a wide range of variability types while being robust to outliers and underestimated measurement uncertainties. We consider variability detection as a classification problem that can be approached with machine learning. Logistic Regression (LR), Support Vector Machines (SVM), k Nearest Neighbours (kNN), Neural Nets (NN), Random Forests (RF), and Stochastic Gradient Boosting classifier (SGB) are applied to 18 features (variability indices) quantifying scatter and/or correlation between points in a light curve. We use a subset of Optical Gravitational Lensing Experiment phase two (OGLE-II) Large Magellanic Cloud (LMC) photometry (30 265 light curves) that was searched for variability using traditional methods (168 known variable objects) as the training set and then apply the NN to a new test set of 31 798 OGLE-II LMC light curves. Among 205 candidates selected in the test set, 178 are real variables, while 13 low-amplitude variables are new discoveries. The machine learning classifiers considered are found to be more efficient (select more variables and fewer false candidates) compared to traditional techniques using individual variability indices or their linear combination. The NN, SGB, SVM, and RF show a higher efficiency compared to LR and kNN.

  18. Classifying Coding DNA with Nucleotide Statistics

    Directory of Open Access Journals (Sweden)

    Nicolas Carels

    2009-10-01

    Full Text Available In this report, we compared the success rate of classification of coding sequences (CDS vs. introns by Codon Structure Factor (CSF and by a method that we called Universal Feature Method (UFM. UFM is based on the scoring of purine bias (Rrr and stop codon frequency. We show that the success rate of CDS/intron classification by UFM is higher than by CSF. UFM classifies ORFs as coding or non-coding through a score based on (i the stop codon distribution, (ii the product of purine probabilities in the three positions of nucleotide triplets, (iii the product of Cytosine (C, Guanine (G, and Adenine (A probabilities in the 1st, 2nd, and 3rd positions of triplets, respectively, (iv the probabilities of G in 1st and 2nd position of triplets and (v the distance of their GC3 vs. GC2 levels to the regression line of the universal correlation. More than 80% of CDSs (true positives of Homo sapiens (>250 bp, Drosophila melanogaster (>250 bp and Arabidopsis thaliana (>200 bp are successfully classified with a false positive rate lower or equal to 5%. The method releases coding sequences in their coding strand and coding frame, which allows their automatic translation into protein sequences with 95% confidence. The method is a natural consequence of the compositional bias of nucleotides in coding sequences.

  19. STATISTICAL TOOLS FOR CLASSIFYING GALAXY GROUP DYNAMICS

    International Nuclear Information System (INIS)

    Hou, Annie; Parker, Laura C.; Harris, William E.; Wilman, David J.

    2009-01-01

    The dynamical state of galaxy groups at intermediate redshifts can provide information about the growth of structure in the universe. We examine three goodness-of-fit tests, the Anderson-Darling (A-D), Kolmogorov, and χ 2 tests, in order to determine which statistical tool is best able to distinguish between groups that are relaxed and those that are dynamically complex. We perform Monte Carlo simulations of these three tests and show that the χ 2 test is profoundly unreliable for groups with fewer than 30 members. Power studies of the Kolmogorov and A-D tests are conducted to test their robustness for various sample sizes. We then apply these tests to a sample of the second Canadian Network for Observational Cosmology Redshift Survey (CNOC2) galaxy groups and find that the A-D test is far more reliable and powerful at detecting real departures from an underlying Gaussian distribution than the more commonly used χ 2 and Kolmogorov tests. We use this statistic to classify a sample of the CNOC2 groups and find that 34 of 106 groups are inconsistent with an underlying Gaussian velocity distribution, and thus do not appear relaxed. In addition, we compute velocity dispersion profiles (VDPs) for all groups with more than 20 members and compare the overall features of the Gaussian and non-Gaussian groups, finding that the VDPs of the non-Gaussian groups are distinct from those classified as Gaussian.

  20. Mercury⊕: An evidential reasoning image classifier

    Science.gov (United States)

    Peddle, Derek R.

    1995-12-01

    MERCURY⊕ is a multisource evidential reasoning classification software system based on the Dempster-Shafer theory of evidence. The design and implementation of this software package is described for improving the classification and analysis of multisource digital image data necessary for addressing advanced environmental and geoscience applications. In the remote-sensing context, the approach provides a more appropriate framework for classifying modern, multisource, and ancillary data sets which may contain a large number of disparate variables with different statistical properties, scales of measurement, and levels of error which cannot be handled using conventional Bayesian approaches. The software uses a nonparametric, supervised approach to classification, and provides a more objective and flexible interface to the evidential reasoning framework using a frequency-based method for computing support values from training data. The MERCURY⊕ software package has been implemented efficiently in the C programming language, with extensive use made of dynamic memory allocation procedures and compound linked list and hash-table data structures to optimize the storage and retrieval of evidence in a Knowledge Look-up Table. The software is complete with a full user interface and runs under Unix, Ultrix, VAX/VMS, MS-DOS, and Apple Macintosh operating system. An example of classifying alpine land cover and permafrost active layer depth in northern Canada is presented to illustrate the use and application of these ideas.

  1. Identifying saltcedar with hyperspectral data and support vector machines

    Science.gov (United States)

    Saltcedar (Tamarix spp.) are a group of dense phreatophytic shrubs and trees that are invasive to riparian areas throughout the United States. This study determined the feasibility of using hyperspectral data and a support vector machine (SVM) classifier to discriminate saltcedar from other cover t...

  2. Modeling a ground-coupled heat pump system by a support vector machine

    Energy Technology Data Exchange (ETDEWEB)

    Esen, Hikmet; Esen, Mehmet [Department of Mechanical Education, Faculty of Technical Education, Firat University, 23119 Elazig (Turkey); Inalli, Mustafa [Department of Mechanical Engineering, Faculty of Engineering, Firat University, 23279 Elazig (Turkey); Sengur, Abdulkadir [Department of Electronic and Computer Science, Faculty of Technical Education, Firat University, 23119 Elazig (Turkey)

    2008-08-15

    This paper reports on a modeling study of ground coupled heat pump (GCHP) system performance (COP) by using a support vector machine (SVM) method. A GCHP system is a multi-variable system that is hard to model by conventional methods. As regards the SVM, it has a superior capability for generalization, and this capability is independent of the dimensionality of the input data. In this study, a SVM based method was intended to adopt GCHP system for efficient modeling. The Lin-kernel SVM method was quite efficient in modeling purposes and did not require a pre-knowledge about the system. The performance of the proposed methodology was evaluated by using several statistical validation parameters. It is found that the root-mean squared (RMS) value is 0.002722, the coefficient of multiple determinations (R{sup 2}) value is 0.999999, coefficient of variation (cov) value is 0.077295, and mean error function (MEF) value is 0.507437 for the proposed Lin-kernel SVM method. The optimum parameters of the SVM method were determined by using a greedy search algorithm. This search algorithm was effective for obtaining the optimum parameters. The simulation results show that the SVM is a good method for prediction of the COP of the GCHP system. The computation of SVM model is faster compared with other machine learning techniques (artificial neural networks (ANN) and adaptive neuro-fuzzy inference system (ANFIS)); because there are fewer free parameters and only support vectors (only a fraction of all data) are used in the generalization process. (author)

  3. ClearTK 2.0: Design Patterns for Machine Learning in UIMA

    OpenAIRE

    Bethard, Steven; Ogren, Philip; Becker, Lee

    2014-01-01

    ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, r...

  4. Automatic Task Classification via Support Vector Machine and Crowdsourcing

    Directory of Open Access Journals (Sweden)

    Hyungsik Shin

    2018-01-01

    Full Text Available Automatic task classification is a core part of personal assistant systems that are widely used in mobile devices such as smartphones and tablets. Even though many industry leaders are providing their own personal assistant services, their proprietary internals and implementations are not well known to the public. In this work, we show through real implementation and evaluation that automatic task classification can be implemented for mobile devices by using the support vector machine algorithm and crowdsourcing. To train our task classifier, we collected our training data set via crowdsourcing using the Amazon Mechanical Turk platform. Our classifier can classify a short English sentence into one of the thirty-two predefined tasks that are frequently requested while using personal mobile devices. Evaluation results show high prediction accuracy of our classifier ranging from 82% to 99%. By using large amount of crowdsourced data, we also illustrate the relationship between training data size and the prediction accuracy of our task classifier.

  5. Machine Learning and Radiology

    Science.gov (United States)

    Wang, Shijun; Summers, Ronald M.

    2012-01-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077

  6. Machine learning and radiology.

    Science.gov (United States)

    Wang, Shijun; Summers, Ronald M

    2012-07-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. Copyright © 2012. Published by Elsevier B.V.

  7. Des Machines pour ecrire (Writing Machines).

    Science.gov (United States)

    Mirande, Corinne

    1996-01-01

    A variety of formats for writing news articles and announcements on a variety of common topics (current events, cultural events, sports, surveys, interviews, travelogs, and classified advertising) is presented in fill-in-the-blank form. Suggestions are offered for using these formats to teach French usage at all levels. (MSE)

  8. 36 CFR 1256.46 - National security-classified information.

    Science.gov (United States)

    2010-07-01

    ... 36 Parks, Forests, and Public Property 3 2010-07-01 2010-07-01 false National security-classified... Restrictions § 1256.46 National security-classified information. In accordance with 5 U.S.C. 552(b)(1), NARA... properly classified under the provisions of the pertinent Executive Order on Classified National Security...

  9. DNA-based machines.

    Science.gov (United States)

    Wang, Fuan; Willner, Bilha; Willner, Itamar

    2014-01-01

    The base sequence in nucleic acids encodes substantial structural and functional information into the biopolymer. This encoded information provides the basis for the tailoring and assembly of DNA machines. A DNA machine is defined as a molecular device that exhibits the following fundamental features. (1) It performs a fuel-driven mechanical process that mimics macroscopic machines. (2) The mechanical process requires an energy input, "fuel." (3) The mechanical operation is accompanied by an energy consumption process that leads to "waste products." (4) The cyclic operation of the DNA devices, involves the use of "fuel" and "anti-fuel" ingredients. A variety of DNA-based machines are described, including the construction of "tweezers," "walkers," "robots," "cranes," "transporters," "springs," "gears," and interlocked cyclic DNA structures acting as reconfigurable catenanes, rotaxanes, and rotors. Different "fuels", such as nucleic acid strands, pH (H⁺/OH⁻), metal ions, and light, are used to trigger the mechanical functions of the DNA devices. The operation of the devices in solution and on surfaces is described, and a variety of optical, electrical, and photoelectrochemical methods to follow the operations of the DNA machines are presented. We further address the possible applications of DNA machines and the future perspectives of molecular DNA devices. These include the application of DNA machines as functional structures for the construction of logic gates and computing, for the programmed organization of metallic nanoparticle structures and the control of plasmonic properties, and for controlling chemical transformations by DNA machines. We further discuss the future applications of DNA machines for intracellular sensing, controlling intracellular metabolic pathways, and the use of the functional nanostructures for drug delivery and medical applications.

  10. An SVM classifier to separate false signals from microcalcifications in digital mammograms

    Energy Technology Data Exchange (ETDEWEB)

    Bazzani, Armando; Bollini, Dante; Brancaccio, Rosa; Campanini, Renato; Riccardi, Alessandro; Romani, Davide [Department of Physics, University of Bologna (Italy); INFN, Bologna (Italy); Lanconelli, Nico [Department of Physics, University of Bologna, and INFN, Bologna (Italy). E-mail: nico.lanconelli@bo.infn.it; Bevilacqua, Alessandro [Department of Electronics, Computer Science and Systems, University of Bologna, and INFN, Bologna (Italy)

    2001-06-01

    In this paper we investigate the feasibility of using an SVM (support vector machine) classifier in our automatic system for the detection of clustered microcalcifications in digital mammograms. SVM is a technique for pattern recognition which relies on the statistical learning theory. It minimizes a function of two terms: the number of misclassified vectors of the training set and a term regarding the generalization classifier capability. We compare the SVM classifier with an MLP (multi-layer perceptron) in the false-positive reduction phase of our detection scheme: a detected signal is considered either microcalcification or false signal, according to the value of a set of its features. The SVM classifier gets slightly better results than the MLP one (Az value of 0.963 against 0.958) in the presence of a high number of training data; the improvement becomes much more evident (Az value of 0.952 against 0.918) in training sets of reduced size. Finally, the setting of the SVM classifier is much easier than the MLP one. (author)

  11. A comparative study of machine learning models for ethnicity classification

    Science.gov (United States)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  12. Fundamentals of machine design

    CERN Document Server

    Karaszewski, Waldemar

    2011-01-01

    A forum of researchers, educators and engineers involved in various aspects of Machine Design provided the inspiration for this collection of peer-reviewed papers. The resultant dissemination of the latest research results, and the exchange of views concerning the future research directions to be taken in this field will make the work of immense value to all those having an interest in the topics covered. The book reflects the cooperative efforts made in seeking out the best strategies for effecting improvements in the quality and the reliability of machines and machine parts and for extending

  13. Machine Learning for Hackers

    CERN Document Server

    Conway, Drew

    2012-01-01

    If you're an experienced programmer interested in crunching data, this book will get you started with machine learning-a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation. Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyz

  14. Creativity in Machine Learning

    OpenAIRE

    Thoma, Martin

    2016-01-01

    Recent machine learning techniques can be modified to produce creative results. Those results did not exist before; it is not a trivial combination of the data which was fed into the machine learning system. The obtained results come in multiple forms: As images, as text and as audio. This paper gives a high level overview of how they are created and gives some examples. It is meant to be a summary of the current work and give people who are new to machine learning some starting points.

  15. Machine Tool Software

    Science.gov (United States)

    1988-01-01

    A NASA-developed software package has played a part in technical education of students who major in Mechanical Engineering Technology at William Rainey Harper College. Professor Hack has been using (APT) Automatically Programmed Tool Software since 1969 in his CAD/CAM Computer Aided Design and Manufacturing curriculum. Professor Hack teaches the use of APT programming languages for control of metal cutting machines. Machine tool instructions are geometry definitions written in APT Language to constitute a "part program." The part program is processed by the machine tool. CAD/CAM students go from writing a program to cutting steel in the course of a semester.

  16. Comparison of classifiers for decoding sensory and cognitive information from prefrontal neuronal populations.

    Directory of Open Access Journals (Sweden)

    Elaine Astrand

    Full Text Available Decoding neuronal information is important in neuroscience, both as a basic means to understand how neuronal activity is related to cerebral function and as a processing stage in driving neuroprosthetic effectors. Here, we compare the readout performance of six commonly used classifiers at decoding two different variables encoded by the spiking activity of the non-human primate frontal eye fields (FEF: the spatial position of a visual cue, and the instructed orientation of the animal's attention. While the first variable is exogenously driven by the environment, the second variable corresponds to the interpretation of the instruction conveyed by the cue; it is endogenously driven and corresponds to the output of internal cognitive operations performed on the visual attributes of the cue. These two variables were decoded using either a regularized optimal linear estimator in its explicit formulation, an optimal linear artificial neural network estimator, a non-linear artificial neural network estimator, a non-linear naïve Bayesian estimator, a non-linear Reservoir recurrent network classifier or a non-linear Support Vector Machine classifier. Our results suggest that endogenous information such as the orientation of attention can be decoded from the FEF with the same accuracy as exogenous visual information. All classifiers did not behave equally in the face of population size and heterogeneity, the available training and testing trials, the subject's behavior and the temporal structure of the variable of interest. In most situations, the regularized optimal linear estimator and the non-linear Support Vector Machine classifiers outperformed the other tested decoders.

  17. Ant colony optimization algorithm for interpretable Bayesian classifiers combination: application to medical predictions.

    Directory of Open Access Journals (Sweden)

    Salah Bouktif

    Full Text Available Prediction and classification techniques have been well studied by machine learning researchers and developed for several real-word problems. However, the level of acceptance and success of prediction models are still below expectation due to some difficulties such as the low performance of prediction models when they are applied in different environments. Such a problem has been addressed by many researchers, mainly from the machine learning community. A second problem, principally raised by model users in different communities, such as managers, economists, engineers, biologists, and medical practitioners, etc., is the prediction models' interpretability. The latter is the ability of a model to explain its predictions and exhibit the causality relationships between the inputs and the outputs. In the case of classification, a successful way to alleviate the low performance is to use ensemble classiers. It is an intuitive strategy to activate collaboration between different classifiers towards a better performance than individual classier. Unfortunately, ensemble classifiers method do not take into account the interpretability of the final classification outcome. It even worsens the original interpretability of the individual classifiers. In this paper we propose a novel implementation of classifiers combination approach that does not only promote the overall performance but also preserves the interpretability of the resulting model. We propose a solution based on Ant Colony Optimization and tailored for the case of Bayesian classifiers. We validate our proposed solution with case studies from medical domain namely, heart disease and Cardiotography-based predictions, problems where interpretability is critical to make appropriate clinical decisions.The datasets, Prediction Models and software tool together with supplementary materials are available at http://faculty.uaeu.ac.ae/salahb/ACO4BC.htm.

  18. Two channel EEG thought pattern classifier.

    Science.gov (United States)

    Craig, D A; Nguyen, H T; Burchey, H A

    2006-01-01

    This paper presents a real-time electro-encephalogram (EEG) identification system with the goal of achieving hands free control. With two EEG electrodes placed on the scalp of the user, EEG signals are amplified and digitised directly using a ProComp+ encoder and transferred to the host computer through the RS232 interface. Using a real-time multilayer neural network, the actual classification for the control of a powered wheelchair has a very fast response. It can detect changes in the user's thought pattern in 1 second. Using only two EEG electrodes at positions O(1) and C(4) the system can classify three mental commands (forward, left and right) with an accuracy of more than 79 %

  19. Classifying Drivers' Cognitive Load Using EEG Signals.

    Science.gov (United States)

    Barua, Shaibal; Ahmed, Mobyen Uddin; Begum, Shahina

    2017-01-01

    A growing traffic safety issue is the effect of cognitive loading activities on traffic safety and driving performance. To monitor drivers' mental state, understanding cognitive load is important since while driving, performing cognitively loading secondary tasks, for example talking on the phone, can affect the performance in the primary task, i.e. driving. Electroencephalography (EEG) is one of the reliable measures of cognitive load that can detect the changes in instantaneous load and effect of cognitively loading secondary task. In this driving simulator study, 1-back task is carried out while the driver performs three different simulated driving scenarios. This paper presents an EEG based approach to classify a drivers' level of cognitive load using Case-Based Reasoning (CBR). The results show that for each individual scenario as well as using data combined from the different scenarios, CBR based system achieved approximately over 70% of classification accuracy.

  20. Classifying prion and prion-like phenomena.

    Science.gov (United States)

    Harbi, Djamel; Harrison, Paul M

    2014-01-01

    The universe of prion and prion-like phenomena has expanded significantly in the past several years. Here, we overview the challenges in classifying this data informatically, given that terms such as "prion-like", "prion-related" or "prion-forming" do not have a stable meaning in the scientific literature. We examine the spectrum of proteins that have been described in the literature as forming prions, and discuss how "prion" can have a range of meaning, with a strict definition being for demonstration of infection with in vitro-derived recombinant prions. We suggest that although prion/prion-like phenomena can largely be apportioned into a small number of broad groups dependent on the type of transmissibility evidence for them, as new phenomena are discovered in the coming years, a detailed ontological approach might be necessary that allows for subtle definition of different "flavors" of prion / prion-like phenomena.