WorldWideScience

Sample records for learning classification rules

  1. A supervised learning rule for classification of spatiotemporal spike patterns.

    Science.gov (United States)

    Lilin Guo; Zhenzhong Wang; Adjouadi, Malek

    2016-08-01

    This study introduces a novel supervised algorithm for spiking neurons that take into consideration synapse delays and axonal delays associated with weights. It can be utilized for both classification and association and uses several biologically influenced properties, such as axonal and synaptic delays. This algorithm also takes into consideration spike-timing-dependent plasticity as in Remote Supervised Method (ReSuMe). This paper focuses on the classification aspect alone. Spiked neurons trained according to this proposed learning rule are capable of classifying different categories by the associated sequences of precisely timed spikes. Simulation results have shown that the proposed learning method greatly improves classification accuracy when compared to the Spike Pattern Association Neuron (SPAN) and the Tempotron learning rule.

  2. A rule-learning program in high energy physics event classification

    International Nuclear Information System (INIS)

    Clearwater, S.H.; Stern, E.G.

    1991-01-01

    We have applied a rule-learning program to the problem of event classification in high energy physics. The program searches for event classifications, i.e. rules, and effectively allows an exploration of many more possible classifications than is practical by a physicist. The program, RL4, is particularly useful because it can easily explore multi-dimensional rules as well as rules that may seem non-intuitive at first to the physicist. RL4 is also contrasted with other learning programs. (orig.)

  3. The efficiency of the RULES-4 classification learning algorithm in predicting the density of agents

    Directory of Open Access Journals (Sweden)

    Ziad Salem

    2014-12-01

    Full Text Available Learning is the act of obtaining new or modifying existing knowledge, behaviours, skills or preferences. The ability to learn is found in humans, other organisms and some machines. Learning is always based on some sort of observations or data such as examples, direct experience or instruction. This paper presents a classification algorithm to learn the density of agents in an arena based on the measurements of six proximity sensors of a combined actuator sensor units (CASUs. Rules are presented that were induced by the learning algorithm that was trained with data-sets based on the CASU’s sensor data streams collected during a number of experiments with “Bristlebots (agents in the arena (environment”. It was found that a set of rules generated by the learning algorithm is able to predict the number of bristlebots in the arena based on the CASU’s sensor readings with satisfying accuracy.

  4. Optical beam classification using deep learning: a comparison with rule- and feature-based classification

    Science.gov (United States)

    Alom, Md. Zahangir; Awwal, Abdul A. S.; Lowe-Webb, Roger; Taha, Tarek M.

    2017-08-01

    Deep-learning methods are gaining popularity because of their state-of-the-art performance in image classification tasks. In this paper, we explore classification of laser-beam images from the National Ignition Facility (NIF) using a novel deeplearning approach. NIF is the world's largest, most energetic laser. It has nearly 40,000 optics that precisely guide, reflect, amplify, and focus 192 laser beams onto a fusion target. NIF utilizes four petawatt lasers called the Advanced Radiographic Capability (ARC) to produce backlighting X-ray illumination to capture implosion dynamics of NIF experiments with picosecond temporal resolution. In the current operational configuration, four independent short-pulse ARC beams are created and combined in a split-beam configuration in each of two NIF apertures at the entry of the pre-amplifier. The subaperture beams then propagate through the NIF beampath up to the ARC compressor. Each ARC beamlet is separately compressed with a dedicated set of four gratings and recombined as sub-apertures for transport to the parabola vessel, where the beams are focused using parabolic mirrors and pointed to the target. Small angular errors in the compressor gratings can cause the sub-aperture beams to diverge from one another and prevent accurate alignment through the transport section between the compressor and parabolic mirrors. This is an off-normal condition that must be detected and corrected. The goal of the off-normal check is to determine whether the ARC beamlets are sufficiently overlapped into a merged single spot or diverged into two distinct spots. Thus, the objective of the current work is three-fold: developing a simple algorithm to perform off-normal classification, exploring the use of Convolutional Neural Network (CNN) for the same task, and understanding the inter-relationship of the two approaches. The CNN recognition results are compared with other machine-learning approaches, such as Deep Neural Network (DNN) and Support

  5. A SEMI-AUTOMATIC RULE SET BUILDING METHOD FOR URBAN LAND COVER CLASSIFICATION BASED ON MACHINE LEARNING AND HUMAN KNOWLEDGE

    Directory of Open Access Journals (Sweden)

    H. Y. Gu

    2017-09-01

    Full Text Available Classification rule set is important for Land Cover classification, which refers to features and decision rules. The selection of features and decision are based on an iterative trial-and-error approach that is often utilized in GEOBIA, however, it is time-consuming and has a poor versatility. This study has put forward a rule set building method for Land cover classification based on human knowledge and machine learning. The use of machine learning is to build rule sets effectively which will overcome the iterative trial-and-error approach. The use of human knowledge is to solve the shortcomings of existing machine learning method on insufficient usage of prior knowledge, and improve the versatility of rule sets. A two-step workflow has been introduced, firstly, an initial rule is built based on Random Forest and CART decision tree. Secondly, the initial rule is analyzed and validated based on human knowledge, where we use statistical confidence interval to determine its threshold. The test site is located in Potsdam City. We utilised the TOP, DSM and ground truth data. The results show that the method could determine rule set for Land Cover classification semi-automatically, and there are static features for different land cover classes.

  6. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Directory of Open Access Journals (Sweden)

    Enrico Glaab

    Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  7. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Science.gov (United States)

    Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M; Krasnogor, Natalio

    2012-01-01

    Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  8. Two fast and accurate heuristic RBF learning rules for data classification.

    Science.gov (United States)

    Rouhani, Modjtaba; Javan, Dawood S

    2016-03-01

    This paper presents new Radial Basis Function (RBF) learning methods for classification problems. The proposed methods use some heuristics to determine the spreads, the centers and the number of hidden neurons of network in such a way that the higher efficiency is achieved by fewer numbers of neurons, while the learning algorithm remains fast and simple. To retain network size limited, neurons are added to network recursively until termination condition is met. Each neuron covers some of train data. The termination condition is to cover all training data or to reach the maximum number of neurons. In each step, the center and spread of the new neuron are selected based on maximization of its coverage. Maximization of coverage of the neurons leads to a network with fewer neurons and indeed lower VC dimension and better generalization property. Using power exponential distribution function as the activation function of hidden neurons, and in the light of new learning approaches, it is proved that all data became linearly separable in the space of hidden layer outputs which implies that there exist linear output layer weights with zero training error. The proposed methods are applied to some well-known datasets and the simulation results, compared with SVM and some other leading RBF learning methods, show their satisfactory and comparable performance. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  10. Classification error of the thresholded independence rule

    DEFF Research Database (Denmark)

    Bak, Britta Anker; Fenger-Grøn, Morten; Jensen, Jens Ledet

    We consider classification in the situation of two groups with normally distributed data in the ‘large p small n’ framework. To counterbalance the high number of variables we consider the thresholded independence rule. An upper bound on the classification error is established which is taylored...

  11. Learning Apache Mahout classification

    CERN Document Server

    Gupta, Ashish

    2015-01-01

    If you are a data scientist who has some experience with the Hadoop ecosystem and machine learning methods and want to try out classification on large datasets using Mahout, this book is ideal for you. Knowledge of Java is essential.

  12. Online learning algorithm for ensemble of decision rules

    KAUST Repository

    Chikalov, Igor; Moshkov, Mikhail; Zielosko, Beata

    2011-01-01

    We describe an online learning algorithm that builds a system of decision rules for a classification problem. Rules are constructed according to the minimum description length principle by a greedy algorithm or using the dynamic programming approach

  13. Machine learning-, rule- and pharmacophore-based classification on the inhibition of P-glycoprotein and NorA.

    Science.gov (United States)

    Ngo, T-D; Tran, T-D; Le, M-T; Thai, K-M

    2016-09-01

    The efflux pumps P-glycoprotein (P-gp) in humans and NorA in Staphylococcus aureus are of great interest for medicinal chemists because of their important roles in multidrug resistance (MDR). The high polyspecificity as well as the unavailability of high-resolution X-ray crystal structures of these transmembrane proteins lead us to combining ligand-based approaches, which in the case of this study were machine learning, perceptual mapping and pharmacophore modelling. For P-gp inhibitory activity, individual models were developed using different machine learning algorithms and subsequently combined into an ensemble model which showed a good discrimination between inhibitors and noninhibitors (acctrain-diverse = 84%; accinternal-test = 92% and accexternal-test = 100%). For ligand promiscuity between P-gp and NorA, perceptual maps and pharmacophore models were generated for the detection of rules and features. Based on these in silico tools, hit compounds for reversing MDR were discovered from the in-house and DrugBank databases through virtual screening in an attempt to restore drug sensitivity in cancer cells and bacteria.

  14. Class Association Rule Pada Metode Associative Classification

    Directory of Open Access Journals (Sweden)

    Eka Karyawati

    2011-11-01

    Full Text Available Frequent patterns (itemsets discovery is an important problem in associative classification rule mining.  Differents approaches have been proposed such as the Apriori-like, Frequent Pattern (FP-growth, and Transaction Data Location (Tid-list Intersection algorithm. This paper focuses on surveying and comparing the state of the art associative classification techniques with regards to the rule generation phase of associative classification algorithms.  This phase includes frequent itemsets discovery and rules mining/extracting methods to generate the set of class association rules (CARs.  There are some techniques proposed to improve the rule generation method.  A technique by utilizing the concepts of discriminative power of itemsets can reduce the size of frequent itemset.  It can prune the useless frequent itemsets. The closed frequent itemset concept can be utilized to compress the rules to be compact rules.  This technique may reduce the size of generated rules.  Other technique is in determining the support threshold value of the itemset. Specifying not single but multiple support threshold values with regard to the class label frequencies can give more appropriate support threshold value.  This technique may generate more accurate rules. Alternative technique to generate rule is utilizing the vertical layout to represent dataset.  This method is very effective because it only needs one scan over dataset, compare with other techniques that need multiple scan over dataset.   However, one problem with these approaches is that the initial set of tid-lists may be too large to fit into main memory. It requires more sophisticated techniques to compress the tid-lists.

  15. The research on business rules classification and specification methods

    OpenAIRE

    Baltrušaitis, Egidijus

    2005-01-01

    The work is based on the research of business rules classification and specification methods. The basics of business rules approach are discussed. The most common business rules classification and modeling methods are analyzed. Business rules modeling techniques and tools for supporting them in the information systems are presented. Basing on the analysis results business rules classification method is proposed. Templates for every business rule type are presented. Business rules structuring ...

  16. Knowledge discovery with classification rules in a cardiovascular dataset.

    Science.gov (United States)

    Podgorelec, Vili; Kokol, Peter; Stiglic, Milojka Molan; Hericko, Marjan; Rozman, Ivan

    2005-12-01

    In this paper we study an evolutionary machine learning approach to data mining and knowledge discovery based on the induction of classification rules. A method for automatic rules induction called AREX using evolutionary induction of decision trees and automatic programming is introduced. The proposed algorithm is applied to a cardiovascular dataset consisting of different groups of attributes which should possibly reveal the presence of some specific cardiovascular problems in young patients. A case study is presented that shows the use of AREX for the classification of patients and for discovering possible new medical knowledge from the dataset. The defined knowledge discovery loop comprises a medical expert's assessment of induced rules to drive the evolution of rule sets towards more appropriate solutions. The final result is the discovery of a possible new medical knowledge in the field of pediatric cardiology.

  17. Biological couplings: Classification and characteristic rules

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    The phenomena that biological functions originate from biological coupling are the important biological foundation of multiple bionics and the significant discoveries in the bionic fields. In this paper, the basic concepts related to biological coupling are introduced from the bionic viewpoint. Constitution, classification and characteristic rules of biological coupling are illuminated, the general modes of biological coupling studies are analyzed, and the prospects of multi-coupling bionics are predicted.

  18. Online learning algorithm for ensemble of decision rules

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    We describe an online learning algorithm that builds a system of decision rules for a classification problem. Rules are constructed according to the minimum description length principle by a greedy algorithm or using the dynamic programming approach. © 2011 Springer-Verlag.

  19. Deep learning for image classification

    Science.gov (United States)

    McCoppin, Ryan; Rizki, Mateen

    2014-06-01

    This paper provides an overview of deep learning and introduces the several subfields of deep learning including a specific tutorial of convolutional neural networks. Traditional methods for learning image features are compared to deep learning techniques. In addition, we present our preliminary classification results, our basic implementation of a convolutional restricted Boltzmann machine on the Mixed National Institute of Standards and Technology database (MNIST), and we explain how to use deep learning networks to assist in our development of a robust gender classification system.

  20. Active Learning for Text Classification

    OpenAIRE

    Hu, Rong

    2011-01-01

    Text classification approaches are used extensively to solve real-world challenges. The success or failure of text classification systems hangs on the datasets used to train them, without a good dataset it is impossible to build a quality system. This thesis examines the applicability of active learning in text classification for the rapid and economical creation of labelled training data. Four main contributions are made in this thesis. First, we present two novel selection strategies to cho...

  1. Deep Learning for ECG Classification

    Science.gov (United States)

    Pyakillya, B.; Kazachenko, N.; Mikhailovsky, N.

    2017-10-01

    The importance of ECG classification is very high now due to many current medical applications where this problem can be stated. Currently, there are many machine learning (ML) solutions which can be used for analyzing and classifying ECG data. However, the main disadvantages of these ML results is use of heuristic hand-crafted or engineered features with shallow feature learning architectures. The problem relies in the possibility not to find most appropriate features which will give high classification accuracy in this ECG problem. One of the proposing solution is to use deep learning architectures where first layers of convolutional neurons behave as feature extractors and in the end some fully-connected (FCN) layers are used for making final decision about ECG classes. In this work the deep learning architecture with 1D convolutional layers and FCN layers for ECG classification is presented and some classification results are showed.

  2. A Novel Texture Classification Procedure by using Association Rules

    Directory of Open Access Journals (Sweden)

    L. Jaba Sheela

    2008-11-01

    Full Text Available Texture can be defined as a local statistical pattern of texture primitives in observer’s domain of interest. Texture classification aims to assign texture labels to unknown textures, according to training samples and classification rules. Association rules have been used in various applications during the past decades. Association rules capture both structural and statistical information, and automatically identify the structures that occur most frequently and relationships that have significant discriminative power. So, association rules can be adapted to capture frequently occurring local structures in textures. This paper describes the usage of association rules for texture classification problem. The performed experimental studies show the effectiveness of the association rules. The overall success rate is about 98%.

  3. Transfer Learning beyond Text Classification

    Science.gov (United States)

    Yang, Qiang

    Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.

  4. Rule-guided human classification of Volunteered Geographic Information

    Science.gov (United States)

    Ali, Ahmed Loai; Falomir, Zoe; Schmid, Falko; Freksa, Christian

    2017-05-01

    During the last decade, web technologies and location sensing devices have evolved generating a form of crowdsourcing known as Volunteered Geographic Information (VGI). VGI acted as a platform of spatial data collection, in particular, when a group of public participants are involved in collaborative mapping activities: they work together to collect, share, and use information about geographic features. VGI exploits participants' local knowledge to produce rich data sources. However, the resulting data inherits problematic data classification. In VGI projects, the challenges of data classification are due to the following: (i) data is likely prone to subjective classification, (ii) remote contributions and flexible contribution mechanisms in most projects, and (iii) the uncertainty of spatial data and non-strict definitions of geographic features. These factors lead to various forms of problematic classification: inconsistent, incomplete, and imprecise data classification. This research addresses classification appropriateness. Whether the classification of an entity is appropriate or inappropriate is related to quantitative and/or qualitative observations. Small differences between observations may be not recognizable particularly for non-expert participants. Hence, in this paper, the problem is tackled by developing a rule-guided classification approach. This approach exploits data mining techniques of Association Classification (AC) to extract descriptive (qualitative) rules of specific geographic features. The rules are extracted based on the investigation of qualitative topological relations between target features and their context. Afterwards, the extracted rules are used to develop a recommendation system able to guide participants to the most appropriate classification. The approach proposes two scenarios to guide participants towards enhancing the quality of data classification. An empirical study is conducted to investigate the classification of grass

  5. CLASSIFICATION OF LEARNING MANAGEMENT SYSTEMS

    Directory of Open Access Journals (Sweden)

    Yu. B. Popova

    2016-01-01

    Full Text Available Using of information technologies and, in particular, learning management systems, increases opportunities of teachers and students in reaching their goals in education. Such systems provide learning content, help organize and monitor training, collect progress statistics and take into account the individual characteristics of each user. Currently, there is a huge inventory of both paid and free systems are physically located both on college servers and in the cloud, offering different features sets of different licensing scheme and the cost. This creates the problem of choosing the best system. This problem is partly due to the lack of comprehensive classification of such systems. Analysis of more than 30 of the most common now automated learning management systems has shown that a classification of such systems should be carried out according to certain criteria, under which the same type of system can be considered. As classification features offered by the author are: cost, functionality, modularity, keeping the customer’s requirements, the integration of content, the physical location of a system, adaptability training. Considering the learning management system within these classifications and taking into account the current trends of their development, it is possible to identify the main requirements to them: functionality, reliability, ease of use, low cost, support for SCORM standard or Tin Can API, modularity and adaptability. According to the requirements at the Software Department of FITR BNTU under the guidance of the author since 2009 take place the development, the use and continuous improvement of their own learning management system.

  6. High Dimensional Classification Using Features Annealed Independence Rules.

    Science.gov (United States)

    Fan, Jianqing; Fan, Yingying

    2008-01-01

    Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.

  7. Comparison of rule induction, decision trees and formal concept analysis approaches for classification

    Science.gov (United States)

    Kotelnikov, E. V.; Milov, V. R.

    2018-05-01

    Rule-based learning algorithms have higher transparency and easiness to interpret in comparison with neural networks and deep learning algorithms. These properties make it possible to effectively use such algorithms to solve descriptive tasks of data mining. The choice of an algorithm depends also on its ability to solve predictive tasks. The article compares the quality of the solution of the problems with binary and multiclass classification based on the experiments with six datasets from the UCI Machine Learning Repository. The authors investigate three algorithms: Ripper (rule induction), C4.5 (decision trees), In-Close (formal concept analysis). The results of the experiments show that In-Close demonstrates the best quality of classification in comparison with Ripper and C4.5, however the latter two generate more compact rule sets.

  8. Biclustering Learning of Trading Rules.

    Science.gov (United States)

    Huang, Qinghua; Wang, Ting; Tao, Dacheng; Li, Xuelong

    2015-10-01

    Technical analysis with numerous indicators and patterns has been regarded as important evidence for making trading decisions in financial markets. However, it is extremely difficult for investors to find useful trading rules based on numerous technical indicators. This paper innovatively proposes the use of biclustering mining to discover effective technical trading patterns that contain a combination of indicators from historical financial data series. This is the first attempt to use biclustering algorithm on trading data. The mined patterns are regarded as trading rules and can be classified as three trading actions (i.e., the buy, the sell, and no-action signals) with respect to the maximum support. A modified K nearest neighborhood ( K -NN) method is applied to classification of trading days in the testing period. The proposed method [called biclustering algorithm and the K nearest neighbor (BIC- K -NN)] was implemented on four historical datasets and the average performance was compared with the conventional buy-and-hold strategy and three previously reported intelligent trading systems. Experimental results demonstrate that the proposed trading system outperforms its counterparts and will be useful for investment in various financial markets.

  9. Supervised Learning for Visual Pattern Classification

    Science.gov (United States)

    Zheng, Nanning; Xue, Jianru

    This chapter presents an overview of the topics and major ideas of supervised learning for visual pattern classification. Two prevalent algorithms, i.e., the support vector machine (SVM) and the boosting algorithm, are briefly introduced. SVMs and boosting algorithms are two hot topics of recent research in supervised learning. SVMs improve the generalization of the learning machine by implementing the rule of structural risk minimization (SRM). It exhibits good generalization even when little training data are available for machine training. The boosting algorithm can boost a weak classifier to a strong classifier by means of the so-called classifier combination. This algorithm provides a general way for producing a classifier with high generalization capability from a great number of weak classifiers.

  10. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  11. FACET CLASSIFICATIONS OF E-LEARNING TOOLS

    Directory of Open Access Journals (Sweden)

    Olena Yu. Balalaieva

    2013-12-01

    Full Text Available The article deals with the classification of e-learning tools based on the facet method, which suggests the separation of the parallel set of objects into independent classification groups; at the same time it is not assumed rigid classification structure and pre-built finite groups classification groups are formed by a combination of values taken from the relevant facets. An attempt to systematize the existing classification of e-learning tools from the standpoint of classification theory is made for the first time. Modern Ukrainian and foreign facet classifications of e-learning tools are described; their positive and negative features compared to classifications based on a hierarchical method are analyzed. The original author's facet classification of e-learning tools is proposed.

  12. Catalogue and classification of technical safety rules for light-water reactors and reprocessing plants

    International Nuclear Information System (INIS)

    Bloser, M.; Fichtner, N.; Neider, R.

    1975-08-01

    This report on the cataloguing and classification of technical rules for land-based light-water reactors and reprocessing plants contains a list of classified rules. The reasons for the classification system used are given and discussed

  13. Using fuzzy association rule mining in cancer classification

    International Nuclear Information System (INIS)

    Mahmoodian, Hamid; Marhaban, M.H.; Abdulrahim, Raha; Rosli, Rozita; Saripan, Iqbal

    2011-01-01

    Full text: The classification of the cancer tumors based on gene expression profiles has been extensively studied in numbers of studies. A wide variety of cancer datasets have been implemented by the various methods of gene selec tion and classification to identify the behavior of the genes in tumors and find the relationships between them and outcome of diseases. Interpretability of the model, which is developed by fuzzy rules and linguistic variables in this study, has been rarely considered. In addition, creating a fuzzy classifier with high performance in classification that uses a subset of significant genes which have been selected by different types of gene selection methods is another goal of this study. A new algorithm has been developed to identify the fuzzy rules and significant genes based on fuzzy association rule mining. At first, different subset of genes which have been selected by different methods, were used to generate primary fuzzy classifiers separately and then proposed algorithm was implemented to mix the genes which have been associated in the primary classifiers and generate a new classifier. The results show that fuzzy classifier can classify the tumors with high performance while presenting the relationships between the genes by linguistic variables

  14. Working memory supports inference learning just like classification learning.

    Science.gov (United States)

    Craig, Stewart; Lewandowsky, Stephan

    2013-08-01

    Recent research has found a positive relationship between people's working memory capacity (WMC) and their speed of category learning. To date, only classification-learning tasks have been considered, in which people learn to assign category labels to objects. It is unknown whether learning to make inferences about category features might also be related to WMC. We report data from a study in which 119 participants undertook classification learning and inference learning, and completed a series of WMC tasks. Working memory capacity was positively related to people's classification and inference learning performance.

  15. Cellular automata rule characterization and classification using texture descriptors

    Science.gov (United States)

    Machicao, Jeaneth; Ribas, Lucas C.; Scabini, Leonardo F. S.; Bruno, Odermir M.

    2018-05-01

    The cellular automata (CA) spatio-temporal patterns have attracted the attention from many researchers since it can provide emergent behavior resulting from the dynamics of each individual cell. In this manuscript, we propose an approach of texture image analysis to characterize and classify CA rules. The proposed method converts the CA spatio-temporal patterns into a gray-scale image. The gray-scale is obtained by creating a binary number based on the 8-connected neighborhood of each dot of the CA spatio-temporal pattern. We demonstrate that this technique enhances the CA rule characterization and allow to use different texture image analysis algorithms. Thus, various texture descriptors were evaluated in a supervised training approach aiming to characterize the CA's global evolution. Our results show the efficiency of the proposed method for the classification of the elementary CA (ECAs), reaching a maximum of 99.57% of accuracy rate according to the Li-Packard scheme (6 classes) and 94.36% for the classification of the 88 rules scheme. Moreover, within the image analysis context, we found a better performance of the method by means of a transformation of the binary states to a gray-scale.

  16. Thermodynamic efficiency of learning a rule in neural networks

    Science.gov (United States)

    Goldt, Sebastian; Seifert, Udo

    2017-11-01

    Biological systems have to build models from their sensory input data that allow them to efficiently process previously unseen inputs. Here, we study a neural network learning a binary classification rule for these inputs from examples provided by a teacher. We analyse the ability of the network to apply the rule to new inputs, that is to generalise from past experience. Using stochastic thermodynamics, we show that the thermodynamic costs of the learning process provide an upper bound on the amount of information that the network is able to learn from its teacher for both batch and online learning. This allows us to introduce a thermodynamic efficiency of learning. We analytically compute the dynamics and the efficiency of a noisy neural network performing online learning in the thermodynamic limit. In particular, we analyse three popular learning algorithms, namely Hebbian, Perceptron and AdaTron learning. Our work extends the methods of stochastic thermodynamics to a new type of learning problem and might form a suitable basis for investigating the thermodynamics of decision-making.

  17. New stopping rules for dendrogram classification in TWINSPAN

    Directory of Open Access Journals (Sweden)

    Omid Esmailzadeh

    2015-12-01

    Full Text Available The aim of this study is to propose a modification of TWINSPAN algorithm with introducing new stopping rules for TWINSPAN. Modified TWINSPAN combines the analysis of heterogeneity of the clusters prior to each division to prevent the imposed divisions of homogeneous clusters and it also solved the limitation of classical TWINSPAN in which the number of clusters increases power of two. For this purpose, ecological groups of Box tree stands in Farim forests were classified with using classical and modified TWINSPAN basis of plant species cover percentage of 60 plots with 400 m2 surface area which were made by releve method (by consideration of indicator stand concept. In this relation, five different heterogeneity measures including Whittaker’s beta diversity and total inertia, Sorensen, Jaccard and Orlo´ci dissimilarity indices which representing diversity and distance indices respectively were involved. Sample plots were also classified from basis of topographical properties using cluster analysis with emphasizing Euclidean distance coefficient and Wards clustering method. Results showed that using of two sets of heterogeneity indices lead to different classification dendrograms. In this relation, results of Whittaker’s beta with total inertia as diversity indices were similar and the other three dissimilarity indices have shown similar behavior. Finally, our results reiterated that modified TWINSPAN did not alter the logic of the TWINSPAN classification, but it increased the flexibility of TWINSPAN dendrogram with changing the hierarchy of divisions in the final classification of ecological groups of Box tree stands in Farim forests.

  18. Document Classification Using Distributed Machine Learning

    OpenAIRE

    Aydin, Galip; Hallac, Ibrahim Riza

    2018-01-01

    In this paper, we investigate the performance and success rates of Na\\"ive Bayes Classification Algorithm for automatic classification of Turkish news into predetermined categories like economy, life, health etc. We use Apache Big Data technologies such as Hadoop, HDFS, Spark and Mahout, and apply these distributed technologies to Machine Learning.

  19. Using Machine Learning for Land Suitability Classification

    African Journals Online (AJOL)

    User

    West African Journal of Applied Ecology, vol. ... evidence for the utility of machine learning methods in land suitability classification especially MCS methods. ... Artificial intelligence tools. ..... Numerical values of index for the various classes.

  20. The statistical mechanics of learning a rule

    International Nuclear Information System (INIS)

    Watkin, T.L.H.; Rau, A.; Biehl, M.

    1993-01-01

    A summary is presented of the statistical mechanical theory of learning a rule with a neural network, a rapidly advancing area which is closely related to other inverse problems frequently encountered by physicists. By emphasizing the relationship between neural networks and strongly interacting physical systems, such as spin glasses, the authors show how learning theory has provided a workshop in which to develop new, exact analytical techniques

  1. Observation versus classification in supervised category learning.

    Science.gov (United States)

    Levering, Kimery R; Kurtz, Kenneth J

    2015-02-01

    The traditional supervised classification paradigm encourages learners to acquire only the knowledge needed to predict category membership (a discriminative approach). An alternative that aligns with important aspects of real-world concept formation is learning with a broader focus to acquire knowledge of the internal structure of each category (a generative approach). Our work addresses the impact of a particular component of the traditional classification task: the guess-and-correct cycle. We compare classification learning to a supervised observational learning task in which learners are shown labeled examples but make no classification response. The goals of this work sit at two levels: (1) testing for differences in the nature of the category representations that arise from two basic learning modes; and (2) evaluating the generative/discriminative continuum as a theoretical tool for understand learning modes and their outcomes. Specifically, we view the guess-and-correct cycle as consistent with a more discriminative approach and therefore expected it to lead to narrower category knowledge. Across two experiments, the observational mode led to greater sensitivity to distributional properties of features and correlations between features. We conclude that a relatively subtle procedural difference in supervised category learning substantially impacts what learners come to know about the categories. The results demonstrate the value of the generative/discriminative continuum as a tool for advancing the psychology of category learning and also provide a valuable constraint for formal models and associated theories.

  2. Comparison of some classification algorithms based on deterministic and nondeterministic decision rules

    KAUST Repository

    Delimata, Paweł

    2010-01-01

    We discuss two, in a sense extreme, kinds of nondeterministic rules in decision tables. The first kind of rules, called as inhibitory rules, are blocking only one decision value (i.e., they have all but one decisions from all possible decisions on their right hand sides). Contrary to this, any rule of the second kind, called as a bounded nondeterministic rule, can have on the right hand side only a few decisions. We show that both kinds of rules can be used for improving the quality of classification. In the paper, two lazy classification algorithms of polynomial time complexity are considered. These algorithms are based on deterministic and inhibitory decision rules, but the direct generation of rules is not required. Instead of this, for any new object the considered algorithms extract from a given decision table efficiently some information about the set of rules. Next, this information is used by a decision-making procedure. The reported results of experiments show that the algorithms based on inhibitory decision rules are often better than those based on deterministic decision rules. We also present an application of bounded nondeterministic rules in construction of rule based classifiers. We include the results of experiments showing that by combining rule based classifiers based on minimal decision rules with bounded nondeterministic rules having confidence close to 1 and sufficiently large support, it is possible to improve the classification quality. © 2010 Springer-Verlag.

  3. Deep learning classification in asteroseismology

    DEFF Research Database (Denmark)

    Hon, Marc; Stello, Dennis; Yu, Jie

    2017-01-01

    In the power spectra of oscillating red giants, there are visually distinct features defining stars ascending the red giant branch from those that have commenced helium core burning. We train a 1D convolutional neural network by supervised learning to automatically learn these visual features from...

  4. Using blocking approach to preserve privacy in classification rules by inserting dummy Transaction

    Directory of Open Access Journals (Sweden)

    Doryaneh Hossien Afshari

    2017-03-01

    Full Text Available The increasing rate of data sharing among organizations could maximize the risk of leaking sensitive knowledge. Trying to solve this problem leads to increase the importance of privacy preserving within the process of data sharing. In this study is focused on privacy preserving in classification rules mining as a technique of data mining. We propose a blocking algorithm to hiding sensitive classification rules. In the solution, rules' hiding occurs as a result of editing a set of transactions which satisfy sensitive classification rules. The proposed approach tries to deceive and block adversaries by inserting some dummy transactions. Finally, the solution has been evaluated and compared with other available solutions. Results show that limiting the number of attributes existing in each sensitive rule will lead to a decrease in both the number of lost rules and the production rate of ghost rules.

  5. Classification Based on Pruning and Double Covered Rule Sets for the Internet of Things Applications

    Science.gov (United States)

    Zhou, Zhongmei; Wang, Weiping

    2014-01-01

    The Internet of things (IOT) is a hot issue in recent years. It accumulates large amounts of data by IOT users, which is a great challenge to mining useful knowledge from IOT. Classification is an effective strategy which can predict the need of users in IOT. However, many traditional rule-based classifiers cannot guarantee that all instances can be covered by at least two classification rules. Thus, these algorithms cannot achieve high accuracy in some datasets. In this paper, we propose a new rule-based classification, CDCR-P (Classification based on the Pruning and Double Covered Rule sets). CDCR-P can induce two different rule sets A and B. Every instance in training set can be covered by at least one rule not only in rule set A, but also in rule set B. In order to improve the quality of rule set B, we take measure to prune the length of rules in rule set B. Our experimental results indicate that, CDCR-P not only is feasible, but also it can achieve high accuracy. PMID:24511304

  6. Classification based on pruning and double covered rule sets for the internet of things applications.

    Science.gov (United States)

    Li, Shasha; Zhou, Zhongmei; Wang, Weiping

    2014-01-01

    The Internet of things (IOT) is a hot issue in recent years. It accumulates large amounts of data by IOT users, which is a great challenge to mining useful knowledge from IOT. Classification is an effective strategy which can predict the need of users in IOT. However, many traditional rule-based classifiers cannot guarantee that all instances can be covered by at least two classification rules. Thus, these algorithms cannot achieve high accuracy in some datasets. In this paper, we propose a new rule-based classification, CDCR-P (Classification based on the Pruning and Double Covered Rule sets). CDCR-P can induce two different rule sets A and B. Every instance in training set can be covered by at least one rule not only in rule set A, but also in rule set B. In order to improve the quality of rule set B, we take measure to prune the length of rules in rule set B. Our experimental results indicate that, CDCR-P not only is feasible, but also it can achieve high accuracy.

  7. Deep learning classification in asteroseismology

    Science.gov (United States)

    Hon, Marc; Stello, Dennis; Yu, Jie

    2017-08-01

    In the power spectra of oscillating red giants, there are visually distinct features defining stars ascending the red giant branch from those that have commenced helium core burning. We train a 1D convolutional neural network by supervised learning to automatically learn these visual features from images of folded oscillation spectra. By training and testing on Kepler red giants, we achieve an accuracy of up to 99 per cent in separating helium-burning red giants from those ascending the red giant branch. The convolutional neural network additionally shows capability in accurately predicting the evolutionary states of 5379 previously unclassified Kepler red giants, by which we now have greatly increased the number of classified stars.

  8. Attribute Learning for SAR Image Classification

    Directory of Open Access Journals (Sweden)

    Chu He

    2017-04-01

    Full Text Available This paper presents a classification approach based on attribute learning for high spatial resolution Synthetic Aperture Radar (SAR images. To explore the representative and discriminative attributes of SAR images, first, an iterative unsupervised algorithm is designed to cluster in the low-level feature space, where the maximum edge response and the ratio of mean-to-variance are included; a cross-validation step is applied to prevent overfitting. Second, the most discriminative clustering centers are sorted out to construct an attribute dictionary. By resorting to the attribute dictionary, a representation vector describing certain categories in the SAR image can be generated, which in turn is used to perform the classifying task. The experiments conducted on TerraSAR-X images indicate that those learned attributes have strong visual semantics, which are characterized by bright and dark spots, stripes, or their combinations. The classification method based on these learned attributes achieves better results.

  9. Galaxy Classification using Machine Learning

    Science.gov (United States)

    Fowler, Lucas; Schawinski, Kevin; Brandt, Ben-Elias; widmer, Nicole

    2017-01-01

    We present our current research into the use of machine learning to classify galaxy imaging data with various convolutional neural network configurations in TensorFlow. We are investigating how five-band Sloan Digital Sky Survey imaging data can be used to train on physical properties such as redshift, star formation rate, mass and morphology. We also investigate the performance of artificially redshifted images in recovering physical properties as image quality degrades.

  10. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  11. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    International Nuclear Information System (INIS)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.; McEwen, Jason D.

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  12. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    Energy Technology Data Exchange (ETDEWEB)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K. [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); McEwen, Jason D., E-mail: dr.michelle.lochner@gmail.com [Mullard Space Science Laboratory, University College London, Surrey RH5 6NT (United Kingdom)

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  13. Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results

    OpenAIRE

    Otero, Fernando E.B.; Freitas, Alex A.

    2016-01-01

    The vast majority of Ant Colony Optimization (ACO) algorithms for inducing classification rules use an ACO-based procedure to create a rule in an one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules)-i.e., the ACO search is guided by the quality of a list of rules, instead of an individual rule. In this paper we propose an extension of the cAnt-MinerPB algorith...

  14. Students Learn by Doing: Teaching about Rules of Thumb.

    Science.gov (United States)

    Cude, Brenda J.

    1990-01-01

    Identifies situation in which consumers are likely to substitute rules of thumb for research, reviews rules of thumb often used as substitutes, and identifies teaching activities to help students learn when substitution is appropriate. (JOW)

  15. Learning of Alignment Rules between Concept Hierarchies

    Science.gov (United States)

    Ichise, Ryutaro; Takeda, Hideaki; Honiden, Shinichi

    With the rapid advances of information technology, we are acquiring much information than ever before. As a result, we need tools for organizing this data. Concept hierarchies such as ontologies and information categorizations are powerful and convenient methods for accomplishing this goal, which have gained wide spread acceptance. Although each concept hierarchy is useful, it is difficult to employ multiple concept hierarchies at the same time because it is hard to align their conceptual structures. This paper proposes a rule learning method that inputs information from a source concept hierarchy and finds suitable location for them in a target hierarchy. The key idea is to find the most similar categories in each hierarchy, where similarity is measured by the κ(kappa) statistic that counts instances belonging to both categories. In order to evaluate our method, we conducted experiments using two internet directories: Yahoo! and LYCOS. We map information instances from the source directory into the target directory, and show that our learned rules agree with a human-generated assignment 76% of the time.

  16. Hyperspectral Image Classification Using Discriminative Dictionary Learning

    International Nuclear Information System (INIS)

    Zongze, Y; Hao, S; Kefeng, J; Huanxin, Z

    2014-01-01

    The hyperspectral image (HSI) processing community has witnessed a surge of papers focusing on the utilization of sparse prior for effective HSI classification. In sparse representation based HSI classification, there are two phases: sparse coding with an over-complete dictionary and classification. In this paper, we first apply a novel fisher discriminative dictionary learning method, which capture the relative difference in different classes. The competitive selection strategy ensures that atoms in the resulting over-complete dictionary are the most discriminative. Secondly, motivated by the assumption that spatially adjacent samples are statistically related and even belong to the same materials (same class), we propose a majority voting scheme incorporating contextual information to predict the category label. Experiment results show that the proposed method can effectively strengthen relative discrimination of the constructed dictionary, and incorporating with the majority voting scheme achieve generally an improved prediction performance

  17. E-LEARNING TOOLS: STRUCTURE, CONTENT, CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Yuliya H. Loboda

    2012-05-01

    Full Text Available The article analyses the problems of organization of educational process with use of electronic means of education. Specifies the definition of "electronic learning", their structure and content. Didactic principles are considered, which are the basis of their creation and use. Given the detailed characteristics of e-learning tools for methodological purposes. On the basis of the allocated pedagogical problems of the use of electronic means of education presented and complemented by their classification, namely the means of theoretical and technological training, means of practical training, support tools, and comprehensive facilities.

  18. Rule-based land cover classification from very high-resolution satellite image with multiresolution segmentation

    Science.gov (United States)

    Haque, Md. Enamul; Al-Ramadan, Baqer; Johnson, Brian A.

    2016-07-01

    Multiresolution segmentation and rule-based classification techniques are used to classify objects from very high-resolution satellite images of urban areas. Custom rules are developed using different spectral, geometric, and textural features with five scale parameters, which exploit varying classification accuracy. Principal component analysis is used to select the most important features out of a total of 207 different features. In particular, seven different object types are considered for classification. The overall classification accuracy achieved for the rule-based method is 95.55% and 98.95% for seven and five classes, respectively. Other classifiers that are not using rules perform at 84.17% and 97.3% accuracy for seven and five classes, respectively. The results exploit coarse segmentation for higher scale parameter and fine segmentation for lower scale parameter. The major contribution of this research is the development of rule sets and the identification of major features for satellite image classification where the rule sets are transferable and the parameters are tunable for different types of imagery. Additionally, the individual objectwise classification and principal component analysis help to identify the required object from an arbitrary number of objects within images given ground truth data for the training.

  19. Rule based systems for big data a machine learning approach

    CERN Document Server

    Liu, Han; Cocea, Mihaela

    2016-01-01

    The ideas introduced in this book explore the relationships among rule based systems, machine learning and big data. Rule based systems are seen as a special type of expert systems, which can be built by using expert knowledge or learning from real data. The book focuses on the development and evaluation of rule based systems in terms of accuracy, efficiency and interpretability. In particular, a unified framework for building rule based systems, which consists of the operations of rule generation, rule simplification and rule representation, is presented. Each of these operations is detailed using specific methods or techniques. In addition, this book also presents some ensemble learning frameworks for building ensemble rule based systems.

  20. Discriminative Bayesian Dictionary Learning for Classification.

    Science.gov (United States)

    Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal

    2016-12-01

    We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.

  1. Learning a New Selection Rule in Visual and Frontal Cortex

    NARCIS (Netherlands)

    van der Togt, Chris; Stănişor, Liviu; Pooresmaeili, Arezoo; Albantakis, Larissa; Deco, Gustavo; Roelfsema, Pieter R

    2016-01-01

    How do you make a decision if you do not know the rules of the game? Models of sensory decision-making suggest that choices are slow if evidence is weak, but they may only apply if the subject knows the task rules. Here, we asked how the learning of a new rule influences neuronal activity in the

  2. Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results.

    Science.gov (United States)

    Otero, Fernando E B; Freitas, Alex A

    2016-01-01

    Most ant colony optimization (ACO) algorithms for inducing classification rules use a ACO-based procedure to create a rule in a one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-Miner[Formula: see text] algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules), i.e., the ACO search is guided by the quality of a list of rules instead of an individual rule. In this paper we propose an extension of the cAnt-Miner[Formula: see text] algorithm to discover a set of rules (unordered rules). The main motivations for this work are to improve the interpretation of individual rules by discovering a set of rules and to evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms, support vector machines, and the cAnt-Miner[Formula: see text] producing ordered rules are also presented.

  3. Code-specific learning rules improve action selection by populations of spiking neurons.

    Science.gov (United States)

    Friedrich, Johannes; Urbanczik, Robert; Senn, Walter

    2014-08-01

    Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions. We previously introduced reinforcement learning for population-based decision making by spiking neurons. Here we generalize population reinforcement learning to spike-based plasticity rules that take account of the postsynaptic neural code. We consider spike/no-spike, spike count and spike latency codes. The multi-valued and continuous-valued features in the postsynaptic code allow for a generalization of binary decision making to multi-valued decision making and continuous-valued action selection. We show that code-specific learning rules speed up learning both for the discrete classification and the continuous regression tasks. The suggested learning rules also speed up with increasing population size as opposed to standard reinforcement learning rules. Continuous action selection is further shown to explain realistic learning speeds in the Morris water maze. Finally, we introduce the concept of action perturbation as opposed to the classical weight- or node-perturbation as an exploration mechanism underlying reinforcement learning. Exploration in the action space greatly increases the speed of learning as compared to exploration in the neuron or weight space.

  4. Classification, (big) data analysis and statistical learning

    CERN Document Server

    Conversano, Claudio; Vichi, Maurizio

    2018-01-01

    This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pul...

  5. Learning general phonological rules from distributional information: a computational model.

    Science.gov (United States)

    Calamaro, Shira; Jarosz, Gaja

    2015-04-01

    Phonological rules create alternations in the phonetic realizations of related words. These rules must be learned by infants in order to identify the phonological inventory, the morphological structure, and the lexicon of a language. Recent work proposes a computational model for the learning of one kind of phonological alternation, allophony (Peperkamp, Le Calvez, Nadal, & Dupoux, 2006). This paper extends the model to account for learning of a broader set of phonological alternations and the formalization of these alternations as general rules. In Experiment 1, we apply the original model to new data in Dutch and demonstrate its limitations in learning nonallophonic rules. In Experiment 2, we extend the model to allow it to learn general rules for alternations that apply to a class of segments. In Experiment 3, the model is further extended to allow for generalization by context; we argue that this generalization must be constrained by linguistic principles. Copyright © 2014 Cognitive Science Society, Inc.

  6. 7 CFR 1000.43 - General classification rules.

    Science.gov (United States)

    2010-01-01

    ... market administrator shall determine the shrinkage or overage of skim milk and butterfat for each pool... pursuant to § 1000.44, the following rules shall apply: (a) Each month the market administrator shall...) and § 1135.11 of this chapter, the pounds of skim milk and butterfat, respectively, in each class in...

  7. [Guidelines for hygienic classification of learning technologies].

    Science.gov (United States)

    Kuchma, V R; Teksheva, L M; Milushkina, O Iu

    2008-01-01

    Optimization of the educational environment under the present-day conditions has been in progress, by using learning techwares (LTW) without fail. To organize and regulate an academic process in terms of the safety of applied LTW, there is a need for their classification. The currently existing attempts to structure LTW disregard hygienically significant aspects. The task of the present study was to substantiate a LTW safety criterion ensuring a universal approach to working out regulations. This criterion may be the exposure intensity determined by the form of organization of education and its pattern, by the procedure of information presentation, and the age-related peculiarities of a pupil, i.e. by the actual load that is presented by the product of the intensity exposure and its time. The hygienic classification of LTW may be used to evaluate their negative effect in an educational process on the health status of children and adolescents, to regulate hazardous factors and training modes, to design and introduce new learning complexes. The structuring of a LTW system allows one to define possible deleterious actions and the possibilities of preventing this action on the basis of strictly established regulations.

  8. Induction and pruning of classification rules for prediction of microseismic hazards in coal mines

    Energy Technology Data Exchange (ETDEWEB)

    Sikora, M. [Silesian Technical University, Gliwice (Poland)

    2011-06-15

    The paper presents results of application of a rule induction and pruning algorithm for classification of a microseismic hazard state in coal mines. Due to imbalanced distribution of examples describing states 'hazardous' and 'safe', the special algorithm was used for induction and rule pruning. The algorithm selects optimal parameters' values influencing rule induction and pruning based on training and tuning sets. A rule quality measure which decides about a form and classification abilities of rules that are induced is the basic parameter of the algorithm. The specificity and sensitivity of a classifier were used to evaluate its quality. Conducted tests show that the admitted method of rules induction and classifier's quality evaluation enables to get better results of classification of microseismic hazards than by methods currently used in mining practice. Results obtained by the rules-based classifier were also compared with results got by a decision tree induction algorithm and by a neuro-fuzzy system.

  9. An Efficient Inductive Genetic Learning Algorithm for Fuzzy Relational Rules

    Directory of Open Access Journals (Sweden)

    Antonio

    2012-04-01

    Full Text Available Fuzzy modelling research has traditionally focused on certain types of fuzzy rules. However, the use of alternative rule models could improve the ability of fuzzy systems to represent a specific problem. In this proposal, an extended fuzzy rule model, that can include relations between variables in the antecedent of rules is presented. Furthermore, a learning algorithm based on the iterative genetic approach which is able to represent the knowledge using this model is proposed as well. On the other hand, potential relations among initial variables imply an exponential growth in the feasible rule search space. Consequently, two filters for detecting relevant potential relations are added to the learning algorithm. These filters allows to decrease the search space complexity and increase the algorithm efficiency. Finally, we also present an experimental study to demonstrate the benefits of using fuzzy relational rules.

  10. Learning classification models with soft-label information.

    Science.gov (United States)

    Nguyen, Quang; Valizadegan, Hamed; Hauskrecht, Milos

    2014-01-01

    Learning of classification models in medicine often relies on data labeled by a human expert. Since labeling of clinical data may be time-consuming, finding ways of alleviating the labeling costs is critical for our ability to automatically learn such models. In this paper we propose a new machine learning approach that is able to learn improved binary classification models more efficiently by refining the binary class information in the training phase with soft labels that reflect how strongly the human expert feels about the original class labels. Two types of methods that can learn improved binary classification models from soft labels are proposed. The first relies on probabilistic/numeric labels, the other on ordinal categorical labels. We study and demonstrate the benefits of these methods for learning an alerting model for heparin induced thrombocytopenia. The experiments are conducted on the data of 377 patient instances labeled by three different human experts. The methods are compared using the area under the receiver operating characteristic curve (AUC) score. Our AUC results show that the new approach is capable of learning classification models more efficiently compared to traditional learning methods. The improvement in AUC is most remarkable when the number of examples we learn from is small. A new classification learning framework that lets us learn from auxiliary soft-label information provided by a human expert is a promising new direction for learning classification models from expert labels, reducing the time and cost needed to label data.

  11. Learning features for tissue classification with the classification restricted Boltzmann machine

    DEFF Research Database (Denmark)

    van Tulder, Gijs; de Bruijne, Marleen

    2014-01-01

    Performance of automated tissue classification in medical imaging depends on the choice of descriptive features. In this paper, we show how restricted Boltzmann machines (RBMs) can be used to learn features that are especially suited for texture-based tissue classification. We introduce the convo...... outperform conventional RBM-based feature learning, which is unsupervised and uses only a generative learning objective, as well as often-used filter banks. We show that a mixture of generative and discriminative learning can produce filters that give a higher classification accuracy....

  12. Improving Hyperspectral Image Classification Method for Fine Land Use Assessment Application Using Semisupervised Machine Learning

    Directory of Open Access Journals (Sweden)

    Chunyang Wang

    2015-01-01

    Full Text Available Study on land use/cover can reflect changing rules of population, economy, agricultural structure adjustment, policy, and traffic and provide better service for the regional economic development and urban evolution. The study on fine land use/cover assessment using hyperspectral image classification is a focal growing area in many fields. Semisupervised learning method which takes a large number of unlabeled samples and minority labeled samples, improving classification and predicting the accuracy effectively, has been a new research direction. In this paper, we proposed improving fine land use/cover assessment based on semisupervised hyperspectral classification method. The test analysis of study area showed that the advantages of semisupervised classification method could improve the high precision overall classification and objective assessment of land use/cover results.

  13. Classification versus inference learning contrasted with real-world categories.

    Science.gov (United States)

    Jones, Erin L; Ross, Brian H

    2011-07-01

    Categories are learned and used in a variety of ways, but the research focus has been on classification learning. Recent work contrasting classification with inference learning of categories found important later differences in category performance. However, theoretical accounts differ on whether this is due to an inherent difference between the tasks or to the implementation decisions. The inherent-difference explanation argues that inference learners focus on the internal structure of the categories--what each category is like--while classification learners focus on diagnostic information to predict category membership. In two experiments, using real-world categories and controlling for earlier methodological differences, inference learners learned more about what each category was like than did classification learners, as evidenced by higher performance on a novel classification test. These results suggest that there is an inherent difference between learning new categories by classifying an item versus inferring a feature.

  14. Unsupervised feature learning for autonomous rock image classification

    Science.gov (United States)

    Shu, Lei; McIsaac, Kenneth; Osinski, Gordon R.; Francis, Raymond

    2017-09-01

    Autonomous rock image classification can enhance the capability of robots for geological detection and enlarge the scientific returns, both in investigation on Earth and planetary surface exploration on Mars. Since rock textural images are usually inhomogeneous and manually hand-crafting features is not always reliable, we propose an unsupervised feature learning method to autonomously learn the feature representation for rock images. In our tests, rock image classification using the learned features shows that the learned features can outperform manually selected features. Self-taught learning is also proposed to learn the feature representation from a large database of unlabelled rock images of mixed class. The learned features can then be used repeatedly for classification of any subclass. This takes advantage of the large dataset of unlabelled rock images and learns a general feature representation for many kinds of rocks. We show experimental results supporting the feasibility of self-taught learning on rock images.

  15. Rule-based category learning in children: the role of age and executive functioning.

    Directory of Open Access Journals (Sweden)

    Rahel Rabi

    Full Text Available Rule-based category learning was examined in 4-11 year-olds and adults. Participants were asked to learn a set of novel perceptual categories in a classification learning task. Categorization performance improved with age, with younger children showing the strongest rule-based deficit relative to older children and adults. Model-based analyses provided insight regarding the type of strategy being used to solve the categorization task, demonstrating that the use of the task appropriate strategy increased with age. When children and adults who identified the correct categorization rule were compared, the performance deficit was no longer evident. Executive functions were also measured. While both working memory and inhibitory control were related to rule-based categorization and improved with age, working memory specifically was found to marginally mediate the age-related improvements in categorization. When analyses focused only on the sample of children, results showed that working memory ability and inhibitory control were associated with categorization performance and strategy use. The current findings track changes in categorization performance across childhood, demonstrating at which points performance begins to mature and resemble that of adults. Additionally, findings highlight the potential role that working memory and inhibitory control may play in rule-based category learning.

  16. Capacities and overlap indexes with an application in fuzzy rule-based classification systems

    Czech Academy of Sciences Publication Activity Database

    Paternain, D.; Bustince, H.; Pagola, M.; Sussner, P.; Kolesárová, A.; Mesiar, Radko

    2016-01-01

    Roč. 305, č. 1 (2016), s. 70-94 ISSN 0165-0114 Institutional support: RVO:67985556 Keywords : Capacity * Overlap index * Overlap function * Choquet integral * Fuzzy rule-based classification systems Subject RIV: BA - General Mathematics Impact factor: 2.718, year: 2016 http://library.utia.cas.cz/separaty/2016/E/mesiar-0465739.pdf

  17. Analysis of mesenchymal stem cell differentiation in vitro using classification association rule mining.

    Science.gov (United States)

    Wang, Weiqi; Wang, Yanbo Justin; Bañares-Alcántara, René; Coenen, Frans; Cui, Zhanfeng

    2009-12-01

    In this paper, data mining is used to analyze the data on the differentiation of mammalian Mesenchymal Stem Cells (MSCs), aiming at discovering known and hidden rules governing MSC differentiation, following the establishment of a web-based public database containing experimental data on the MSC proliferation and differentiation. To this effect, a web-based public interactive database comprising the key parameters which influence the fate and destiny of mammalian MSCs has been constructed and analyzed using Classification Association Rule Mining (CARM) as a data-mining technique. The results show that the proposed approach is technically feasible and performs well with respect to the accuracy of (classification) prediction. Key rules mined from the constructed MSC database are consistent with experimental observations, indicating the validity of the method developed and the first step in the application of data mining to the study of MSCs.

  18. In the name of legal certainty? Comparison of advance ruling for tariff classification in the European Union, China and Taiwan

    NARCIS (Netherlands)

    Chen, S.

    2016-01-01

    In many jurisdictions, international traders can apply to customs authorities for an advance ruling for tariff classification before they import or export their goods. The advance ruling system for tariff classification is expected to grant more legal certainty to international traders because they

  19. Incremental Learning of Context Free Grammars by Parsing-Based Rule Generation and Rule Set Search

    Science.gov (United States)

    Nakamura, Katsuhiko; Hoshina, Akemi

    This paper discusses recent improvements and extensions in Synapse system for inductive inference of context free grammars (CFGs) from sample strings. Synapse uses incremental learning, rule generation based on bottom-up parsing, and the search for rule sets. The form of production rules in the previous system is extended from Revised Chomsky Normal Form A→βγ to Extended Chomsky Normal Form, which also includes A→B, where each of β and γ is either a terminal or nonterminal symbol. From the result of bottom-up parsing, a rule generation mechanism synthesizes minimum production rules required for parsing positive samples. Instead of inductive CYK algorithm in the previous version of Synapse, the improved version uses a novel rule generation method, called ``bridging,'' which bridges the lacked part of the derivation tree for the positive string. The improved version also employs a novel search strategy, called serial search in addition to minimum rule set search. The synthesis of grammars by the serial search is faster than the minimum set search in most cases. On the other hand, the size of the generated CFGs is generally larger than that by the minimum set search, and the system can find no appropriate grammar for some CFL by the serial search. The paper shows experimental results of incremental learning of several fundamental CFGs and compares the methods of rule generation and search strategies.

  20. Differential impact of relevant and irrelevant dimension primes on rule-based and information-integration category learning.

    Science.gov (United States)

    Grimm, Lisa R; Maddox, W Todd

    2013-11-01

    Research has identified multiple category-learning systems with each being "tuned" for learning categories with different task demands and each governed by different neurobiological systems. Rule-based (RB) classification involves testing verbalizable rules for category membership while information-integration (II) classification requires the implicit learning of stimulus-response mappings. In the first study to directly test rule priming with RB and II category learning, we investigated the influence of the availability of information presented at the beginning of the task. Participants viewed lines that varied in length, orientation, and position on the screen, and were primed to focus on stimulus dimensions that were relevant or irrelevant to the correct classification rule. In Experiment 1, we used an RB category structure, and in Experiment 2, we used an II category structure. Accuracy and model-based analyses suggested that a focus on relevant dimensions improves RB task performance later in learning while a focus on an irrelevant dimension improves II task performance early in learning. © 2013.

  1. Catalogue and classification of technical safety standards, rules and regulations for nuclear power reactors and nuclear fuel cycle facilities

    International Nuclear Information System (INIS)

    Fichtner, N.; Becker, K.; Bashir, M.

    1977-01-01

    The present report is an up-dated version of the report 'Catalogue and Classification of Technical Safety Rules for Light-water Reactors and Reprocessing Plants' edited under code No EUR 5362e, August 1975. Like the first version of the report, it constitutes a catalogue and classification of standards, rules and regulations on land-based nuclear power reactors and fuel cycle facilities. The reasons for the classification system used are given and discussed

  2. Active Learning of Classification Models with Likert-Scale Feedback.

    Science.gov (United States)

    Xue, Yanbing; Hauskrecht, Milos

    2017-01-01

    Annotation of classification data by humans can be a time-consuming and tedious process. Finding ways of reducing the annotation effort is critical for building the classification models in practice and for applying them to a variety of classification tasks. In this paper, we develop a new active learning framework that combines two strategies to reduce the annotation effort. First, it relies on label uncertainty information obtained from the human in terms of the Likert-scale feedback. Second, it uses active learning to annotate examples with the greatest expected change. We propose a Bayesian approach to calculate the expectation and an incremental SVM solver to reduce the time complexity of the solvers. We show the combination of our active learning strategy and the Likert-scale feedback can learn classification models more rapidly and with a smaller number of labeled instances than methods that rely on either Likert-scale labels or active learning alone.

  3. Learning the Rules of the Game

    Science.gov (United States)

    Smith, Donald A.

    2018-03-01

    Games have often been used in the classroom to teach physics ideas and concepts, but there has been less published on games that can be used to teach scientific thinking. D. Maloney and M. Masters describe an activity in which students attempt to infer rules to a game from a history of moves, but the students don't actually play the game. Giving the list of moves allows the instructor to emphasize the important fact that nature usually gives us incomplete data sets, but it does make the activity less immersive. E. Kimmel suggested letting students attempt to figure out the rules to Reversi by playing it, but this game only has two players, which makes it difficult to apply in a classroom setting. Kimmel himself admits the choice of Reversi is somewhat arbitrary. There are games, however, that are designed to make the process of figuring out the rules an integral aspect of play. These games involve more people and require only a deck or two of cards. I present here an activity constructed around the card game Mao, which can be used to help students recognize aspects of scientific thinking. The game is particularly good at illustrating the importance of falsification tests (questions designed to elicit a negative answer) over verification tests (examples that confirm what is already suspected) for illuminating the underlying rules.

  4. Learning the Rules of the Game

    Science.gov (United States)

    Smith, Donald A.

    2018-01-01

    Games have often been used in the classroom to teach physics ideas and concepts, but there has been less published on games that can be used to teach scientific thinking. D. Maloney and M. Masters describe an activity in which students attempt to infer rules to a game from a history of moves, but the students do not actually play the game. Giving…

  5. Seismic waveform classification using deep learning

    Science.gov (United States)

    Kong, Q.; Allen, R. M.

    2017-12-01

    MyShake is a global smartphone seismic network that harnesses the power of crowdsourcing. It has an Artificial Neural Network (ANN) algorithm running on the phone to distinguish earthquake motion from human activities recorded by the accelerometer on board. Once the ANN detects earthquake-like motion, it sends a 5-min chunk of acceleration data back to the server for further analysis. The time-series data collected contains both earthquake data and human activity data that the ANN confused. In this presentation, we will show the Convolutional Neural Network (CNN) we built under the umbrella of supervised learning to find out the earthquake waveform. The waveforms of the recorded motion could treat easily as images, and by taking the advantage of the power of CNN processing the images, we achieved very high successful rate to select the earthquake waveforms out. Since there are many non-earthquake waveforms than the earthquake waveforms, we also built an anomaly detection algorithm using the CNN. Both these two methods can be easily extended to other waveform classification problems.

  6. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    Science.gov (United States)

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  7. A strategy learning model for autonomous agents based on classification

    Directory of Open Access Journals (Sweden)

    Śnieżyński Bartłomiej

    2015-09-01

    Full Text Available In this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process

  8. Towards a classification of fusion rule algebras in rational conformal field theories

    International Nuclear Information System (INIS)

    Ravanini, F.

    1991-01-01

    We review the main topics concerning Fusion Rule Algebras (FRA) of Rational Conformal Field Theories. After an exposition of their general properties, we examine known results on the complete classification for low number of fields (≤4). We then turn our attention to FRA's generated polynomially by one (real) fundamental field, for which a classification is known. Attempting to generalize this result, we describe some connections between FRA's and Graph Theory. The possibility to get new results on the subject following this ''graph'' approach is briefly discussed. (author)

  9. Performance of the majority voting rule in solving the density classification problem in high dimensions

    International Nuclear Information System (INIS)

    Gomez Soto, Jose Manuel; Fuks, Henryk

    2011-01-01

    The density classification problem (DCP) is one of the most widely studied problems in the theory of cellular automata. After it was shown that the DCP cannot be solved perfectly, the research in this area has been focused on finding better rules that could solve the DCP approximately. In this paper, we argue that the majority voting rule in high dimensions can achieve high performance in solving the DCP, and that its performance increases with dimension. We support this conjecture with arguments based on the mean-field approximation and direct computer simulations. (paper)

  10. Performance of the majority voting rule in solving the density classification problem in high dimensions

    Energy Technology Data Exchange (ETDEWEB)

    Gomez Soto, Jose Manuel [Unidad Academica de Matematicas, Universidad Autonoma de Zacatecas, Calzada Solidaridad entronque Paseo a la Bufa, Zacatecas, Zac. (Mexico); Fuks, Henryk, E-mail: jmgomezgoo@gmail.com, E-mail: hfuks@brocku.ca [Department of Mathematics, Brock University, St. Catharines, ON (Canada)

    2011-11-04

    The density classification problem (DCP) is one of the most widely studied problems in the theory of cellular automata. After it was shown that the DCP cannot be solved perfectly, the research in this area has been focused on finding better rules that could solve the DCP approximately. In this paper, we argue that the majority voting rule in high dimensions can achieve high performance in solving the DCP, and that its performance increases with dimension. We support this conjecture with arguments based on the mean-field approximation and direct computer simulations. (paper)

  11. Delta Learning Rule for the Active Sites Model

    OpenAIRE

    Lingashetty, Krishna Chaithanya

    2010-01-01

    This paper reports the results on methods of comparing the memory retrieval capacity of the Hebbian neural network which implements the B-Matrix approach, by using the Widrow-Hoff rule of learning. We then, extend the recently proposed Active Sites model by developing a delta rule to increase memory capacity. Also, this paper extends the binary neural network to a multi-level (non-binary) neural network.

  12. Polarimetric SAR image classification based on discriminative dictionary learning model

    Science.gov (United States)

    Sang, Cheng Wei; Sun, Hong

    2018-03-01

    Polarimetric SAR (PolSAR) image classification is one of the important applications of PolSAR remote sensing. It is a difficult high-dimension nonlinear mapping problem, the sparse representations based on learning overcomplete dictionary have shown great potential to solve such problem. The overcomplete dictionary plays an important role in PolSAR image classification, however for PolSAR image complex scenes, features shared by different classes will weaken the discrimination of learned dictionary, so as to degrade classification performance. In this paper, we propose a novel overcomplete dictionary learning model to enhance the discrimination of dictionary. The learned overcomplete dictionary by the proposed model is more discriminative and very suitable for PolSAR classification.

  13. QUEST : Eliminating online supervised learning for efficient classification algorithms

    NARCIS (Netherlands)

    Zwartjes, Ardjan; Havinga, Paul J.M.; Smit, Gerard J.M.; Hurink, Johann L.

    2016-01-01

    In this work, we introduce QUEST (QUantile Estimation after Supervised Training), an adaptive classification algorithm for Wireless Sensor Networks (WSNs) that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting

  14. Learning a New Selection Rule in Visual and Frontal Cortex.

    Science.gov (United States)

    van der Togt, Chris; Stănişor, Liviu; Pooresmaeili, Arezoo; Albantakis, Larissa; Deco, Gustavo; Roelfsema, Pieter R

    2016-08-01

    How do you make a decision if you do not know the rules of the game? Models of sensory decision-making suggest that choices are slow if evidence is weak, but they may only apply if the subject knows the task rules. Here, we asked how the learning of a new rule influences neuronal activity in the visual (area V1) and frontal cortex (area FEF) of monkeys. We devised a new icon-selection task. On each day, the monkeys saw 2 new icons (small pictures) and learned which one was relevant. We rewarded eye movements to a saccade target connected to the relevant icon with a curve. Neurons in visual and frontal cortex coded the monkey's choice, because the representation of the selected curve was enhanced. Learning delayed the neuronal selection signals and we uncovered the cause of this delay in V1, where learning to select the relevant icon caused an early suppression of surrounding image elements. These results demonstrate that the learning of a new rule causes a transition from fast and random decisions to a more considerate strategy that takes additional time and they reveal the contribution of visual and frontal cortex to the learning process. © The Author 2016. Published by Oxford University Press.

  15. Deep learning for tumor classification in imaging mass spectrometry.

    Science.gov (United States)

    Behrmann, Jens; Etmann, Christian; Boskamp, Tobias; Casadonte, Rita; Kriegsmann, Jörg; Maaß, Peter

    2018-04-01

    Tumor classification using imaging mass spectrometry (IMS) data has a high potential for future applications in pathology. Due to the complexity and size of the data, automated feature extraction and classification steps are required to fully process the data. Since mass spectra exhibit certain structural similarities to image data, deep learning may offer a promising strategy for classification of IMS data as it has been successfully applied to image classification. Methodologically, we propose an adapted architecture based on deep convolutional networks to handle the characteristics of mass spectrometry data, as well as a strategy to interpret the learned model in the spectral domain based on a sensitivity analysis. The proposed methods are evaluated on two algorithmically challenging tumor classification tasks and compared to a baseline approach. Competitiveness of the proposed methods is shown on both tasks by studying the performance via cross-validation. Moreover, the learned models are analyzed by the proposed sensitivity analysis revealing biologically plausible effects as well as confounding factors of the considered tasks. Thus, this study may serve as a starting point for further development of deep learning approaches in IMS classification tasks. https://gitlab.informatik.uni-bremen.de/digipath/Deep_Learning_for_Tumor_Classification_in_IMS. jbehrmann@uni-bremen.de or christianetmann@uni-bremen.de. Supplementary data are available at Bioinformatics online.

  16. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  17. A Weighted Block Dictionary Learning Algorithm for Classification

    OpenAIRE

    Shi, Zhongrong

    2016-01-01

    Discriminative dictionary learning, playing a critical role in sparse representation based classification, has led to state-of-the-art classification results. Among the existing discriminative dictionary learning methods, two different approaches, shared dictionary and class-specific dictionary, which associate each dictionary atom to all classes or a single class, have been studied. The shared dictionary is a compact method but with lack of discriminative information; the class-specific dict...

  18. Genetic classification of populations using supervised learning.

    Directory of Open Access Journals (Sweden)

    Michael Bridges

    2011-05-01

    Full Text Available There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case-control studies and quality control (when participants in a study have been genotyped at different laboratories. This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available.In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines to the classification of three populations (two from Scotland and one from Bulgaria. The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies.

  19. Genetic classification of populations using supervised learning.

    LENUS (Irish Health Repository)

    Bridges, Michael

    2011-01-01

    There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case-control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available.In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies.

  20. Extracting classification rules from an informatic security incidents repository by genetic programming

    Directory of Open Access Journals (Sweden)

    Carlos Javier Carvajal Montealegre

    2015-04-01

    Full Text Available This paper describes the data mining process to obtain classification rules over an information security incident data collection, explaining in detail the use of genetic programming as a mean to model the incidents behavior and representing such rules as decision trees. The described mining process includes several tasks, such as the GP (Genetic Programming approach evaluation, the individual's representation and the algorithm parameters tuning to upgrade the performance. The paper concludes with the result analysis and the description of the rules obtained, suggesting measures to avoid the occurrence of new informatics attacks. This paper is a part of the thesis work degree: Information Security Incident Analytics by Data Mining for Behavioral Modeling and Pattern Recognition (Carvajal, 2012.

  1. Logic Learning Machine creates explicit and stable rules stratifying neuroblastoma patients.

    Science.gov (United States)

    Cangelosi, Davide; Blengio, Fabiola; Versteeg, Rogier; Eggert, Angelika; Garaventa, Alberto; Gambini, Claudio; Conte, Massimo; Eva, Alessandra; Muselli, Marco; Varesio, Luigi

    2013-01-01

    Neuroblastoma is the most common pediatric solid tumor. About fifty percent of high risk patients die despite treatment making the exploration of new and more effective strategies for improving stratification mandatory. Hypoxia is a condition of low oxygen tension occurring in poorly vascularized areas of the tumor associated with poor prognosis. We had previously defined a robust gene expression signature measuring the hypoxic component of neuroblastoma tumors (NB-hypo) which is a molecular risk factor. We wanted to develop a prognostic classifier of neuroblastoma patients' outcome blending existing knowledge on clinical and molecular risk factors with the prognostic NB-hypo signature. Furthermore, we were interested in classifiers outputting explicit rules that could be easily translated into the clinical setting. Shadow Clustering (SC) technique, which leads to final models called Logic Learning Machine (LLM), exhibits a good accuracy and promises to fulfill the aims of the work. We utilized this algorithm to classify NB-patients on the bases of the following risk factors: Age at diagnosis, INSS stage, MYCN amplification and NB-hypo. The algorithm generated explicit classification rules in good agreement with existing clinical knowledge. Through an iterative procedure we identified and removed from the dataset those examples which caused instability in the rules. This workflow generated a stable classifier very accurate in predicting good and poor outcome patients. The good performance of the classifier was validated in an independent dataset. NB-hypo was an important component of the rules with a strength similar to that of tumor staging. The novelty of our work is to identify stability, explicit rules and blending of molecular and clinical risk factors as the key features to generate classification rules for NB patients to be conveyed to the clinic and to be used to design new therapies. We derived, through LLM, a set of four stable rules identifying a new

  2. Logic Learning Machine creates explicit and stable rules stratifying neuroblastoma patients

    Science.gov (United States)

    2013-01-01

    Background Neuroblastoma is the most common pediatric solid tumor. About fifty percent of high risk patients die despite treatment making the exploration of new and more effective strategies for improving stratification mandatory. Hypoxia is a condition of low oxygen tension occurring in poorly vascularized areas of the tumor associated with poor prognosis. We had previously defined a robust gene expression signature measuring the hypoxic component of neuroblastoma tumors (NB-hypo) which is a molecular risk factor. We wanted to develop a prognostic classifier of neuroblastoma patients' outcome blending existing knowledge on clinical and molecular risk factors with the prognostic NB-hypo signature. Furthermore, we were interested in classifiers outputting explicit rules that could be easily translated into the clinical setting. Results Shadow Clustering (SC) technique, which leads to final models called Logic Learning Machine (LLM), exhibits a good accuracy and promises to fulfill the aims of the work. We utilized this algorithm to classify NB-patients on the bases of the following risk factors: Age at diagnosis, INSS stage, MYCN amplification and NB-hypo. The algorithm generated explicit classification rules in good agreement with existing clinical knowledge. Through an iterative procedure we identified and removed from the dataset those examples which caused instability in the rules. This workflow generated a stable classifier very accurate in predicting good and poor outcome patients. The good performance of the classifier was validated in an independent dataset. NB-hypo was an important component of the rules with a strength similar to that of tumor staging. Conclusions The novelty of our work is to identify stability, explicit rules and blending of molecular and clinical risk factors as the key features to generate classification rules for NB patients to be conveyed to the clinic and to be used to design new therapies. We derived, through LLM, a set of four

  3. Image Classification Workflow Using Machine Learning Methods

    Science.gov (United States)

    Christoffersen, M. S.; Roser, M.; Valadez-Vergara, R.; Fernández-Vega, J. A.; Pierce, S. A.; Arora, R.

    2016-12-01

    Recent increases in the availability and quality of remote sensing datasets have fueled an increasing number of scientifically significant discoveries based on land use classification and land use change analysis. However, much of the software made to work with remote sensing data products, specifically multispectral images, is commercial and often prohibitively expensive. The free to use solutions that are currently available come bundled up as small parts of much larger programs that are very susceptible to bugs and difficult to install and configure. What is needed is a compact, easy to use set of tools to perform land use analysis on multispectral images. To address this need, we have developed software using the Python programming language with the sole function of land use classification and land use change analysis. We chose Python to develop our software because it is relatively readable, has a large body of relevant third party libraries such as GDAL and Spectral Python, and is free to install and use on Windows, Linux, and Macintosh operating systems. In order to test our classification software, we performed a K-means unsupervised classification, Gaussian Maximum Likelihood supervised classification, and a Mahalanobis Distance based supervised classification. The images used for testing were three Landsat rasters of Austin, Texas with a spatial resolution of 60 meters for the years of 1984 and 1999, and 30 meters for the year 2015. The testing dataset was easily downloaded using the Earth Explorer application produced by the USGS. The software should be able to perform classification based on any set of multispectral rasters with little to no modification. Our software makes the ease of land use classification using commercial software available without an expensive license.

  4. Large-Scale Machine Learning for Classification and Search

    Science.gov (United States)

    Liu, Wei

    2012-01-01

    With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest…

  5. Joint Feature Selection and Classification for Multilabel Learning.

    Science.gov (United States)

    Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong

    2018-03-01

    Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.

  6. Classification of multiple sclerosis lesions using adaptive dictionary learning.

    Science.gov (United States)

    Deshpande, Hrishikesh; Maurel, Pierre; Barillot, Christian

    2015-12-01

    This paper presents a sparse representation and an adaptive dictionary learning based method for automated classification of multiple sclerosis (MS) lesions in magnetic resonance (MR) images. Manual delineation of MS lesions is a time-consuming task, requiring neuroradiology experts to analyze huge volume of MR data. This, in addition to the high intra- and inter-observer variability necessitates the requirement of automated MS lesion classification methods. Among many image representation models and classification methods that can be used for such purpose, we investigate the use of sparse modeling. In the recent years, sparse representation has evolved as a tool in modeling data using a few basis elements of an over-complete dictionary and has found applications in many image processing tasks including classification. We propose a supervised classification approach by learning dictionaries specific to the lesions and individual healthy brain tissues, which include white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF). The size of the dictionaries learned for each class plays a major role in data representation but it is an even more crucial element in the case of competitive classification. Our approach adapts the size of the dictionary for each class, depending on the complexity of the underlying data. The algorithm is validated using 52 multi-sequence MR images acquired from 13 MS patients. The results demonstrate the effectiveness of our approach in MS lesion classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Group-Based Active Learning of Classification Models.

    Science.gov (United States)

    Luo, Zhipeng; Hauskrecht, Milos

    2017-05-01

    Learning of classification models from real-world data often requires additional human expert effort to annotate the data. However, this process can be rather costly and finding ways of reducing the human annotation effort is critical for this task. The objective of this paper is to develop and study new ways of providing human feedback for efficient learning of classification models by labeling groups of examples. Briefly, unlike traditional active learning methods that seek feedback on individual examples, we develop a new group-based active learning framework that solicits label information on groups of multiple examples. In order to describe groups in a user-friendly way, conjunctive patterns are used to compactly represent groups. Our empirical study on 12 UCI data sets demonstrates the advantages and superiority of our approach over both classic instance-based active learning work, as well as existing group-based active-learning methods.

  8. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  9. Students’ Learning Obstacles and Alternative Solution in Counting Rules Learning Levels Senior High School

    Directory of Open Access Journals (Sweden)

    M A Jatmiko

    2017-12-01

    Full Text Available The counting rules is a topic in mathematics senior high school. In the learning process, teachers often find students who have difficulties in learning this topic. Knowing the characteristics of students' learning difficulties and analyzing the causes is important for the teacher, as an effort in trying to reflect the learning process and as a reference in constructing alternative learning solutions which appropriate to anticipate students’ learning obstacles. This study uses qualitative methods and involves 70 students of class XII as research subjects. The data collection techniques used in this study is diagnostic test instrument about learning difficulties in counting rules, observation, and interview. The data used to know the learning difficulties experienced by students, the causes of learning difficulties, and to develop alternative learning solutions. From the results of data analysis, the results of diagnostic tests researcher found some obstacles faced by students, such as students get confused in describing the definition, students difficulties in understanding the procedure of solving multiplication rules. Based on those problems, researcher analyzed the causes of these difficulties and make hypothetical learning trajectory as an alternative solution in counting rules learning.

  10. Classification of Polarimetric SAR Data Using Dictionary Learning

    DEFF Research Database (Denmark)

    Vestergaard, Jacob Schack; Nielsen, Allan Aasbjerg; Dahl, Anders Lindbjerg

    2012-01-01

    This contribution deals with classification of multilook fully polarimetric synthetic aperture radar (SAR) data by learning a dictionary of crop types present in the Foulum test site. The Foulum test site contains a large number of agricultural fields, as well as lakes, forests, natural vegetation......, grasslands and urban areas, which make it ideally suited for evaluation of classification algorithms. Dictionary learning centers around building a collection of image patches typical for the classification problem at hand. This requires initial manual labeling of the classes present in the data and is thus...... a method for supervised classification. Sparse coding of these image patches aims to maintain a proficient number of typical patches and associated labels. Data is consecutively classified by a nearest neighbor search of the dictionary elements and labeled with probabilities of each class. Each dictionary...

  11. Classification of solar wind with machine learning

    NARCIS (Netherlands)

    E. Camporeale (Enrico); A. Carè (Algo); J.E. Borovsky (Joseph)

    2017-01-01

    htmlabstractWe present a four-category classification algorithm for the solar wind, based on Gaussian Process. The four categories are the ones previously adopted in Xu and Borovsky (2015): ejecta, coronal hole origin plasma, streamer belt origin plasma, and sector reversal origin plasma. The

  12. An Ontology to Support the Classification of Learning Material in an Organizational Learning Environment: An Evaluation

    Science.gov (United States)

    Valaski, Joselaine; Reinehr, Sheila; Malucelli, Andreia

    2017-01-01

    Purpose: The purpose of this research was to evaluate whether ontology integrated in an organizational learning environment may support the automatic learning material classification in a specific knowledge area. Design/methodology/approach: An ontology for recommending learning material was integrated in the organizational learning environment…

  13. Rule Learning in Autism: The Role of Reward Type and Social Context

    OpenAIRE

    Jones, E. J. H.; Webb, S. J.; Estes, A.; Dawson, G.

    2013-01-01

    Learning abstract rules is central to social and cognitive development. Across two experiments, we used Delayed Non-Matching to Sample tasks to characterize the longitudinal development and nature of rule-learning impairments in children with Autism Spectrum Disorder (ASD). Results showed that children with ASD consistently experienced more difficulty learning an abstract rule from a discrete physical reward than children with DD. Rule learning was facilitated by the provision of more concret...

  14. Deep Multi-Task Learning for Tree Genera Classification

    Science.gov (United States)

    Ko, C.; Kang, J.; Sohn, G.

    2018-05-01

    The goal for our paper is to classify tree genera using airborne Light Detection and Ranging (LiDAR) data with Convolution Neural Network (CNN) - Multi-task Network (MTN) implementation. Unlike Single-task Network (STN) where only one task is assigned to the learning outcome, MTN is a deep learning architect for learning a main task (classification of tree genera) with other tasks (in our study, classification of coniferous and deciduous) simultaneously, with shared classification features. The main contribution of this paper is to improve classification accuracy from CNN-STN to CNN-MTN. This is achieved by introducing a concurrence loss (Lcd) to the designed MTN. This term regulates the overall network performance by minimizing the inconsistencies between the two tasks. Results show that we can increase the classification accuracy from 88.7 % to 91.0 % (from STN to MTN). The second goal of this paper is to solve the problem of small training sample size by multiple-view data generation. The motivation of this goal is to address one of the most common problems in implementing deep learning architecture, the insufficient number of training data. We address this problem by simulating training dataset with multiple-view approach. The promising results from this paper are providing a basis for classifying a larger number of dataset and number of classes in the future.

  15. Learning of Rule Ensembles for Multiple Attribute Ranking Problems

    Science.gov (United States)

    Dembczyński, Krzysztof; Kotłowski, Wojciech; Słowiński, Roman; Szeląg, Marcin

    In this paper, we consider the multiple attribute ranking problem from a Machine Learning perspective. We propose two approaches to statistical learning of an ensemble of decision rules from decision examples provided by the Decision Maker in terms of pairwise comparisons of some objects. The first approach consists in learning a preference function defining a binary preference relation for a pair of objects. The result of application of this function on all pairs of objects to be ranked is then exploited using the Net Flow Score procedure, giving a linear ranking of objects. The second approach consists in learning a utility function for single objects. The utility function also gives a linear ranking of objects. In both approaches, the learning is based on the boosting technique. The presented approaches to Preference Learning share good properties of the decision rule preference model and have good performance in the massive-data learning problems. As Preference Learning and Multiple Attribute Decision Aiding share many concepts and methodological issues, in the introduction, we review some aspects bridging these two fields. To illustrate the two approaches proposed in this paper, we solve with them a toy example concerning the ranking of a set of cars evaluated by multiple attributes. Then, we perform a large data experiment on real data sets. The first data set concerns credit rating. Since recent research in the field of Preference Learning is motivated by the increasing role of modeling preferences in recommender systems and information retrieval, we chose two other massive data sets from this area - one comes from movie recommender system MovieLens, and the other concerns ranking of text documents from 20 Newsgroups data set.

  16. Active Metric Learning for Supervised Classification

    OpenAIRE

    Kumaran, Krishnan; Papageorgiou, Dimitri; Chang, Yutong; Li, Minhan; Takáč, Martin

    2018-01-01

    Clustering and classification critically rely on distance metrics that provide meaningful comparisons between data points. We present mixed-integer optimization approaches to find optimal distance metrics that generalize the Mahalanobis metric extensively studied in the literature. Additionally, we generalize and improve upon leading methods by removing reliance on pre-designated "target neighbors," "triplets," and "similarity pairs." Another salient feature of our method is its ability to en...

  17. QUEST: Eliminating Online Supervised Learning for Efficient Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Ardjan Zwartjes

    2016-10-01

    Full Text Available In this work, we introduce QUEST (QUantile Estimation after Supervised Training, an adaptive classification algorithm for Wireless Sensor Networks (WSNs that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting raw sensor data puts high demands on the battery, reducing network life time. By merely transmitting partial results or classifications based on the sampled data, the amount of traffic on the network can be significantly reduced. Such classifications can be made by learning based algorithms using sampled data. An important issue, however, is the training phase of these learning based algorithms. Training a deployed sensor network requires a lot of communication and an impractical amount of human involvement. QUEST is a hybrid algorithm that combines supervised learning in a controlled environment with unsupervised learning on the location of deployment. Using the SITEX02 dataset, we demonstrate that the presented solution works with a performance penalty of less than 10% in 90% of the tests. Under some circumstances, it even outperforms a network of classifiers completely trained with supervised learning. As a result, the need for on-site supervised learning and communication for training is completely eliminated by our solution.

  18. QUEST: Eliminating Online Supervised Learning for Efficient Classification Algorithms.

    Science.gov (United States)

    Zwartjes, Ardjan; Havinga, Paul J M; Smit, Gerard J M; Hurink, Johann L

    2016-10-01

    In this work, we introduce QUEST (QUantile Estimation after Supervised Training), an adaptive classification algorithm for Wireless Sensor Networks (WSNs) that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting raw sensor data puts high demands on the battery, reducing network life time. By merely transmitting partial results or classifications based on the sampled data, the amount of traffic on the network can be significantly reduced. Such classifications can be made by learning based algorithms using sampled data. An important issue, however, is the training phase of these learning based algorithms. Training a deployed sensor network requires a lot of communication and an impractical amount of human involvement. QUEST is a hybrid algorithm that combines supervised learning in a controlled environment with unsupervised learning on the location of deployment. Using the SITEX02 dataset, we demonstrate that the presented solution works with a performance penalty of less than 10% in 90% of the tests. Under some circumstances, it even outperforms a network of classifiers completely trained with supervised learning. As a result, the need for on-site supervised learning and communication for training is completely eliminated by our solution.

  19. Language Learning Strategies: Classification and Pedagogical Implication

    Directory of Open Access Journals (Sweden)

    Ag. Bambang Setiyadi

    2001-01-01

    Full Text Available Many studies have been conducted to explore language learning strategies (Rubin, 1975, Naiman et . al ., 1978; Fillmore, 1979; O'Malley et . al ., 1985 and 1990; Politzer and Groarty, 1985; Prokop, 1989; Oxford, 1990; and Wenden, 1991. In the current study a total of 79 university students participating in a 3 month English course participated. This study attempted to explore what language learning strategies successful learners used and to what extent the strategies contributed to success in learning English in Indonesia . Factor analyses, accounting for 62.1 %, 56.0 %, 41.1 %, and 43.5 % of the varience of speaking, listening, reading and writing measures in the language learning strategy questionnaire, suggested that the questionnaire constituted three constructs. The three constructs were named metacognitive strategies, deep level cognitive and surface level cognitive strategies. Regression analyses, performed using scales based on these factors revealed significant main effects for the use of the language learning strategies in learning English, constituting 43 % of the varience in the posttest English achievement scores. An analysis of varience of the gain scores of the highest, middle, and the lowest groups of performers suggested a greater use of metacognitive strategies among successful learners and a greater use of surface level cognitive strategies among unsuccessful learners. Implications for the classroom and future research are also discussed.

  20. Symbol manipulation and rule learning in spiking neuronal networks.

    Science.gov (United States)

    Fernando, Chrisantha

    2011-04-21

    It has been claimed that the productivity, systematicity and compositionality of human language and thought necessitate the existence of a physical symbol system (PSS) in the brain. Recent discoveries about temporal coding suggest a novel type of neuronal implementation of a physical symbol system. Furthermore, learning classifier systems provide a plausible algorithmic basis by which symbol re-write rules could be trained to undertake behaviors exhibiting systematicity and compositionality, using a kind of natural selection of re-write rules in the brain, We show how the core operation of a learning classifier system, namely, the replication with variation of symbol re-write rules, can be implemented using spike-time dependent plasticity based supervised learning. As a whole, the aim of this paper is to integrate an algorithmic and an implementation level description of a neuronal symbol system capable of sustaining systematic and compositional behaviors. Previously proposed neuronal implementations of symbolic representations are compared with this new proposal. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery.

    Science.gov (United States)

    Montella, Alfonso; Aria, Massimo; D'Ambrosio, Antonio; Mauriello, Filomena

    2012-11-01

    Aim of the study was the analysis of powered two-wheeler (PTW) crashes in Italy in order to detect interdependence as well as dissimilarities among crash characteristics and provide insights for the development of safety improvement strategies focused on PTWs. At this aim, data mining techniques were used to analyze the data relative to the 254,575 crashes involving PTWs occurred in Italy in the period 2006-2008. Classification trees analysis and rules discovery were performed. Tree-based methods are non-linear and non-parametric data mining tools for supervised classification and regression problems. They do not require a priori probabilistic knowledge about the phenomena under studying and consider conditional interactions among input data. Rules discovery is the identification of sets of items (i.e., crash patterns) that occur together in a given event (i.e., a crash in our study) more often than they would if they were independent of each other. Thus, the method can detect interdependence among crash characteristics. Due to the large number of patterns considered, both methods suffer from an extreme risk of finding patterns that appear due to chance alone. To overcome this problem, in our study we randomly split the sample data in two data sets and used well-established statistical practices to evaluate the statistical significance of the results. Both the classification trees and the rules discovery were effective in providing meaningful insights about PTW crash characteristics and their interdependencies. Even though in several cases different crash characteristics were highlighted, the results of the two the analysis methods were never contradictory. Furthermore, most of the findings of this study were consistent with the results of previous studies which used different analytical techniques, such as probabilistic models of crash injury severity. Basing on the analysis results, engineering countermeasures and policy initiatives to reduce PTW injuries and

  2. Learning Structural Classification Rules for Web-page Categorization

    NARCIS (Netherlands)

    Stuckenschmidt, Heiner; Hartmann, Jens; Van Harmelen, Frank

    2002-01-01

    Content-related metadata plays an important role in the effort of developing intelligent web applications. One of the most established form of providing content-related metadata is the assignment of web-pages to content categories. We describe the Spectacle system for classifying individual web

  3. Advanced Steel Microstructural Classification by Deep Learning Methods.

    Science.gov (United States)

    Azimi, Seyed Majid; Britz, Dominik; Engstler, Michael; Fritz, Mario; Mücklich, Frank

    2018-02-01

    The inner structure of a material is called microstructure. It stores the genesis of a material and determines all its physical and chemical properties. While microstructural characterization is widely spread and well known, the microstructural classification is mostly done manually by human experts, which gives rise to uncertainties due to subjectivity. Since the microstructure could be a combination of different phases or constituents with complex substructures its automatic classification is very challenging and only a few prior studies exist. Prior works focused on designed and engineered features by experts and classified microstructures separately from the feature extraction step. Recently, Deep Learning methods have shown strong performance in vision applications by learning the features from data together with the classification step. In this work, we propose a Deep Learning method for microstructural classification in the examples of certain microstructural constituents of low carbon steel. This novel method employs pixel-wise segmentation via Fully Convolutional Neural Network (FCNN) accompanied by a max-voting scheme. Our system achieves 93.94% classification accuracy, drastically outperforming the state-of-the-art method of 48.89% accuracy. Beyond the strong performance of our method, this line of research offers a more robust and first of all objective way for the difficult task of steel quality appreciation.

  4. Classification of Strawberry Fruit Shape by Machine Learning

    Science.gov (United States)

    Ishikawa, T.; Hayashi, A.; Nagamatsu, S.; Kyutoku, Y.; Dan, I.; Wada, T.; Oku, K.; Saeki, Y.; Uto, T.; Tanabata, T.; Isobe, S.; Kochi, N.

    2018-05-01

    Shape is one of the most important traits of agricultural products due to its relationships with the quality, quantity, and value of the products. For strawberries, the nine types of fruit shape were defined and classified by humans based on the sampler patterns of the nine types. In this study, we tested the classification of strawberry shapes by machine learning in order to increase the accuracy of the classification, and we introduce the concept of computerization into this field. Four types of descriptors were extracted from the digital images of strawberries: (1) the Measured Values (MVs) including the length of the contour line, the area, the fruit length and width, and the fruit width/length ratio; (2) the Ellipse Similarity Index (ESI); (3) Elliptic Fourier Descriptors (EFDs), and (4) Chain Code Subtraction (CCS). We used these descriptors for the classification test along with the random forest approach, and eight of the nine shape types were classified with combinations of MVs + CCS + EFDs. CCS is a descriptor that adds human knowledge to the chain codes, and it showed higher robustness in classification than the other descriptors. Our results suggest machine learning's high ability to classify fruit shapes accurately. We will attempt to increase the classification accuracy and apply the machine learning methods to other plant species.

  5. Classification of Incomplete Data Based on Evidence Theory and an Extreme Learning Machine in Wireless Sensor Networks.

    Science.gov (United States)

    Zhang, Yang; Liu, Yun; Chao, Han-Chieh; Zhang, Zhenjiang; Zhang, Zhiyuan

    2018-03-30

    In wireless sensor networks, the classification of incomplete data reported by sensor nodes is an open issue because it is difficult to accurately estimate the missing values. In many cases, the misclassification is unacceptable considering that it probably brings catastrophic damages to the data users. In this paper, a novel classification approach of incomplete data is proposed to reduce the misclassification errors. This method uses the regularized extreme learning machine to estimate the potential values of missing data at first, and then it converts the estimations into multiple classification results on the basis of the distance between interval numbers. Finally, an evidential reasoning rule is adopted to fuse these classification results. The final decision is made according to the combined basic belief assignment. The experimental results show that this method has better performance than other traditional classification methods of incomplete data.

  6. Phenotype classification of zebrafish embryos by supervised learning.

    Directory of Open Access Journals (Sweden)

    Nathalie Jeanray

    Full Text Available Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning. We show that, compared to manual classification, automatic classification results in 90 to 100% agreement with consensus voting of biological experts in nine out of eleven considered defects in 3 days old zebrafish larvae. Automation of the analysis and classification of zebrafish embryo pictures reduces the workload and time required for the biological expert and increases the reproducibility and objectivity of this classification.

  7. Nonlinear programming for classification problems in machine learning

    Science.gov (United States)

    Astorino, Annabella; Fuduli, Antonio; Gaudioso, Manlio

    2016-10-01

    We survey some nonlinear models for classification problems arising in machine learning. In the last years this field has become more and more relevant due to a lot of practical applications, such as text and web classification, object recognition in machine vision, gene expression profile analysis, DNA and protein analysis, medical diagnosis, customer profiling etc. Classification deals with separation of sets by means of appropriate separation surfaces, which is generally obtained by solving a numerical optimization model. While linear separability is the basis of the most popular approach to classification, the Support Vector Machine (SVM), in the recent years using nonlinear separating surfaces has received some attention. The objective of this work is to recall some of such proposals, mainly in terms of the numerical optimization models. In particular we tackle the polyhedral, ellipsoidal, spherical and conical separation approaches and, for some of them, we also consider the semisupervised versions.

  8. Rule learning in autism: the role of reward type and social context.

    Science.gov (United States)

    Jones, E J H; Webb, S J; Estes, A; Dawson, G

    2013-01-01

    Learning abstract rules is central to social and cognitive development. Across two experiments, we used Delayed Non-Matching to Sample tasks to characterize the longitudinal development and nature of rule-learning impairments in children with Autism Spectrum Disorder (ASD). Results showed that children with ASD consistently experienced more difficulty learning an abstract rule from a discrete physical reward than children with DD. Rule learning was facilitated by the provision of more concrete reinforcement, suggesting an underlying difficulty in forming conceptual connections. Learning abstract rules about social stimuli remained challenging through late childhood, indicating the importance of testing executive functions in both social and non-social contexts.

  9. Efficient HIK SVM learning for image classification.

    Science.gov (United States)

    Wu, Jianxin

    2012-10-01

    Histograms are used in almost every aspect of image processing and computer vision, from visual descriptors to image representations. Histogram intersection kernel (HIK) and support vector machine (SVM) classifiers are shown to be very effective in dealing with histograms. This paper presents contributions concerning HIK SVM for image classification. First, we propose intersection coordinate descent (ICD), a deterministic and scalable HIK SVM solver. ICD is much faster than, and has similar accuracies to, general purpose SVM solvers and other fast HIK SVM training methods. We also extend ICD to the efficient training of a broader family of kernels. Second, we show an important empirical observation that ICD is not sensitive to the C parameter in SVM, and we provide some theoretical analyses to explain this observation. ICD achieves high accuracies in many problems, using its default parameters. This is an attractive property for practitioners, because many image processing tasks are too large to choose SVM parameters using cross-validation.

  10. Cooperative Learning for Distributed In-Network Traffic Classification

    Science.gov (United States)

    Joseph, S. B.; Loo, H. R.; Ismail, I.; Andromeda, T.; Marsono, M. N.

    2017-04-01

    Inspired by the concept of autonomic distributed/decentralized network management schemes, we consider the issue of information exchange among distributed network nodes to network performance and promote scalability for in-network monitoring. In this paper, we propose a cooperative learning algorithm for propagation and synchronization of network information among autonomic distributed network nodes for online traffic classification. The results show that network nodes with sharing capability perform better with a higher average accuracy of 89.21% (sharing data) and 88.37% (sharing clusters) compared to 88.06% for nodes without cooperative learning capability. The overall performance indicates that cooperative learning is promising for distributed in-network traffic classification.

  11. Unsupervised Feature Learning for Heart Sounds Classification Using Autoencoder

    Science.gov (United States)

    Hu, Wei; Lv, Jiancheng; Liu, Dongbo; Chen, Yao

    2018-04-01

    Cardiovascular disease seriously threatens the health of many people. It is usually diagnosed during cardiac auscultation, which is a fast and efficient method of cardiovascular disease diagnosis. In recent years, deep learning approach using unsupervised learning has made significant breakthroughs in many fields. However, to our knowledge, deep learning has not yet been used for heart sound classification. In this paper, we first use the average Shannon energy to extract the envelope of the heart sounds, then find the highest point of S1 to extract the cardiac cycle. We convert the time-domain signals of the cardiac cycle into spectrograms and apply principal component analysis whitening to reduce the dimensionality of the spectrogram. Finally, we apply a two-layer autoencoder to extract the features of the spectrogram. The experimental results demonstrate that the features from the autoencoder are suitable for heart sound classification.

  12. AGE GROUP CLASSIFICATION USING MACHINE LEARNING TECHNIQUES

    OpenAIRE

    Arshdeep Singh Syal*1 & Abhinav Gupta2

    2017-01-01

    A human face provides a lot of information that allows another person to identify characteristics such as age, sex, etc. Therefore, the challenge is to develop an age group prediction system using the automatic learning method. The task of estimating the age group of the human from their frontal facial images is very captivating, but also challenging because of the pattern of personalized and non-linear aging that differs from one person to another. This paper examines the problem of predicti...

  13. Failure diagnosis using deep belief learning based health state classification

    International Nuclear Information System (INIS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2013-01-01

    Effective health diagnosis provides multifarious benefits such as improved safety, improved reliability and reduced costs for operation and maintenance of complex engineered systems. This paper presents a novel multi-sensor health diagnosis method using deep belief network (DBN). DBN has recently become a popular approach in machine learning for its promised advantages such as fast inference and the ability to encode richer and higher order network structures. The DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines and works through a layer by layer successive learning process. The proposed multi-sensor health diagnosis methodology using DBN based state classification can be structured in three consecutive stages: first, defining health states and preprocessing sensory data for DBN training and testing; second, developing DBN based classification models for diagnosis of predefined health states; third, validating DBN classification models with testing sensory dataset. Health diagnosis using DBN based health state classification technique is compared with four existing diagnosis techniques. Benchmark classification problems and two engineering health diagnosis applications: aircraft engine health diagnosis and electric power transformer health diagnosis are employed to demonstrate the efficacy of the proposed approach

  14. An Efficient Ensemble Learning Method for Gene Microarray Classification

    Directory of Open Access Journals (Sweden)

    Alireza Osareh

    2013-01-01

    Full Text Available The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

  15. Recommendation System Based On Association Rules For Distributed E-Learning Management Systems

    Science.gov (United States)

    Mihai, Gabroveanu

    2015-09-01

    Traditional Learning Management Systems are installed on a single server where learning materials and user data are kept. To increase its performance, the Learning Management System can be installed on multiple servers; learning materials and user data could be distributed across these servers obtaining a Distributed Learning Management System. In this paper is proposed the prototype of a recommendation system based on association rules for Distributed Learning Management System. Information from LMS databases is analyzed using distributed data mining algorithms in order to extract the association rules. Then the extracted rules are used as inference rules to provide personalized recommendations. The quality of provided recommendations is improved because the rules used to make the inferences are more accurate, since these rules aggregate knowledge from all e-Learning systems included in Distributed Learning Management System.

  16. Ichthyoplankton Classification Tool using Generative Adversarial Networks and Transfer Learning

    KAUST Repository

    Aljaafari, Nura

    2018-04-15

    The study and the analysis of marine ecosystems is a significant part of the marine science research. These systems are valuable resources for fisheries, improving water quality and can even be used in drugs production. The investigation of ichthyoplankton inhabiting these ecosystems is also an important research field. Ichthyoplankton are fish in their early stages of life. In this stage, the fish have relatively similar shape and are small in size. The currently used way of identifying them is not optimal. Marine scientists typically study such organisms by sending a team that collects samples from the sea which is then taken to the lab for further investigation. These samples need to be studied by an expert and usually end needing a DNA sequencing. This method is time-consuming and requires a high level of experience. The recent advances in AI have helped to solve and automate several difficult tasks which motivated us to develop a classification tool for ichthyoplankton. We show that using machine learning techniques, such as generative adversarial networks combined with transfer learning solves such a problem with high accuracy. We show that using traditional machine learning algorithms fails to solve it. We also give a general framework for creating a classification tool when the dataset used for training is a limited dataset. We aim to build a user-friendly tool that can be used by any user for the classification task and we aim to give a guide to the researchers so that they can follow in creating a classification tool.

  17. Learning-induced pattern classification in a chaotic neural network

    International Nuclear Information System (INIS)

    Li, Yang; Zhu, Ping; Xie, Xiaoping; He, Guoguang; Aihara, Kazuyuki

    2012-01-01

    In this Letter, we propose a Hebbian learning rule with passive forgetting (HLRPF) for use in a chaotic neural network (CNN). We then define the indices based on the Euclidean distance to investigate the evolution of the weights in a simplified way. Numerical simulations demonstrate that, under suitable external stimulations, the CNN with the proposed HLRPF acts as a fuzzy-like pattern classifier that performs much better than an ordinary CNN. The results imply relationship between learning and recognition. -- Highlights: ► Proposing a Hebbian learning rule with passive forgetting (HLRPF). ► Defining indices to investigate the evolution of the weights simply. ► The chaotic neural network with HLRPF acts as a fuzzy-like pattern classifier. ► The pattern classifier ability of the network is improved much.

  18. Deep learning for EEG-Based preference classification

    Science.gov (United States)

    Teo, Jason; Hou, Chew Lin; Mountstephens, James

    2017-10-01

    Electroencephalogram (EEG)-based emotion classification is rapidly becoming one of the most intensely studied areas of brain-computer interfacing (BCI). The ability to passively identify yet accurately correlate brainwaves with our immediate emotions opens up truly meaningful and previously unattainable human-computer interactions such as in forensic neuroscience, rehabilitative medicine, affective entertainment and neuro-marketing. One particularly useful yet rarely explored areas of EEG-based emotion classification is preference recognition [1], which is simply the detection of like versus dislike. Within the limited investigations into preference classification, all reported studies were based on musically-induced stimuli except for a single study which used 2D images. The main objective of this study is to apply deep learning, which has been shown to produce state-of-the-art results in diverse hard problems such as in computer vision, natural language processing and audio recognition, to 3D object preference classification over a larger group of test subjects. A cohort of 16 users was shown 60 bracelet-like objects as rotating visual stimuli on a computer display while their preferences and EEGs were recorded. After training a variety of machine learning approaches which included deep neural networks, we then attempted to classify the users' preferences for the 3D visual stimuli based on their EEGs. Here, we show that that deep learning outperforms a variety of other machine learning classifiers for this EEG-based preference classification task particularly in a highly challenging dataset with large inter- and intra-subject variability.

  19. Finding Influential Users in Social Media Using Association Rule Learning

    Directory of Open Access Journals (Sweden)

    Fredrik Erlandsson

    2016-04-01

    Full Text Available Influential users play an important role in online social networks since users tend to have an impact on one other. Therefore, the proposed work analyzes users and their behavior in order to identify influential users and predict user participation. Normally, the success of a social media site is dependent on the activity level of the participating users. For both online social networking sites and individual users, it is of interest to find out if a topic will be interesting or not. In this article, we propose association learning to detect relationships between users. In order to verify the findings, several experiments were executed based on social network analysis, in which the most influential users identified from association rule learning were compared to the results from Degree Centrality and Page Rank Centrality. The results clearly indicate that it is possible to identify the most influential users using association rule learning. In addition, the results also indicate a lower execution time compared to state-of-the-art methods.

  20. Learning Word Embeddings with Chi-Square Weights for Healthcare Tweet Classification

    Directory of Open Access Journals (Sweden)

    Sicong Kuang

    2017-08-01

    Full Text Available Twitter is a popular source for the monitoring of healthcare information and public disease. However, there exists much noise in the tweets. Even though appropriate keywords appear in the tweets, they do not guarantee the identification of a truly health-related tweet. Thus, the traditional keyword-based classification task is largely ineffective. Algorithms for word embeddings have proved to be useful in many natural language processing (NLP tasks. We introduce two algorithms based on an existing word embedding learning algorithm: the continuous bag-of-words model (CBOW. We apply the proposed algorithms to the task of recognizing healthcare-related tweets. In the CBOW model, the vector representation of words is learned from their contexts. To simplify the computation, the context is represented by an average of all words inside the context window. However, not all words in the context window contribute equally to the prediction of the target word. Greedily incorporating all the words in the context window will largely limit the contribution of the useful semantic words and bring noisy or irrelevant words into the learning process, while existing word embedding algorithms also try to learn a weighted CBOW model. Their weights are based on existing pre-defined syntactic rules while ignoring the task of the learned embedding. We propose learning weights based on the words’ relative importance in the classification task. Our intuition is that such learned weights place more emphasis on words that have comparatively more to contribute to the later task. We evaluate the embeddings learned from our algorithms on two healthcare-related datasets. The experimental results demonstrate that embeddings learned from the proposed algorithms outperform existing techniques by a relative accuracy improvement of over 9%.

  1. Comparative analysis of expert and machine-learning methods for classification of body cavity effusions in companion animals.

    Science.gov (United States)

    Hotz, Christine S; Templeton, Steven J; Christopher, Mary M

    2005-03-01

    A rule-based expert system using CLIPS programming language was created to classify body cavity effusions as transudates, modified transudates, exudates, chylous, and hemorrhagic effusions. The diagnostic accuracy of the rule-based system was compared with that produced by 2 machine-learning methods: Rosetta, a rough sets algorithm and RIPPER, a rule-induction method. Results of 508 body cavity fluid analyses (canine, feline, equine) obtained from the University of California-Davis Veterinary Medical Teaching Hospital computerized patient database were used to test CLIPS and to test and train RIPPER and Rosetta. The CLIPS system, using 17 rules, achieved an accuracy of 93.5% compared with pathologist consensus diagnoses. Rosetta accurately classified 91% of effusions by using 5,479 rules. RIPPER achieved the greatest accuracy (95.5%) using only 10 rules. When the original rules of the CLIPS application were replaced with those of RIPPER, the accuracy rates were identical. These results suggest that both rule-based expert systems and machine-learning methods hold promise for the preliminary classification of body fluids in the clinical laboratory.

  2. Random ensemble learning for EEG classification.

    Science.gov (United States)

    Hosseini, Mohammad-Parsa; Pompili, Dario; Elisevich, Kost; Soltanian-Zadeh, Hamid

    2018-01-01

    Real-time detection of seizure activity in epilepsy patients is critical in averting seizure activity and improving patients' quality of life. Accurate evaluation, presurgical assessment, seizure prevention, and emergency alerts all depend on the rapid detection of seizure onset. A new method of feature selection and classification for rapid and precise seizure detection is discussed wherein informative components of electroencephalogram (EEG)-derived data are extracted and an automatic method is presented using infinite independent component analysis (I-ICA) to select independent features. The feature space is divided into subspaces via random selection and multichannel support vector machines (SVMs) are used to classify these subspaces. The result of each classifier is then combined by majority voting to establish the final output. In addition, a random subspace ensemble using a combination of SVM, multilayer perceptron (MLP) neural network and an extended k-nearest neighbors (k-NN), called extended nearest neighbor (ENN), is developed for the EEG and electrocorticography (ECoG) big data problem. To evaluate the solution, a benchmark ECoG of eight patients with temporal and extratemporal epilepsy was implemented in a distributed computing framework as a multitier cloud-computing architecture. Using leave-one-out cross-validation, the accuracy, sensitivity, specificity, and both false positive and false negative ratios of the proposed method were found to be 0.97, 0.98, 0.96, 0.04, and 0.02, respectively. Application of the solution to cases under investigation with ECoG has also been effected to demonstrate its utility. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. An Expert System for Diagnosis of Sleep Disorder Using Fuzzy Rule-Based Classification Systems

    Science.gov (United States)

    Septem Riza, Lala; Pradini, Mila; Fitrajaya Rahman, Eka; Rasim

    2017-03-01

    Sleep disorder is an anomaly that could cause problems for someone’ sleeping pattern. Nowadays, it becomes an issue since people are getting busy with their own business and have no time to visit the doctors. Therefore, this research aims to develop a system used for diagnosis of sleep disorder using Fuzzy Rule-Based Classification System (FRBCS). FRBCS is a method based on the fuzzy set concepts. It consists of two steps: (i) constructing a model/knowledge involving rulebase and database, and (ii) prediction over new data. In this case, the knowledge is obtained from experts whereas in the prediction stage, we perform fuzzification, inference, and classification. Then, a platform implementing the method is built with a combination between PHP and the R programming language using the “Shiny” package. To validate the system that has been made, some experiments have been done using data from a psychiatric hospital in West Java, Indonesia. Accuracy of the result and computation time are 84.85% and 0.0133 seconds, respectively.

  4. A hybrid ensemble learning approach to star-galaxy classification

    Science.gov (United States)

    Kim, Edward J.; Brunner, Robert J.; Carrasco Kind, Matias

    2015-10-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template-fitting method. Using data from the CFHTLenS survey (Canada-France-Hawaii Telescope Lensing Survey), we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2 (Deep Extragalactic Evolutionary Probe Phase 2 ), SDSS (Sloan Digital Sky Survey), VIPERS (VIMOS Public Extragalactic Redshift Survey), and VVDS (VIMOS VLT Deep Survey), and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

  5. Convolutional neural network with transfer learning for rice type classification

    Science.gov (United States)

    Patel, Vaibhav Amit; Joshi, Manjunath V.

    2018-04-01

    Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.

  6. Manifold regularized multitask feature learning for multimodality disease classification.

    Science.gov (United States)

    Jie, Biao; Zhang, Daoqiang; Cheng, Bo; Shen, Dinggang

    2015-02-01

    Multimodality based methods have shown great advantages in classification of Alzheimer's disease (AD) and its prodromal stage, that is, mild cognitive impairment (MCI). Recently, multitask feature selection methods are typically used for joint selection of common features across multiple modalities. However, one disadvantage of existing multimodality based methods is that they ignore the useful data distribution information in each modality, which is essential for subsequent classification. Accordingly, in this paper we propose a manifold regularized multitask feature learning method to preserve both the intrinsic relatedness among multiple modalities of data and the data distribution information in each modality. Specifically, we denote the feature learning on each modality as a single task, and use group-sparsity regularizer to capture the intrinsic relatedness among multiple tasks (i.e., modalities) and jointly select the common features from multiple tasks. Furthermore, we introduce a new manifold-based Laplacian regularizer to preserve the data distribution information from each task. Finally, we use the multikernel support vector machine method to fuse multimodality data for eventual classification. Conversely, we also extend our method to the semisupervised setting, where only partial data are labeled. We evaluate our method using the baseline magnetic resonance imaging (MRI), fluorodeoxyglucose positron emission tomography (FDG-PET), and cerebrospinal fluid (CSF) data of subjects from AD neuroimaging initiative database. The experimental results demonstrate that our proposed method can not only achieve improved classification performance, but also help to discover the disease-related brain regions useful for disease diagnosis. © 2014 Wiley Periodicals, Inc.

  7. Machine-learning methods in the classification of water bodies

    Directory of Open Access Journals (Sweden)

    Sołtysiak Marek

    2016-06-01

    Full Text Available Amphibian species have been considered as useful ecological indicators. They are used as indicators of environmental contamination, ecosystem health and habitat quality., Amphibian species are sensitive to changes in the aquatic environment and therefore, may form the basis for the classification of water bodies. Water bodies in which there are a large number of amphibian species are especially valuable even if they are located in urban areas. The automation of the classification process allows for a faster evaluation of the presence of amphibian species in the water bodies. Three machine-learning methods (artificial neural networks, decision trees and the k-nearest neighbours algorithm have been used to classify water bodies in Chorzów – one of 19 cities in the Upper Silesia Agglomeration. In this case, classification is a supervised data mining method consisting of several stages such as building the model, the testing phase and the prediction. Seven natural and anthropogenic features of water bodies (e.g. the type of water body, aquatic plants, the purpose of the water body (destination, position of the water body in relation to any possible buildings, condition of the water body, the degree of littering, the shore type and fishing activities have been taken into account in the classification. The data set used in this study involved information about 71 different water bodies and 9 amphibian species living in them. The results showed that the best average classification accuracy was obtained with the multilayer perceptron neural network.

  8. Learning Supervised Topic Models for Classification and Regression from Crowds

    DEFF Research Database (Denmark)

    Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete

    2017-01-01

    problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages...... annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression...

  9. Genetic learning in rule-based and neural systems

    Science.gov (United States)

    Smith, Robert E.

    1993-01-01

    The design of neural networks and fuzzy systems can involve complex, nonlinear, and ill-conditioned optimization problems. Often, traditional optimization schemes are inadequate or inapplicable for such tasks. Genetic Algorithms (GA's) are a class of optimization procedures whose mechanics are based on those of natural genetics. Mathematical arguments show how GAs bring substantial computational leverage to search problems, without requiring the mathematical characteristics often necessary for traditional optimization schemes (e.g., modality, continuity, availability of derivative information, etc.). GA's have proven effective in a variety of search tasks that arise in neural networks and fuzzy systems. This presentation begins by introducing the mechanism and theoretical underpinnings of GA's. GA's are then related to a class of rule-based machine learning systems called learning classifier systems (LCS's). An LCS implements a low-level production-system that uses a GA as its primary rule discovery mechanism. This presentation illustrates how, despite its rule-based framework, an LCS can be thought of as a competitive neural network. Neural network simulator code for an LCS is presented. In this context, the GA is doing more than optimizing and objective function. It is searching for an ecology of hidden nodes with limited connectivity. The GA attempts to evolve this ecology such that effective neural network performance results. The GA is particularly well adapted to this task, given its naturally-inspired basis. The LCS/neural network analogy extends itself to other, more traditional neural networks. Conclusions to the presentation discuss the implications of using GA's in ecological search problems that arise in neural and fuzzy systems.

  10. Fifty years of computer analysis in chest imaging: rule-based, machine learning, deep learning.

    Science.gov (United States)

    van Ginneken, Bram

    2017-03-01

    Half a century ago, the term "computer-aided diagnosis" (CAD) was introduced in the scientific literature. Pulmonary imaging, with chest radiography and computed tomography, has always been one of the focus areas in this field. In this study, I describe how machine learning became the dominant technology for tackling CAD in the lungs, generally producing better results than do classical rule-based approaches, and how the field is now rapidly changing: in the last few years, we have seen how even better results can be obtained with deep learning. The key differences among rule-based processing, machine learning, and deep learning are summarized and illustrated for various applications of CAD in the chest.

  11. A deep learning pipeline for Indian dance style classification

    Science.gov (United States)

    Dewan, Swati; Agarwal, Shubham; Singh, Navjyoti

    2018-04-01

    In this paper, we address the problem of dance style classification to classify Indian dance or any dance in general. We propose a 3-step deep learning pipeline. First, we extract 14 essential joint locations of the dancer from each video frame, this helps us to derive any body region location within the frame, we use this in the second step which forms the main part of our pipeline. Here, we divide the dancer into regions of important motion in each video frame. We then extract patches centered at these regions. Main discriminative motion is captured in these patches. We stack the features from all such patches of a frame into a single vector and form our hierarchical dance pose descriptor. Finally, in the third step, we build a high level representation of the dance video using the hierarchical descriptors and train it using a Recurrent Neural Network (RNN) for classification. Our novelty also lies in the way we use multiple representations for a single video. This helps us to: (1) Overcome the RNN limitation of learning small sequences over big sequences such as dance; (2) Extract more data from the available dataset for effective deep learning by training multiple representations. Our contributions in this paper are three-folds: (1) We provide a deep learning pipeline for classification of any form of dance; (2) We prove that a segmented representation of a dance video works well with sequence learning techniques for recognition purposes; (3) We extend and refine the ICD dataset and provide a new dataset for evaluation of dance. Our model performs comparable or better in some cases than the state-of-the-art on action recognition benchmarks.

  12. Rule Extraction Based on Extreme Learning Machine and an Improved Ant-Miner Algorithm for Transient Stability Assessment.

    Science.gov (United States)

    Li, Yang; Li, Guoqing; Wang, Zhenhao

    2015-01-01

    In order to overcome the problems of poor understandability of the pattern recognition-based transient stability assessment (PRTSA) methods, a new rule extraction method based on extreme learning machine (ELM) and an improved Ant-miner (IAM) algorithm is presented in this paper. First, the basic principles of ELM and Ant-miner algorithm are respectively introduced. Then, based on the selected optimal feature subset, an example sample set is generated by the trained ELM-based PRTSA model. And finally, a set of classification rules are obtained by IAM algorithm to replace the original ELM network. The novelty of this proposal is that transient stability rules are extracted from an example sample set generated by the trained ELM-based transient stability assessment model by using IAM algorithm. The effectiveness of the proposed method is shown by the application results on the New England 39-bus power system and a practical power system--the southern power system of Hebei province.

  13. Quantum learning: asymptotically optimal classification of qubit states

    International Nuclear Information System (INIS)

    Guta, Madalin; Kotlowski, Wojciech

    2010-01-01

    Pattern recognition is a central topic in learning theory, with numerous applications such as voice and text recognition, image analysis and computer diagnosis. The statistical setup in classification is the following: we are given an i.i.d. training set (X 1 , Y 1 ), ... , (X n , Y n ), where X i represents a feature and Y i in{0, 1} is a label attached to that feature. The underlying joint distribution of (X, Y) is unknown, but we can learn about it from the training set, and we aim at devising low error classifiers f: X→Y used to predict the label of new incoming features. In this paper, we solve a quantum analogue of this problem, namely the classification of two arbitrary unknown mixed qubit states. Given a number of 'training' copies from each of the states, we would like to 'learn' about them by performing a measurement on the training set. The outcome is then used to design measurements for the classification of future systems with unknown labels. We found the asymptotically optimal classification strategy and show that typically it performs strictly better than a plug-in strategy, which consists of estimating the states separately and then discriminating between them using the Helstrom measurement. The figure of merit is given by the excess risk equal to the difference between the probability of error and the probability of error of the optimal measurement for known states. We show that the excess risk scales as n -1 and compute the exact constant of the rate.

  14. Fast Low-Rank Shared Dictionary Learning for Image Classification.

    Science.gov (United States)

    Tiep Huu Vu; Monga, Vishal

    2017-11-01

    Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e., claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Furthermore, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image data sets establish the advantages of our method over the state-of-the-art dictionary learning methods.

  15. Machine Learning Based Localization and Classification with Atomic Magnetometers

    Science.gov (United States)

    Deans, Cameron; Griffin, Lewis D.; Marmugi, Luca; Renzoni, Ferruccio

    2018-01-01

    We demonstrate identification of position, material, orientation, and shape of objects imaged by a Rb 85 atomic magnetometer performing electromagnetic induction imaging supported by machine learning. Machine learning maximizes the information extracted from the images created by the magnetometer, demonstrating the use of hidden data. Localization 2.6 times better than the spatial resolution of the imaging system and successful classification up to 97% are obtained. This circumvents the need of solving the inverse problem and demonstrates the extension of machine learning to diffusive systems, such as low-frequency electrodynamics in media. Automated collection of task-relevant information from quantum-based electromagnetic imaging will have a relevant impact from biomedicine to security.

  16. A dictionary learning approach for human sperm heads classification.

    Science.gov (United States)

    Shaker, Fariba; Monadjemi, S Amirhassan; Alirezaie, Javad; Naghsh-Nilchi, Ahmad Reza

    2017-12-01

    To diagnose infertility in men, semen analysis is conducted in which sperm morphology is one of the factors that are evaluated. Since manual assessment of sperm morphology is time-consuming and subjective, automatic classification methods are being developed. Automatic classification of sperm heads is a complicated task due to the intra-class differences and inter-class similarities of class objects. In this research, a Dictionary Learning (DL) technique is utilized to construct a dictionary of sperm head shapes. This dictionary is used to classify the sperm heads into four different classes. Square patches are extracted from the sperm head images. Columnized patches from each class of sperm are used to learn class-specific dictionaries. The patches from a test image are reconstructed using each class-specific dictionary and the overall reconstruction error for each class is used to select the best matching class. Average accuracy, precision, recall, and F-score are used to evaluate the classification method. The method is evaluated using two publicly available datasets of human sperm head shapes. The proposed DL based method achieved an average accuracy of 92.2% on the HuSHeM dataset, and an average recall of 62% on the SCIAN-MorphoSpermGS dataset. The results show a significant improvement compared to a previously published shape-feature-based method. We have achieved high-performance results. In addition, our proposed approach offers a more balanced classifier in which all four classes are recognized with high precision and recall. In this paper, we use a Dictionary Learning approach in classifying human sperm heads. It is shown that the Dictionary Learning method is far more effective in classifying human sperm heads than classifiers using shape-based features. Also, a dataset of human sperm head shapes is introduced to facilitate future research. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

    Science.gov (United States)

    Akhtar, Naveed; Mian, Ajmal

    2017-10-03

    We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.

  18. Machine Learning Classification of Buildings for Map Generalization

    Directory of Open Access Journals (Sweden)

    Jaeeun Lee

    2017-10-01

    Full Text Available A critical problem in mapping data is the frequent updating of large data sets. To solve this problem, the updating of small-scale data based on large-scale data is very effective. Various map generalization techniques, such as simplification, displacement, typification, elimination, and aggregation, must therefore be applied. In this study, we focused on the elimination and aggregation of the building layer, for which each building in a large scale was classified as “0-eliminated,” “1-retained,” or “2-aggregated.” Machine-learning classification algorithms were then used for classifying the buildings. The data of 1:1000 scale and 1:25,000 scale digital maps obtained from the National Geographic Information Institute were used. We applied to these data various machine-learning classification algorithms, including naive Bayes (NB, decision tree (DT, k-nearest neighbor (k-NN, and support vector machine (SVM. The overall accuracies of each algorithm were satisfactory: DT, 88.96%; k-NN, 88.27%; SVM, 87.57%; and NB, 79.50%. Although elimination is a direct part of the proposed process, generalization operations, such as simplification and aggregation of polygons, must still be performed for buildings classified as retained and aggregated. Thus, these algorithms can be used for building classification and can serve as preparatory steps for building generalization.

  19. Optimizing Multiple Kernel Learning for the Classification of UAV Data

    Directory of Open Access Journals (Sweden)

    Caroline M. Gevaert

    2016-12-01

    Full Text Available Unmanned Aerial Vehicles (UAVs are capable of providing high-quality orthoimagery and 3D information in the form of point clouds at a relatively low cost. Their increasing popularity stresses the necessity of understanding which algorithms are especially suited for processing the data obtained from UAVs. The features that are extracted from the point cloud and imagery have different statistical characteristics and can be considered as heterogeneous, which motivates the use of Multiple Kernel Learning (MKL for classification problems. In this paper, we illustrate the utility of applying MKL for the classification of heterogeneous features obtained from UAV data through a case study of an informal settlement in Kigali, Rwanda. Results indicate that MKL can achieve a classification accuracy of 90.6%, a 5.2% increase over a standard single-kernel Support Vector Machine (SVM. A comparison of seven MKL methods indicates that linearly-weighted kernel combinations based on simple heuristics are competitive with respect to computationally-complex, non-linear kernel combination methods. We further underline the importance of utilizing appropriate feature grouping strategies for MKL, which has not been directly addressed in the literature, and we propose a novel, automated feature grouping method that achieves a high classification accuracy for various MKL methods.

  20. Semi-Supervised Learning for Classification of Protein Sequence Data

    Directory of Open Access Journals (Sweden)

    Brian R. King

    2008-01-01

    Full Text Available Protein sequence data continue to become available at an exponential rate. Annotation of functional and structural attributes of these data lags far behind, with only a small fraction of the data understood and labeled by experimental methods. Classification methods that are based on semi-supervised learning can increase the overall accuracy of classifying partly labeled data in many domains, but very few methods exist that have shown their effect on protein sequence classification. We show how proven methods from text classification can be applied to protein sequence data, as we consider both existing and novel extensions to the basic methods, and demonstrate restrictions and differences that must be considered. We demonstrate comparative results against the transductive support vector machine, and show superior results on the most difficult classification problems. Our results show that large repositories of unlabeled protein sequence data can indeed be used to improve predictive performance, particularly in situations where there are fewer labeled protein sequences available, and/or the data are highly unbalanced in nature.

  1. Quality prediction modeling for multistage manufacturing based on classification and association rule mining

    Directory of Open Access Journals (Sweden)

    Kao Hung-An

    2017-01-01

    Full Text Available For manufacturing enterprises, product quality is a key factor to assess production capability and increase their core competence. To reduce external failure cost, many research and methodology have been introduced in order to improve process yield rate, such as TQC/TQM, Shewhart CycleDeming's 14 Points, etc. Nowadays, impressive progress has been made in process monitoring and industrial data analysis because of the Industry 4.0 trend. Industries start to utilize quality control (QC methodology to lower inspection overhead and internal failure cost. Currently, the focus of QC is mostly in the inspection of single workstation and final product, however, for multistage manufacturing, many factors (like equipment, operators, parameters, etc. can have cumulative and interactive effects to the final quality. When failure occurs, it is difficult to resume the original settings for cause analysis. To address these problems, this research proposes a combination of principal components analysis (PCA with classification and association rule mining algorithms to extract features representing relationship of multiple workstations, predict final product quality, and analyze the root-cause of product defect. The method is demonstrated on a semiconductor data set.

  2. Representation learning with deep extreme learning machines for efficient image set classification

    KAUST Repository

    Uzair, Muhammad

    2016-12-09

    Efficient and accurate representation of a collection of images, that belong to the same class, is a major research challenge for practical image set classification. Existing methods either make prior assumptions about the data structure, or perform heavy computations to learn structure from the data itself. In this paper, we propose an efficient image set representation that does not make any prior assumptions about the structure of the underlying data. We learn the nonlinear structure of image sets with deep extreme learning machines that are very efficient and generalize well even on a limited number of training samples. Extensive experiments on a broad range of public datasets for image set classification show that the proposed algorithm consistently outperforms state-of-the-art image set classification methods both in terms of speed and accuracy.

  3. Representation learning with deep extreme learning machines for efficient image set classification

    KAUST Repository

    Uzair, Muhammad; Shafait, Faisal; Ghanem, Bernard; Mian, Ajmal

    2016-01-01

    Efficient and accurate representation of a collection of images, that belong to the same class, is a major research challenge for practical image set classification. Existing methods either make prior assumptions about the data structure, or perform heavy computations to learn structure from the data itself. In this paper, we propose an efficient image set representation that does not make any prior assumptions about the structure of the underlying data. We learn the nonlinear structure of image sets with deep extreme learning machines that are very efficient and generalize well even on a limited number of training samples. Extensive experiments on a broad range of public datasets for image set classification show that the proposed algorithm consistently outperforms state-of-the-art image set classification methods both in terms of speed and accuracy.

  4. Learning semantic histopathological representation for basal cell carcinoma classification

    Science.gov (United States)

    Gutiérrez, Ricardo; Rueda, Andrea; Romero, Eduardo

    2013-03-01

    Diagnosis of a histopathology glass slide is a complex process that involves accurate recognition of several structures, their function in the tissue and their relation with other structures. The way in which the pathologist represents the image content and the relations between those objects yields a better and accurate diagnoses. Therefore, an appropriate semantic representation of the image content will be useful in several analysis tasks such as cancer classification, tissue retrieval and histopahological image analysis, among others. Nevertheless, to automatically recognize those structures and extract their inner semantic meaning are still very challenging tasks. In this paper we introduce a new semantic representation that allows to describe histopathological concepts suitable for classification. The approach herein identify local concepts using a dictionary learning approach, i.e., the algorithm learns the most representative atoms from a set of random sampled patches, and then models the spatial relations among them by counting the co-occurrence between atoms, while penalizing the spatial distance. The proposed approach was compared with a bag-of-features representation in a tissue classification task. For this purpose, 240 histological microscopical fields of view, 24 per tissue class, were collected. Those images fed a Support Vector Machine classifier per class, using 120 images as train set and the remaining ones for testing, maintaining the same proportion of each concept in the train and test sets. The obtained classification results, averaged from 100 random partitions of training and test sets, shows that our approach is more sensitive in average than the bag-of-features representation in almost 6%.

  5. Solar wind classification from a machine learning perspective

    Science.gov (United States)

    Heidrich-Meisner, V.; Wimmer-Schweingruber, R. F.

    2017-12-01

    It is a very well known fact that the ubiquitous solar wind comes in at least two varieties, the slow solar wind and the coronal hole wind. The simplified view of two solar wind types has been frequently challenged. Existing solar wind categorization schemes rely mainly on different combinations of the solar wind proton speed, the O and C charge state ratios, the Alfvén speed, the expected proton temperature and the specific proton entropy. In available solar wind classification schemes, solar wind from stream interaction regimes is often considered either as coronal hole wind or slow solar wind, although their plasma properties are different compared to "pure" coronal hole or slow solar wind. As shown in Neugebauer et al. (2016), even if only two solar wind types are assumed, available solar wind categorization schemes differ considerably for intermediate solar wind speeds. Thus, the decision boundary between the coronal hole and the slow solar wind is so far not well defined.In this situation, a machine learning approach to solar wind classification can provide an additional perspective.We apply a well-known machine learning method, k-means, to the task of solar wind classification in order to answer the following questions: (1) How many solar wind types can reliably be identified in our data set comprised of ten years of solar wind observations from the Advanced Composition Explorer (ACE)? (2) Which combinations of solar wind parameters are particularly useful for solar wind classification?Potential subtypes of slow solar wind are of particular interest because they can provide hints of respective different source regions or release mechanisms of slow solar wind.

  6. Supervised Learning Applied to Air Traffic Trajectory Classification

    Science.gov (United States)

    Bosson, Christabelle; Nikoleris, Tasos

    2018-01-01

    Given the recent increase of interest in introducing new vehicle types and missions into the National Airspace System, a transition towards a more autonomous air traffic control system is required in order to enable and handle increased density and complexity. This paper presents an exploratory effort of the needed autonomous capabilities by exploring supervised learning techniques in the context of aircraft trajectories. In particular, it focuses on the application of machine learning algorithms and neural network models to a runway recognition trajectory-classification study. It investigates the applicability and effectiveness of various classifiers using datasets containing trajectory records for a month of air traffic. A feature importance and sensitivity analysis are conducted to challenge the chosen time-based datasets and the ten selected features. The study demonstrates that classification accuracy levels of 90% and above can be reached in less than 40 seconds of training for most machine learning classifiers when one track data point, described by the ten selected features at a particular time step, per trajectory is used as input. It also shows that neural network models can achieve similar accuracy levels but at higher training time costs.

  7. Robot Grasp Learning by Demonstration without Predefined Rules

    Directory of Open Access Journals (Sweden)

    César Fernández

    2011-12-01

    Full Text Available A learning-based approach to autonomous robot grasping is presented. Pattern recognition techniques are used to measure the similarity between a set of previously stored example grasps and all the possible candidate grasps for a new object. Two sets of features are defined in order to characterize grasps: point attributes describe the surroundings of a contact point; point-set attributes describe the relationship between the set of n contact points (assuming an n-fingered robot gripper is used. In the experiments performed, the nearest neighbour classifier outperforms other approaches like multilayer perceptrons, radial basis functions or decision trees, in terms of classification accuracy, while computational load is not excessive for a real time application (a grasp is fully synthesized in 0.2 seconds. The results obtained on a synthetic database show that the proposed system is able to imitate the grasping behaviour of the user (e.g. the system learns to grasp a mug by its handle. All the code has been made available for testing purposes.

  8. Deep transfer learning for automatic target classification: MWIR to LWIR

    Science.gov (United States)

    Ding, Zhengming; Nasrabadi, Nasser; Fu, Yun

    2016-05-01

    Publisher's Note: This paper, originally published on 5/12/2016, was replaced with a corrected/revised version on 5/18/2016. If you downloaded the original PDF but are unable to access the revision, please contact SPIE Digital Library Customer Service for assistance. When dealing with sparse or no labeled data in the target domain, transfer learning shows its appealing performance by borrowing the supervised knowledge from external domains. Recently deep structure learning has been exploited in transfer learning due to its attractive power in extracting effective knowledge through multi-layer strategy, so that deep transfer learning is promising to address the cross-domain mismatch. In general, cross-domain disparity can be resulted from the difference between source and target distributions or different modalities, e.g., Midwave IR (MWIR) and Longwave IR (LWIR). In this paper, we propose a Weighted Deep Transfer Learning framework for automatic target classification through a task-driven fashion. Specifically, deep features and classifier parameters are obtained simultaneously for optimal classification performance. In this way, the proposed deep structures can extract more effective features with the guidance of the classifier performance; on the other hand, the classifier performance is further improved since it is optimized on more discriminative features. Furthermore, we build a weighted scheme to couple source and target output by assigning pseudo labels to target data, therefore we can transfer knowledge from source (i.e., MWIR) to target (i.e., LWIR). Experimental results on real databases demonstrate the superiority of the proposed algorithm by comparing with others.

  9. The Costs of Supervised Classification: The Effect of Learning Task on Conceptual Flexibility

    Science.gov (United States)

    Hoffman, Aaron B.; Rehder, Bob

    2010-01-01

    Research has shown that learning a concept via standard supervised classification leads to a focus on diagnostic features, whereas learning by inferring missing features promotes the acquisition of within-category information. Accordingly, we predicted that classification learning would produce a deficit in people's ability to draw "novel…

  10. Ensemble Deep Learning for Biomedical Time Series Classification

    Directory of Open Access Journals (Sweden)

    Lin-peng Jin

    2016-01-01

    Full Text Available Ensemble learning has been proved to improve the generalization ability effectively in both theory and practice. In this paper, we briefly outline the current status of research on it first. Then, a new deep neural network-based ensemble method that integrates filtering views, local views, distorted views, explicit training, implicit training, subview prediction, and Simple Average is proposed for biomedical time series classification. Finally, we validate its effectiveness on the Chinese Cardiovascular Disease Database containing a large number of electrocardiogram recordings. The experimental results show that the proposed method has certain advantages compared to some well-known ensemble methods, such as Bagging and AdaBoost.

  11. Multimodal Task-Driven Dictionary Learning for Image Classification

    Science.gov (United States)

    2015-12-18

    recognition, multi-view face recognition, multi-view action recognition, and multimodal biometric recognition. It is also shown that, compared to the...improvement in several multi-task learning applications such as target classification, biometric recognitions, and multiview face recognition [12], [14], [17...relevant samples from other modalities for a given unimodal query. However, α1 α2 …αS D1 … Index finger Thumb finger … Iris x1 x2 xS D2 DS … … … J o in

  12. Machine Learning Techniques for Stellar Light Curve Classification

    Science.gov (United States)

    Hinners, Trisha A.; Tat, Kevin; Thorp, Rachel

    2018-07-01

    We apply machine learning techniques in an attempt to predict and classify stellar properties from noisy and sparse time-series data. We preprocessed over 94 GB of Kepler light curves from the Mikulski Archive for Space Telescopes (MAST) to classify according to 10 distinct physical properties using both representation learning and feature engineering approaches. Studies using machine learning in the field have been primarily done on simulated data, making our study one of the first to use real light-curve data for machine learning approaches. We tuned our data using previous work with simulated data as a template and achieved mixed results between the two approaches. Representation learning using a long short-term memory recurrent neural network produced no successful predictions, but our work with feature engineering was successful for both classification and regression. In particular, we were able to achieve values for stellar density, stellar radius, and effective temperature with low error (∼2%–4%) and good accuracy (∼75%) for classifying the number of transits for a given star. The results show promise for improvement for both approaches upon using larger data sets with a larger minority class. This work has the potential to provide a foundation for future tools and techniques to aid in the analysis of astrophysical data.

  13. Indexing Density Models for Incremental Learning and Anytime Classification on Data Streams

    DEFF Research Database (Denmark)

    Seidl, Thomas; Assent, Ira; Kranen, Philipp

    2009-01-01

    Classification of streaming data faces three basic challenges: it has to deal with huge amounts of data, the varying time between two stream data items must be used best possible (anytime classification) and additional training data must be incrementally learned (anytime learning) for applying...... to the individual object to be classified) a hierarchy of mixture densities that represent kernel density estimators at successively coarser levels. Our probability density queries together with novel classification improvement strategies provide the necessary information for very effective classification at any...... point of interruption. Moreover, we propose a novel evaluation method for anytime classification using Poisson streams and demonstrate the anytime learning performance of the Bayes tree....

  14. A comparative study of machine learning models for ethnicity classification

    Science.gov (United States)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  15. Learning Supervised Topic Models for Classification and Regression from Crowds.

    Science.gov (United States)

    Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete; Pereira, Francisco C

    2017-12-01

    The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.

  16. A self-learning rule base for command following in dynamical systems

    Science.gov (United States)

    Tsai, Wei K.; Lee, Hon-Mun; Parlos, Alexander

    1992-01-01

    In this paper, a self-learning Rule Base for command following in dynamical systems is presented. The learning is accomplished though reinforcement learning using an associative memory called SAM. The main advantage of SAM is that it is a function approximator with explicit storage of training samples. A learning algorithm patterned after the dynamic programming is proposed. Two artificially created, unstable dynamical systems are used for testing, and the Rule Base was used to generate a feedback control to improve the command following ability of the otherwise uncontrolled systems. The numerical results are very encouraging. The controlled systems exhibit a more stable behavior and a better capability to follow reference commands. The rules resulting from the reinforcement learning are explicitly stored and they can be modified or augmented by human experts. Due to overlapping storage scheme of SAM, the stored rules are similar to fuzzy rules.

  17. Classification of Phishing Email Using Random Forest Machine Learning Technique

    Directory of Open Access Journals (Sweden)

    Andronicus A. Akinyelu

    2014-01-01

    Full Text Available Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN and false positive (FP rates.

  18. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  19. Robust Semi-Supervised Manifold Learning Algorithm for Classification

    Directory of Open Access Journals (Sweden)

    Mingxia Chen

    2018-01-01

    Full Text Available In the recent years, manifold learning methods have been widely used in data classification to tackle the curse of dimensionality problem, since they can discover the potential intrinsic low-dimensional structures of the high-dimensional data. Given partially labeled data, the semi-supervised manifold learning algorithms are proposed to predict the labels of the unlabeled points, taking into account label information. However, these semi-supervised manifold learning algorithms are not robust against noisy points, especially when the labeled data contain noise. In this paper, we propose a framework for robust semi-supervised manifold learning (RSSML to address this problem. The noisy levels of the labeled points are firstly predicted, and then a regularization term is constructed to reduce the impact of labeled points containing noise. A new robust semi-supervised optimization model is proposed by adding the regularization term to the traditional semi-supervised optimization model. Numerical experiments are given to show the improvement and efficiency of RSSML on noisy data sets.

  20. Simulation-driven machine learning: Bearing fault classification

    Science.gov (United States)

    Sobie, Cameron; Freitas, Carina; Nicolai, Mike

    2018-01-01

    Increasing the accuracy of mechanical fault detection has the potential to improve system safety and economic performance by minimizing scheduled maintenance and the probability of unexpected system failure. Advances in computational performance have enabled the application of machine learning algorithms across numerous applications including condition monitoring and failure detection. Past applications of machine learning to physical failure have relied explicitly on historical data, which limits the feasibility of this approach to in-service components with extended service histories. Furthermore, recorded failure data is often only valid for the specific circumstances and components for which it was collected. This work directly addresses these challenges for roller bearings with race faults by generating training data using information gained from high resolution simulations of roller bearing dynamics, which is used to train machine learning algorithms that are then validated against four experimental datasets. Several different machine learning methodologies are compared starting from well-established statistical feature-based methods to convolutional neural networks, and a novel application of dynamic time warping (DTW) to bearing fault classification is proposed as a robust, parameter free method for race fault detection.

  1. Analysis of Rules for Islamic Inheritance Law in Indonesia Using Hybrid Rule Based Learning

    Science.gov (United States)

    Khosyi'ah, S.; Irfan, M.; Maylawati, D. S.; Mukhlas, O. S.

    2018-01-01

    Along with the development of human civilization in Indonesia, the changes and reform of Islamic inheritance law so as to conform to the conditions and culture cannot be denied. The distribution of inheritance in Indonesia can be done automatically by storing the rule of Islamic inheritance law in the expert system. In this study, we analyze the knowledge of experts in Islamic inheritance in Indonesia and represent it in the form of rules using rule-based Forward Chaining (FC) and Davis-Putman-Logemann-Loveland (DPLL) algorithms. By hybridizing FC and DPLL algorithms, the rules of Islamic inheritance law in Indonesia are clearly defined and measured. The rules were conceptually validated by some experts in Islamic laws and informatics. The results revealed that generally all rules were ready for use in an expert system.

  2. Concurrence of rule- and similarity-based mechanisms in artificial grammar learning.

    Science.gov (United States)

    Opitz, Bertram; Hofmann, Juliane

    2015-03-01

    A current theoretical debate regards whether rule-based or similarity-based learning prevails during artificial grammar learning (AGL). Although the majority of findings are consistent with a similarity-based account of AGL it has been argued that these results were obtained only after limited exposure to study exemplars, and performance on subsequent grammaticality judgment tests has often been barely above chance level. In three experiments the conditions were investigated under which rule- and similarity-based learning could be applied. Participants were exposed to exemplars of an artificial grammar under different (implicit and explicit) learning instructions. The analysis of receiver operating characteristics (ROC) during a final grammaticality judgment test revealed that explicit but not implicit learning led to rule knowledge. It also demonstrated that this knowledge base is built up gradually while similarity knowledge governed the initial state of learning. Together these results indicate that rule- and similarity-based mechanisms concur during AGL. Moreover, it could be speculated that two different rule processes might operate in parallel; bottom-up learning via gradual rule extraction and top-down learning via rule testing. Crucially, the latter is facilitated by performance feedback that encourages explicit hypothesis testing. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. A comparison of rule-based and machine learning approaches for classifying patient portal messages.

    Science.gov (United States)

    Cronin, Robert M; Fabbri, Daniel; Denny, Joshua C; Rosenbloom, S Trent; Jackson, Gretchen Purcell

    2017-09-01

    Secure messaging through patient portals is an increasingly popular way that consumers interact with healthcare providers. The increasing burden of secure messaging can affect clinic staffing and workflows. Manual management of portal messages is costly and time consuming. Automated classification of portal messages could potentially expedite message triage and delivery of care. We developed automated patient portal message classifiers with rule-based and machine learning techniques using bag of words and natural language processing (NLP) approaches. To evaluate classifier performance, we used a gold standard of 3253 portal messages manually categorized using a taxonomy of communication types (i.e., main categories of informational, medical, logistical, social, and other communications, and subcategories including prescriptions, appointments, problems, tests, follow-up, contact information, and acknowledgement). We evaluated our classifiers' accuracies in identifying individual communication types within portal messages with area under the receiver-operator curve (AUC). Portal messages often contain more than one type of communication. To predict all communication types within single messages, we used the Jaccard Index. We extracted the variables of importance for the random forest classifiers. The best performing approaches to classification for the major communication types were: logistic regression for medical communications (AUC: 0.899); basic (rule-based) for informational communications (AUC: 0.842); and random forests for social communications and logistical communications (AUCs: 0.875 and 0.925, respectively). The best performing classification approach of classifiers for individual communication subtypes was random forests for Logistical-Contact Information (AUC: 0.963). The Jaccard Indices by approach were: basic classifier, Jaccard Index: 0.674; Naïve Bayes, Jaccard Index: 0.799; random forests, Jaccard Index: 0.859; and logistic regression, Jaccard

  4. Functional networks inference from rule-based machine learning models.

    Science.gov (United States)

    Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume

    2016-01-01

    Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The

  5. Comparisons of likelihood and machine learning methods of individual classification

    Science.gov (United States)

    Guinand, B.; Topchy, A.; Page, K.S.; Burnham-Curtis, M. K.; Punch, W.F.; Scribner, K.T.

    2002-01-01

    Classification methods used in machine learning (e.g., artificial neural networks, decision trees, and k-nearest neighbor clustering) are rarely used with population genetic data. We compare different nonparametric machine learning techniques with parametric likelihood estimations commonly employed in population genetics for purposes of assigning individuals to their population of origin (“assignment tests”). Classifier accuracy was compared across simulated data sets representing different levels of population differentiation (low and high FST), number of loci surveyed (5 and 10), and allelic diversity (average of three or eight alleles per locus). Empirical data for the lake trout (Salvelinus namaycush) exhibiting levels of population differentiation comparable to those used in simulations were examined to further evaluate and compare classification methods. Classification error rates associated with artificial neural networks and likelihood estimators were lower for simulated data sets compared to k-nearest neighbor and decision tree classifiers over the entire range of parameters considered. Artificial neural networks only marginally outperformed the likelihood method for simulated data (0–2.8% lower error rates). The relative performance of each machine learning classifier improved relative likelihood estimators for empirical data sets, suggesting an ability to “learn” and utilize properties of empirical genotypic arrays intrinsic to each population. Likelihood-based estimation methods provide a more accessible option for reliable assignment of individuals to the population of origin due to the intricacies in development and evaluation of artificial neural networks. In recent years, characterization of highly polymorphic molecular markers such as mini- and microsatellites and development of novel methods of analysis have enabled researchers to extend investigations of ecological and evolutionary processes below the population level to the level of

  6. Seizure Classification From EEG Signals Using Transfer Learning, Semi-Supervised Learning and TSK Fuzzy System.

    Science.gov (United States)

    Jiang, Yizhang; Wu, Dongrui; Deng, Zhaohong; Qian, Pengjiang; Wang, Jun; Wang, Guanjin; Chung, Fu-Lai; Choi, Kup-Sze; Wang, Shitong

    2017-12-01

    Recognition of epileptic seizures from offline EEG signals is very important in clinical diagnosis of epilepsy. Compared with manual labeling of EEG signals by doctors, machine learning approaches can be faster and more consistent. However, the classification accuracy is usually not satisfactory for two main reasons: the distributions of the data used for training and testing may be different, and the amount of training data may not be enough. In addition, most machine learning approaches generate black-box models that are difficult to interpret. In this paper, we integrate transductive transfer learning, semi-supervised learning and TSK fuzzy system to tackle these three problems. More specifically, we use transfer learning to reduce the discrepancy in data distribution between the training and testing data, employ semi-supervised learning to use the unlabeled testing data to remedy the shortage of training data, and adopt TSK fuzzy system to increase model interpretability. Two learning algorithms are proposed to train the system. Our experimental results show that the proposed approaches can achieve better performance than many state-of-the-art seizure classification algorithms.

  7. Machine learning methods can replace 3D profile method in classification of amyloidogenic hexapeptides

    Directory of Open Access Journals (Sweden)

    Stanislawski Jerzy

    2013-01-01

    Full Text Available Abstract Background Amyloids are proteins capable of forming fibrils. Many of them underlie serious diseases, like Alzheimer disease. The number of amyloid-associated diseases is constantly increasing. Recent studies indicate that amyloidogenic properties can be associated with short segments of aminoacids, which transform the structure when exposed. A few hundreds of such peptides have been experimentally found. Experimental testing of all possible aminoacid combinations is currently not feasible. Instead, they can be predicted by computational methods. 3D profile is a physicochemical-based method that has generated the most numerous dataset - ZipperDB. However, it is computationally very demanding. Here, we show that dataset generation can be accelerated. Two methods to increase the classification efficiency of amyloidogenic candidates are presented and tested: simplified 3D profile generation and machine learning methods. Results We generated a new dataset of hexapeptides, using more economical 3D profile algorithm, which showed very good classification overlap with ZipperDB (93.5%. The new part of our dataset contains 1779 segments, with 204 classified as amyloidogenic. The dataset of 6-residue sequences with their binary classification, based on the energy of the segment, was applied for training machine learning methods. A separate set of sequences from ZipperDB was used as a test set. The most effective methods were Alternating Decision Tree and Multilayer Perceptron. Both methods obtained area under ROC curve of 0.96, accuracy 91%, true positive rate ca. 78%, and true negative rate 95%. A few other machine learning methods also achieved a good performance. The computational time was reduced from 18-20 CPU-hours (full 3D profile to 0.5 CPU-hours (simplified 3D profile to seconds (machine learning. Conclusions We showed that the simplified profile generation method does not introduce an error with regard to the original method, while

  8. Machine learning methods can replace 3D profile method in classification of amyloidogenic hexapeptides.

    Science.gov (United States)

    Stanislawski, Jerzy; Kotulska, Malgorzata; Unold, Olgierd

    2013-01-17

    Amyloids are proteins capable of forming fibrils. Many of them underlie serious diseases, like Alzheimer disease. The number of amyloid-associated diseases is constantly increasing. Recent studies indicate that amyloidogenic properties can be associated with short segments of aminoacids, which transform the structure when exposed. A few hundreds of such peptides have been experimentally found. Experimental testing of all possible aminoacid combinations is currently not feasible. Instead, they can be predicted by computational methods. 3D profile is a physicochemical-based method that has generated the most numerous dataset - ZipperDB. However, it is computationally very demanding. Here, we show that dataset generation can be accelerated. Two methods to increase the classification efficiency of amyloidogenic candidates are presented and tested: simplified 3D profile generation and machine learning methods. We generated a new dataset of hexapeptides, using more economical 3D profile algorithm, which showed very good classification overlap with ZipperDB (93.5%). The new part of our dataset contains 1779 segments, with 204 classified as amyloidogenic. The dataset of 6-residue sequences with their binary classification, based on the energy of the segment, was applied for training machine learning methods. A separate set of sequences from ZipperDB was used as a test set. The most effective methods were Alternating Decision Tree and Multilayer Perceptron. Both methods obtained area under ROC curve of 0.96, accuracy 91%, true positive rate ca. 78%, and true negative rate 95%. A few other machine learning methods also achieved a good performance. The computational time was reduced from 18-20 CPU-hours (full 3D profile) to 0.5 CPU-hours (simplified 3D profile) to seconds (machine learning). We showed that the simplified profile generation method does not introduce an error with regard to the original method, while increasing the computational efficiency. Our new dataset

  9. Criterion learning in rule-based categorization: simulation of neural mechanism and new data.

    Science.gov (United States)

    Helie, Sebastien; Ell, Shawn W; Filoteo, J Vincent; Maddox, W Todd

    2015-04-01

    In perceptual categorization, rule selection consists of selecting one or several stimulus-dimensions to be used to categorize the stimuli (e.g., categorize lines according to their length). Once a rule has been selected, criterion learning consists of defining how stimuli will be grouped using the selected dimension(s) (e.g., if the selected rule is line length, define 'long' and 'short'). Very little is known about the neuroscience of criterion learning, and most existing computational models do not provide a biological mechanism for this process. In this article, we introduce a new model of rule learning called Heterosynaptic Inhibitory Criterion Learning (HICL). HICL includes a biologically-based explanation of criterion learning, and we use new category-learning data to test key aspects of the model. In HICL, rule selective cells in prefrontal cortex modulate stimulus-response associations using pre-synaptic inhibition. Criterion learning is implemented by a new type of heterosynaptic error-driven Hebbian learning at inhibitory synapses that uses feedback to drive cell activation above/below thresholds representing ionic gating mechanisms. The model is used to account for new human categorization data from two experiments showing that: (1) changing rule criterion on a given dimension is easier if irrelevant dimensions are also changing (Experiment 1), and (2) showing that changing the relevant rule dimension and learning a new criterion is more difficult, but also facilitated by a change in the irrelevant dimension (Experiment 2). We conclude with a discussion of some of HICL's implications for future research on rule learning. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS

    Directory of Open Access Journals (Sweden)

    Y. B. Abdullin

    2017-01-01

    Full Text Available Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging due to the lack of contextual information. We propose a deep neural network model that uses bilingual word embeddings to effectively solve sentiment classification problem for a given pair of languages. We apply our approach to two corpora of two different language pairs: English-Russian and Russian-Kazakh. We show how to train a classifier in one language and predict in another. Our approach achieves 73% accuracy for English and 74% accuracy for Russian. For Kazakh sentiment analysis, we propose a baseline method, that achieves 60% accuracy; and a method to learn bilingual embeddings from a large unlabeled corpus using a bilingual word pairs.

  11. Hybrid Collaborative Learning for Classification and Clustering in Sensor Networks

    Science.gov (United States)

    Wagstaff, Kiri L.; Sosnowski, Scott; Lane, Terran

    2012-01-01

    Traditionally, nodes in a sensor network simply collect data and then pass it on to a centralized node that archives, distributes, and possibly analyzes the data. However, analysis at the individual nodes could enable faster detection of anomalies or other interesting events as well as faster responses, such as sending out alerts or increasing the data collection rate. There is an additional opportunity for increased performance if learners at individual nodes can communicate with their neighbors. In previous work, methods were developed by which classification algorithms deployed at sensor nodes can communicate information about event labels to each other, building on prior work with co-training, self-training, and active learning. The idea of collaborative learning was extended to function for clustering algorithms as well, similar to ideas from penta-training and consensus clustering. However, collaboration between these learner types had not been explored. A new protocol was developed by which classifiers and clusterers can share key information about their observations and conclusions as they learn. This is an active collaboration in which learners of either type can query their neighbors for information that they then use to re-train or re-learn the concept they are studying. The protocol also supports broadcasts from the classifiers and clusterers to the rest of the network to announce new discoveries. Classifiers observe an event and assign it a label (type). Clusterers instead group observations into clusters without assigning them a label, and they collaborate in terms of pairwise constraints between two events [same-cluster (mustlink) or different-cluster (cannot-link)]. Fundamentally, these two learner types speak different languages. To bridge this gap, the new communication protocol provides four types of exchanges: hybrid queries for information, hybrid "broadcasts" of learned information, each specified for classifiers-to-clusterers, and clusterers

  12. RULE-BASE METHOD FOR ANALYSIS OF QUALITY E-LEARNING IN HIGHER EDUCATION

    Directory of Open Access Journals (Sweden)

    darsih darsih darsih

    2016-04-01

    Full Text Available ABSTRACT Assessing the quality of e-learning courses to measure the success of e-learning systems in online learning is essential. The system can be used to improve education. The study analyzes the quality of e-learning course on the web site www.kulon.undip.ac.id used a questionnaire with questions based on the variables of ISO 9126. Penilaiann Likert scale was used with a web app. Rule-base reasoning method is used to subject the quality of e-learningyang assessed. A case study conducted in four e-learning courses with 133 sample / respondents as users of the e-learning course. From the obtained results of research conducted both for the value of e-learning from each subject tested. In addition, each e-learning courses have different advantages depending on certain variables. Keywords : E-Learning, Rule-Base, Questionnaire, Likert, Measuring.

  13. Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments

    Science.gov (United States)

    Han, Wenjing; Coutinho, Eduardo; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan

    2016-01-01

    Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances. PMID:27627768

  14. Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments.

    Science.gov (United States)

    Han, Wenjing; Coutinho, Eduardo; Ruan, Huabin; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan

    2016-01-01

    Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances.

  15. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

    Directory of Open Access Journals (Sweden)

    C. V. Subbulakshmi

    2015-01-01

    Full Text Available Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learning capability of the particle swarm optimization (PSO algorithm with the extreme learning machine (ELM classifier. As a recent off-line learning method, ELM is a single-hidden layer feedforward neural network (FFNN, proved to be an excellent classifier with large number of hidden layer neurons. In this research, PSO is used to determine the optimum set of parameters for the ELM, thus reducing the number of hidden layer neurons, and it further improves the network generalization performance. The proposed method is experimented on five benchmarked datasets of the UCI Machine Learning Repository for handling medical dataset classification. Simulation results show that the proposed approach is able to achieve good generalization performance, compared to the results of other classifiers.

  16. Learning a Transferable Change Rule from a Recurrent Neural Network for Land Cover Change Detection

    Directory of Open Access Journals (Sweden)

    Haobo Lyu

    2016-06-01

    Full Text Available When exploited in remote sensing analysis, a reliable change rule with transfer ability can detect changes accurately and be applied widely. However, in practice, the complexity of land cover changes makes it difficult to use only one change rule or change feature learned from a given multi-temporal dataset to detect any other new target images without applying other learning processes. In this study, we consider the design of an efficient change rule having transferability to detect both binary and multi-class changes. The proposed method relies on an improved Long Short-Term Memory (LSTM model to acquire and record the change information of long-term sequence remote sensing data. In particular, a core memory cell is utilized to learn the change rule from the information concerning binary changes or multi-class changes. Three gates are utilized to control the input, output and update of the LSTM model for optimization. In addition, the learned rule can be applied to detect changes and transfer the change rule from one learned image to another new target multi-temporal image. In this study, binary experiments, transfer experiments and multi-class change experiments are exploited to demonstrate the superiority of our method. Three contributions of this work can be summarized as follows: (1 the proposed method can learn an effective change rule to provide reliable change information for multi-temporal images; (2 the learned change rule has good transferability for detecting changes in new target images without any extra learning process, and the new target images should have a multi-spectral distribution similar to that of the training images; and (3 to the authors’ best knowledge, this is the first time that deep learning in recurrent neural networks is exploited for change detection. In addition, under the framework of the proposed method, changes can be detected under both binary detection and multi-class change detection.

  17. Using an improved association rules mining optimization algorithm in web-based mobile-learning system

    Science.gov (United States)

    Huang, Yin; Chen, Jianhua; Xiong, Shaojun

    2009-07-01

    Mobile-Learning (M-learning) makes many learners get the advantages of both traditional learning and E-learning. Currently, Web-based Mobile-Learning Systems have created many new ways and defined new relationships between educators and learners. Association rule mining is one of the most important fields in data mining and knowledge discovery in databases. Rules explosion is a serious problem which causes great concerns, as conventional mining algorithms often produce too many rules for decision makers to digest. Since Web-based Mobile-Learning System collects vast amounts of student profile data, data mining and knowledge discovery techniques can be applied to find interesting relationships between attributes of learners, assessments, the solution strategies adopted by learners and so on. Therefore ,this paper focus on a new data-mining algorithm, combined with the advantages of genetic algorithm and simulated annealing algorithm , called ARGSA(Association rules based on an improved Genetic Simulated Annealing Algorithm), to mine the association rules. This paper first takes advantage of the Parallel Genetic Algorithm and Simulated Algorithm designed specifically for discovering association rules. Moreover, the analysis and experiment are also made to show the proposed method is superior to the Apriori algorithm in this Mobile-Learning system.

  18. Classification of Exacerbation Frequency in the COPDGene Cohort Using Deep Learning with Deep Belief Networks.

    Science.gov (United States)

    Ying, Jun; Dutta, Joyita; Guo, Ning; Hu, Chenhui; Zhou, Dan; Sitek, Arkadiusz; Li, Quanzheng

    2016-12-21

    This study aims to develop an automatic classifier based on deep learning for exacerbation frequency in patients with chronic obstructive pulmonary disease (COPD). A threelayer deep belief network (DBN) with two hidden layers and one visible layer was employed to develop classification models and the models' robustness to exacerbation was analyzed. Subjects from the COPDGene cohort were labeled with exacerbation frequency, defined as the number of exacerbation events per year. 10,300 subjects with 361 features each were included in the analysis. After feature selection and parameter optimization, the proposed classification method achieved an accuracy of 91.99%, using a 10-fold cross validation experiment. The analysis of DBN weights showed that there was a good visual spatial relationship between the underlying critical features of different layers. Our findings show that the most sensitive features obtained from the DBN weights are consistent with the consensus showed by clinical rules and standards for COPD diagnostics. We thus demonstrate that DBN is a competitive tool for exacerbation risk assessment for patients suffering from COPD.

  19. The evolution of social learning rules: payoff-biased and frequency-dependent biased transmission.

    Science.gov (United States)

    Kendal, Jeremy; Giraldeau, Luc-Alain; Laland, Kevin

    2009-09-21

    Humans and other animals do not use social learning indiscriminately, rather, natural selection has favoured the evolution of social learning rules that make selective use of social learning to acquire relevant information in a changing environment. We present a gene-culture coevolutionary analysis of a small selection of such rules (unbiased social learning, payoff-biased social learning and frequency-dependent biased social learning, including conformism and anti-conformism) in a population of asocial learners where the environment is subject to a constant probability of change to a novel state. We define conditions under which each rule evolves to a genetically polymorphic equilibrium. We find that payoff-biased social learning may evolve under high levels of environmental variation if the fitness benefit associated with the acquired behaviour is either high or low but not of intermediate value. In contrast, both conformist and anti-conformist biases can become fixed when environment variation is low, whereupon the mean fitness in the population is higher than for a population of asocial learners. Our examination of the population dynamics reveals stable limit cycles under conformist and anti-conformist biases and some highly complex dynamics including chaos. Anti-conformists can out-compete conformists when conditions favour a low equilibrium frequency of the learned behaviour. We conclude that evolution, punctuated by the repeated successful invasion of different social learning rules, should continuously favour a reduction in the equilibrium frequency of asocial learning, and propose that, among competing social learning rules, the dominant rule will be the one that can persist with the lowest frequency of asocial learning.

  20. RuleML-Based Learning Object Interoperability on the Semantic Web

    Science.gov (United States)

    Biletskiy, Yevgen; Boley, Harold; Ranganathan, Girish R.

    2008-01-01

    Purpose: The present paper aims to describe an approach for building the Semantic Web rules for interoperation between heterogeneous learning objects, namely course outlines from different universities, and one of the rule uses: identifying (in)compatibilities between course descriptions. Design/methodology/approach: As proof of concept, a rule…

  1. Ensemble learning with trees and rules: supervised, semi-supervised, unsupervised

    Science.gov (United States)

    In this article, we propose several new approaches for post processing a large ensemble of conjunctive rules for supervised and semi-supervised learning problems. We show with various examples that for high dimensional regression problems the models constructed by the post processing the rules with ...

  2. On-line learning of non-monotonic rules by simple perceptron

    OpenAIRE

    Inoue, Jun-ichi; Nishimori, Hidetoshi; Kabashima, Yoshiyuki

    1997-01-01

    We study the generalization ability of a simple perceptron which learns unlearnable rules. The rules are presented by a teacher perceptron with a non-monotonic transfer function. The student is trained in the on-line mode. The asymptotic behaviour of the generalization error is estimated under various conditions. Several learning strategies are proposed and improved to obtain the theoretical lower bound of the generalization error.

  3. Adaptive Learning Rule for Hardware-based Deep Neural Networks Using Electronic Synapse Devices

    OpenAIRE

    Lim, Suhwan; Bae, Jong-Ho; Eum, Jai-Ho; Lee, Sungtae; Kim, Chul-Heung; Kwon, Dongseok; Park, Byung-Gook; Lee, Jong-Ho

    2017-01-01

    In this paper, we propose a learning rule based on a back-propagation (BP) algorithm that can be applied to a hardware-based deep neural network (HW-DNN) using electronic devices that exhibit discrete and limited conductance characteristics. This adaptive learning rule, which enables forward, backward propagation, as well as weight updates in hardware, is helpful during the implementation of power-efficient and high-speed deep neural networks. In simulations using a three-layer perceptron net...

  4. Fingerprint classification using a simplified rule-set based on directional patterns and singularity features

    CSIR Research Space (South Africa)

    Dorasamy, K

    2015-07-01

    Full Text Available The use of directional patterns has recently received more attention in fingerprint classification. It provides a global representation of a fingerprint, by dividing it into homogeneous orientation partitions. With this technique, the challenge...

  5. Role of Prefrontal Cortex in Learning and Generalizing Hierarchical Rules in 8-Month-Old Infants.

    Science.gov (United States)

    Werchan, Denise M; Collins, Anne G E; Frank, Michael J; Amso, Dima

    2016-10-05

    Recent research indicates that adults and infants spontaneously create and generalize hierarchical rule sets during incidental learning. Computational models and empirical data suggest that, in adults, this process is supported by circuits linking prefrontal cortex (PFC) with striatum and their modulation by dopamine, but the neural circuits supporting this form of learning in infants are largely unknown. We used near-infrared spectroscopy to record PFC activity in 8-month-old human infants during a simple audiovisual hierarchical-rule-learning task. Behavioral results confirmed that infants adopted hierarchical rule sets to learn and generalize spoken object-label mappings across different speaker contexts. Infants had increased activity over right dorsal lateral PFC when rule sets switched from one trial to the next, a neural marker related to updating rule sets into working memory in the adult literature. Infants' eye blink rate, a possible physiological correlate of striatal dopamine activity, also increased when rule sets switched from one trial to the next. Moreover, the increase in right dorsolateral PFC activity in conjunction with eye blink rate also predicted infants' generalization ability, providing exploratory evidence for frontostriatal involvement during learning. These findings provide evidence that PFC is involved in rudimentary hierarchical rule learning in 8-month-old infants, an ability that was previously thought to emerge later in life in concert with PFC maturation. Hierarchical rule learning is a powerful learning mechanism that allows rules to be selected in a context-appropriate fashion and transferred or reused in novel contexts. Data from computational models and adults suggests that this learning mechanism is supported by dopamine-innervated interactions between prefrontal cortex (PFC) and striatum. Here, we provide evidence that PFC also supports hierarchical rule learning during infancy, challenging the current dogma that PFC is an

  6. ASIST SIG/CR Classification Workshop 2000: Classification for User Support and Learning.

    Science.gov (United States)

    Soergel, Dagobert

    2001-01-01

    Reports on papers presented at the 62nd Annual Meeting of ASIST (American Society for Information Science and Technology) for the Special Interest Group in Classification Research (SIG/CR). Topics include types of knowledge; developing user-oriented classifications, including domain analysis; classification in the user interface; and automatic…

  7. Comparison of some classification algorithms based on deterministic and nondeterministic decision rules

    KAUST Repository

    Delimata, Paweł; Marszał-Paszek, Barbara; Moshkov, Mikhail; Paszek, Piotr; Skowron, Andrzej; Suraj, Zbigniew

    2010-01-01

    the considered algorithms extract from a given decision table efficiently some information about the set of rules. Next, this information is used by a decision-making procedure. The reported results of experiments show that the algorithms based on inhibitory

  8. Application of a kernel-based online learning algorithm to the classification of nodule candidates in computer-aided detection of CT lung nodules

    International Nuclear Information System (INIS)

    Matsumoto, S.; Ohno, Y.; Takenaka, D.; Sugimura, K.; Yamagata, H.

    2007-01-01

    Classification of the nodule candidates in computer-aided detection (CAD) of lung nodules in CT images was addressed by constructing a nonlinear discriminant function using a kernel-based learning algorithm called the kernel recursive least-squares (KRLS) algorithm. Using the nodule candidates derived from the processing by a CAD scheme of 100 CT datasets containing 253 non-calcified nodules or 3 mm or larger as determined by the consensus of two thoracic radiologists, the following trial were carried out 100 times: by randomly selecting 50 datasets for training, a nonlinear discriminant function was obtained using the nodule candidates in the training datasets and tested with the remaining candidates; for comparison, a rule-based classification was tested in a similar manner. At the number of false positives per case of about 5, the nonlinear classification method showed an improved sensitivity of 80% (mean over the 100 trials) compared with 74% of the rule-based method. (orig.)

  9. Learning Expressive Linkage Rules for Entity Matching using Genetic Programming

    OpenAIRE

    Isele, Robert

    2013-01-01

    A central problem in data integration and data cleansing is to identify pairs of entities in data sets that describe the same real-world object. Many existing methods for matching entities rely on explicit linkage rules, which specify how two entities are compared for equivalence. Unfortunately, writing accurate linkage rules by hand is a non-trivial problem that requires detailed knowledge of the involved data sets. Another important issue is the efficient execution of link...

  10. Learning about the internal structure of categories through classification and feature inference.

    Science.gov (United States)

    Jee, Benjamin D; Wiley, Jennifer

    2014-01-01

    Previous research on category learning has found that classification tasks produce representations that are skewed toward diagnostic feature dimensions, whereas feature inference tasks lead to richer representations of within-category structure. Yet, prior studies often measure category knowledge through tasks that involve identifying only the typical features of a category. This neglects an important aspect of a category's internal structure: how typical and atypical features are distributed within a category. The present experiments tested the hypothesis that inference learning results in richer knowledge of internal category structure than classification learning. We introduced several new measures to probe learners' representations of within-category structure. Experiment 1 found that participants in the inference condition learned and used a wider range of feature dimensions than classification learners. Classification learners, however, were more sensitive to the presence of atypical features within categories. Experiment 2 provided converging evidence that classification learners were more likely to incorporate atypical features into their representations. Inference learners were less likely to encode atypical category features, even in a "partial inference" condition that focused learners' attention on the feature dimensions relevant to classification. Overall, these results are contrary to the hypothesis that inference learning produces superior knowledge of within-category structure. Although inference learning promoted representations that included a broad range of category-typical features, classification learning promoted greater sensitivity to the distribution of typical and atypical features within categories.

  11. Classification of CT brain images based on deep learning networks.

    Science.gov (United States)

    Gao, Xiaohong W; Hui, Rui; Tian, Zengmin

    2017-01-01

    While computerised tomography (CT) may have been the first imaging tool to study human brain, it has not yet been implemented into clinical decision making process for diagnosis of Alzheimer's disease (AD). On the other hand, with the nature of being prevalent, inexpensive and non-invasive, CT does present diagnostic features of AD to a great extent. This study explores the significance and impact on the application of the burgeoning deep learning techniques to the task of classification of CT brain images, in particular utilising convolutional neural network (CNN), aiming at providing supplementary information for the early diagnosis of Alzheimer's disease. Towards this end, three categories of CT images (N = 285) are clustered into three groups, which are AD, lesion (e.g. tumour) and normal ageing. In addition, considering the characteristics of this collection with larger thickness along the direction of depth (z) (~3-5 mm), an advanced CNN architecture is established integrating both 2D and 3D CNN networks. The fusion of the two CNN networks is subsequently coordinated based on the average of Softmax scores obtained from both networks consolidating 2D images along spatial axial directions and 3D segmented blocks respectively. As a result, the classification accuracy rates rendered by this elaborated CNN architecture are 85.2%, 80% and 95.3% for classes of AD, lesion and normal respectively with an average of 87.6%. Additionally, this improved CNN network appears to outperform the others when in comparison with 2D version only of CNN network as well as a number of state of the art hand-crafted approaches. As a result, these approaches deliver accuracy rates in percentage of 86.3, 85.6 ± 1.10, 86.3 ± 1.04, 85.2 ± 1.60, 83.1 ± 0.35 for 2D CNN, 2D SIFT, 2D KAZE, 3D SIFT and 3D KAZE respectively. The two major contributions of the paper constitute a new 3-D approach while applying deep learning technique to extract signature information

  12. Quantitative evaluation of variations in rule-based classifications of land cover in urban neighbourhoods using WorldView-2 imagery

    Science.gov (United States)

    Belgiu, Mariana; ǎguţ, Lucian, , Dr; Strobl, Josef

    2014-01-01

    The increasing availability of high resolution imagery has triggered the need for automated image analysis techniques, with reduced human intervention and reproducible analysis procedures. The knowledge gained in the past might be of use to achieving this goal, if systematically organized into libraries which would guide the image analysis procedure. In this study we aimed at evaluating the variability of digital classifications carried out by three experts who were all assigned the same interpretation task. Besides the three classifications performed by independent operators, we developed an additional rule-based classification that relied on the image classifications best practices found in the literature, and used it as a surrogate for libraries of object characteristics. The results showed statistically significant differences among all operators who classified the same reference imagery. The classifications carried out by the experts achieved satisfactory results when transferred to another area for extracting the same classes of interest, without modification of the developed rules.

  13. Strategies for adding adaptive learning mechanisms to rule-based diagnostic expert systems

    Science.gov (United States)

    Stclair, D. C.; Sabharwal, C. L.; Bond, W. E.; Hacke, Keith

    1988-01-01

    Rule-based diagnostic expert systems can be used to perform many of the diagnostic chores necessary in today's complex space systems. These expert systems typically take a set of symptoms as input and produce diagnostic advice as output. The primary objective of such expert systems is to provide accurate and comprehensive advice which can be used to help return the space system in question to nominal operation. The development and maintenance of diagnostic expert systems is time and labor intensive since the services of both knowledge engineer(s) and domain expert(s) are required. The use of adaptive learning mechanisms to increment evaluate and refine rules promises to reduce both time and labor costs associated with such systems. This paper describes the basic adaptive learning mechanisms of strengthening, weakening, generalization, discrimination, and discovery. Next basic strategies are discussed for adding these learning mechanisms to rule-based diagnostic expert systems. These strategies support the incremental evaluation and refinement of rules in the knowledge base by comparing the set of advice given by the expert system (A) with the correct diagnosis (C). Techniques are described for selecting those rules in the in the knowledge base which should participate in adaptive learning. The strategies presented may be used with a wide variety of learning algorithms. Further, these strategies are applicable to a large number of rule-based diagnostic expert systems. They may be used to provide either immediate or deferred updating of the knowledge base.

  14. Effects of the Memorization of Rule Statements on Performance, Retention, and Transfer in a Computer-Based Learning Task.

    Science.gov (United States)

    Towle, Nelson J.

    Research sought to determine whether memorization of rule statements before, during or after instruction in rule application skills would facilitate the acquisition and/or retention of rule-governed behavior as compared to no-rule statement memorization. A computer-assisted instructional (CAI) program required high school students to learn to a…

  15. Rule induction performance in amnestic mild cognitive impairment and Alzheimer's dementia: examining the role of simple and biconditional rule learning processes.

    Science.gov (United States)

    Oosterman, Joukje M; Heringa, Sophie M; Kessels, Roy P C; Biessels, Geert Jan; Koek, Huiberdina L; Maes, Joseph H R; van den Berg, Esther

    2017-04-01

    Rule induction tests such as the Wisconsin Card Sorting Test require executive control processes, but also the learning and memorization of simple stimulus-response rules. In this study, we examined the contribution of diminished learning and memorization of simple rules to complex rule induction test performance in patients with amnestic mild cognitive impairment (aMCI) or Alzheimer's dementia (AD). Twenty-six aMCI patients, 39 AD patients, and 32 control participants were included. A task was used in which the memory load and the complexity of the rules were independently manipulated. This task consisted of three conditions: a simple two-rule learning condition (Condition 1), a simple four-rule learning condition (inducing an increase in memory load, Condition 2), and a complex biconditional four-rule learning condition-inducing an increase in complexity and, hence, executive control load (Condition 3). Performance of AD patients declined disproportionately when the number of simple rules that had to be memorized increased (from Condition 1 to 2). An additional increment in complexity (from Condition 2 to 3) did not, however, disproportionately affect performance of the patients. Performance of the aMCI patients did not differ from that of the control participants. In the patient group, correlation analysis showed that memory performance correlated with Condition 1 performance, whereas executive task performance correlated with Condition 2 performance. These results indicate that the reduced learning and memorization of underlying task rules explains a significant part of the diminished complex rule induction performance commonly reported in AD, although results from the correlation analysis suggest involvement of executive control functions as well. Taken together, these findings suggest that care is needed when interpreting rule induction task performance in terms of executive function deficits in these patients.

  16. Recent Advances in Conotoxin Classification by Using Machine Learning Methods.

    Science.gov (United States)

    Dao, Fu-Ying; Yang, Hui; Su, Zhen-Dong; Yang, Wuritu; Wu, Yun; Hui, Ding; Chen, Wei; Tang, Hua; Lin, Hao

    2017-06-25

    Conotoxins are disulfide-rich small peptides, which are invaluable peptides that target ion channel and neuronal receptors. Conotoxins have been demonstrated as potent pharmaceuticals in the treatment of a series of diseases, such as Alzheimer's disease, Parkinson's disease, and epilepsy. In addition, conotoxins are also ideal molecular templates for the development of new drug lead compounds and play important roles in neurobiological research as well. Thus, the accurate identification of conotoxin types will provide key clues for the biological research and clinical medicine. Generally, conotoxin types are confirmed when their sequence, structure, and function are experimentally validated. However, it is time-consuming and costly to acquire the structure and function information by using biochemical experiments. Therefore, it is important to develop computational tools for efficiently and effectively recognizing conotoxin types based on sequence information. In this work, we reviewed the current progress in computational identification of conotoxins in the following aspects: (i) construction of benchmark dataset; (ii) strategies for extracting sequence features; (iii) feature selection techniques; (iv) machine learning methods for classifying conotoxins; (v) the results obtained by these methods and the published tools; and (vi) future perspectives on conotoxin classification. The paper provides the basis for in-depth study of conotoxins and drug therapy research.

  17. Classification Framework for ICT-Based Learning Technologies for Disabled People

    Science.gov (United States)

    Hersh, Marion

    2017-01-01

    The paper presents the first systematic approach to the classification of inclusive information and communication technologies (ICT)-based learning technologies and ICT-based learning technologies for disabled people which covers both assistive and general learning technologies, is valid for all disabled people and considers the full range of…

  18. Ontology-based concept map learning path reasoning system using SWRL rules

    Energy Technology Data Exchange (ETDEWEB)

    Chu, K.-K.; Lee, C.-I. [National Univ. of Tainan, Taiwan (China). Dept. of Computer Science and Information Learning Technology

    2010-08-13

    Concept maps are graphical representations of knowledge. Concept mapping may reduce students' cognitive load and extend simple memory function. The purpose of this study was on the diagnosis of students' concept map learning abilities and the provision of personally constructive advice dependant on their learning path and progress. Ontology is a useful method with which to represent and store concept map information. Semantic web rule language (SWRL) rules are easy to understand and to use as specific reasoning services. This paper discussed the selection of grade 7 lakes and rivers curriculum for which to devise a concept map learning path reasoning service. The paper defined a concept map e-learning ontology and two SWRL semantic rules, and collected users' concept map learning path data to infer implicit knowledge and to recommend the next learning path for users. It was concluded that the designs devised in this study were feasible and advanced and the ontology kept the domain knowledge preserved. SWRL rules identified an abstraction model for inferred properties. Since they were separate systems, they did not interfere with each other, while ontology or SWRL rules were maintained, ensuring persistent system extensibility and robustness. 15 refs., 1 tab., 8 figs.

  19. Joint Concept Correlation and Feature-Concept Relevance Learning for Multilabel Classification.

    Science.gov (United States)

    Zhao, Xiaowei; Ma, Zhigang; Li, Zhi; Li, Zhihui

    2018-02-01

    In recent years, multilabel classification has attracted significant attention in multimedia annotation. However, most of the multilabel classification methods focus only on the inherent correlations existing among multiple labels and concepts and ignore the relevance between features and the target concepts. To obtain more robust multilabel classification results, we propose a new multilabel classification method aiming to capture the correlations among multiple concepts by leveraging hypergraph that is proved to be beneficial for relational learning. Moreover, we consider mining feature-concept relevance, which is often overlooked by many multilabel learning algorithms. To better show the feature-concept relevance, we impose a sparsity constraint on the proposed method. We compare the proposed method with several other multilabel classification methods and evaluate the classification performance by mean average precision on several data sets. The experimental results show that the proposed method outperforms the state-of-the-art methods.

  20. Oil palm fresh fruit bunch ripeness classification based on rule- based expert system of ROI image processing technique results

    International Nuclear Information System (INIS)

    Alfatni, M S M; Shariff, A R M; Marhaban, M H; Shafie, S B; Saaed, O M B; Abdullah, M Z; BAmiruddin, M D

    2014-01-01

    There is a processing need for a fast, easy and accurate classification system for oil palm fruit ripeness. Such a system will be invaluable to farmers and plantation managers who need to sell their oil palm fresh fruit bunch (FFB) for the mill as this will avoid disputes. In this paper,a new approach was developed under the name of expert rules-based systembased on the image processing techniques results of thethree different oil palm FFB region of interests (ROIs), namely; ROI1 (300x300 pixels), ROI2 (50x50 pixels) and ROI3 (100x100 pixels). The results show that the best rule-based ROIs for statistical colour feature extraction with k-nearest neighbors (KNN) classifier at 94% were chosen as well as the ROIs that indicated results higher than the rule-based outcome, such as the ROIs of statistical colour feature extraction with artificial neural network (ANN) classifier at 94%, were selected for further FFB ripeness inspection system

  1. Towards a complete rule-based classification approach for flat fingerprints

    CSIR Research Space (South Africa)

    Webb, L

    2014-12-01

    Full Text Available fingerprints. This work implements an algorithm which includes new rules to account for more instances of flat fingerprints with missing singular points, specifically when the delta of a Right Loop or Left Loop fingerprint is not captured, when one of the loops...

  2. Applying Active Learning to Assertion Classification of Concepts in Clinical Text

    Science.gov (United States)

    Chen, Yukun; Mani, Subramani; Xu, Hua

    2012-01-01

    Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105

  3. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  4. Learning Display Rules: The Socialization of Emotion Expression in Infancy.

    Science.gov (United States)

    Malatesta, Carol Zander; Haviland, Jeannette M.

    1982-01-01

    Develops a methodology for studying emotion socialization and examines the synchrony of mother and infant expressions to determine whether "instruction" in display rules is underway in early infancy and what the short-term effects of such instruction on infant expression might be. Sixty dyads were videotaped during play and reunion after brief…

  5. Monetary Policy Rules, Learning and Stability: a Survey of the Recent Literature (In French)

    OpenAIRE

    Martin ZUMPE (GREThA UMR CNRS 5113)

    2010-01-01

    This paper presents the literature about econometric learning and its impact on the performances of monetary policy rules in the framework of the new canonical macroeconomic model. Rational expectations which are a building block of the original model can thus be replaced by expectations based on estimation algorithms. The permanent updating of these estimations can be interpreted as a learning proces of the model’s agents. This learning proces induces additional dynamics into the model. The ...

  6. Automatic Classification Using Supervised Learning in a Medical Document Filtering Application.

    Science.gov (United States)

    Mostafa, J.; Lam, W.

    2000-01-01

    Presents a multilevel model of the information filtering process that permits document classification. Evaluates a document classification approach based on a supervised learning algorithm, measures the accuracy of the algorithm in a neural network that was trained to classify medical documents on cell biology, and discusses filtering…

  7. Poster abstract: A machine learning approach for vehicle classification using passive infrared and ultrasonic sensors

    KAUST Repository

    Warriach, Ehsan Ullah; Claudel, Christian G.

    2013-01-01

    This article describes the implementation of four different machine learning techniques for vehicle classification in a dual ultrasonic/passive infrared traffic flow sensors. Using k-NN, Naive Bayes, SVM and KNN-SVM algorithms, we show that KNN

  8. Joint learning and weighting of visual vocabulary for bag-of-feature based tissue classification

    KAUST Repository

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2013-01-01

    their power in this field. Two important issues of bag-of-feature strategy for tissue classification are investigated in this paper: the visual vocabulary learning and weighting, which are always considered independently in traditional methods by neglecting

  9. CLASS-PAIR-GUIDED MULTIPLE KERNEL LEARNING OF INTEGRATING HETEROGENEOUS FEATURES FOR CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Q. Wang

    2017-10-01

    Full Text Available In recent years, many studies on remote sensing image classification have shown that using multiple features from different data sources can effectively improve the classification accuracy. As a very powerful means of learning, multiple kernel learning (MKL can conveniently be embedded in a variety of characteristics. The conventional combined kernel learned by MKL can be regarded as the compromise of all basic kernels for all classes in classification. It is the best of the whole, but not optimal for each specific class. For this problem, this paper proposes a class-pair-guided MKL method to integrate the heterogeneous features (HFs from multispectral image (MSI and light detection and ranging (LiDAR data. In particular, the one-against-one strategy is adopted, which converts multiclass classification problem to a plurality of two-class classification problem. Then, we select the best kernel from pre-constructed basic kernels set for each class-pair by kernel alignment (KA in the process of classification. The advantage of the proposed method is that only the best kernel for the classification of any two classes can be retained, which leads to greatly enhanced discriminability. Experiments are conducted on two real data sets, and the experimental results show that the proposed method achieves the best performance in terms of classification accuracies in integrating the HFs for classification when compared with several state-of-the-art algorithms.

  10. Rule Extraction Based on Extreme Learning Machine and an Improved Ant-Miner Algorithm for Transient Stability Assessment.

    Directory of Open Access Journals (Sweden)

    Yang Li

    Full Text Available In order to overcome the problems of poor understandability of the pattern recognition-based transient stability assessment (PRTSA methods, a new rule extraction method based on extreme learning machine (ELM and an improved Ant-miner (IAM algorithm is presented in this paper. First, the basic principles of ELM and Ant-miner algorithm are respectively introduced. Then, based on the selected optimal feature subset, an example sample set is generated by the trained ELM-based PRTSA model. And finally, a set of classification rules are obtained by IAM algorithm to replace the original ELM network. The novelty of this proposal is that transient stability rules are extracted from an example sample set generated by the trained ELM-based transient stability assessment model by using IAM algorithm. The effectiveness of the proposed method is shown by the application results on the New England 39-bus power system and a practical power system--the southern power system of Hebei province.

  11. Supervised machine learning and active learning in classification of radiology reports.

    Science.gov (United States)

    Nguyen, Dung H M; Patrick, Jon D

    2014-01-01

    This paper presents an automated system for classifying the results of imaging examinations (CT, MRI, positron emission tomography) into reportable and non-reportable cancer cases. This system is part of an industrial-strength processing pipeline built to extract content from radiology reports for use in the Victorian Cancer Registry. In addition to traditional supervised learning methods such as conditional random fields and support vector machines, active learning (AL) approaches were investigated to optimize training production and further improve classification performance. The project involved two pilot sites in Victoria, Australia (Lake Imaging (Ballarat) and Peter MacCallum Cancer Centre (Melbourne)) and, in collaboration with the NSW Central Registry, one pilot site at Westmead Hospital (Sydney). The reportability classifier performance achieved 98.25% sensitivity and 96.14% specificity on the cancer registry's held-out test set. Up to 92% of training data needed for supervised machine learning can be saved by AL. AL is a promising method for optimizing the supervised training production used in classification of radiology reports. When an AL strategy is applied during the data selection process, the cost of manual classification can be reduced significantly. The most important practical application of the reportability classifier is that it can dramatically reduce human effort in identifying relevant reports from the large imaging pool for further investigation of cancer. The classifier is built on a large real-world dataset and can achieve high performance in filtering relevant reports to support cancer registries. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  12. A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series

    OpenAIRE

    Chambon, Stanislas; Galtier, Mathieu; Arnal, Pierrick; Wainrib, Gilles; Gramfort, Alexandre

    2017-01-01

    Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders. It is traditionally performed by a sleep expert who assigns to each 30s of signal a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEG), electrooculograms (EOG), electrocardiograms (ECG) and electromyograms (EMG). We introduce here the first deep learning approach for sleep stage classification that learns end-to-end without computing spectrograms or...

  13. Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective

    Directory of Open Access Journals (Sweden)

    Changbo Zhao

    2015-01-01

    data analyzed by different computational methods, we present the overview for four subfields of TCM diagnosis, respectively. For each subfield, we design a rectangular reference list with applications in the horizontal direction and machine learning algorithms in the longitudinal direction. According to the current development of objective TCM diagnosis for patient classification, a discussion of the research issues around machine learning techniques with applications to TCM diagnosis is given to facilitate the further research for TCM patient classification.

  14. Perspectives on Machine Learning for Classification of Schizotypy Using fMRI Data

    DEFF Research Database (Denmark)

    Madsen, Kristoffer Hougaard; Krohne, Laerke G; Cai, Xin-Lu

    2018-01-01

    Functional magnetic resonance imaging is capable of estimating functional activation and connectivity in the human brain, and lately there has been increased interest in the use of these functional modalities combined with machine learning for identification of psychiatric traits. While...... the use of machine learning schizotypy research. To this end, we describe common data processing steps while commenting on best practices and procedures. First, we introduce the important role of schizotypy to motivate the importance of reliable classification, and summarize existing machine learning....... We provide more detailed descriptions and software as supplementary material. Finally, we present current challenges in machine learning for classification of schizotypy and comment on future trends and perspectives....

  15. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  16. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  17. CIN classification and prediction using machine learning methods

    Science.gov (United States)

    Chirkina, Anastasia; Medvedeva, Marina; Komotskiy, Evgeny

    2017-06-01

    The aim of this paper is a comparison of the existing classification algorithms with different parameters, and selection those ones, which allows solving the problem of primary diagnosis of cervical intraepithelial neoplasia (CIN), as it characterizes the condition of the body in the precancerous stage. The paper describes a feature selection process, as well as selection of the best models for a multiclass classification.

  18. I - Multivariate Classification and Machine Learning in HEP

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Traditional multivariate methods for classification (Stochastic Gradient Boosted Decision Trees and Multi-Layer Perceptrons) are explained in theory and practise using examples from HEP. General aspects of multivariate classification are discussed, in particular different regularisation techniques. Afterwards, data-driven techniques are introduced and compared to MC-based methods.

  19. Bimodal emotion congruency is critical to preverbal infants' abstract rule learning.

    Science.gov (United States)

    Tsui, Angeline Sin Mei; Ma, Yuen Ki; Ho, Anna; Chow, Hiu Mei; Tseng, Chia-huei

    2016-05-01

    Extracting general rules from specific examples is important, as we must face the same challenge displayed in various formats. Previous studies have found that bimodal presentation of grammar-like rules (e.g. ABA) enhanced 5-month-olds' capacity to acquire a rule that infants failed to learn when the rule was presented with visual presentation of the shapes alone (circle-triangle-circle) or auditory presentation of the syllables (la-ba-la) alone. However, the mechanisms and constraints for this bimodal learning facilitation are still unknown. In this study, we used audio-visual relation congruency between bimodal stimulation to disentangle possible facilitation sources. We exposed 8- to 10-month-old infants to an AAB sequence consisting of visual faces with affective expressions and/or auditory voices conveying emotions. Our results showed that infants were able to distinguish the learned AAB rule from other novel rules under bimodal stimulation when the affects in audio and visual stimuli were congruently paired (Experiments 1A and 2A). Infants failed to acquire the same rule when audio-visual stimuli were incongruently matched (Experiment 2B) and when only the visual (Experiment 1B) or the audio (Experiment 1C) stimuli were presented. Our results highlight that bimodal facilitation in infant rule learning is not only dependent on better statistical probability and redundant sensory information, but also the relational congruency of audio-visual information. A video abstract of this article can be viewed at https://m.youtube.com/watch?v=KYTyjH1k9RQ. © 2015 John Wiley & Sons Ltd.

  20. A learning rule for very simple universal approximators consisting of a single layer of perceptrons.

    Science.gov (United States)

    Auer, Peter; Burgsteiner, Harald; Maass, Wolfgang

    2008-06-01

    One may argue that the simplest type of neural networks beyond a single perceptron is an array of several perceptrons in parallel. In spite of their simplicity, such circuits can compute any Boolean function if one views the majority of the binary perceptron outputs as the binary output of the parallel perceptron, and they are universal approximators for arbitrary continuous functions with values in [0,1] if one views the fraction of perceptrons that output 1 as the analog output of the parallel perceptron. Note that in contrast to the familiar model of a "multi-layer perceptron" the parallel perceptron that we consider here has just binary values as outputs of gates on the hidden layer. For a long time one has thought that there exists no competitive learning algorithm for these extremely simple neural networks, which also came to be known as committee machines. It is commonly assumed that one has to replace the hard threshold gates on the hidden layer by sigmoidal gates (or RBF-gates) and that one has to tune the weights on at least two successive layers in order to achieve satisfactory learning results for any class of neural networks that yield universal approximators. We show that this assumption is not true, by exhibiting a simple learning algorithm for parallel perceptrons - the parallel delta rule (p-delta rule). In contrast to backprop for multi-layer perceptrons, the p-delta rule only has to tune a single layer of weights, and it does not require the computation and communication of analog values with high precision. Reduced communication also distinguishes our new learning rule from other learning rules for parallel perceptrons such as MADALINE. Obviously these features make the p-delta rule attractive as a biologically more realistic alternative to backprop in biological neural circuits, but also for implementations in special purpose hardware. We show that the p-delta rule also implements gradient descent-with regard to a suitable error measure

  1. More than one kind of inference: re-examining what's learned in feature inference and classification.

    Science.gov (United States)

    Sweller, Naomi; Hayes, Brett K

    2010-08-01

    Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.

  2. CBM Resources/reserves classification and evaluation based on PRMS rules

    Science.gov (United States)

    Fa, Guifang; Yuan, Ruie; Wang, Zuoqian; Lan, Jun; Zhao, Jian; Xia, Mingjun; Cai, Dechao; Yi, Yanjing

    2018-02-01

    This paper introduces a set of definitions and classification requirements for coalbed methane (CBM) resources/reserves, based on Petroleum Resources Management System (PRMS). The basic CBM classification criterions of 1P, 2P, 3P and contingent resources are put forward from the following aspects: ownership, project maturity, drilling requirements, testing requirements, economic requirements, infrastructure and market, timing of production and development, and so on. The volumetric method is used to evaluate the OGIP, with focuses on analyses of key parameters and principles of the parameter selection, such as net thickness, ash and water content, coal rank and composition, coal density, cleat volume and saturation and absorbed gas content etc. A dynamic method is used to assess the reserves and recovery efficiency. Since the differences in rock and fluid properties, displacement mechanism, completion and operating practices and wellbore type resulted in different production curve characteristics, the factors affecting production behavior, the dewatering period, pressure build-up and interference effects were analyzed. The conclusion and results that the paper achieved can be used as important references for reasonable assessment of CBM resources/reserves.

  3. Learning and Retention through Predictive Inference and Classification

    Science.gov (United States)

    Sakamoto, Yasuaki; Love, Bradley C.

    2010-01-01

    Work in category learning addresses how humans acquire knowledge and, thus, should inform classroom practices. In two experiments, we apply and evaluate intuitions garnered from laboratory-based research in category learning to learning tasks situated in an educational context. In Experiment 1, learning through predictive inference and…

  4. Learning soil classification with the Kayapó indians

    OpenAIRE

    Cooper,Miguel; Teramoto,Edson Roberto; Vidal-Torrado,Pablo; Sparovek,Gerd

    2005-01-01

    The Kayapó Xicrin do Cateté (Xicrin) indigenous reserve is located within the Amazon forest in Pará (Brazil). The Xicrins have developed a soil classification system that is incorporated in their language and culture. The etymology of their classification system and its logical structure makes it similar and comparable with modern soil classification. The etymology of the Xicrin's language is based on the junction of radicals to form words for different soil names. The name of the soil is for...

  5. KDIGO 2012 Clinical Practice Guideline CKD classification rules out creatinine clearance 24 hour urine collection?

    Science.gov (United States)

    Ognibene, A; Grandi, G; Lorubbio, M; Rapi, S; Salvadori, B; Terreni, A; Veroni, F

    2016-01-01

    The recent guideline for the evaluation and management of Chronic Kidney Disease recommends assessing GFR employing equations based on serum creatinine; despite this, creatinine clearance 24-hour urine collection is used routinely in many settings. In this study we compared the classification assessed from CrCl (creatinine clearance 24h urine collection) and e-GFR calculated with CKD-EPI or MDRD formulas. In this retrospective study we analyze consecutive laboratory data: creatinine clearance 24h urine collection, serum creatinine and demographic data such as sex and age from 15,777 patients >18 years of age collected from 2011 to 2013 in our laboratory at Careggi Hospital. The results were then compared to the estimated GFR calculated with the equations according to the recent treatment guidelines. Consecutive and retrospective laboratory data (creatinine clearance 24h urine collection, serum creatinine and, demographic data such as sex and age) from 15,777 patients >18 years of age seen at Careggi Hospital were collected. Comparison between e-GFR calculated with CKD-EPI or MDRD formulas and GFR according CrCl determinations and bias [95% CI] were 11.34 [-47,4/70.1] and 11.4 [-50.2/73] respectively. The concordance for 18/65 years aged group when compared with e-GFR classification between MDRD vs CKDEPI, MDRD vs CrCl and CKD-EPI vs CrCl were 0.78, 0.34, and 0.41 respectively, while in the 65/110years aged group the concordance Kappas were 0.84, 0.38, and 0.36 respectively. The use of CrCl provides a different classification than the estimation of GFR using a prediction equation. The CrCl is unreliable when it is necessary to identify CKD subjects with decrease of GFR of 5ml/min/1.73m(2)/year. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  6. Learning and innovative elements of strategy adoption rules expand cooperative network topologies.

    Science.gov (United States)

    Wang, Shijun; Szalay, Máté S; Zhang, Changshui; Csermely, Peter

    2008-04-09

    Cooperation plays a key role in the evolution of complex systems. However, the level of cooperation extensively varies with the topology of agent networks in the widely used models of repeated games. Here we show that cooperation remains rather stable by applying the reinforcement learning strategy adoption rule, Q-learning on a variety of random, regular, small-word, scale-free and modular network models in repeated, multi-agent Prisoner's Dilemma and Hawk-Dove games. Furthermore, we found that using the above model systems other long-term learning strategy adoption rules also promote cooperation, while introducing a low level of noise (as a model of innovation) to the strategy adoption rules makes the level of cooperation less dependent on the actual network topology. Our results demonstrate that long-term learning and random elements in the strategy adoption rules, when acting together, extend the range of network topologies enabling the development of cooperation at a wider range of costs and temptations. These results suggest that a balanced duo of learning and innovation may help to preserve cooperation during the re-organization of real-world networks, and may play a prominent role in the evolution of self-organizing, complex systems.

  7. Negation handling in sentiment classification using rule-based adapted from Indonesian language syntactic for Indonesian text in Twitter

    Science.gov (United States)

    Amalia, Rizkiana; Arif Bijaksana, Moch; Darmantoro, Dhinta

    2018-03-01

    The presence of the word negation is able to change the polarity of the text if it is not handled properly it will affect the performance of the sentiment classification. Negation words in Indonesian are ‘tidak’, ‘bukan’, ‘belum’ and ‘jangan’. Also, there is a conjunction word that able to reverse the actual values, as the word ‘tetapi’, or ‘tapi’. Unigram has shortcomings in dealing with the existence of negation because it treats negation word and the negated words as separate words. A general approach for negation handling in English text gives the tag ‘NEG_’ for following words after negation until the first punctuation. But this may gives the tag to un-negated, and this approach does not handle negation and conjunction in one sentences. The rule-based method to determine what words negated by adapting the rules of Indonesian language syntactic of negation to determine the scope of negation was proposed in this study. With adapting syntactic rules and tagging “NEG_” using SVM classifier with RBF kernel has better performance results than the other experiments. Considering the average F1-score value, the performance of this proposed method can be improved against baseline equal to 1.79% (baseline without negation handling) and 5% (baseline with existing negation handling) for a dataset that all tweets contain negation words. And also for the second dataset that has the various number of negation words in document tweet. It can be improved against baseline at 2.69% (without negation handling) and 3.17% (with existing negation handling).

  8. A Machine Learning Approach to Discover Rules for Expressive Performance Actions in Jazz Guitar Music

    Science.gov (United States)

    Giraldo, Sergio I.; Ramirez, Rafael

    2016-01-01

    Expert musicians introduce expression in their performances by manipulating sound properties such as timing, energy, pitch, and timbre. Here, we present a data driven computational approach to induce expressive performance rule models for note duration, onset, energy, and ornamentation transformations in jazz guitar music. We extract high-level features from a set of 16 commercial audio recordings (and corresponding music scores) of jazz guitarist Grant Green in order to characterize the expression in the pieces. We apply machine learning techniques to the resulting features to learn expressive performance rule models. We (1) quantitatively evaluate the accuracy of the induced models, (2) analyse the relative importance of the considered musical features, (3) discuss some of the learnt expressive performance rules in the context of previous work, and (4) assess their generailty. The accuracies of the induced predictive models is significantly above base-line levels indicating that the audio performances and the musical features extracted contain sufficient information to automatically learn informative expressive performance patterns. Feature analysis shows that the most important musical features for predicting expressive transformations are note duration, pitch, metrical strength, phrase position, Narmour structure, and tempo and key of the piece. Similarities and differences between the induced expressive rules and the rules reported in the literature were found. Differences may be due to the fact that most previously studied performance data has consisted of classical music recordings. Finally, the rules' performer specificity/generality is assessed by applying the induced rules to performances of the same pieces performed by two other professional jazz guitar players. Results show a consistency in the ornamentation patterns between Grant Green and the other two musicians, which may be interpreted as a good indicator for generality of the ornamentation rules

  9. A Machine Learning Approach to Discover Rules for Expressive Performance Actions in Jazz Guitar Music.

    Science.gov (United States)

    Giraldo, Sergio I; Ramirez, Rafael

    2016-01-01

    Expert musicians introduce expression in their performances by manipulating sound properties such as timing, energy, pitch, and timbre. Here, we present a data driven computational approach to induce expressive performance rule models for note duration, onset, energy, and ornamentation transformations in jazz guitar music. We extract high-level features from a set of 16 commercial audio recordings (and corresponding music scores) of jazz guitarist Grant Green in order to characterize the expression in the pieces. We apply machine learning techniques to the resulting features to learn expressive performance rule models. We (1) quantitatively evaluate the accuracy of the induced models, (2) analyse the relative importance of the considered musical features, (3) discuss some of the learnt expressive performance rules in the context of previous work, and (4) assess their generailty. The accuracies of the induced predictive models is significantly above base-line levels indicating that the audio performances and the musical features extracted contain sufficient information to automatically learn informative expressive performance patterns. Feature analysis shows that the most important musical features for predicting expressive transformations are note duration, pitch, metrical strength, phrase position, Narmour structure, and tempo and key of the piece. Similarities and differences between the induced expressive rules and the rules reported in the literature were found. Differences may be due to the fact that most previously studied performance data has consisted of classical music recordings. Finally, the rules' performer specificity/generality is assessed by applying the induced rules to performances of the same pieces performed by two other professional jazz guitar players. Results show a consistency in the ornamentation patterns between Grant Green and the other two musicians, which may be interpreted as a good indicator for generality of the ornamentation rules.

  10. Chicago Classification of Esophageal Motility Disorders: Lessons Learned

    NARCIS (Netherlands)

    Rohof, W. O. A.; Bredenoord, A. J.

    2017-01-01

    High-resolution manometry (HRM) is increasingly performed worldwide, to study esophageal motility. The Chicago classification is subsequently applied to interpret the manometric findings and facilitate a diagnosis of esophageal motility disorders. This review will discuss new insights regarding the

  11. A learning rule that explains how rewards teach attention

    NARCIS (Netherlands)

    Rombouts, Jaldert O.; Bohte, Sander M.; Martinez-Trujillo, Julio; Roelfsema, Pieter R.

    2015-01-01

    Many theories propose that top-down attentional signals control processing in sensory cortices by modulating neural activity. But who controls the controller? Here we investigate how a biologically plausible neural reinforcement learning scheme can create higher order representations and top-down

  12. A learning rule that explains how rewards teach attention

    NARCIS (Netherlands)

    J.O. Rombouts (Jaldert); S.M. Bohte (Sander); J. Martinez-Trujillo; P.R. Roelfsema

    2015-01-01

    htmlabstractMany theories propose that top-down attentional signals control processing in sensory cortices by modulating neural activity. But who controls the controller? Here we investigate how a biologically plausible neural reinforcement learning scheme can create higher order representations and

  13. Perceptual learning rules based on reinforcers and attention

    NARCIS (Netherlands)

    Roelfsema, Pieter R.; van Ooyen, Arjen; Watanabe, Takeo

    2010-01-01

    How does the brain learn those visual features that are relevant for behavior? In this article, we focus on two factors that guide plasticity of visual representations. First, reinforcers cause the global release of diffusive neuromodulatory signals that gate plasticity. Second, attentional feedback

  14. Comparative study of deep learning methods for one-shot image classification (abstract)

    NARCIS (Netherlands)

    van den Bogaert, J.; Mohseni, H.; Khodier, M.; Stoyanov, Y.; Mocanu, D.C.; Menkovski, V.

    2017-01-01

    Training deep learning models for images classification requires large amount of labeled data to overcome the challenges of overfitting and underfitting. Usually, in many practical applications, these labeled data are not available. In an attempt to solve this problem, the one-shot learning paradigm

  15. Test-Enhanced Learning of Natural Concepts: Effects on Recognition Memory, Classification, and Metacognition

    Science.gov (United States)

    Jacoby, Larry L.; Wahlheim, Christopher N.; Coane, Jennifer H.

    2010-01-01

    Three experiments examined testing effects on learning of natural concepts and metacognitive assessments of such learning. Results revealed that testing enhanced recognition memory and classification accuracy for studied and novel exemplars of bird families on immediate and delayed tests. These effects depended on the balance of study and test…

  16. Classification and learning using genetic algorithms applications in Bioinformatics and Web Intelligence

    CERN Document Server

    Bandyopadhyay, Sanghamitra

    2007-01-01

    This book provides a unified framework that describes how genetic learning can be used to design pattern recognition and learning systems. It examines how a search technique, the genetic algorithm, can be used for pattern classification mainly through approximating decision boundaries. Coverage also demonstrates the effectiveness of the genetic classifiers vis-à-vis several widely used classifiers, including neural networks.

  17. Category learning strategies in younger and older adults: Rule abstraction and memorization.

    Science.gov (United States)

    Wahlheim, Christopher N; McDaniel, Mark A; Little, Jeri L

    2016-06-01

    Despite the fundamental role of category learning in cognition, few studies have examined how this ability differs between younger and older adults. The present experiment examined possible age differences in category learning strategies and their effects on learning. Participants were trained on a category determined by a disjunctive rule applied to relational features. The utilization of rule- and exemplar-based strategies was indexed by self-reports and transfer performance. Based on self-reported strategies, the frequencies of rule- and exemplar-based learners were not significantly different between age groups, but there was a significantly higher frequency of intermediate learners (i.e., learners not identifying with a reliance on either rule- or exemplar-based strategies) in the older than younger adult group. Training performance was higher for younger than older adults regardless of the strategy utilized, showing that older adults were impaired in their ability to learn the correct rule or to remember exemplar-label associations. Transfer performance converged with strategy reports in showing higher fidelity category representations for younger adults. Younger adults with high working memory capacity were more likely to use an exemplar-based strategy, and older adults with high working memory capacity showed better training performance. Age groups did not differ in their self-reported memory beliefs, and these beliefs did not predict training strategies or performance. Overall, the present results contradict earlier findings that older adults prefer rule- to exemplar-based learning strategies, presumably to compensate for memory deficits. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  18. Integrating Interview Methodology to Analyze Inter-Institutional Comparisons of Service-Learning within the Carnegie Community Engagement Classification Framework

    Science.gov (United States)

    Plante, Jarrad D.; Cox, Thomas D.

    2016-01-01

    Service-learning has a longstanding history in higher education in and includes three main tenets: academic learning, meaningful community service, and civic learning. The Carnegie Foundation for the Advancement of Teaching created an elective classification system called the Carnegie Community Engagement Classification for higher education…

  19. EEG Eye State Identification Using Incremental Attribute Learning with Time-Series Classification

    Directory of Open Access Journals (Sweden)

    Ting Wang

    2014-01-01

    Full Text Available Eye state identification is a kind of common time-series classification problem which is also a hot spot in recent research. Electroencephalography (EEG is widely used in eye state classification to detect human's cognition state. Previous research has validated the feasibility of machine learning and statistical approaches for EEG eye state classification. This paper aims to propose a novel approach for EEG eye state identification using incremental attribute learning (IAL based on neural networks. IAL is a novel machine learning strategy which gradually imports and trains features one by one. Previous studies have verified that such an approach is applicable for solving a number of pattern recognition problems. However, in these previous works, little research on IAL focused on its application to time-series problems. Therefore, it is still unknown whether IAL can be employed to cope with time-series problems like EEG eye state classification. Experimental results in this study demonstrates that, with proper feature extraction and feature ordering, IAL can not only efficiently cope with time-series classification problems, but also exhibit better classification performance in terms of classification error rates in comparison with conventional and some other approaches.

  20. Learning Dispatching Rules for Scheduling: A Synergistic View Comprising Decision Trees, Tabu Search and Simulation

    Directory of Open Access Journals (Sweden)

    Atif Shahzad

    2016-02-01

    Full Text Available A promising approach for an effective shop scheduling that synergizes the benefits of the combinatorial optimization, supervised learning and discrete-event simulation is presented. Though dispatching rules are in widely used by shop scheduling practitioners, only ordinary performance rules are known; hence, dynamic generation of dispatching rules is desired to make them more effective in changing shop conditions. Meta-heuristics are able to perform quite well and carry more knowledge of the problem domain, however at the cost of prohibitive computational effort in real-time. The primary purpose of this research lies in an offline extraction of this domain knowledge using decision trees to generate simple if-then rules that subsequently act as dispatching rules for scheduling in an online manner. We used similarity index to identify parametric and structural similarity in problem instances in order to implicitly support the learning algorithm for effective rule generation and quality index for relative ranking of the dispatching decisions. Maximum lateness is used as the scheduling objective in a job shop scheduling environment.

  1. Classification of PolSAR Images Using Multilayer Autoencoders and a Self-Paced Learning Approach

    Directory of Open Access Journals (Sweden)

    Wenshuai Chen

    2018-01-01

    Full Text Available In this paper, a novel polarimetric synthetic aperture radar (PolSAR image classification method based on multilayer autoencoders and self-paced learning (SPL is proposed. The multilayer autoencoders network is used to learn the features, which convert raw data into more abstract expressions. Then, softmax regression is applied to produce the predicted probability distributions over all the classes of each pixel. When we optimize the multilayer autoencoders network, self-paced learning is used to accelerate the learning convergence and achieve a stronger generalization capability. Under this learning paradigm, the network learns the easier samples first and gradually involves more difficult samples in the training process. The proposed method achieves the overall classification accuracies of 94.73%, 94.82% and 78.12% on the Flevoland dataset from AIRSAR, Flevoland dataset from RADARSAT-2 and Yellow River delta dataset, respectively. Such results are comparable with other state-of-the-art methods.

  2. Logical-Rule Models of Classification Response Times: A Synthesis of Mental-Architecture, Random-Walk, and Decision-Bound Approaches

    Science.gov (United States)

    Fific, Mario; Little, Daniel R.; Nosofsky, Robert M.

    2010-01-01

    We formalize and provide tests of a set of logical-rule models for predicting perceptual classification response times (RTs) and choice probabilities. The models are developed by synthesizing mental-architecture, random-walk, and decision-bound approaches. According to the models, people make independent decisions about the locations of stimuli…

  3. Deep learning application: rubbish classification with aid of an android device

    Science.gov (United States)

    Liu, Sijiang; Jiang, Bo; Zhan, Jie

    2017-06-01

    Deep learning is a very hot topic currently in pattern recognition and artificial intelligence researches. Aiming at the practical problem that people usually don't know correct classifications some rubbish should belong to, based on the powerful image classification ability of the deep learning method, we have designed a prototype system to help users to classify kinds of rubbish. Firstly the CaffeNet Model was adopted for our classification network training on the ImageNet dataset, and the trained network was deployed on a web server. Secondly an android app was developed for users to capture images of unclassified rubbish, upload images to the web server for analyzing backstage and retrieve the feedback, so that users can obtain the classification guide by an android device conveniently. Tests on our prototype system of rubbish classification show that: an image of one single type of rubbish with origin shape can be better used to judge its classification, while an image containing kinds of rubbish or rubbish with changed shape may fail to help users to decide rubbish's classification. However, the system still shows promising auxiliary function for rubbish classification if the network training strategy can be optimized further.

  4. Fuzzy OLAP association rules mining-based modular reinforcement learning approach for multiagent systems.

    Science.gov (United States)

    Kaya, Mehmet; Alhajj, Reda

    2005-04-01

    Multiagent systems and data mining have recently attracted considerable attention in the field of computing. Reinforcement learning is the most commonly used learning process for multiagent systems. However, it still has some drawbacks, including modeling other learning agents present in the domain as part of the state of the environment, and some states are experienced much less than others, or some state-action pairs are never visited during the learning phase. Further, before completing the learning process, an agent cannot exhibit a certain behavior in some states that may be experienced sufficiently. In this study, we propose a novel multiagent learning approach to handle these problems. Our approach is based on utilizing the mining process for modular cooperative learning systems. It incorporates fuzziness and online analytical processing (OLAP) based mining to effectively process the information reported by agents. First, we describe a fuzzy data cube OLAP architecture which facilitates effective storage and processing of the state information reported by agents. This way, the action of the other agent, not even in the visual environment. of the agent under consideration, can simply be predicted by extracting online association rules, a well-known data mining technique, from the constructed data cube. Second, we present a new action selection model, which is also based on association rules mining. Finally, we generalize not sufficiently experienced states, by mining multilevel association rules from the proposed fuzzy data cube. Experimental results obtained on two different versions of a well-known pursuit domain show the robustness and effectiveness of the proposed fuzzy OLAP mining based modular learning approach. Finally, we tested the scalability of the approach presented in this paper and compared it with our previous work on modular-fuzzy Q-learning and ordinary Q-learning.

  5. Learning "Rules" of Practice within the Context of the Practicum Triad: A Case Study of Learning to Teach

    Science.gov (United States)

    Chalies, Sebastien; Escalie, Guillaume; Stefano, Bertone; Clarke, Anthony

    2012-01-01

    This case study sought to determine the professional development circumstances in which a preservice teacher learned rules of practice (Wittgenstein, 1996) on practicum while interacting with a cooperating teacher and university supervisor. Borrowing from a theoretical conceptualization of teacher professional development based on the postulates…

  6. Fusion of deep learning architectures, multilayer feedforward networks and learning vector quantizers for deep classification learning

    NARCIS (Netherlands)

    Villmann, T.; Biehl, M.; Villmann, A.; Saralajew, S.

    2017-01-01

    The advantage of prototype based learning vector quantizers are the intuitive and simple model adaptation as well as the easy interpretability of the prototypes as class representatives for the class distribution to be learned. Although they frequently yield competitive performance and show robust

  7. Learning soil classification with the Kayapó indians

    Directory of Open Access Journals (Sweden)

    Cooper Miguel

    2005-01-01

    Full Text Available The Kayapó Xicrin do Cateté (Xicrin indigenous reserve is located within the Amazon forest in Pará (Brazil. The Xicrins have developed a soil classification system that is incorporated in their language and culture. The etymology of their classification system and its logical structure makes it similar and comparable with modern soil classification. The etymology of the Xicrin's language is based on the junction of radicals to form words for different soil names. The name of the soil is formed by the main noun radical "puka", to which adjectives referring to soil morphological attributes are added. Modern classification systems are also based on similar morphological variables, and analytical support for defining boundaries of chemical or physical soil attributes are important only in lower hierarchical levels. Soil scientists have developed a soil classification system that is sensitive for the restrictions and potentialities the soil will show for modern agriculture. The Xicrins classify soils for what is important for their life style, i.e. a harmonic and friendly life with the resources they gain from the forest.

  8. Ontology-based classification of remote sensing images using spectral rules

    Science.gov (United States)

    Andrés, Samuel; Arvor, Damien; Mougenot, Isabelle; Libourel, Thérèse; Durieux, Laurent

    2017-05-01

    Earth Observation data is of great interest for a wide spectrum of scientific domain applications. An enhanced access to remote sensing images for "domain" experts thus represents a great advance since it allows users to interpret remote sensing images based on their domain expert knowledge. However, such an advantage can also turn into a major limitation if this knowledge is not formalized, and thus is difficult for it to be shared with and understood by other users. In this context, knowledge representation techniques such as ontologies should play a major role in the future of remote sensing applications. We implemented an ontology-based prototype to automatically classify Landsat images based on explicit spectral rules. The ontology is designed in a very modular way in order to achieve a generic and versatile representation of concepts we think of utmost importance in remote sensing. The prototype was tested on four subsets of Landsat images and the results confirmed the potential of ontologies to formalize expert knowledge and classify remote sensing images.

  9. Optimal monetary policy rules: the problem of stability under heterogeneous learning

    Czech Academy of Sciences Publication Activity Database

    Bogomolova, Anna; Kolyuzhnov, Dmitri

    -, č. 379 (2008), s. 1-34 ISSN 1211-3298 R&D Projects: GA MŠk LC542 Institutional research plan: CEZ:AV0Z70850503 Keywords : monetary policy rules * New Keynesian model * adaptive learning Subject RIV: AH - Economics http://www.cerge-ei.cz/pdf/wp/Wp379.pdf

  10. The Aptitude-Treatment Interaction Effects on the Learning of Grammar Rules

    Science.gov (United States)

    Hwu, Fenfang; Sun, Shuyan

    2012-01-01

    The present study investigates the interaction between two types of explicit instructional approaches, deduction and explicit-induction, and the level of foreign language aptitude in the learning of grammar rules. Results indicate that on the whole the two equally explicit instructional approaches did not differentially affect learning…

  11. Beyond Motivation - History as a method for the learning of meta-discursive rules in mathematics

    DEFF Research Database (Denmark)

    Kjeldsen, Tinne Hoff; Blomhøj, Morten

    2012-01-01

    In this paper, we argue that history might have a profound role to play for learning mathematics by providing a self-evident (if not indispensable) strategy for revealing meta-discursive rules in mathematics and turning them into explicit objects of reflection for students. Our argument is based...

  12. Effect of e-learning program on risk assessment and pressure ulcer classification - A randomized study.

    Science.gov (United States)

    Bredesen, Ida Marie; Bjøro, Karen; Gunningberg, Lena; Hofoss, Dag

    2016-05-01

    Pressure ulcers (PUs) are a problem in health care. Staff competency is paramount to PU prevention. Education is essential to increase skills in pressure ulcer classification and risk assessment. Currently, no pressure ulcer learning programs are available in Norwegian. Develop and test an e-learning program for assessment of pressure ulcer risk and pressure ulcer classification. Forty-four nurses working in acute care hospital wards or nursing homes participated and were assigned randomly into two groups: an e-learning program group (intervention) and a traditional classroom lecture group (control). Data was collected immediately before and after training, and again after three months. The study was conducted at one nursing home and two hospitals between May and December 2012. Accuracy of risk assessment (five patient cases) and pressure ulcer classification (40 photos [normal skin, pressure ulcer categories I-IV] split in two sets) were measured by comparing nurse evaluations in each of the two groups to a pre-established standard based on ratings by experts in pressure ulcer classification and risk assessment. Inter-rater reliability was measured by exact percent agreement and multi-rater Fleiss kappa. A Mann-Whitney U test was used for continuous sum score variables. An e-learning program did not improve Braden subscale scoring. For pressure ulcer classification, however, the intervention group scored significantly higher than the control group on several of the categories in post-test immediately after training. However, after three months there were no significant differences in classification skills between the groups. An e-learning program appears to have a greater effect on the accuracy of pressure ulcer classification than classroom teaching in the short term. For proficiency in Braden scoring, no significant effect of educational methods on learning results was detected. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. A Machine Learning Approach to Discover Rules for Expressive Performance Actions in Jazz Guitar Music

    Directory of Open Access Journals (Sweden)

    Sergio Ivan Giraldo

    2016-12-01

    Full Text Available Expert musicians introduce expression in their performances by manipulating sound properties such as timing, energy, pitch, and timbre. Here, we present a data driven computational approach to induce expressive performance rule models for note duration, onset, energy, and ornamentation transformations in jazz guitar music. We extract high-level features from a set of 16 commercial audio recordings (and corresponding music scores of jazz guitarist Grant Green in order to characterize the expression in the pieces. We apply machine learning techniques to the resulting features to learn expressive performance rule models. We (1 quantitatively evaluate the accuracy of the induced models, (2 analyse the relative importance of the considered musical features, (3 discuss some of the learnt expressive performance rules in the context of previous work, and (4 assess their generailty. The accuracies of the induced predictive models is significantly above base-line levels indicating that the audio performances and the musical features extracted contain sufficient information to automatically learn informative expressive performance patterns. Feature analysis shows that the most important musical features for predicting expressive transformations are note duration, pitch, metrical strength, phrase position, Narmour structure, and tempo and key of the piece. Similarities and differences between the induced expressive rules and the rules reported in the literature were found. Differences may be due to the fact that most previously studied performance data has consisted of classical music recordings. Finally, the rules’ performer specificity/generality is assessed by applying the induced rules to performances of the same pieces performed by two other professional jazz guitar players. Results show a consistency in the ornamentation patterns between Grant Green and the other two musicians, which may be interpreted as a good indicator for generality of the

  14. Online Learning for Classification of Alzheimer Disease based on Cortical Thickness and Hippocampal Shape Analysis.

    Science.gov (United States)

    Lee, Ga-Young; Kim, Jeonghun; Kim, Ju Han; Kim, Kiwoong; Seong, Joon-Kyung

    2014-01-01

    Mobile healthcare applications are becoming a growing trend. Also, the prevalence of dementia in modern society is showing a steady growing trend. Among degenerative brain diseases that cause dementia, Alzheimer disease (AD) is the most common. The purpose of this study was to identify AD patients using magnetic resonance imaging in the mobile environment. We propose an incremental classification for mobile healthcare systems. Our classification method is based on incremental learning for AD diagnosis and AD prediction using the cortical thickness data and hippocampus shape. We constructed a classifier based on principal component analysis and linear discriminant analysis. We performed initial learning and mobile subject classification. Initial learning is the group learning part in our server. Our smartphone agent implements the mobile classification and shows various results. With use of cortical thickness data analysis alone, the discrimination accuracy was 87.33% (sensitivity 96.49% and specificity 64.33%). When cortical thickness data and hippocampal shape were analyzed together, the achieved accuracy was 87.52% (sensitivity 96.79% and specificity 63.24%). In this paper, we presented a classification method based on online learning for AD diagnosis by employing both cortical thickness data and hippocampal shape analysis data. Our method was implemented on smartphone devices and discriminated AD patients for normal group.

  15. Advanced Machine Learning for Classification, Regression, and Generation in Jet Physics

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    There is a deep connection between machine learning and jet physics - after all, jets are defined by unsupervised learning algorithms. Jet physics has been a driving force for studying modern machine learning in high energy physics. Domain specific challenges require new techniques to make full use of the algorithms. A key focus is on understanding how and what the algorithms learn. Modern machine learning techniques for jet physics are demonstrated for classification, regression, and generation. In addition to providing powerful baseline performance, we show how to train complex models directly on data and to generate sparse stacked images with non-uniform granularity.

  16. Clinical classification in low back pain: best-evidence diagnostic rules based on systematic reviews.

    Science.gov (United States)

    Petersen, Tom; Laslett, Mark; Juhl, Carsten

    2017-05-12

    Clinical examination findings are used in primary care to give an initial diagnosis to patients with low back pain and related leg symptoms. The purpose of this study was to develop best evidence Clinical Diagnostic Rules (CDR] for the identification of the most common patho-anatomical disorders in the lumbar spine; i.e. intervertebral discs, sacroiliac joints, facet joints, bone, muscles, nerve roots, muscles, peripheral nerve tissue, and central nervous system sensitization. A sensitive electronic search strategy using MEDLINE, EMBASE and CINAHL databases was combined with hand searching and citation tracking to identify eligible studies. Criteria for inclusion were: persons with low back pain with or without related leg symptoms, history or physical examination findings suitable for use in primary care, comparison with acceptable reference standards, and statistical reporting permitting calculation of diagnostic value. Quality assessments were made independently by two reviewers using the Quality Assessment of Diagnostic Accuracy Studies tool. Clinical examination findings that were investigated by at least two studies were included and results that met our predefined threshold of positive likelihood ratio ≥ 2 or negative likelihood ratio ≤ 0.5 were considered for the CDR. Sixty-four studies satisfied our eligible criteria. We were able to construct promising CDRs for symptomatic intervertebral disc, sacroiliac joint, spondylolisthesis, disc herniation with nerve root involvement, and spinal stenosis. Single clinical test appear not to be as useful as clusters of tests that are more closely in line with clinical decision making. This is the first comprehensive systematic review of diagnostic accuracy studies that evaluate clinical examination findings for their ability to identify the most common patho-anatomical disorders in the lumbar spine. In some diagnostic categories we have sufficient evidence to recommend a CDR. In others, we have only

  17. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification.

    Science.gov (United States)

    Bing, Lu; Wang, Wei

    2017-01-01

    We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL). After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM). Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  18. Rough Sets as a Knowledge Discovery and Classification Tool for the Diagnosis of Students with Learning Disabilities

    Directory of Open Access Journals (Sweden)

    Yu-Chi Lin

    2011-02-01

    Full Text Available Due to the implicit characteristics of learning disabilities (LDs, the diagnosis of students with learning disabilities has long been a difficult issue. Artificial intelligence techniques like artificial neural network (ANN and support vector machine (SVM have been applied to the LD diagnosis problem with satisfactory outcomes. However, special education teachers or professionals tend to be skeptical to these kinds of black-box predictors. In this study, we adopt the rough set theory (RST, which can not only perform as a classifier, but may also produce meaningful explanations or rules, to the LD diagnosis application. Our experiments indicate that the RST approach is competitive as a tool for feature selection, and it performs better in term of prediction accuracy than other rulebased algorithms such as decision tree and ripper algorithms. We also propose to mix samples collected from sources with different LD diagnosis procedure and criteria. By pre-processing these mixed samples with simple and readily available clustering algorithms, we are able to improve the quality and support of rules generated by the RST. Overall, our study shows that the rough set approach, as a classification and knowledge discovery tool, may have great potential in playing an essential role in LD diagnosis.

  19. Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

    Directory of Open Access Journals (Sweden)

    Tsatsoulis Costas

    2010-05-01

    Full Text Available Abstract Background There is increasing evidence that gene location and surrounding genes influence the functionality of genes in the eukaryotic genome. Knowing the Gene Ontology Slim terms associated with a gene gives us insight into a gene's functionality by informing us how its gene product behaves in a cellular context using three different ontologies: molecular function, biological process, and cellular component. In this study, we analyzed if we could classify a gene in Saccharomyces cerevisiae to its correct Gene Ontology Slim term using information about its location in the genome and information from its nearest-neighbouring genes using classification learning. Results We performed experiments to establish that the MultiBoostAB algorithm using the J48 classifier could correctly classify Gene Ontology Slim terms of a gene given information regarding the gene's location and information from its nearest-neighbouring genes for training. Different neighbourhood sizes were examined to determine how many nearest neighbours should be included around each gene to provide better classification rules. Our results show that by just incorporating neighbour information from each gene's two-nearest neighbours, the percentage of correctly classified genes to their correct Gene Ontology Slim term for each ontology reaches over 80% with high accuracy (reflected in F-measures over 0.80 of the classification rules produced. Conclusions We confirmed that in classifying genes to their correct Gene Ontology Slim term, the inclusion of neighbour information from those genes is beneficial. Knowing the location of a gene and the Gene Ontology Slim information from neighbouring genes gives us insight into that gene's functionality. This benefit is seen by just including information from a gene's two-nearest neighbouring genes.

  20. Poster abstract: A machine learning approach for vehicle classification using passive infrared and ultrasonic sensors

    KAUST Repository

    Warriach, Ehsan Ullah

    2013-01-01

    This article describes the implementation of four different machine learning techniques for vehicle classification in a dual ultrasonic/passive infrared traffic flow sensors. Using k-NN, Naive Bayes, SVM and KNN-SVM algorithms, we show that KNN-SVM significantly outperforms other algorithms in terms of classification accuracy. We also show that some of these algorithms could run in real time on the prototype system. Copyright © 2013 ACM.

  1. Exploiting monotonicity constraints for active learning in ordinal classification

    NARCIS (Netherlands)

    Soons, Pieter; Feelders, Adrianus

    2014-01-01

    We consider ordinal classification and instance ranking problems where each attribute is known to have an increasing or decreasing relation with the class label or rank. For example, it stands to reason that the number of query terms occurring in a document has a positive influence on its relevance

  2. Ichthyoplankton Classification Tool using Generative Adversarial Networks and Transfer Learning

    KAUST Repository

    Aljaafari, Nura

    2018-01-01

    . This method is time-consuming and requires a high level of experience. The recent advances in AI have helped to solve and automate several difficult tasks which motivated us to develop a classification tool for ichthyoplankton. We show that using machine

  3. Objects Classification by Learning-Based Visual Saliency Model and Convolutional Neural Network.

    Science.gov (United States)

    Li, Na; Zhao, Xinbo; Yang, Yongjia; Zou, Xiaochun

    2016-01-01

    Humans can easily classify different kinds of objects whereas it is quite difficult for computers. As a hot and difficult problem, objects classification has been receiving extensive interests with broad prospects. Inspired by neuroscience, deep learning concept is proposed. Convolutional neural network (CNN) as one of the methods of deep learning can be used to solve classification problem. But most of deep learning methods, including CNN, all ignore the human visual information processing mechanism when a person is classifying objects. Therefore, in this paper, inspiring the completed processing that humans classify different kinds of objects, we bring forth a new classification method which combines visual attention model and CNN. Firstly, we use the visual attention model to simulate the processing of human visual selection mechanism. Secondly, we use CNN to simulate the processing of how humans select features and extract the local features of those selected areas. Finally, not only does our classification method depend on those local features, but also it adds the human semantic features to classify objects. Our classification method has apparently advantages in biology. Experimental results demonstrated that our method made the efficiency of classification improve significantly.

  4. Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective

    Science.gov (United States)

    Zhao, Changbo; Li, Guo-Zheng; Wang, Chengjun; Niu, Jinling

    2015-01-01

    As a complementary and alternative medicine in medical field, traditional Chinese medicine (TCM) has drawn great attention in the domestic field and overseas. In practice, TCM provides a quite distinct methodology to patient diagnosis and treatment compared to western medicine (WM). Syndrome (ZHENG or pattern) is differentiated by a set of symptoms and signs examined from an individual by four main diagnostic methods: inspection, auscultation and olfaction, interrogation, and palpation which reflects the pathological and physiological changes of disease occurrence and development. Patient classification is to divide patients into several classes based on different criteria. In this paper, from the machine learning perspective, a survey on patient classification issue will be summarized on three major aspects of TCM: sign classification, syndrome differentiation, and disease classification. With the consideration of different diagnostic data analyzed by different computational methods, we present the overview for four subfields of TCM diagnosis, respectively. For each subfield, we design a rectangular reference list with applications in the horizontal direction and machine learning algorithms in the longitudinal direction. According to the current development of objective TCM diagnosis for patient classification, a discussion of the research issues around machine learning techniques with applications to TCM diagnosis is given to facilitate the further research for TCM patient classification. PMID:26246834

  5. Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective.

    Science.gov (United States)

    Zhao, Changbo; Li, Guo-Zheng; Wang, Chengjun; Niu, Jinling

    2015-01-01

    As a complementary and alternative medicine in medical field, traditional Chinese medicine (TCM) has drawn great attention in the domestic field and overseas. In practice, TCM provides a quite distinct methodology to patient diagnosis and treatment compared to western medicine (WM). Syndrome (ZHENG or pattern) is differentiated by a set of symptoms and signs examined from an individual by four main diagnostic methods: inspection, auscultation and olfaction, interrogation, and palpation which reflects the pathological and physiological changes of disease occurrence and development. Patient classification is to divide patients into several classes based on different criteria. In this paper, from the machine learning perspective, a survey on patient classification issue will be summarized on three major aspects of TCM: sign classification, syndrome differentiation, and disease classification. With the consideration of different diagnostic data analyzed by different computational methods, we present the overview for four subfields of TCM diagnosis, respectively. For each subfield, we design a rectangular reference list with applications in the horizontal direction and machine learning algorithms in the longitudinal direction. According to the current development of objective TCM diagnosis for patient classification, a discussion of the research issues around machine learning techniques with applications to TCM diagnosis is given to facilitate the further research for TCM patient classification.

  6. Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification.

    Science.gov (United States)

    Wen, Zaidao; Hou, Biao; Jiao, Licheng

    2017-05-03

    Linear synthesis model based dictionary learning framework has achieved remarkable performances in image classification in the last decade. Behaved as a generative feature model, it however suffers from some intrinsic deficiencies. In this paper, we propose a novel parametric nonlinear analysis cosparse model (NACM) with which a unique feature vector will be much more efficiently extracted. Additionally, we derive a deep insight to demonstrate that NACM is capable of simultaneously learning the task adapted feature transformation and regularization to encode our preferences, domain prior knowledge and task oriented supervised information into the features. The proposed NACM is devoted to the classification task as a discriminative feature model and yield a novel discriminative nonlinear analysis operator learning framework (DNAOL). The theoretical analysis and experimental performances clearly demonstrate that DNAOL will not only achieve the better or at least competitive classification accuracies than the state-of-the-art algorithms but it can also dramatically reduce the time complexities in both training and testing phases.

  7. Optimized extreme learning machine for urban land cover classification using hyperspectral imagery

    Science.gov (United States)

    Su, Hongjun; Tian, Shufang; Cai, Yue; Sheng, Yehua; Chen, Chen; Najafian, Maryam

    2017-12-01

    This work presents a new urban land cover classification framework using the firefly algorithm (FA) optimized extreme learning machine (ELM). FA is adopted to optimize the regularization coefficient C and Gaussian kernel σ for kernel ELM. Additionally, effectiveness of spectral features derived from an FA-based band selection algorithm is studied for the proposed classification task. Three sets of hyperspectral databases were recorded using different sensors, namely HYDICE, HyMap, and AVIRIS. Our study shows that the proposed method outperforms traditional classification algorithms such as SVM and reduces computational cost significantly.

  8. A method for classification of network traffic based on C5.0 Machine Learning Algorithm

    DEFF Research Database (Denmark)

    Bujlow, Tomasz; Riaz, M. Tahir; Pedersen, Jens Myrup

    2012-01-01

    current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown...... and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options...

  9. Multi-Site Diagnostic Classification of Schizophrenia Using Discriminant Deep Learning with Functional Connectivity MRI

    Directory of Open Access Journals (Sweden)

    Ling-Li Zeng

    2018-04-01

    Full Text Available Background: A lack of a sufficiently large sample at single sites causes poor generalizability in automatic diagnosis classification of heterogeneous psychiatric disorders such as schizophrenia based on brain imaging scans. Advanced deep learning methods may be capable of learning subtle hidden patterns from high dimensional imaging data, overcome potential site-related variation, and achieve reproducible cross-site classification. However, deep learning-based cross-site transfer classification, despite less imaging site-specificity and more generalizability of diagnostic models, has not been investigated in schizophrenia. Methods: A large multi-site functional MRI sample (n = 734, including 357 schizophrenic patients from seven imaging resources was collected, and a deep discriminant autoencoder network, aimed at learning imaging site-shared functional connectivity features, was developed to discriminate schizophrenic individuals from healthy controls. Findings: Accuracies of approximately 85·0% and 81·0% were obtained in multi-site pooling classification and leave-site-out transfer classification, respectively. The learned functional connectivity features revealed dysregulation of the cortical-striatal-cerebellar circuit in schizophrenia, and the most discriminating functional connections were primarily located within and across the default, salience, and control networks. Interpretation: The findings imply that dysfunctional integration of the cortical-striatal-cerebellar circuit across the default, salience, and control networks may play an important role in the “disconnectivity” model underlying the pathophysiology of schizophrenia. The proposed discriminant deep learning method may be capable of learning reliable connectome patterns and help in understanding the pathophysiology and achieving accurate prediction of schizophrenia across multiple independent imaging sites. Keywords: Schizophrenia, Deep learning, Connectome, f

  10. Topic categorisation of statements in suicide notes with integrated rules and machine learning.

    Science.gov (United States)

    Kovačević, Aleksandar; Dehghan, Azad; Keane, John A; Nenadic, Goran

    2012-01-01

    We describe and evaluate an automated approach used as part of the i2b2 2011 challenge to identify and categorise statements in suicide notes into one of 15 topics, including Love, Guilt, Thankfulness, Hopelessness and Instructions. The approach combines a set of lexico-syntactic rules with a set of models derived by machine learning from a training dataset. The machine learning models rely on named entities, lexical, lexico-semantic and presentation features, as well as the rules that are applicable to a given statement. On a testing set of 300 suicide notes, the approach showed the overall best micro F-measure of up to 53.36%. The best precision achieved was 67.17% when only rules are used, whereas best recall of 50.57% was with integrated rules and machine learning. While some topics (eg, Sorrow, Anger, Blame) prove challenging, the performance for relatively frequent (eg, Love) and well-scoped categories (eg, Thankfulness) was comparatively higher (precision between 68% and 79%), suggesting that automated text mining approaches can be effective in topic categorisation of suicide notes.

  11. Multi-level discriminative dictionary learning with application to large scale image classification.

    Science.gov (United States)

    Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua

    2015-10-01

    The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.

  12. A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem

    Directory of Open Access Journals (Sweden)

    Zekić-Sušac Marijana

    2014-09-01

    Full Text Available Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.

  13. Do pre-trained deep learning models improve computer-aided classification of digital mammograms?

    Science.gov (United States)

    Aboutalib, Sarah S.; Mohamed, Aly A.; Zuley, Margarita L.; Berg, Wendie A.; Luo, Yahong; Wu, Shandong

    2018-02-01

    Digital mammography screening is an important exam for the early detection of breast cancer and reduction in mortality. False positives leading to high recall rates, however, results in unnecessary negative consequences to patients and health care systems. In order to better aid radiologists, computer-aided tools can be utilized to improve distinction between image classifications and thus potentially reduce false recalls. The emergence of deep learning has shown promising results in the area of biomedical imaging data analysis. This study aimed to investigate deep learning and transfer learning methods that can improve digital mammography classification performance. In particular, we evaluated the effect of pre-training deep learning models with other imaging datasets in order to boost classification performance on a digital mammography dataset. Two types of datasets were used for pre-training: (1) a digitized film mammography dataset, and (2) a very large non-medical imaging dataset. By using either of these datasets to pre-train the network initially, and then fine-tuning with the digital mammography dataset, we found an increase in overall classification performance in comparison to a model without pre-training, with the very large non-medical dataset performing the best in improving the classification accuracy.

  14. Medication Impairs Probabilistic Classification Learning in Parkinson's Disease

    Science.gov (United States)

    Jahanshahi, Marjan; Wilkinson, Leonora; Gahir, Harpreet; Dharminda, Angeline; Lagnado, David A.

    2010-01-01

    In Parkinson's disease (PD), it is possible that tonic increase of dopamine associated with levodopa medication overshadows phasic release of dopamine, which is essential for learning. Thus while the motor symptoms of PD are improved with levodopa medication, learning would be disrupted. To test this hypothesis, we investigated the effect of…

  15. Classification of hydration status using electrocardiogram and machine learning

    Science.gov (United States)

    Kaveh, Anthony; Chung, Wayne

    2013-10-01

    The electrocardiogram (ECG) has been used extensively in clinical practice for decades to non-invasively characterize the health of heart tissue; however, these techniques are limited to time domain features. We propose a machine classification system using support vector machines (SVM) that uses temporal and spectral information to classify health state beyond cardiac arrhythmias. Our method uses single lead ECG to classify volume depletion (or dehydration) without the lengthy and costly blood analysis tests traditionally used for detecting dehydration status. Our method builds on established clinical ECG criteria for identifying electrolyte imbalances and lends to automated, computationally efficient implementation. The method was tested on the MIT-BIH PhysioNet database to validate this purely computational method for expedient disease-state classification. The results show high sensitivity, supporting use as a cost- and time-effective screening tool.

  16. Multi-Site Diagnostic Classification of Schizophrenia Using Discriminant Deep Learning with Functional Connectivity MRI.

    Science.gov (United States)

    Zeng, Ling-Li; Wang, Huaning; Hu, Panpan; Yang, Bo; Pu, Weidan; Shen, Hui; Chen, Xingui; Liu, Zhening; Yin, Hong; Tan, Qingrong; Wang, Kai; Hu, Dewen

    2018-04-01

    A lack of a sufficiently large sample at single sites causes poor generalizability in automatic diagnosis classification of heterogeneous psychiatric disorders such as schizophrenia based on brain imaging scans. Advanced deep learning methods may be capable of learning subtle hidden patterns from high dimensional imaging data, overcome potential site-related variation, and achieve reproducible cross-site classification. However, deep learning-based cross-site transfer classification, despite less imaging site-specificity and more generalizability of diagnostic models, has not been investigated in schizophrenia. A large multi-site functional MRI sample (n = 734, including 357 schizophrenic patients from seven imaging resources) was collected, and a deep discriminant autoencoder network, aimed at learning imaging site-shared functional connectivity features, was developed to discriminate schizophrenic individuals from healthy controls. Accuracies of approximately 85·0% and 81·0% were obtained in multi-site pooling classification and leave-site-out transfer classification, respectively. The learned functional connectivity features revealed dysregulation of the cortical-striatal-cerebellar circuit in schizophrenia, and the most discriminating functional connections were primarily located within and across the default, salience, and control networks. The findings imply that dysfunctional integration of the cortical-striatal-cerebellar circuit across the default, salience, and control networks may play an important role in the "disconnectivity" model underlying the pathophysiology of schizophrenia. The proposed discriminant deep learning method may be capable of learning reliable connectome patterns and help in understanding the pathophysiology and achieving accurate prediction of schizophrenia across multiple independent imaging sites. Copyright © 2018 German Center for Neurodegenerative Diseases (DZNE). Published by Elsevier B.V. All rights reserved.

  17. ANALYTiC: An Active Learning System for Trajectory Classification.

    Science.gov (United States)

    Soares Junior, Amilcar; Renso, Chiara; Matwin, Stan

    2017-01-01

    The increasing availability and use of positioning devices has resulted in large volumes of trajectory data. However, semantic annotations for such data are typically added by domain experts, which is a time-consuming task. Machine-learning algorithms can help infer semantic annotations from trajectory data by learning from sets of labeled data. Specifically, active learning approaches can minimize the set of trajectories to be annotated while preserving good performance measures. The ANALYTiC web-based interactive tool visually guides users through this annotation process.

  18. Visualizing histopathologic deep learning classification and anomaly detection using nonlinear feature space dimensionality reduction.

    Science.gov (United States)

    Faust, Kevin; Xie, Quin; Han, Dominick; Goyle, Kartikay; Volynskaya, Zoya; Djuric, Ugljesa; Diamandis, Phedias

    2018-05-16

    There is growing interest in utilizing artificial intelligence, and particularly deep learning, for computer vision in histopathology. While accumulating studies highlight expert-level performance of convolutional neural networks (CNNs) on focused classification tasks, most studies rely on probability distribution scores with empirically defined cutoff values based on post-hoc analysis. More generalizable tools that allow humans to visualize histology-based deep learning inferences and decision making are scarce. Here, we leverage t-distributed Stochastic Neighbor Embedding (t-SNE) to reduce dimensionality and depict how CNNs organize histomorphologic information. Unique to our workflow, we develop a quantitative and transparent approach to visualizing classification decisions prior to softmax compression. By discretizing the relationships between classes on the t-SNE plot, we show we can super-impose randomly sampled regions of test images and use their distribution to render statistically-driven classifications. Therefore, in addition to providing intuitive outputs for human review, this visual approach can carry out automated and objective multi-class classifications similar to more traditional and less-transparent categorical probability distribution scores. Importantly, this novel classification approach is driven by a priori statistically defined cutoffs. It therefore serves as a generalizable classification and anomaly detection tool less reliant on post-hoc tuning. Routine incorporation of this convenient approach for quantitative visualization and error reduction in histopathology aims to accelerate early adoption of CNNs into generalized real-world applications where unanticipated and previously untrained classes are often encountered.

  19. An Active Learning Framework for Hyperspectral Image Classification Using Hierarchical Segmentation

    Science.gov (United States)

    Zhang, Zhou; Pasolli, Edoardo; Crawford, Melba M.; Tilton, James C.

    2015-01-01

    Augmenting spectral data with spatial information for image classification has recently gained significant attention, as classification accuracy can often be improved by extracting spatial information from neighboring pixels. In this paper, we propose a new framework in which active learning (AL) and hierarchical segmentation (HSeg) are combined for spectral-spatial classification of hyperspectral images. The spatial information is extracted from a best segmentation obtained by pruning the HSeg tree using a new supervised strategy. The best segmentation is updated at each iteration of the AL process, thus taking advantage of informative labeled samples provided by the user. The proposed strategy incorporates spatial information in two ways: 1) concatenating the extracted spatial features and the original spectral features into a stacked vector and 2) extending the training set using a self-learning-based semi-supervised learning (SSL) approach. Finally, the two strategies are combined within an AL framework. The proposed framework is validated with two benchmark hyperspectral datasets. Higher classification accuracies are obtained by the proposed framework with respect to five other state-of-the-art spectral-spatial classification approaches. Moreover, the effectiveness of the proposed pruning strategy is also demonstrated relative to the approaches based on a fixed segmentation.

  20. Experimental analysis of the performance of machine learning algorithms in the classification of navigation accident records

    Directory of Open Access Journals (Sweden)

    REIS, M V. S. de A.

    2017-06-01

    Full Text Available This paper aims to evaluate the use of machine learning techniques in a database of marine accidents. We analyzed and evaluated the main causes and types of marine accidents in the Northern Fluminense region. For this, machine learning techniques were used. The study showed that the modeling can be done in a satisfactory manner using different configurations of classification algorithms, varying the activation functions and training parameters. The SMO (Sequential Minimal Optimization algorithm showed the best performance result.

  1. Compensatory Processing During Rule-Based Category Learning in Older Adults

    Science.gov (United States)

    Bharani, Krishna L.; Paller, Ken A.; Reber, Paul J.; Weintraub, Sandra; Yanar, Jorge; Morrison, Robert G.

    2016-01-01

    Healthy older adults typically perform worse than younger adults at rule-based category learning, but better than patients with Alzheimer's or Parkinson's disease. To further investigate aging's effect on rule-based category learning, we monitored event-related potentials (ERPs) while younger and neuropsychologically typical older adults performed a visual category-learning task with a rule-based category structure and trial-by-trial feedback. Using these procedures, we previously identified ERPs sensitive to categorization strategy and accuracy in young participants. In addition, previous studies have demonstrated the importance of neural processing in the prefrontal cortex and the medial temporal lobe for this task. In this study, older adults showed lower accuracy and longer response times than younger adults, but there were two distinct subgroups of older adults. One subgroup showed near-chance performance throughout the procedure, never categorizing accurately. The other subgroup reached asymptotic accuracy that was equivalent to that in younger adults, although they categorized more slowly. These two subgroups were further distinguished via ERPs. Consistent with the compensation theory of cognitive aging, older adults who successfully learned showed larger frontal ERPs when compared with younger adults. Recruitment of prefrontal resources may have improved performance while slowing response times. Additionally, correlations of feedback-locked P300 amplitudes with category-learning accuracy differentiated successful younger and older adults. Overall, the results suggest that the ability to adapt one's behavior in response to feedback during learning varies across older individuals, and that the failure of some to adapt their behavior may reflect inadequate engagement of prefrontal cortex. PMID:26422522

  2. Learning to recognise : A study on one-class classification and active learning

    NARCIS (Netherlands)

    Juszczak, P.

    2006-01-01

    The thesis treats classification problems which are undersampled or where there exist an unbalance between classes in the sampling. The thesis is divided into three parts. The first two parts treat the problem of one-class classification. In the one-class classification problem, it is assumed that

  3. Active learning for clinical text classification: is it better than random sampling?

    Science.gov (United States)

    Figueroa, Rosa L; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P

    2012-01-01

    Objective This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Design Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Measurements Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. Results The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. Conclusion For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty. PMID:22707743

  4. Alexnet Feature Extraction and Multi-Kernel Learning for Objectoriented Classification

    Science.gov (United States)

    Ding, L.; Li, H.; Hu, C.; Zhang, W.; Wang, S.

    2018-04-01

    In view of the fact that the deep convolutional neural network has stronger ability of feature learning and feature expression, an exploratory research is done on feature extraction and classification for high resolution remote sensing images. Taking the Google image with 0.3 meter spatial resolution in Ludian area of Yunnan Province as an example, the image segmentation object was taken as the basic unit, and the pre-trained AlexNet deep convolution neural network model was used for feature extraction. And the spectral features, AlexNet features and GLCM texture features are combined with multi-kernel learning and SVM classifier, finally the classification results were compared and analyzed. The results show that the deep convolution neural network can extract more accurate remote sensing image features, and significantly improve the overall accuracy of classification, and provide a reference value for earthquake disaster investigation and remote sensing disaster evaluation.

  5. Classification of EEG signals using a genetic-based machine learning classifier.

    Science.gov (United States)

    Skinner, B T; Nguyen, H T; Liu, D K

    2007-01-01

    This paper investigates the efficacy of the genetic-based learning classifier system XCS, for the classification of noisy, artefact-inclusive human electroencephalogram (EEG) signals represented using large condition strings (108bits). EEG signals from three participants were recorded while they performed four mental tasks designed to elicit hemispheric responses. Autoregressive (AR) models and Fast Fourier Transform (FFT) methods were used to form feature vectors with which mental tasks can be discriminated. XCS achieved a maximum classification accuracy of 99.3% and a best average of 88.9%. The relative classification performance of XCS was then compared against four non-evolutionary classifier systems originating from different learning techniques. The experimental results will be used as part of our larger research effort investigating the feasibility of using EEG signals as an interface to allow paralysed persons to control a powered wheelchair or other devices.

  6. ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECTORIENTED CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    L. Ding

    2018-04-01

    Full Text Available In view of the fact that the deep convolutional neural network has stronger ability of feature learning and feature expression, an exploratory research is done on feature extraction and classification for high resolution remote sensing images. Taking the Google image with 0.3 meter spatial resolution in Ludian area of Yunnan Province as an example, the image segmentation object was taken as the basic unit, and the pre-trained AlexNet deep convolution neural network model was used for feature extraction. And the spectral features, AlexNet features and GLCM texture features are combined with multi-kernel learning and SVM classifier, finally the classification results were compared and analyzed. The results show that the deep convolution neural network can extract more accurate remote sensing image features, and significantly improve the overall accuracy of classification, and provide a reference value for earthquake disaster investigation and remote sensing disaster evaluation.

  7. Learning object-to-class kernels for scene classification.

    Science.gov (United States)

    Zhang, Lei; Zhen, Xiantong; Shao, Ling

    2014-08-01

    High-level image representations have drawn increasing attention in visual recognition, e.g., scene classification, since the invention of the object bank. The object bank represents an image as a response map of a large number of pretrained object detectors and has achieved superior performance for visual recognition. In this paper, based on the object bank representation, we propose the object-to-class (O2C) distances to model scene images. In particular, four variants of O2C distances are presented, and with the O2C distances, we can represent the images using the object bank by lower-dimensional but more discriminative spaces, called distance spaces, which are spanned by the O2C distances. Due to the explicit computation of O2C distances based on the object bank, the obtained representations can possess more semantic meanings. To combine the discriminant ability of the O2C distances to all scene classes, we further propose to kernalize the distance representation for the final classification. We have conducted extensive experiments on four benchmark data sets, UIUC-Sports, Scene-15, MIT Indoor, and Caltech-101, which demonstrate that the proposed approaches can significantly improve the original object bank approach and achieve the state-of-the-art performance.

  8. Naïve and Robust: Class-Conditional Independence in Human Classification Learning

    Science.gov (United States)

    Jarecki, Jana B.; Meder, Björn; Nelson, Jonathan D.

    2018-01-01

    Humans excel in categorization. Yet from a computational standpoint, learning a novel probabilistic classification task involves severe computational challenges. The present paper investigates one way to address these challenges: assuming class-conditional independence of features. This feature independence assumption simplifies the inference…

  9. A novel deep learning approach for classification of EEG motor imagery signals.

    Science.gov (United States)

    Tabar, Yousef Rezaei; Halici, Ugur

    2017-02-01

    Signal classification is an important issue in brain computer interface (BCI) systems. Deep learning approaches have been used successfully in many recent studies to learn features and classify different types of data. However, the number of studies that employ these approaches on BCI applications is very limited. In this study we aim to use deep learning methods to improve classification performance of EEG motor imagery signals. In this study we investigate convolutional neural networks (CNN) and stacked autoencoders (SAE) to classify EEG Motor Imagery signals. A new form of input is introduced to combine time, frequency and location information extracted from EEG signal and it is used in CNN having one 1D convolutional and one max-pooling layers. We also proposed a new deep network by combining CNN and SAE. In this network, the features that are extracted in CNN are classified through the deep network SAE. The classification performance obtained by the proposed method on BCI competition IV dataset 2b in terms of kappa value is 0.547. Our approach yields 9% improvement over the winner algorithm of the competition. Our results show that deep learning methods provide better classification performance compared to other state of art approaches. These methods can be applied successfully to BCI systems where the amount of data is large due to daily recording.

  10. Deep learning classification in asteroseismology using an improved neural network

    DEFF Research Database (Denmark)

    Hon, Marc; Stello, Dennis; Yu, Jie

    2018-01-01

    Deep learning in the form of 1D convolutional neural networks have previously been shown to be capable of efficiently classifying the evolutionary state of oscillating red giants into red giant branch stars and helium-core burning stars by recognizing visual features in their asteroseismic...... frequency spectra. We elaborate further on the deep learning method by developing an improved convolutional neural network classifier. To make our method useful for current and future space missions such as K2, TESS, and PLATO, we train classifiers that are able to classify the evolutionary states of lower...

  11. Using Machine Learning Methods Jointly to Find Better Set of Rules in Data Mining

    Directory of Open Access Journals (Sweden)

    SUG Hyontai

    2017-01-01

    Full Text Available Rough set-based data mining algorithms are one of widely accepted machine learning technologies because of their strong mathematical background and capability of finding optimal rules based on given data sets only without room for prejudiced views to be inserted on the data. But, because the algorithms find rules very precisely, we may confront with the overfitting problem. On the other hand, association rule algorithms find rules of association, where the association resides between sets of items in database. The algorithms find itemsets that occur more than given minimum support, so that they can find the itemsets practically in reasonable time even for very large databases by supplying the minimum support appropriately. In order to overcome the problem of the overfitting problem in rough set-based algorithms, first we find large itemsets, after that we select attributes that cover the large itemsets. By using the selected attributes only, we may find better set of rules based on rough set theory. Results from experiments support our suggested method.

  12. Machine learning algorithms for mode-of-action classification in toxicity assessment.

    Science.gov (United States)

    Zhang, Yile; Wong, Yau Shu; Deng, Jian; Anton, Cristina; Gabos, Stephan; Zhang, Weiping; Huang, Dorothy Yu; Jin, Can

    2016-01-01

    Real Time Cell Analysis (RTCA) technology is used to monitor cellular changes continuously over the entire exposure period. Combining with different testing concentrations, the profiles have potential in probing the mode of action (MOA) of the testing substances. In this paper, we present machine learning approaches for MOA assessment. Computational tools based on artificial neural network (ANN) and support vector machine (SVM) are developed to analyze the time-concentration response curves (TCRCs) of human cell lines responding to tested chemicals. The techniques are capable of learning data from given TCRCs with known MOA information and then making MOA classification for the unknown toxicity. A novel data processing step based on wavelet transform is introduced to extract important features from the original TCRC data. From the dose response curves, time interval leading to higher classification success rate can be selected as input to enhance the performance of the machine learning algorithm. This is particularly helpful when handling cases with limited and imbalanced data. The validation of the proposed method is demonstrated by the supervised learning algorithm applied to the exposure data of HepG2 cell line to 63 chemicals with 11 concentrations in each test case. Classification success rate in the range of 85 to 95 % are obtained using SVM for MOA classification with two clusters to cases up to four clusters. Wavelet transform is capable of capturing important features of TCRCs for MOA classification. The proposed SVM scheme incorporated with wavelet transform has a great potential for large scale MOA classification and high-through output chemical screening.

  13. Accurate classification of brain gliomas by discriminate dictionary learning based on projective dictionary pair learning of proton magnetic resonance spectra.

    Science.gov (United States)

    Adebileje, Sikiru Afolabi; Ghasemi, Keyvan; Aiyelabegan, Hammed Tanimowo; Saligheh Rad, Hamidreza

    2017-04-01

    Proton magnetic resonance spectroscopy is a powerful noninvasive technique that complements the structural images of cMRI, which aids biomedical and clinical researches, by identifying and visualizing the compositions of various metabolites within the tissues of interest. However, accurate classification of proton magnetic resonance spectroscopy is still a challenging issue in clinics due to low signal-to-noise ratio, overlapping peaks of metabolites, and the presence of background macromolecules. This paper evaluates the performance of a discriminate dictionary learning classifiers based on projective dictionary pair learning method for brain gliomas proton magnetic resonance spectroscopy spectra classification task, and the result were compared with the sub-dictionary learning methods. The proton magnetic resonance spectroscopy data contain a total of 150 spectra (74 healthy, 23 grade II, 23 grade III, and 30 grade IV) from two databases. The datasets from both databases were first coupled together, followed by column normalization. The Kennard-Stone algorithm was used to split the datasets into its training and test sets. Performance comparison based on the overall accuracy, sensitivity, specificity, and precision was conducted. Based on the overall accuracy of our classification scheme, the dictionary pair learning method was found to outperform the sub-dictionary learning methods 97.78% compared with 68.89%, respectively. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Real-Time Barcode Detection and Classification Using Deep Learning

    DEFF Research Database (Denmark)

    Hansen, Daniel Kold; Nasrollahi, Kamal; Rasmussen, Christoffer Bøgelund

    2017-01-01

    Barcodes, in their different forms, can be found on almost any packages available in the market. Detecting and then decoding of barcodes have therefore great applications. We describe how to adapt the state-of-the- art deep learning-based detector of You Only Look Once (YOLO) for the purpose...

  15. II - Multivariate Classification and Machine Learning in HEP

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    A summary of the history of deep-learning is given and the difference to traditional artificial neural networks is discussed. Advanced methods like convoluted neural networks, recurrent neural networks and unsupervised training are introduced. Interesting examples from this emerging field outside HEP are presented. Possible applications in HEP are discussed.

  16. Classification of carcinogenic and mutagenic properties using machine learning method

    DEFF Research Database (Denmark)

    Moorthy, N. S.Hari Narayana; Kumar, Surendra; Poongavanam, Vasanthanathan

    2017-01-01

    An accurate calculation of carcinogenicity of chemicals became a serious challenge for the health assessment authority around the globe because of not only increased cost for experiments but also various ethical issues exist using animal models. In this study, we provide machine learning...

  17. Networking and distance learning for teachers: A classification of possibilities

    NARCIS (Netherlands)

    Collis, Betty

    1995-01-01

    Computer based communication technologies, or what could be more conveniently called networking, are bringing new possibilities into teacher education in many different ways. As with distance education more generally they can facilitate flexibility in time and place of learning, but the range of

  18. Incremental concept learning with few training examples and hierarchical classification

    NARCIS (Netherlands)

    Bouma, H.; Eendebak, P.T.; Schutte, K.; Azzopardi, G.; Burghouts, G.J.

    2015-01-01

    Object recognition and localization are important to automatically interpret video and allow better querying on its content. We propose a method for object localization that learns incrementally and addresses four key aspects. Firstly, we show that for certain applications, recognition is feasible

  19. Machine learning versus knowledge based classification of legal texts

    NARCIS (Netherlands)

    de Maat, E.; Krabben, K.; Winkels, R.; Winkels, R.G.F.

    2010-01-01

    This paper presents results of an experiment in which we used machine learning (ML) techniques to classify sentences in Dutch legislation. These results are compared to the results of a pattern-based classifier. Overall, the ML classifier performs as accurate (>90%) as the pattern based one, but

  20. Machine learning methods for the classification of gliomas: Initial results using features extracted from MR spectroscopy.

    Science.gov (United States)

    Ranjith, G; Parvathy, R; Vikas, V; Chandrasekharan, Kesavadas; Nair, Suresh

    2015-04-01

    With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. The aim of the study is to classify gliomas into benign and malignant types using MRI data. Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.

  1. Computational modeling of spiking neural network with learning rules from STDP and intrinsic plasticity

    Science.gov (United States)

    Li, Xiumin; Wang, Wei; Xue, Fangzheng; Song, Yongduan

    2018-02-01

    Recently there has been continuously increasing interest in building up computational models of spiking neural networks (SNN), such as the Liquid State Machine (LSM). The biologically inspired self-organized neural networks with neural plasticity can enhance the capability of computational performance, with the characteristic features of dynamical memory and recurrent connection cycles which distinguish them from the more widely used feedforward neural networks. Despite a variety of computational models for brain-like learning and information processing have been proposed, the modeling of self-organized neural networks with multi-neural plasticity is still an important open challenge. The main difficulties lie in the interplay among different forms of neural plasticity rules and understanding how structures and dynamics of neural networks shape the computational performance. In this paper, we propose a novel approach to develop the models of LSM with a biologically inspired self-organizing network based on two neural plasticity learning rules. The connectivity among excitatory neurons is adapted by spike-timing-dependent plasticity (STDP) learning; meanwhile, the degrees of neuronal excitability are regulated to maintain a moderate average activity level by another learning rule: intrinsic plasticity (IP). Our study shows that LSM with STDP+IP performs better than LSM with a random SNN or SNN obtained by STDP alone. The noticeable improvement with the proposed method is due to the better reflected competition among different neurons in the developed SNN model, as well as the more effectively encoded and processed relevant dynamic information with its learning and self-organizing mechanism. This result gives insights to the optimization of computational models of spiking neural networks with neural plasticity.

  2. Study on a pattern classification method of soil quality based on simplified learning sample dataset

    Science.gov (United States)

    Zhang, Jiahua; Liu, S.; Hu, Y.; Tian, Y.

    2011-01-01

    Based on the massive soil information in current soil quality grade evaluation, this paper constructed an intelligent classification approach of soil quality grade depending on classical sampling techniques and disordered multiclassification Logistic regression model. As a case study to determine the learning sample capacity under certain confidence level and estimation accuracy, and use c-means algorithm to automatically extract the simplified learning sample dataset from the cultivated soil quality grade evaluation database for the study area, Long chuan county in Guangdong province, a disordered Logistic classifier model was then built and the calculation analysis steps of soil quality grade intelligent classification were given. The result indicated that the soil quality grade can be effectively learned and predicted by the extracted simplified dataset through this method, which changed the traditional method for soil quality grade evaluation. ?? 2011 IEEE.

  3. Lessons learned from the Maintenance Rule implementation at Northeast Utilities operating plants

    International Nuclear Information System (INIS)

    Hastings, K.B.; Khalil, Y.F.; Johnson, W.

    1996-01-01

    The Maintenance Rule as described in 10CFR50.65 requires holders of all operating nuclear power plants to monitor the performance of structures, systems, and components (SSCs) against licensee-established performance criteria. The Industry with the assistance of the Nuclear Energy Institute (NEI) developed a guideline, which includes all parts of the Maintenance Rule, to establish these performance criteria while incorporating safety and reliability of the operating plants. The NUMARC 93-01 Guideline introduced the term ''Risk Significant'' to categorize subsets of the SSCs which would require increased focus, from a Maintenance Rule perspective, in setting their performance criteria. Northeast Utilities Company (NU) operates five nuclear plants three at Millstone Station in Waterford, Connecticut; the Connecticut Yankee plant in Haddam Neck, Connecticut; and the Seabrook Station in Seabrook, New Hampshire. NU started the implementation process of the Maintenance Rule program at its five operating plants since early 1994, and have identified a population of risk significant SSCs at each plant. Recently, Northeast Utilities' Maintenance Rule Team re-examined the initial risk significant determinations to further refine these populations, and to establish consistencies among its operating units. As a result of the re-examination process, a number of inconsistencies and areas for improvement have been identified. The lessons learned provide valuable insights to consider in the future as one implements more risk based initiatives such as Graded QA and Risk-Based ISI and IST. This paper discusses the risk significance criteria, how Northeast Utilities utilized NUMARC 93-01 Guideline to determine the risk significant SSCs for its operating plants, and lessons learned. The results provided here do not include the Seabrook Station

  4. Deep Phenotyping: Deep Learning For Temporal Phenotype/Genotype Classification

    OpenAIRE

    Najafi, Mohammad; Namin, Sarah; Esmaeilzadeh, Mohammad; Brown, Tim; Borevitz, Justin

    2017-01-01

    High resolution and high throughput, genotype to phenotype studies in plants are underway to accelerate breeding of climate ready crops. Complex developmental phenotypes are observed by imaging a variety of accessions in different environment conditions, however extracting the genetically heritable traits is challenging. In the recent years, deep learning techniques and in particular Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs) and Long-Short Term Memories (LSTMs), h...

  5. A Three-Threshold Learning Rule Approaches the Maximal Capacity of Recurrent Neural Networks.

    Directory of Open Access Journals (Sweden)

    Alireza Alemi

    2015-08-01

    Full Text Available Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model simplicity and the locality of the synaptic update rules come at the cost of a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns to be memorized are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the

  6. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  7. Classification of amyloid status using machine learning with histograms of oriented 3D gradients

    Directory of Open Access Journals (Sweden)

    Liam Cattell

    2016-01-01

    Full Text Available Brain amyloid burden may be quantitatively assessed from positron emission tomography imaging using standardised uptake value ratios. Using these ratios as an adjunct to visual image assessment has been shown to improve inter-reader reliability, however, the amyloid positivity threshold is dependent on the tracer and specific image regions used to calculate the uptake ratio. To address this problem, we propose a machine learning approach to amyloid status classification, which is independent of tracer and does not require a specific set of regions of interest. Our method extracts feature vectors from amyloid images, which are based on histograms of oriented three-dimensional gradients. We optimised our method on 133 18F-florbetapir brain volumes, and applied it to a separate test set of 131 volumes. Using the same parameter settings, we then applied our method to 209 11C-PiB images and 128 18F-florbetaben images. We compared our method to classification results achieved using two other methods: standardised uptake value ratios and a machine learning method based on voxel intensities. Our method resulted in the largest mean distances between the subjects and the classification boundary, suggesting that it is less likely to make low-confidence classification decisions. Moreover, our method obtained the highest classification accuracy for all three tracers, and consistently achieved above 96% accuracy.

  8. A Proposed Functional Abilities Classification Tool for Developmental Disorders Affecting Learning and Behaviour

    Directory of Open Access Journals (Sweden)

    Benjamin Klein

    2018-02-01

    Full Text Available Children with developmental disorders affecting learning and behaviour (DDALB (e.g., attention, social communication, language, and learning disabilities, etc. require individualized support across multiple environments to promote participation, quality of life, and developmental outcomes. Support to enhance participation is based largely on individual profiles of functioning (e.g., communication, cognitive, social skills, executive functioning, etc., which are highly heterogeneous within medical diagnoses. Currently educators, clinicians, and parents encounter widespread difficulties in meeting children’s needs as there is lack of universal classification of functioning and disability for use in school environments. Objective: a practical tool for functional classification broadly applicable for children with DDALB could facilitate the collaboration, identification of points of entry of support, individual program planning, and reassessment in a transparent, equitable process based on functional need and context. We propose such a tool, the Functional Abilities Classification Tool (FACT based on the concepts of the ICF (International Classification of Functioning, Disability and Health. FACT is intended to provide ability and participation classification that is complementary to medical diagnosis. For children presenting with difficulties, the proposed tool initially classifies participation over several environments. Then, functional abilities are classified and personal factors and environment are described. Points of entry for support are identified given an analysis of functional ability profile, personal factors, environmental features, and pattern of participation. Conclusion: case examples, use of the tool and implications for children, agencies, and the system are described.

  9. Brake fault diagnosis using Clonal Selection Classification Algorithm (CSCA – A statistical learning approach

    Directory of Open Access Journals (Sweden)

    R. Jegadeeshwaran

    2015-03-01

    Full Text Available In automobile, brake system is an essential part responsible for control of the vehicle. Any failure in the brake system impacts the vehicle's motion. It will generate frequent catastrophic effects on the vehicle cum passenger's safety. Thus the brake system plays a vital role in an automobile and hence condition monitoring of the brake system is essential. Vibration based condition monitoring using machine learning techniques are gaining momentum. This study is one such attempt to perform the condition monitoring of a hydraulic brake system through vibration analysis. In this research, the performance of a Clonal Selection Classification Algorithm (CSCA for brake fault diagnosis has been reported. A hydraulic brake system test rig was fabricated. Under good and faulty conditions of a brake system, the vibration signals were acquired using a piezoelectric transducer. The statistical parameters were extracted from the vibration signal. The best feature set was identified for classification using attribute evaluator. The selected features were then classified using CSCA. The classification accuracy of such artificial intelligence technique has been compared with other machine learning approaches and discussed. The Clonal Selection Classification Algorithm performs better and gives the maximum classification accuracy (96% for the fault diagnosis of a hydraulic brake system.

  10. Mixing Languages during Learning? Testing the One Subject-One Language Rule.

    Directory of Open Access Journals (Sweden)

    Eneko Antón

    Full Text Available In bilingual communities, mixing languages is avoided in formal schooling: even if two languages are used on a daily basis for teaching, only one language is used to teach each given academic subject. This tenet known as the one subject-one language rule avoids mixing languages in formal schooling because it may hinder learning. The aim of this study was to test the scientific ground of this assumption by investigating the consequences of acquiring new concepts using a method in which two languages are mixed as compared to a purely monolingual method. Native balanced bilingual speakers of Basque and Spanish-adults (Experiment 1 and children (Experiment 2-learnt new concepts by associating two different features to novel objects. Half of the participants completed the learning process in a multilingual context (one feature was described in Basque and the other one in Spanish; while the other half completed the learning phase in a purely monolingual context (both features were described in Spanish. Different measures of learning were taken, as well as direct and indirect indicators of concept consolidation. We found no evidence in favor of the non-mixing method when comparing the results of two groups in either experiment, and thus failed to give scientific support for the educational premise of the one subject-one language rule.

  11. Mixing Languages during Learning? Testing the One Subject—One Language Rule

    Science.gov (United States)

    2015-01-01

    In bilingual communities, mixing languages is avoided in formal schooling: even if two languages are used on a daily basis for teaching, only one language is used to teach each given academic subject. This tenet known as the one subject-one language rule avoids mixing languages in formal schooling because it may hinder learning. The aim of this study was to test the scientific ground of this assumption by investigating the consequences of acquiring new concepts using a method in which two languages are mixed as compared to a purely monolingual method. Native balanced bilingual speakers of Basque and Spanish—adults (Experiment 1) and children (Experiment 2)—learnt new concepts by associating two different features to novel objects. Half of the participants completed the learning process in a multilingual context (one feature was described in Basque and the other one in Spanish); while the other half completed the learning phase in a purely monolingual context (both features were described in Spanish). Different measures of learning were taken, as well as direct and indirect indicators of concept consolidation. We found no evidence in favor of the non-mixing method when comparing the results of two groups in either experiment, and thus failed to give scientific support for the educational premise of the one subject—one language rule. PMID:26107624

  12. Criterial noise effects on rule-based category learning: the impact of delayed feedback.

    Science.gov (United States)

    Ell, Shawn W; Ing, A David; Maddox, W Todd

    2009-08-01

    Variability in the representation of the decision criterion is assumed in many category-learning models, yet few studies have directly examined its impact. On each trial, criterial noise should result in drift in the criterion and will negatively impact categorization accuracy, particularly in rule-based categorization tasks, where learning depends on the maintenance and manipulation of decision criteria. In three experiments, we tested this hypothesis and examined the impact of working memory on slowing the drift rate. In Experiment 1, we examined the effect of drift by inserting a 5-sec delay between the categorization response and the delivery of corrective feedback, and working memory demand was manipulated by varying the number of decision criteria to be learned. Delayed feedback adversely affected performance, but only when working memory demand was high. In Experiment 2, we built on a classic finding in the absolute identification literature and demonstrated that distributing the criteria across multiple dimensions decreases the impact of drift during the delay. In Experiment 3, we confirmed that the effect of drift during the delay is moderated by working memory. These results provide important insights into the interplay between criterial noise and working memory, as well as providing important constraints for models of rule-based category learning.

  13. Better and Faster: Knowledge Transfer from Multiple Self-supervised Learning Tasks via Graph Distillation for Video Classification

    OpenAIRE

    Zhang, Chenrui; Peng, Yuxin

    2018-01-01

    Video representation learning is a vital problem for classification task. Recently, a promising unsupervised paradigm termed self-supervised learning has emerged, which explores inherent supervisory signals implied in massive data for feature learning via solving auxiliary tasks. However, existing methods in this regard suffer from two limitations when extended to video classification. First, they focus only on a single task, whereas ignoring complementarity among different task-specific feat...

  14. Photometric classification of type Ia supernovae in the SuperNova Legacy Survey with supervised learning

    Energy Technology Data Exchange (ETDEWEB)

    Möller, A. [Research School of Astronomy and Astrophysics, Australian National University, Canberra, ACT 2611 (Australia); Ruhlmann-Kleider, V.; Leloup, C.; Neveu, J.; Palanque-Delabrouille, N.; Rich, J. [Irfu, SPP, CEA Saclay, F-91191 Gif sur Yvette Cedex (France); Carlberg, R. [Department of Astronomy and Astrophysics, University of Toronto, 50 St. George Street, Toronto, ON M5S 3H8 (Canada); Lidman, C. [Australian Astronomical Observatory, North Ryde, NSW 2113 (Australia); Pritchet, C., E-mail: anais.moller@anu.edu.au, E-mail: vanina.ruhlmann-kleider@cea.fr, E-mail: clement.leloup@cea.fr, E-mail: jneveu@lal.in2p3.fr, E-mail: nathalie.palanque-delabrouille@cea.fr, E-mail: james.rich@cea.fr, E-mail: raymond.carlberg@utoronto.ca, E-mail: chris.lidman@aao.gov.au, E-mail: pritchet@uvic.ca [Department of Physics and Astronomy, University of Victoria, P.O. Box 3055, Victoria, BC V8W 3P6 (Canada)

    2016-12-01

    In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts (0.2 < z < 1.1). Our method consists of two stages: feature extraction (obtaining the SN redshift from photometry and estimating light-curve shape parameters) and machine learning classification. We study the performance of different algorithms such as Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia samples. Using the Area Under the Curve (AUC) metric, where perfect classification is given by 1, we find that our best-performing classifier (Extreme Gradient Boosting Decision Tree) has an AUC of 0.98.We show that it is possible to obtain a large photometrically selected type Ia SN sample with an estimated contamination of less than 5%. When applied to data from the first three years of SNLS, we obtain 529 events. We investigate the differences between classifying simulated SNe, and real SN survey data. In particular, we find that applying a thorough set of selection cuts to the SN sample is essential for good classification. This work demonstrates for the first time the feasibility of machine learning classification in a high- z SN survey with application to real SN data.

  15. Photometric classification of type Ia supernovae in the SuperNova Legacy Survey with supervised learning

    International Nuclear Information System (INIS)

    Möller, A.; Ruhlmann-Kleider, V.; Leloup, C.; Neveu, J.; Palanque-Delabrouille, N.; Rich, J.; Carlberg, R.; Lidman, C.; Pritchet, C.

    2016-01-01

    In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts (0.2 < z < 1.1). Our method consists of two stages: feature extraction (obtaining the SN redshift from photometry and estimating light-curve shape parameters) and machine learning classification. We study the performance of different algorithms such as Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia samples. Using the Area Under the Curve (AUC) metric, where perfect classification is given by 1, we find that our best-performing classifier (Extreme Gradient Boosting Decision Tree) has an AUC of 0.98.We show that it is possible to obtain a large photometrically selected type Ia SN sample with an estimated contamination of less than 5%. When applied to data from the first three years of SNLS, we obtain 529 events. We investigate the differences between classifying simulated SNe, and real SN survey data. In particular, we find that applying a thorough set of selection cuts to the SN sample is essential for good classification. This work demonstrates for the first time the feasibility of machine learning classification in a high- z SN survey with application to real SN data.

  16. Generative Adversarial Networks-Based Semi-Supervised Learning for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Zhi He

    2017-10-01

    Full Text Available Classification of hyperspectral image (HSI is an important research topic in the remote sensing community. Significant efforts (e.g., deep learning have been concentrated on this task. However, it is still an open issue to classify the high-dimensional HSI with a limited number of training samples. In this paper, we propose a semi-supervised HSI classification method inspired by the generative adversarial networks (GANs. Unlike the supervised methods, the proposed HSI classification method is semi-supervised, which can make full use of the limited labeled samples as well as the sufficient unlabeled samples. Core ideas of the proposed method are twofold. First, the three-dimensional bilateral filter (3DBF is adopted to extract the spectral-spatial features by naturally treating the HSI as a volumetric dataset. The spatial information is integrated into the extracted features by 3DBF, which is propitious to the subsequent classification step. Second, GANs are trained on the spectral-spatial features for semi-supervised learning. A GAN contains two neural networks (i.e., generator and discriminator trained in opposition to one another. The semi-supervised learning is achieved by adding samples from the generator to the features and increasing the dimension of the classifier output. Experimental results obtained on three benchmark HSI datasets have confirmed the effectiveness of the proposed method , especially with a limited number of labeled samples.

  17. An Improved Brain-Inspired Emotional Learning Algorithm for Fast Classification

    Directory of Open Access Journals (Sweden)

    Ying Mei

    2017-06-01

    Full Text Available Classification is an important task of machine intelligence in the field of information. The artificial neural network (ANN is widely used for classification. However, the traditional ANN shows slow training speed, and it is hard to meet the real-time requirement for large-scale applications. In this paper, an improved brain-inspired emotional learning (BEL algorithm is proposed for fast classification. The BEL algorithm was put forward to mimic the high speed of the emotional learning mechanism in mammalian brain, which has the superior features of fast learning and low computational complexity. To improve the accuracy of BEL in classification, the genetic algorithm (GA is adopted for optimally tuning the weights and biases of amygdala and orbitofrontal cortex in the BEL neural network. The combinational algorithm named as GA-BEL has been tested on eight University of California at Irvine (UCI datasets and two well-known databases (Japanese Female Facial Expression, Cohn–Kanade. The comparisons of experiments indicate that the proposed GA-BEL is more accurate than the original BEL algorithm, and it is much faster than the traditional algorithm.

  18. Classification of ECG beats using deep belief network and active learning.

    Science.gov (United States)

    G, Sayantan; T, Kien P; V, Kadambari K

    2018-04-12

    A new semi-supervised approach based on deep learning and active learning for classification of electrocardiogram signals (ECG) is proposed. The objective of the proposed work is to model a scientific method for classification of cardiac irregularities using electrocardiogram beats. The model follows the Association for the Advancement of medical instrumentation (AAMI) standards and consists of three phases. In phase I, feature representation of ECG is learnt using Gaussian-Bernoulli deep belief network followed by a linear support vector machine (SVM) training in the consecutive phase. It yields three deep models which are based on AAMI-defined classes, namely N, V, S, and F. In the last phase, a query generator is introduced to interact with the expert to label few beats to improve accuracy and sensitivity. The proposed approach depicts significant improvement in accuracy with minimal queries posed to the expert and fast online training as tested on the MIT-BIH Arrhythmia Database and the MIT-BIH Supra-ventricular Arrhythmia Database (SVDB). With 100 queries labeled by the expert in phase III, the method achieves an accuracy of 99.5% in "S" versus all classifications (SVEB) and 99.4% accuracy in "V " versus all classifications (VEB) on MIT-BIH Arrhythmia Database. In a similar manner, it is attributed that an accuracy of 97.5% for SVEB and 98.6% for VEB on SVDB database is achieved respectively. Graphical Abstract Reply- Deep belief network augmented by active learning for efficient prediction of arrhythmia.

  19. Joint learning and weighting of visual vocabulary for bag-of-feature based tissue classification

    KAUST Repository

    Wang, Jim Jing-Yan

    2013-12-01

    Automated classification of tissue types of Region of Interest (ROI) in medical images has been an important application in Computer-Aided Diagnosis (CAD). Recently, bag-of-feature methods which treat each ROI as a set of local features have shown their power in this field. Two important issues of bag-of-feature strategy for tissue classification are investigated in this paper: the visual vocabulary learning and weighting, which are always considered independently in traditional methods by neglecting the inner relationship between the visual words and their weights. To overcome this problem, we develop a novel algorithm, Joint-ViVo, which learns the vocabulary and visual word weights jointly. A unified objective function based on large margin is defined for learning of both visual vocabulary and visual word weights, and optimized alternately in the iterative algorithm. We test our algorithm on three tissue classification tasks: classifying breast tissue density in mammograms, classifying lung tissue in High-Resolution Computed Tomography (HRCT) images, and identifying brain tissue type in Magnetic Resonance Imaging (MRI). The results show that Joint-ViVo outperforms the state-of-art methods on tissue classification problems. © 2013 Elsevier Ltd.

  20. Manifold regularized multitask learning for semi-supervised multilabel image classification.

    Science.gov (United States)

    Luo, Yong; Tao, Dacheng; Geng, Bo; Xu, Chao; Maybank, Stephen J

    2013-02-01

    It is a significant challenge to classify images with multiple labels by using only a small number of labeled samples. One option is to learn a binary classifier for each label and use manifold regularization to improve the classification performance by exploring the underlying geometric structure of the data distribution. However, such an approach does not perform well in practice when images from multiple concepts are represented by high-dimensional visual features. Thus, manifold regularization is insufficient to control the model complexity. In this paper, we propose a manifold regularized multitask learning (MRMTL) algorithm. MRMTL learns a discriminative subspace shared by multiple classification tasks by exploiting the common structure of these tasks. It effectively controls the model complexity because different tasks limit one another's search volume, and the manifold regularization ensures that the functions in the shared hypothesis space are smooth along the data manifold. We conduct extensive experiments, on the PASCAL VOC'07 dataset with 20 classes and the MIR dataset with 38 classes, by comparing MRMTL with popular image classification algorithms. The results suggest that MRMTL is effective for image classification.

  1. Out-of-Sample Generalizations for Supervised Manifold Learning for Classification.

    Science.gov (United States)

    Vural, Elif; Guillemot, Christine

    2016-03-01

    Supervised manifold learning methods for data classification map high-dimensional data samples to a lower dimensional domain in a structure-preserving way while increasing the separation between different classes. Most manifold learning methods compute the embedding only of the initially available data; however, the generalization of the embedding to novel points, i.e., the out-of-sample extension problem, becomes especially important in classification applications. In this paper, we propose a semi-supervised method for building an interpolation function that provides an out-of-sample extension for general supervised manifold learning algorithms studied in the context of classification. The proposed algorithm computes a radial basis function interpolator that minimizes an objective function consisting of the total embedding error of unlabeled test samples, defined as their distance to the embeddings of the manifolds of their own class, as well as a regularization term that controls the smoothness of the interpolation function in a direction-dependent way. The class labels of test data and the interpolation function parameters are estimated jointly with an iterative process. Experimental results on face and object images demonstrate the potential of the proposed out-of-sample extension algorithm for the classification of manifold-modeled data sets.

  2. Supervised learning for the automated transcription of spacer classification from spoligotype films

    Directory of Open Access Journals (Sweden)

    Abernethy Neil

    2009-08-01

    Full Text Available Abstract Background Molecular genotyping of bacteria has revolutionized the study of tuberculosis epidemiology, yet these established laboratory techniques typically require subjective and laborious interpretation by trained professionals. In the context of a Tuberculosis Case Contact study in The Gambia we used a reverse hybridization laboratory assay called spoligotype analysis. To facilitate processing of spoligotype images we have developed tools and algorithms to automate the classification and transcription of these data directly to a database while allowing for manual editing. Results Features extracted from each of the 1849 spots on a spoligo film were classified using two supervised learning algorithms. A graphical user interface allows manual editing of the classification, before export to a database. The application was tested on ten films of differing quality and the results of the best classifier were compared to expert manual classification, giving a median correct classification rate of 98.1% (inter quartile range: 97.1% to 99.2%, with an automated processing time of less than 1 minute per film. Conclusion The software implementation offers considerable time savings over manual processing whilst allowing expert editing of the automated classification. The automatic upload of the classification to a database reduces the chances of transcription errors.

  3. Energy-efficient algorithm for classification of states of wireless sensor network using machine learning methods

    Science.gov (United States)

    Yuldashev, M. N.; Vlasov, A. I.; Novikov, A. N.

    2018-05-01

    This paper focuses on the development of an energy-efficient algorithm for classification of states of a wireless sensor network using machine learning methods. The proposed algorithm reduces energy consumption by: 1) elimination of monitoring of parameters that do not affect the state of the sensor network, 2) reduction of communication sessions over the network (the data are transmitted only if their values can affect the state of the sensor network). The studies of the proposed algorithm have shown that at classification accuracy close to 100%, the number of communication sessions can be reduced by 80%.

  4. Data Processing And Machine Learning Methods For Multi-Modal Operator State Classification Systems

    Science.gov (United States)

    Hearn, Tristan A.

    2015-01-01

    This document is intended as an introduction to a set of common signal processing learning methods that may be used in the software portion of a functional crew state monitoring system. This includes overviews of both the theory of the methods involved, as well as examples of implementation. Practical considerations are discussed for implementing modular, flexible, and scalable processing and classification software for a multi-modal, multi-channel monitoring system. Example source code is also given for all of the discussed processing and classification methods.

  5. Neuromodulated Spike-Timing-Dependent Plasticity and Theory of Three-Factor Learning Rules

    Directory of Open Access Journals (Sweden)

    Wulfram eGerstner

    2016-01-01

    Full Text Available Classical Hebbian learning puts the emphasis on joint pre- and postsynaptic activity, but neglects the potential role of neuromodulators. Since neuromodulators convey information about novelty or reward, the influence of neuromodulatorson synaptic plasticity is useful not just for action learning in classical conditioning, but also to decide 'when' to create new memories in response to a flow of sensory stimuli.In this review, we focus on timing requirements for pre- and postsynaptic activity in conjunction with one or several phasic neuromodulatory signals. While the emphasis of the text is on conceptual models and mathematical theories, we also discusssome experimental evidence for neuromodulation of Spike-Timing-Dependent Plasticity.We highlight the importance of synaptic mechanisms in bridging the temporal gap between sensory stimulation and neuromodulatory signals, and develop a framework for a class of neo-Hebbian three-factor learning rules that depend on presynaptic activity, postsynaptic variables as well as the influence of neuromodulators.

  6. A novel application of deep learning for single-lead ECG classification.

    Science.gov (United States)

    Mathews, Sherin M; Kambhamettu, Chandra; Barner, Kenneth E

    2018-06-04

    Detecting and classifying cardiac arrhythmias is critical to the diagnosis of patients with cardiac abnormalities. In this paper, a novel approach based on deep learning methodology is proposed for the classification of single-lead electrocardiogram (ECG) signals. We demonstrate the application of the Restricted Boltzmann Machine (RBM) and deep belief networks (DBN) for ECG classification following detection of ventricular and supraventricular heartbeats using single-lead ECG. The effectiveness of this proposed algorithm is illustrated using real ECG signals from the widely-used MIT-BIH database. Simulation results demonstrate that with a suitable choice of parameters, RBM and DBN can achieve high average recognition accuracies of ventricular ectopic beats (93.63%) and of supraventricular ectopic beats (95.57%) at a low sampling rate of 114 Hz. Experimental results indicate that classifiers built into this deep learning-based framework achieved state-of-the art performance models at lower sampling rates and simple features when compared to traditional methods. Further, employing features extracted at a sampling rate of 114 Hz when combined with deep learning provided enough discriminatory power for the classification task. This performance is comparable to that of traditional methods and uses a much lower sampling rate and simpler features. Thus, our proposed deep neural network algorithm demonstrates that deep learning-based methods offer accurate ECG classification and could potentially be extended to other physiological signal classifications, such as those in arterial blood pressure (ABP), nerve conduction (EMG), and heart rate variability (HRV) studies. Copyright © 2018. Published by Elsevier Ltd.

  7. A machine learning approach for the classification of metallic glasses

    Science.gov (United States)

    Gossett, Eric; Perim, Eric; Toher, Cormac; Lee, Dongwoo; Zhang, Haitao; Liu, Jingbei; Zhao, Shaofan; Schroers, Jan; Vlassak, Joost; Curtarolo, Stefano

    Metallic glasses possess an extensive set of mechanical properties along with plastic-like processability. As a result, they are a promising material in many industrial applications. However, the successful synthesis of novel metallic glasses requires trial and error, costing both time and resources. Therefore, we propose a high-throughput approach that combines an extensive set of experimental measurements with advanced machine learning techniques. This allows us to classify metallic glasses and predict the full phase diagrams for a given alloy system. Thus this method provides a means to identify potential glass-formers and opens up the possibility for accelerating and reducing the cost of the design of new metallic glasses.

  8. Pulmonary emphysema classification based on an improved texton learning model by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2013-03-01

    In this paper, we present a texture classification method based on texton learned via sparse representation (SR) with new feature histogram maps in the classification of emphysema. First, an overcomplete dictionary of textons is learned via KSVD learning on every class image patches in the training dataset. In this stage, high-pass filter is introduced to exclude patches in smooth area to speed up the dictionary learning process. Second, 3D joint-SR coefficients and intensity histograms of the test images are used for characterizing regions of interest (ROIs) instead of conventional feature histograms constructed from SR coefficients of the test images over the dictionary. Classification is then performed using a classifier with distance as a histogram dissimilarity measure. Four hundreds and seventy annotated ROIs extracted from 14 test subjects, including 6 paraseptal emphysema (PSE) subjects, 5 centrilobular emphysema (CLE) subjects and 3 panlobular emphysema (PLE) subjects, are used to evaluate the effectiveness and robustness of the proposed method. The proposed method is tested on 167 PSE, 240 CLE and 63 PLE ROIs consisting of mild, moderate and severe pulmonary emphysema. The accuracy of the proposed system is around 74%, 88% and 89% for PSE, CLE and PLE, respectively.

  9. Deep learning architectures for multi-label classification of intelligent health risk prediction.

    Science.gov (United States)

    Maxwell, Andrew; Li, Runzhi; Yang, Bei; Weng, Heng; Ou, Aihua; Hong, Huixiao; Zhou, Zhaoxian; Gong, Ping; Zhang, Chaoyang

    2017-12-28

    Multi-label classification of data remains to be a challenging problem. Because of the complexity of the data, it is sometimes difficult to infer information about classes that are not mutually exclusive. For medical data, patients could have symptoms of multiple different diseases at the same time and it is important to develop tools that help to identify problems early. Intelligent health risk prediction models built with deep learning architectures offer a powerful tool for physicians to identify patterns in patient data that indicate risks associated with certain types of chronic diseases. Physical examination records of 110,300 anonymous patients were used to predict diabetes, hypertension, fatty liver, a combination of these three chronic diseases, and the absence of disease (8 classes in total). The dataset was split into training (90%) and testing (10%) sub-datasets. Ten-fold cross validation was used to evaluate prediction accuracy with metrics such as precision, recall, and F-score. Deep Learning (DL) architectures were compared with standard and state-of-the-art multi-label classification methods. Preliminary results suggest that Deep Neural Networks (DNN), a DL architecture, when applied to multi-label classification of chronic diseases, produced accuracy that was comparable to that of common methods such as Support Vector Machines. We have implemented DNNs to handle both problem transformation and algorithm adaption type multi-label methods and compare both to see which is preferable. Deep Learning architectures have the potential of inferring more information about the patterns of physical examination data than common classification methods. The advanced techniques of Deep Learning can be used to identify the significance of different features from physical examination data as well as to learn the contributions of each feature that impact a patient's risk for chronic diseases. However, accurate prediction of chronic disease risks remains a challenging

  10. Machine-learning phenotypic classification of bicuspid aortopathy.

    Science.gov (United States)

    Wojnarski, Charles M; Roselli, Eric E; Idrees, Jay J; Zhu, Yuanjia; Carnes, Theresa A; Lowry, Ashley M; Collier, Patrick H; Griffin, Brian; Ehrlinger, John; Blackstone, Eugene H; Svensson, Lars G; Lytle, Bruce W

    2018-02-01

    Bicuspid aortic valves (BAV) are associated with incompletely characterized aortopathy. Our objectives were to identify distinct patterns of aortopathy using machine-learning methods and characterize their association with valve morphology and patient characteristics. We analyzed preoperative 3-dimensional computed tomography reconstructions for 656 patients with BAV undergoing ascending aorta surgery between January 2002 and January 2014. Unsupervised partitioning around medoids was used to cluster aortic dimensions. Group differences were identified using polytomous random forest analysis. Three distinct aneurysm phenotypes were identified: root (n = 83; 13%), with predominant dilatation at sinuses of Valsalva; ascending (n = 364; 55%), with supracoronary enlargement rarely extending past the brachiocephalic artery; and arch (n = 209; 32%), with aortic arch dilatation. The arch phenotype had the greatest association with right-noncoronary cusp fusion: 29%, versus 13% for ascending and 15% for root phenotypes (P < .0001). Severe valve regurgitation was most prevalent in root phenotype (57%), followed by ascending (34%) and arch phenotypes (25%; P < .0001). Aortic stenosis was most prevalent in arch phenotype (62%), followed by ascending (50%) and root phenotypes (28%; P < .0001). Patient age increased as the extent of aneurysm became more distal (root, 49 years; ascending, 53 years; arch, 57 years; P < .0001), and root phenotype was associated with greater male predominance compared with ascending and arch phenotypes (94%, 76%, and 70%, respectively; P < .0001). Phenotypes were visually recognizable with 94% accuracy. Three distinct phenotypes of bicuspid valve-associated aortopathy were identified using machine-learning methodology. Patient characteristics and valvular dysfunction vary by phenotype, suggesting that the location of aortic pathology may be related to the underlying pathophysiology of this disease. Copyright © 2017 The American

  11. Coupled dimensionality reduction and classification for supervised and semi-supervised multilabel learning.

    Science.gov (United States)

    Gönen, Mehmet

    2014-03-01

    Coupled training of dimensionality reduction and classification is proposed previously to improve the prediction performance for single-label problems. Following this line of research, in this paper, we first introduce a novel Bayesian method that combines linear dimensionality reduction with linear binary classification for supervised multilabel learning and present a deterministic variational approximation algorithm to learn the proposed probabilistic model. We then extend the proposed method to find intrinsic dimensionality of the projected subspace using automatic relevance determination and to handle semi-supervised learning using a low-density assumption. We perform supervised learning experiments on four benchmark multilabel learning data sets by comparing our method with baseline linear dimensionality reduction algorithms. These experiments show that the proposed approach achieves good performance values in terms of hamming loss, average AUC, macro F 1 , and micro F 1 on held-out test data. The low-dimensional embeddings obtained by our method are also very useful for exploratory data analysis. We also show the effectiveness of our approach in finding intrinsic subspace dimensionality and semi-supervised learning tasks.

  12. Applying cognitive developmental psychology to middle school physics learning: The rule assessment method

    Science.gov (United States)

    Hallinen, Nicole R.; Chi, Min; Chin, Doris B.; Prempeh, Joe; Blair, Kristen P.; Schwartz, Daniel L.

    2013-01-01

    Cognitive developmental psychology often describes children's growing qualitative understanding of the physical world. Physics educators may be able to use the relevant methods to advantage for characterizing changes in students' qualitative reasoning. Siegler developed the "rule assessment" method for characterizing levels of qualitative understanding for two factor situations (e.g., volume and mass for density). The method assigns children to rule levels that correspond to the degree they notice and coordinate the two factors. Here, we provide a brief tutorial plus a demonstration of how we have used this method to evaluate instructional outcomes with middle-school students who learned about torque, projectile motion, and collisions using different instructional methods with simulations.

  13. Visual perceptual learning by operant conditioning training follows rules of contingency

    Science.gov (United States)

    Kim, Dongho; Seitz, Aaron R; Watanabe, Takeo

    2015-01-01

    Visual perceptual learning (VPL) can occur as a result of a repetitive stimulus-reward pairing in the absence of any task. This suggests that rules that guide Conditioning, such as stimulus-reward contingency (e.g. that stimulus predicts the likelihood of reward), may also guide the formation of VPL. To address this question, we trained subjects with an operant conditioning task in which there were contingencies between the response to one of three orientations and the presence of reward. Results showed that VPL only occurred for positive contingencies, but not for neutral or negative contingencies. These results suggest that the formation of VPL is influenced by similar rules that guide the process of Conditioning. PMID:26028984

  14. Visual perceptual learning by operant conditioning training follows rules of contingency.

    Science.gov (United States)

    Kim, Dongho; Seitz, Aaron R; Watanabe, Takeo

    2015-01-01

    Visual perceptual learning (VPL) can occur as a result of a repetitive stimulus-reward pairing in the absence of any task. This suggests that rules that guide Conditioning, such as stimulus-reward contingency (e.g. that stimulus predicts the likelihood of reward), may also guide the formation of VPL. To address this question, we trained subjects with an operant conditioning task in which there were contingencies between the response to one of three orientations and the presence of reward. Results showed that VPL only occurred for positive contingencies, but not for neutral or negative contingencies. These results suggest that the formation of VPL is influenced by similar rules that guide the process of Conditioning.

  15. In Vivo Pattern Classification of Ingestive Behavior in Ruminants Using FBG Sensors and Machine Learning

    Directory of Open Access Journals (Sweden)

    Vinicius Pegorini

    2015-11-01

    Full Text Available Pattern classification of ingestive behavior in grazing animals has extreme importance in studies related to animal nutrition, growth and health. In this paper, a system to classify chewing patterns of ruminants in in vivo experiments is developed. The proposal is based on data collected by optical fiber Bragg grating sensors (FBG that are processed by machine learning techniques. The FBG sensors measure the biomechanical strain during jaw movements, and a decision tree is responsible for the classification of the associated chewing pattern. In this study, patterns associated with food intake of dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior were monitored: rumination and idleness. Experimental results show that the proposed approach for pattern classification is capable of differentiating the five patterns involved in the chewing process with an overall accuracy of 94%.

  16. Polarimetric SAR Image Classification Using Multiple-feature Fusion and Ensemble Learning

    Directory of Open Access Journals (Sweden)

    Sun Xun

    2016-12-01

    Full Text Available In this paper, we propose a supervised classification algorithm for Polarimetric Synthetic Aperture Radar (PolSAR images using multiple-feature fusion and ensemble learning. First, we extract different polarimetric features, including extended polarimetric feature space, Hoekman, Huynen, H/alpha/A, and fourcomponent scattering features of PolSAR images. Next, we randomly select two types of features each time from all feature sets to guarantee the reliability and diversity of later ensembles and use a support vector machine as the basic classifier for predicting classification results. Finally, we concatenate all prediction probabilities of basic classifiers as the final feature representation and employ the random forest method to obtain final classification results. Experimental results at the pixel and region levels show the effectiveness of the proposed algorithm.

  17. In Vivo Pattern Classification of Ingestive Behavior in Ruminants Using FBG Sensors and Machine Learning.

    Science.gov (United States)

    Pegorini, Vinicius; Karam, Leandro Zen; Pitta, Christiano Santos Rocha; Cardoso, Rafael; da Silva, Jean Carlos Cardozo; Kalinowski, Hypolito José; Ribeiro, Richardson; Bertotti, Fábio Luiz; Assmann, Tangriani Simioni

    2015-11-11

    Pattern classification of ingestive behavior in grazing animals has extreme importance in studies related to animal nutrition, growth and health. In this paper, a system to classify chewing patterns of ruminants in in vivo experiments is developed. The proposal is based on data collected by optical fiber Bragg grating sensors (FBG) that are processed by machine learning techniques. The FBG sensors measure the biomechanical strain during jaw movements, and a decision tree is responsible for the classification of the associated chewing pattern. In this study, patterns associated with food intake of dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior were monitored: rumination and idleness. Experimental results show that the proposed approach for pattern classification is capable of differentiating the five patterns involved in the chewing process with an overall accuracy of 94%.

  18. Automatic plankton image classification combining multiple view features via multiple kernel learning.

    Science.gov (United States)

    Zheng, Haiyong; Wang, Ruchen; Yu, Zhibin; Wang, Nan; Gu, Zhaorui; Zheng, Bing

    2017-12-28

    Plankton, including phytoplankton and zooplankton, are the main source of food for organisms in the ocean and form the base of marine food chain. As the fundamental components of marine ecosystems, plankton is very sensitive to environment changes, and the study of plankton abundance and distribution is crucial, in order to understand environment changes and protect marine ecosystems. This study was carried out to develop an extensive applicable plankton classification system with high accuracy for the increasing number of various imaging devices. Literature shows that most plankton image classification systems were limited to only one specific imaging device and a relatively narrow taxonomic scope. The real practical system for automatic plankton classification is even non-existent and this study is partly to fill this gap. Inspired by the analysis of literature and development of technology, we focused on the requirements of practical application and proposed an automatic system for plankton image classification combining multiple view features via multiple kernel learning (MKL). For one thing, in order to describe the biomorphic characteristics of plankton more completely and comprehensively, we combined general features with robust features, especially by adding features like Inner-Distance Shape Context for morphological representation. For another, we divided all the features into different types from multiple views and feed them to multiple classifiers instead of only one by combining different kernel matrices computed from different types of features optimally via multiple kernel learning. Moreover, we also applied feature selection method to choose the optimal feature subsets from redundant features for satisfying different datasets from different imaging devices. We implemented our proposed classification system on three different datasets across more than 20 categories from phytoplankton to zooplankton. The experimental results validated that our system

  19. Image Classification of Ribbed Smoked Sheet using Learning Vector Quantization

    Science.gov (United States)

    Rahmat, R. F.; Pulungan, A. F.; Faza, S.; Budiarto, R.

    2017-01-01

    Natural rubber is an important export commodity in Indonesia, which can be a major contributor to national economic development. One type of rubber used as rubber material exports is Ribbed Smoked Sheet (RSS). The quantity of RSS exports depends on the quality of RSS. RSS rubber quality has been assigned in SNI 06-001-1987 and the International Standards of Quality and Packing for Natural Rubber Grades (The Green Book). The determination of RSS quality is also known as the sorting process. In the rubber factones, the sorting process is still done manually by looking and detecting at the levels of air bubbles on the surface of the rubber sheet by naked eyes so that the result is subjective and not so good. Therefore, a method is required to classify RSS rubber automatically and precisely. We propose some image processing techniques for the pre-processing, zoning method for feature extraction and Learning Vector Quantization (LVQ) method for classifying RSS rubber into two grades, namely RSS1 and RSS3. We used 120 RSS images as training dataset and 60 RSS images as testing dataset. The result shows that our proposed method can give 89% of accuracy and the best perform epoch is in the fifteenth epoch.

  20. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  1. On Internet Traffic Classification: A Two-Phased Machine Learning Approach

    Directory of Open Access Journals (Sweden)

    Taimur Bakhshi

    2016-01-01

    Full Text Available Traffic classification utilizing flow measurement enables operators to perform essential network management. Flow accounting methods such as NetFlow are, however, considered inadequate for classification requiring additional packet-level information, host behaviour analysis, and specialized hardware limiting their practical adoption. This paper aims to overcome these challenges by proposing two-phased machine learning classification mechanism with NetFlow as input. The individual flow classes are derived per application through k-means and are further used to train a C5.0 decision tree classifier. As part of validation, the initial unsupervised phase used flow records of fifteen popular Internet applications that were collected and independently subjected to k-means clustering to determine unique flow classes generated per application. The derived flow classes were afterwards used to train and test a supervised C5.0 based decision tree. The resulting classifier reported an average accuracy of 92.37% on approximately 3.4 million test cases increasing to 96.67% with adaptive boosting. The classifier specificity factor which accounted for differentiating content specific from supplementary flows ranged between 98.37% and 99.57%. Furthermore, the computational performance and accuracy of the proposed methodology in comparison with similar machine learning techniques lead us to recommend its extension to other applications in achieving highly granular real-time traffic classification.

  2. Time series classification using k-Nearest neighbours, Multilayer Perceptron and Learning Vector Quantization algorithms

    Directory of Open Access Journals (Sweden)

    Jiří Fejfar

    2012-01-01

    Full Text Available We are presenting results comparison of three artificial intelligence algorithms in a classification of time series derived from musical excerpts in this paper. Algorithms were chosen to represent different principles of classification – statistic approach, neural networks and competitive learning. The first algorithm is a classical k-Nearest neighbours algorithm, the second algorithm is Multilayer Perceptron (MPL, an example of artificial neural network and the third one is a Learning Vector Quantization (LVQ algorithm representing supervised counterpart to unsupervised Self Organizing Map (SOM.After our own former experiments with unlabelled data we moved forward to the data labels utilization, which generally led to a better accuracy of classification results. As we need huge data set of labelled time series (a priori knowledge of correct class which each time series instance belongs to, we used, with a good experience in former studies, musical excerpts as a source of real-world time series. We are using standard deviation of the sound signal as a descriptor of a musical excerpts volume level.We are describing principle of each algorithm as well as its implementation briefly, giving links for further research. Classification results of each algorithm are presented in a confusion matrix showing numbers of misclassifications and allowing to evaluate overall accuracy of the algorithm. Results are compared and particular misclassifications are discussed for each algorithm. Finally the best solution is chosen and further research goals are given.

  3. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning

    Directory of Open Access Journals (Sweden)

    Victoria Plaza-Leiva

    2017-03-01

    Full Text Available Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM, Gaussian processes (GP, and Gaussian mixture models (GMM. A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl. Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.

  4. [Severity classification of chronic obstructive pulmonary disease based on deep learning].

    Science.gov (United States)

    Ying, Jun; Yang, Ceyuan; Li, Quanzheng; Xue, Wanguo; Li, Tanshi; Cao, Wenzhe

    2017-12-01

    In this paper, a deep learning method has been raised to build an automatic classification algorithm of severity of chronic obstructive pulmonary disease. Large sample clinical data as input feature were analyzed for their weights in classification. Through feature selection, model training, parameter optimization and model testing, a classification prediction model based on deep belief network was built to predict severity classification criteria raised by the Global Initiative for Chronic Obstructive Lung Disease (GOLD). We get accuracy over 90% in prediction for two different standardized versions of severity criteria raised in 2007 and 2011 respectively. Moreover, we also got the contribution ranking of different input features through analyzing the model coefficient matrix and confirmed that there was a certain degree of agreement between the more contributive input features and the clinical diagnostic knowledge. The validity of the deep belief network model was proved by this result. This study provides an effective solution for the application of deep learning method in automatic diagnostic decision making.

  5. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification

    Directory of Open Access Journals (Sweden)

    Lu Bing

    2017-01-01

    Full Text Available We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL. After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM. Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  6. Representation Learning for Class C G Protein-Coupled Receptors Classification

    Directory of Open Access Journals (Sweden)

    Raúl Cruz-Barbosa

    2018-03-01

    Full Text Available G protein-coupled receptors (GPCRs are integral cell membrane proteins of relevance for pharmacology. The complete tertiary structure including both extracellular and transmembrane domains has not been determined for any member of class C GPCRs. An alternative way to work on GPCR structural models is the investigation of their functionality through the analysis of their primary structure. For this, sequence representation is a key factor for the GPCRs’ classification context, where usually, feature engineering is carried out. In this paper, we propose the use of representation learning to acquire the features that best represent the class C GPCR sequences and at the same time to obtain a model for classification automatically. Deep learning methods in conjunction with amino acid physicochemical property indices are then used for this purpose. Experimental results assessed by the classification accuracy, Matthews’ correlation coefficient and the balanced error rate show that using a hydrophobicity index and a restricted Boltzmann machine (RBM can achieve performance results (accuracy of 92.9% similar to those reported in the literature. As a second proposal, we combine two or more physicochemical property indices instead of only one as the input for a deep architecture in order to add information from the sequences. Experimental results show that using three hydrophobicity-related index combinations helps to improve the classification performance (accuracy of 94.1% of an RBM better than those reported in the literature for class C GPCRs without using feature selection methods.

  7. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning.

    Science.gov (United States)

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-03-15

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.

  8. Comparison of hand-craft feature based SVM and CNN based deep learning framework for automatic polyp classification.

    Science.gov (United States)

    Younghak Shin; Balasingham, Ilangko

    2017-07-01

    Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.

  9. Genetic Programming for the Generation of Crisp and Fuzzy Rule Bases in Classification and Diagnosis of Medical Data

    DEFF Research Database (Denmark)

    Dounias, George; Tsakonas, Athanasios; Jantzen, Jan

    2002-01-01

    This paper demonstrates two methodologies for the construction of rule-based systems in medical decision making. The first approach consists of a method combining genetic programming and heuristic hierarchical rule-base construction. The second model is composed by a strongly-typed genetic...

  10. SVM and PCA Based Learning Feature Classification Approaches for E-Learning System

    Science.gov (United States)

    Khamparia, Aditya; Pandey, Babita

    2018-01-01

    E-learning and online education has made great improvements in the recent past. It has shifted the teaching paradigm from conventional classroom learning to dynamic web based learning. Due to this, a dynamic learning material has been delivered to learners, instead ofstatic content, according to their skills, needs and preferences. In this…

  11. Fine-grained leukocyte classification with deep residual learning for microscopic images.

    Science.gov (United States)

    Qin, Feiwei; Gao, Nannan; Peng, Yong; Wu, Zizhao; Shen, Shuying; Grudtsin, Artur

    2018-08-01

    Leukocyte classification and cytometry have wide applications in medical domain, previous researches usually exploit machine learning techniques to classify leukocytes automatically. However, constrained by the past development of machine learning techniques, for example, extracting distinctive features from raw microscopic images are difficult, the widely used SVM classifier only has relative few parameters to tune, these methods cannot efficiently handle fine-grained classification cases when the white blood cells have up to 40 categories. Based on deep learning theory, a systematic study is conducted on finer leukocyte classification in this paper. A deep residual neural network based leukocyte classifier is constructed at first, which can imitate the domain expert's cell recognition process, and extract salient features robustly and automatically. Then the deep neural network classifier's topology is adjusted according to the prior knowledge of white blood cell test. After that the microscopic image dataset with almost one hundred thousand labeled leukocytes belonging to 40 categories is built, and combined training strategies are adopted to make the designed classifier has good generalization ability. The proposed deep residual neural network based classifier was tested on microscopic image dataset with 40 leukocyte categories. It achieves top-1 accuracy of 77.80%, top-5 accuracy of 98.75% during the training procedure. The average accuracy on the test set is nearly 76.84%. This paper presents a fine-grained leukocyte classification method for microscopic images, based on deep residual learning theory and medical domain knowledge. Experimental results validate the feasibility and effectiveness of our approach. Extended experiments support that the fine-grained leukocyte classifier could be used in real medical applications, assist doctors in diagnosing diseases, reduce human power significantly. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. A Comparative Classification of Wheat Grains for Artificial Neural Network and Extreme Learning Machine

    OpenAIRE

    ASLAN, Muhammet Fatih; SABANCI, Kadir; YİĞİT, Enes; KAYABAŞI, Ahmet; TOKTAŞ, Abdurrahim; DUYSAK, Hüseyin

    2018-01-01

    In this study, classification of two types of wheat grainsinto bread and durum was carried out. The species of wheat grains in thisdataset are bread and durum and these species have equal samples in the datasetas 100 instances. Seven features, including width, height, area, perimeter,roundness, width and perimeter/area were extracted from each wheat grains. Classificationwas separately conducted by Artificial Neural Network (ANN) and Extreme Learning Machine (ELM)artificial intelligence techn...

  13. A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem

    OpenAIRE

    Zekić-Sušac, Marijana; Pfeifer, Sanja; Šarlija, Nataša

    2014-01-01

    Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART ...

  14. In Vivo Pattern Classification of Ingestive Behavior in Ruminants Using FBG Sensors and Machine Learning

    OpenAIRE

    Pegorini, Vinicius; Karam, Leandro Zen; Pitta, Christiano Santos Rocha; Cardoso, Rafael; da Silva, Jean Carlos Cardozo; Kalinowski, Hypolito Jos?; Ribeiro, Richardson; Bertotti, F?bio Luiz; Assmann, Tangriani Simioni

    2015-01-01

    Pattern classification of ingestive behavior in grazing animals has extreme importance in studies related to animal nutrition, growth and health. In this paper, a system to classify chewing patterns of ruminants in in vivo experiments is developed. The proposal is based on data collected by optical fiber Bragg grating sensors (FBG) that are processed by machine learning techniques. The FBG sensors measure the biomechanical strain during jaw movements, and a decision tree is responsible for th...

  15. Ship Detection and Classification on Optical Remote Sensing Images Using Deep Learning

    Directory of Open Access Journals (Sweden)

    Liu Ying

    2017-01-01

    Full Text Available Ship detection and classification is critical for national maritime security and national defense. Although some SAR (Synthetic Aperture Radar image-based ship detection approaches have been proposed and used, they are not able to satisfy the requirement of real-world applications as the number of SAR sensors is limited, the resolution is low, and the revisit cycle is long. As massive optical remote sensing images of high resolution are available, ship detection and classification on theses images is becoming a promising technique, and has attracted great attention on applications including maritime security and traffic control. Some digital image processing methods have been proposed to detect ships in optical remote sensing images, but most of them face difficulty in terms of accuracy, performance and complexity. Recently, an autoencoder-based deep neural network with extreme learning machine was proposed, but it cannot meet the requirement of real-world applications as it only works with simple and small-scaled data sets. Therefore, in this paper, we propose a novel ship detection and classification approach which utilizes deep convolutional neural network (CNN as the ship classifier. The performance of our proposed ship detection and classification approach was evaluated on a set of images downloaded from Google Earth at the resolution 0.5m. 99% detection accuracy and 95% classification accuracy were achieved. In model training, 75× speedup is achieved on 1 Nvidia Titanx GPU.

  16. Combining extreme learning machines using support vector machines for breast tissue classification.

    Science.gov (United States)

    Daliri, Mohammad Reza

    2015-01-01

    In this paper, we present a new approach for breast tissue classification using the features derived from electrical impedance spectroscopy. This method is composed of a feature extraction method, feature selection phase and a classification step. The feature extraction phase derives the features from the electrical impedance spectra. The extracted features consist of the impedivity at zero frequency (I0), the phase angle at 500 KHz, the high-frequency slope of phase angle, the impedance distance between spectral ends, the area under spectrum, the normalised area, the maximum of the spectrum, the distance between impedivity at I0 and the real part of the maximum frequency point and the length of the spectral curve. The system uses the information theoretic criterion as a strategy for feature selection and the combining extreme learning machines (ELMs) for the classification phase. The results of several ELMs are combined using the support vector machines classifier, and the result of classification is reported as a measure of the performance of the system. The results indicate that the proposed system achieves high accuracy in classification of breast tissues using the electrical impedance spectroscopy.

  17. Building an asynchronous web-based tool for machine learning classification.

    Science.gov (United States)

    Weber, Griffin; Vinterbo, Staal; Ohno-Machado, Lucila

    2002-01-01

    Various unsupervised and supervised learning methods including support vector machines, classification trees, linear discriminant analysis and nearest neighbor classifiers have been used to classify high-throughput gene expression data. Simpler and more widely accepted statistical tools have not yet been used for this purpose, hence proper comparisons between classification methods have not been conducted. We developed free software that implements logistic regression with stepwise variable selection as a quick and simple method for initial exploration of important genetic markers in disease classification. To implement the algorithm and allow our collaborators in remote locations to evaluate and compare its results against those of other methods, we developed a user-friendly asynchronous web-based application with a minimal amount of programming using free, downloadable software tools. With this program, we show that classification using logistic regression can perform as well as other more sophisticated algorithms, and it has the advantages of being easy to interpret and reproduce. By making the tool freely and easily available, we hope to promote the comparison of classification methods. In addition, we believe our web application can be used as a model for other bioinformatics laboratories that need to develop web-based analysis tools in a short amount of time and on a limited budget.

  18. Classification of autism spectrum disorder using supervised learning of brain connectivity measures extracted from synchrostates

    Science.gov (United States)

    Jamal, Wasifa; Das, Saptarshi; Oprescu, Ioana-Anastasia; Maharatna, Koushik; Apicella, Fabio; Sicca, Federico

    2014-08-01

    Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave one out cross-validation of the classification algorithm gives 94.7% accuracy as the best performance with corresponding sensitivity and specificity values as 85.7% and 100% respectively. Significance. The proposed method gives high classification accuracies and outperforms other contemporary research results. The effectiveness of the proposed method for classification of autistic and typical children suggests the possibility of using it on a larger population to validate it for clinical practice.

  19. Classification of Suicide Attempts through a Machine Learning Algorithm Based on Multiple Systemic Psychiatric Scales

    Directory of Open Access Journals (Sweden)

    Jihoon Oh

    2017-09-01

    Full Text Available Classification and prediction of suicide attempts in high-risk groups is important for preventing suicide. The purpose of this study was to investigate whether the information from multiple clinical scales has classification power for identifying actual suicide attempts. Patients with depression and anxiety disorders (N = 573 were included, and each participant completed 31 self-report psychiatric scales and questionnaires about their history of suicide attempts. We then trained an artificial neural network classifier with 41 variables (31 psychiatric scales and 10 sociodemographic elements and ranked the contribution of each variable for the classification of suicide attempts. To evaluate the clinical applicability of our model, we measured classification performance with top-ranked predictors. Our model had an overall accuracy of 93.7% in 1-month, 90.8% in 1-year, and 87.4% in lifetime suicide attempts detection. The area under the receiver operating characteristic curve (AUROC was the highest for 1-month suicide attempts detection (0.93, followed by lifetime (0.89, and 1-year detection (0.87. Among all variables, the Emotion Regulation Questionnaire had the highest contribution, and the positive and negative characteristics of the scales similarly contributed to classification performance. Performance on suicide attempts classification was largely maintained when we only used the top five ranked variables for training (AUROC; 1-month, 0.75, 1-year, 0.85, lifetime suicide attempts detection, 0.87. Our findings indicate that information from self-report clinical scales can be useful for the classification of suicide attempts. Based on the reliable performance of the top five predictors alone, this machine learning approach could help clinicians identify high-risk patients in clinical settings.

  20. Task-Driven Dictionary Learning Based on Mutual Information for Medical Image Classification.

    Science.gov (United States)

    Diamant, Idit; Klang, Eyal; Amitai, Michal; Konen, Eli; Goldberger, Jacob; Greenspan, Hayit

    2017-06-01

    We present a novel variant of the bag-of-visual-words (BoVW) method for automated medical image classification. Our approach improves the BoVW model by learning a task-driven dictionary of the most relevant visual words per task using a mutual information-based criterion. Additionally, we generate relevance maps to visualize and localize the decision of the automatic classification algorithm. These maps demonstrate how the algorithm works and show the spatial layout of the most relevant words. We applied our algorithm to three different tasks: chest x-ray pathology identification (of four pathologies: cardiomegaly, enlarged mediastinum, right consolidation, and left consolidation), liver lesion classification into four categories in computed tomography (CT) images and benign/malignant clusters of microcalcifications (MCs) classification in breast mammograms. Validation was conducted on three datasets: 443 chest x-rays, 118 portal phase CT images of liver lesions, and 260 mammography MCs. The proposed method improves the classical BoVW method for all tested applications. For chest x-ray, area under curve of 0.876 was obtained for enlarged mediastinum identification compared to 0.855 using classical BoVW (with p-value 0.01). For MC classification, a significant improvement of 4% was achieved using our new approach (with p-value = 0.03). For liver lesion classification, an improvement of 6% in sensitivity and 2% in specificity were obtained (with p-value 0.001). We demonstrated that classification based on informative selected set of words results in significant improvement. Our new BoVW approach shows promising results in clinically important domains. Additionally, it can discover relevant parts of images for the task at hand without explicit annotations for training data. This can provide computer-aided support for medical experts in challenging image analysis tasks.

  1. Classification of Suicide Attempts through a Machine Learning Algorithm Based on Multiple Systemic Psychiatric Scales.

    Science.gov (United States)

    Oh, Jihoon; Yun, Kyongsik; Hwang, Ji-Hyun; Chae, Jeong-Ho

    2017-01-01

    Classification and prediction of suicide attempts in high-risk groups is important for preventing suicide. The purpose of this study was to investigate whether the information from multiple clinical scales has classification power for identifying actual suicide attempts. Patients with depression and anxiety disorders ( N  = 573) were included, and each participant completed 31 self-report psychiatric scales and questionnaires about their history of suicide attempts. We then trained an artificial neural network classifier with 41 variables (31 psychiatric scales and 10 sociodemographic elements) and ranked the contribution of each variable for the classification of suicide attempts. To evaluate the clinical applicability of our model, we measured classification performance with top-ranked predictors. Our model had an overall accuracy of 93.7% in 1-month, 90.8% in 1-year, and 87.4% in lifetime suicide attempts detection. The area under the receiver operating characteristic curve (AUROC) was the highest for 1-month suicide attempts detection (0.93), followed by lifetime (0.89), and 1-year detection (0.87). Among all variables, the Emotion Regulation Questionnaire had the highest contribution, and the positive and negative characteristics of the scales similarly contributed to classification performance. Performance on suicide attempts classification was largely maintained when we only used the top five ranked variables for training (AUROC; 1-month, 0.75, 1-year, 0.85, lifetime suicide attempts detection, 0.87). Our findings indicate that information from self-report clinical scales can be useful for the classification of suicide attempts. Based on the reliable performance of the top five predictors alone, this machine learning approach could help clinicians identify high-risk patients in clinical settings.

  2. Statistical Discriminability Estimation for Pattern Classification Based on Neural Incremental Attribute Learning

    DEFF Research Database (Denmark)

    Wang, Ting; Guan, Sheng-Uei; Puthusserypady, Sadasivan

    2014-01-01

    Feature ordering is a significant data preprocessing method in Incremental Attribute Learning (IAL), a novel machine learning approach which gradually trains features according to a given order. Previous research has shown that, similar to feature selection, feature ordering is also important based...... estimation. Moreover, a criterion that summarizes all the produced values of AD is employed with a GA (Genetic Algorithm)-based approach to obtain the optimum feature ordering for classification problems based on neural networks by means of IAL. Compared with the feature ordering obtained by other approaches...

  3. Classification of large-sized hyperspectral imagery using fast machine learning algorithms

    Science.gov (United States)

    Xia, Junshi; Yokoya, Naoto; Iwasaki, Akira

    2017-07-01

    We present a framework of fast machine learning algorithms in the context of large-sized hyperspectral images classification from the theoretical to a practical viewpoint. In particular, we assess the performance of random forest (RF), rotation forest (RoF), and extreme learning machine (ELM) and the ensembles of RF and ELM. These classifiers are applied to two large-sized hyperspectral images and compared to the support vector machines. To give the quantitative analysis, we pay attention to comparing these methods when working with high input dimensions and a limited/sufficient training set. Moreover, other important issues such as the computational cost and robustness against the noise are also discussed.

  4. Multiple kernel learning using single stage function approximation for binary classification problems

    Science.gov (United States)

    Shiju, S.; Sumitra, S.

    2017-12-01

    In this paper, the multiple kernel learning (MKL) is formulated as a supervised classification problem. We dealt with binary classification data and hence the data modelling problem involves the computation of two decision boundaries of which one related with that of kernel learning and the other with that of input data. In our approach, they are found with the aid of a single cost function by constructing a global reproducing kernel Hilbert space (RKHS) as the direct sum of the RKHSs corresponding to the decision boundaries of kernel learning and input data and searching that function from the global RKHS, which can be represented as the direct sum of the decision boundaries under consideration. In our experimental analysis, the proposed model had shown superior performance in comparison with that of existing two stage function approximation formulation of MKL, where the decision functions of kernel learning and input data are found separately using two different cost functions. This is due to the fact that single stage representation helps the knowledge transfer between the computation procedures for finding the decision boundaries of kernel learning and input data, which inturn boosts the generalisation capacity of the model.

  5. Deep learning decision fusion for the classification of urban remote sensing data

    Science.gov (United States)

    Abdi, Ghasem; Samadzadegan, Farhad; Reinartz, Peter

    2018-01-01

    Multisensor data fusion is one of the most common and popular remote sensing data classification topics by considering a robust and complete description about the objects of interest. Furthermore, deep feature extraction has recently attracted significant interest and has become a hot research topic in the geoscience and remote sensing research community. A deep learning decision fusion approach is presented to perform multisensor urban remote sensing data classification. After deep features are extracted by utilizing joint spectral-spatial information, a soft-decision made classifier is applied to train high-level feature representations and to fine-tune the deep learning framework. Next, a decision-level fusion classifies objects of interest by the joint use of sensors. Finally, a context-aware object-based postprocessing is used to enhance the classification results. A series of comparative experiments are conducted on the widely used dataset of 2014 IEEE GRSS data fusion contest. The obtained results illustrate the considerable advantages of the proposed deep learning decision fusion over the traditional classifiers.

  6. Geometrical Modification of Learning Vector Quantization Method for Solving Classification Problems

    Directory of Open Access Journals (Sweden)

    Korhan GÜNEL

    2016-09-01

    Full Text Available In this paper, a geometrical scheme is presented to show how to overcome an encountered problem arising from the use of generalized delta learning rule within competitive learning model. It is introduced a theoretical methodology for describing the quantization of data via rotating prototype vectors on hyper-spheres.The proposed learning algorithm is tested and verified on different multidimensional datasets including a binary class dataset and two multiclass datasets from the UCI repository, and a multiclass dataset constructed by us. The proposed method is compared with some baseline learning vector quantization variants in literature for all domains. Large number of experiments verify the performance of our proposed algorithm with acceptable accuracy and macro f1 scores.

  7. The classification problem in machine learning: an overview with study cases in emotion recognition and music-speech differentiation

    OpenAIRE

    Rodríguez Cadavid, Santiago

    2015-01-01

    This work addresses the well-known classification problem in machine learning -- The goal of this study is to approach the reader to the methodological aspects of the feature extraction, feature selection and classifier performance through simple and understandable theoretical aspects and two study cases -- Finally, a very good classification performance was obtained for the emotion recognition from speech

  8. Lessons learned from early implementation of the maintenance rule at nine nuclear power plants

    International Nuclear Information System (INIS)

    Petrone, C.D.; Correia, R.P.; Black, S.C.

    1995-06-01

    This report summarizes the lessons learned from the nine pilot site visits that were performed to review early implementation of the maintenance rule using the draft NRC Maintenance Inspection Procedure. Licensees followed NUMARC 93-01, ''Industry Guideline for Monitoring the Effectiveness of Maintenance at Nuclear Power Plants.'' In general, the licensees were thorough in determining which structures, systems, and components (SSCS) were within the scope of the maintenance rule at each site. The use of an expert panel was an appropriate and practical method of determining which SSCs are risk significant. When setting goals, all licensees considered safety but many licensees did not consider operating experience throughout the industry. Although required to do so, licensees were not monitoring at the system or train level the performance or condition for some systems used in standby service but not significant to risk. Most licensees had not established adequate monitoring of structures under the rule. Licensees established reasonable plans for doing periodic evaluations, balancing unavailability and reliability, and assessing the effect of taking equipment out of service for maintenance. However, these plans were not evaluated because they had not been fully implemented at the time of the site visits

  9. Trading Rules on Stock Markets Using Genetic Network Programming with Reinforcement Learning and Importance Index

    Science.gov (United States)

    Mabu, Shingo; Hirasawa, Kotaro; Furuzuki, Takayuki

    Genetic Network Programming (GNP) is an evolutionary computation which represents its solutions using graph structures. Since GNP can create quite compact programs and has an implicit memory function, it has been clarified that GNP works well especially in dynamic environments. In addition, a study on creating trading rules on stock markets using GNP with Importance Index (GNP-IMX) has been done. IMX is a new element which is a criterion for decision making. In this paper, we combined GNP-IMX with Actor-Critic (GNP-IMX&AC) and create trading rules on stock markets. Evolution-based methods evolve their programs after enough period of time because they must calculate fitness values, however reinforcement learning can change programs during the period, therefore the trading rules can be created efficiently. In the simulation, the proposed method is trained using the stock prices of 10 brands in 2002 and 2003. Then the generalization ability is tested using the stock prices in 2004. The simulation results show that the proposed method can obtain larger profits than GNP-IMX without AC and Buy&Hold.

  10. Learning the rules of the rock-paper-scissors game: chimpanzees versus children.

    Science.gov (United States)

    Gao, Jie; Su, Yanjie; Tomonaga, Masaki; Matsuzawa, Tetsuro

    2018-01-01

    The present study aimed to investigate whether chimpanzees (Pan troglodytes) could learn a transverse pattern by being trained in the rules of the rock-paper-scissors game in which "paper" beats "rock," "rock" beats "scissors," and "scissors" beats "paper." Additionally, this study compared the learning processes between chimpanzees and children. Seven chimpanzees were tested using a computer-controlled task. They were trained to choose the stronger of two options according to the game rules. The chimpanzees first engaged in the paper-rock sessions until they reached the learning criterion. Subsequently, they engaged in the rock-scissors and scissors-paper sessions, before progressing to sessions with all three pairs mixed. Five of the seven chimpanzees completed training after a mean of 307 sessions, which indicates that they learned the circular pattern. The chimpanzees required more scissors-paper sessions (14.29 ± 6.89), the third learnt pair, than paper-rock (1.71 ± 0.18) and rock-scissors (3.14 ± 0.70) sessions, suggesting they had difficulty finalizing the circularity. The chimpanzees then received generalization tests using new stimuli, which they learned quickly. A similar procedure was performed with children (35-71 months, n = 38) who needed the same number of trials for all three pairs during single-paired sessions. Their accuracy during the mixed-pair sessions improved with age and was better than chance from 50 months of age, which indicates that the ability to solve the transverse patterning problem might develop at around 4 years of age. The present findings show that chimpanzees were able to learn the task but had difficulties with circularity, whereas children learned the task more easily and developed the relevant ability at approximately 4 years of age. Furthermore, the chimpanzees' performance during the mixed-pair sessions was similar to that of 4-year-old children during the corresponding stage of training.

  11. An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification.

    Science.gov (United States)

    Gómez, David; Rojas, Alfonso

    2016-01-01

    A sizable amount of research has been done to improve the mechanisms for knowledge extraction such as machine learning classification or regression. Quite unintuitively, the no free lunch (NFL) theorem states that all optimization problem strategies perform equally well when averaged over all possible problems. This fact seems to clash with the effort put forth toward better algorithms. This letter explores empirically the effect of the NFL theorem on some popular machine learning classification techniques over real-world data sets.

  12. Research into Financial Position of Listed Companies following Classification via Extreme Learning Machine Based upon DE Optimization

    Directory of Open Access Journals (Sweden)

    Fu Yu

    2016-01-01

    Full Text Available By means of the model of extreme learning machine based upon DE optimization, this article particularly centers on the optimization thinking of such a model as well as its application effect in the field of listed company’s financial position classification. It proves that the improved extreme learning machine algorithm based upon DE optimization eclipses the traditional extreme learning machine algorithm following comparison. Meanwhile, this article also intends to introduce certain research thinking concerning extreme learning machine into the economics classification area so as to fulfill the purpose of computerizing the speedy but effective evaluation of massive financial statements of listed companies pertain to different classes

  13. Classification and authentication of unknown water samples using machine learning algorithms.

    Science.gov (United States)

    Kundu, Palash K; Panchariya, P C; Kundu, Madhusree

    2011-07-01

    This paper proposes the development of water sample classification and authentication, in real life which is based on machine learning algorithms. The proposed techniques used experimental measurements from a pulse voltametry method which is based on an electronic tongue (E-tongue) instrumentation system with silver and platinum electrodes. E-tongue include arrays of solid state ion sensors, transducers even of different types, data collectors and data analysis tools, all oriented to the classification of liquid samples and authentication of unknown liquid samples. The time series signal and the corresponding raw data represent the measurement from a multi-sensor system. The E-tongue system, implemented in a laboratory environment for 6 numbers of different ISI (Bureau of Indian standard) certified water samples (Aquafina, Bisleri, Kingfisher, Oasis, Dolphin, and McDowell) was the data source for developing two types of machine learning algorithms like classification and regression. A water data set consisting of 6 numbers of sample classes containing 4402 numbers of features were considered. A PCA (principal component analysis) based classification and authentication tool was developed in this study as the machine learning component of the E-tongue system. A proposed partial least squares (PLS) based classifier, which was dedicated as well; to authenticate a specific category of water sample evolved out as an integral part of the E-tongue instrumentation system. The developed PCA and PLS based E-tongue system emancipated an overall encouraging authentication percentage accuracy with their excellent performances for the aforesaid categories of water samples. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.

  14. A Deep Learning Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time Series.

    Science.gov (United States)

    Chambon, Stanislas; Galtier, Mathieu N; Arnal, Pierrick J; Wainrib, Gilles; Gramfort, Alexandre

    2018-04-01

    Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders. It is traditionally performed by a sleep expert who assigns to each 30 s of the signal of a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEGs), electrooculograms (EOGs), electrocardiograms, and electromyograms (EMGs). We introduce here the first deep learning approach for sleep stage classification that learns end-to-end without computing spectrograms or extracting handcrafted features, that exploits all multivariate and multimodal polysomnography (PSG) signals (EEG, EMG, and EOG), and that can exploit the temporal context of each 30-s window of data. For each modality, the first layer learns linear spatial filters that exploit the array of sensors to increase the signal-to-noise ratio, and the last layer feeds the learnt representation to a softmax classifier. Our model is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields the state-of-the-art performance. Our study reveals a number of insights on the spatiotemporal distribution of the signal of interest: a good tradeoff for optimal classification performance measured with balanced accuracy is to use 6 EEG with 2 EOG (left and right) and 3 EMG chin channels. Also exploiting 1 min of data before and after each data segment offers the strongest improvement when a limited number of channels are available. As sleep experts, our system exploits the multivariate and multimodal nature of PSG signals in order to deliver the state-of-the-art classification performance with a small computational cost.

  15. CACNA1C gene regulates behavioral strategies in operant rule learning.

    Science.gov (United States)

    Koppe, Georgia; Mallien, Anne Stephanie; Berger, Stefan; Bartsch, Dusan; Gass, Peter; Vollmayr, Barbara; Durstewitz, Daniel

    2017-06-01

    Behavioral experiments are usually designed to tap into a specific cognitive function, but animals may solve a given task through a variety of different and individual behavioral strategies, some of them not foreseen by the experimenter. Animal learning may therefore be seen more as the process of selecting among, and adapting, potential behavioral policies, rather than mere strengthening of associative links. Calcium influx through high-voltage-gated Ca2+ channels is central to synaptic plasticity, and altered expression of Cav1.2 channels and the CACNA1C gene have been associated with severe learning deficits and psychiatric disorders. Given this, we were interested in how specifically a selective functional ablation of the Cacna1c gene would modulate the learning process. Using a detailed, individual-level analysis of learning on an operant cue discrimination task in terms of behavioral strategies, combined with Bayesian selection among computational models estimated from the empirical data, we show that a Cacna1c knockout does not impair learning in general but has a much more specific effect: the majority of Cacna1c knockout mice still managed to increase reward feedback across trials but did so by adapting an outcome-based strategy, while the majority of matched controls adopted the experimentally intended cue-association rule. Our results thus point to a quite specific role of a single gene in learning and highlight that much more mechanistic insight could be gained by examining response patterns in terms of a larger repertoire of potential behavioral strategies. The results may also have clinical implications for treating psychiatric disorders.

  16. Bidirectional-Convolutional LSTM Based Spectral-Spatial Feature Learning for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Qingshan Liu

    2017-12-01

    Full Text Available This paper proposes a novel deep learning framework named bidirectional-convolutional long short term memory (Bi-CLSTM network to automatically learn the spectral-spatial features from hyperspectral images (HSIs. In the network, the issue of spectral feature extraction is considered as a sequence learning problem, and a recurrent connection operator across the spectral domain is used to address it. Meanwhile, inspired from the widely used convolutional neural network (CNN, a convolution operator across the spatial domain is incorporated into the network to extract the spatial feature. In addition, to sufficiently capture the spectral information, a bidirectional recurrent connection is proposed. In the classification phase, the learned features are concatenated into a vector and fed to a Softmax classifier via a fully-connected operator. To validate the effectiveness of the proposed Bi-CLSTM framework, we compare it with six state-of-the-art methods, including the popular 3D-CNN model, on three widely used HSIs (i.e., Indian Pines, Pavia University, and Kennedy Space Center. The obtained results show that Bi-CLSTM can improve the classification performance by almost 1.5 % as compared to 3D-CNN.

  17. Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations.

    Science.gov (United States)

    Torkzaban, Bahareh; Kayvanjoo, Amir Hossein; Ardalan, Arman; Mousavi, Soraya; Mariotti, Roberto; Baldoni, Luciana; Ebrahimie, Esmaeil; Ebrahimi, Mansour; Hosseini-Mazinani, Mehdi

    2015-01-01

    Finding efficient analytical techniques is overwhelmingly turning into a bottleneck for the effectiveness of large biological data. Machine learning offers a novel and powerful tool to advance classification and modeling solutions in molecular biology. However, these methods have been less frequently used with empirical population genetics data. In this study, we developed a new combined approach of data analysis using microsatellite marker data from our previous studies of olive populations using machine learning algorithms. Herein, 267 olive accessions of various origins including 21 reference cultivars, 132 local ecotypes, and 37 wild olive specimens from the Iranian plateau, together with 77 of the most represented Mediterranean varieties were investigated using a finely selected panel of 11 microsatellite markers. We organized data in two '4-targeted' and '16-targeted' experiments. A strategy of assaying different machine based analyses (i.e. data cleaning, feature selection, and machine learning classification) was devised to identify the most informative loci and the most diagnostic alleles to represent the population and the geography of each olive accession. These analyses revealed microsatellite markers with the highest differentiating capacity and proved efficiency for our method of clustering olive accessions to reflect upon their regions of origin. A distinguished highlight of this study was the discovery of the best combination of markers for better differentiating of populations via machine learning models, which can be exploited to distinguish among other biological populations.

  18. Collaborative Working e-Learning Environments Supported by Rule-Based e-Tutor

    Directory of Open Access Journals (Sweden)

    Salaheddin Odeh

    2007-10-01

    Full Text Available Collaborative working environments for distance education sets a goal of convenience and an adaptation into our technologically advanced societies. To achieve this revolutionary new way of learning, environments must allow the different participants to communicate and coordinate with each other in a productive manner. Productivity and efficiency is obtained through synchronized communication between the different coordinating partners, which means that multiple users can execute an experiment simultaneously. Within this process, coordination can be accomplished by voice communication and chat tools. In recent times, multi-user environments have been successfully applied in many applications such as air traffic control systems, team-oriented military systems, chat text tools, and multi-player games. Thus, understanding the ideas and the techniques behind these systems can be of great significance regarding the contribution of newer ideas to collaborative working e-learning environments. However, many problems still exist in distance learning and tele-education, such as not finding the proper assistance while performing the remote experiment. Therefore, the students become overwhelmed and the experiment will fail. In this paper, we are going to discuss a solution that enables students to obtain an automated help by either a human tutor or a rule-based e-tutor (embedded rule-based system for the purpose of student support in complex remote experimentative environments. The technical implementation of the system can be realized by using the powerful Microsoft .NET, which offers a complete integrated developmental environment (IDE with a wide collection of products and technologies. Once the system is developed, groups of students are independently able to coordinate and to execute the experiment at any time and from any place, organizing the work between them positively.

  19. Prototype-based Models for the Supervised Learning of Classification Schemes

    Science.gov (United States)

    Biehl, Michael; Hammer, Barbara; Villmann, Thomas

    2017-06-01

    An introduction is given to the use of prototype-based models in supervised machine learning. The main concept of the framework is to represent previously observed data in terms of so-called prototypes, which reflect typical properties of the data. Together with a suitable, discriminative distance or dissimilarity measure, prototypes can be used for the classification of complex, possibly high-dimensional data. We illustrate the framework in terms of the popular Learning Vector Quantization (LVQ). Most frequently, standard Euclidean distance is employed as a distance measure. We discuss how LVQ can be equipped with more general dissimilarites. Moreover, we introduce relevance learning as a tool for the data-driven optimization of parameterized distances.

  20. Classification of older adults with/without a fall history using machine learning methods.

    Science.gov (United States)

    Lin Zhang; Ou Ma; Fabre, Jennifer M; Wood, Robert H; Garcia, Stephanie U; Ivey, Kayla M; McCann, Evan D

    2015-01-01

    Falling is a serious problem in an aged society such that assessment of the risk of falls for individuals is imperative for the research and practice of falls prevention. This paper introduces an application of several machine learning methods for training a classifier which is capable of classifying individual older adults into a high risk group and a low risk group (distinguished by whether or not the members of the group have a recent history of falls). Using a 3D motion capture system, significant gait features related to falls risk are extracted. By training these features, classification hypotheses are obtained based on machine learning techniques (K Nearest-neighbour, Naive Bayes, Logistic Regression, Neural Network, and Support Vector Machine). Training and test accuracies with sensitivity and specificity of each of these techniques are assessed. The feature adjustment and tuning of the machine learning algorithms are discussed. The outcome of the study will benefit the prediction and prevention of falls.

  1. Computer-aided classification of lung nodules on computed tomography images via deep learning technique

    Directory of Open Access Journals (Sweden)

    Hua KL

    2015-08-01

    Full Text Available Kai-Lung Hua,1 Che-Hao Hsu,1 Shintami Chusnul Hidayati,1 Wen-Huang Cheng,2 Yu-Jen Chen3 1Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, 2Research Center for Information Technology Innovation, Academia Sinica, 3Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan Abstract: Lung cancer has a poor prognosis when not diagnosed early and unresectable lesions are present. The management of small lung nodules noted on computed tomography scan is controversial due to uncertain tumor characteristics. A conventional computer-aided diagnosis (CAD scheme requires several image processing and pattern recognition steps to accomplish a quantitative tumor differentiation result. In such an ad hoc image analysis pipeline, every step depends heavily on the performance of the previous step. Accordingly, tuning of classification performance in a conventional CAD scheme is very complicated and arduous. Deep learning techniques, on the other hand, have the intrinsic advantage of an automatic exploitation feature and tuning of performance in a seamless fashion. In this study, we attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques. Specifically, we introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images. Two baseline methods with feature computing steps were implemented for comparison. The experimental results suggest that deep learning methods could achieve better discriminative results and hold promise in the CAD application domain. Keywords: nodule classification, deep learning, deep belief network, convolutional neural network

  2. Fast Gaussian kernel learning for classification tasks based on specially structured global optimization.

    Science.gov (United States)

    Zhong, Shangping; Chen, Tianshun; He, Fengying; Niu, Yuzhen

    2014-09-01

    For a practical pattern classification task solved by kernel methods, the computing time is mainly spent on kernel learning (or training). However, the current kernel learning approaches are based on local optimization techniques, and hard to have good time performances, especially for large datasets. Thus the existing algorithms cannot be easily extended to large-scale tasks. In this paper, we present a fast Gaussian kernel learning method by solving a specially structured global optimization (SSGO) problem. We optimize the Gaussian kernel function by using the formulated kernel target alignment criterion, which is a difference of increasing (d.i.) functions. Through using a power-transformation based convexification method, the objective criterion can be represented as a difference of convex (d.c.) functions with a fixed power-transformation parameter. And the objective programming problem can then be converted to a SSGO problem: globally minimizing a concave function over a convex set. The SSGO problem is classical and has good solvability. Thus, to find the global optimal solution efficiently, we can adopt the improved Hoffman's outer approximation method, which need not repeat the searching procedure with different starting points to locate the best local minimum. Also, the proposed method can be proven to converge to the global solution for any classification task. We evaluate the proposed method on twenty benchmark datasets, and compare it with four other Gaussian kernel learning methods. Experimental results show that the proposed method stably achieves both good time-efficiency performance and good classification performance. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Lesion classification using clinical and visual data fusion by multiple kernel learning

    Science.gov (United States)

    Kisilev, Pavel; Hashoul, Sharbell; Walach, Eugene; Tzadok, Asaf

    2014-03-01

    To overcome operator dependency and to increase diagnosis accuracy in breast ultrasound (US), a lot of effort has been devoted to developing computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Unfortunately, the efficacy of such CAD systems is limited since they rely on correct automatic lesions detection and localization, and on robustness of features computed based on the detected areas. In this paper we propose a new approach to boost the performance of a Machine Learning based CAD system, by combining visual and clinical data from patient files. We compute a set of visual features from breast ultrasound images, and construct the textual descriptor of patients by extracting relevant keywords from patients' clinical data files. We then use the Multiple Kernel Learning (MKL) framework to train SVM based classifier to discriminate between benign and malignant cases. We investigate different types of data fusion methods, namely, early, late, and intermediate (MKL-based) fusion. Our database consists of 408 patient cases, each containing US images, textual description of complaints and symptoms filled by physicians, and confirmed diagnoses. We show experimentally that the proposed MKL-based approach is superior to other classification methods. Even though the clinical data is very sparse and noisy, its MKL-based fusion with visual features yields significant improvement of the classification accuracy, as compared to the image features only based classifier.

  4. Multi-Modal Curriculum Learning for Semi-Supervised Image Classification.

    Science.gov (United States)

    Gong, Chen; Tao, Dacheng; Maybank, Stephen J; Liu, Wei; Kang, Guoliang; Yang, Jie

    2016-07-01

    Semi-supervised image classification aims to classify a large quantity of unlabeled images by typically harnessing scarce labeled images. Existing semi-supervised methods often suffer from inadequate classification accuracy when encountering difficult yet critical images, such as outliers, because they treat all unlabeled images equally and conduct classifications in an imperfectly ordered sequence. In this paper, we employ the curriculum learning methodology by investigating the difficulty of classifying every unlabeled image. The reliability and the discriminability of these unlabeled images are particularly investigated for evaluating their difficulty. As a result, an optimized image sequence is generated during the iterative propagations, and the unlabeled images are logically classified from simple to difficult. Furthermore, since images are usually characterized by multiple visual feature descriptors, we associate each kind of features with a teacher, and design a multi-modal curriculum learning (MMCL) strategy to integrate the information from different feature modalities. In each propagation, each teacher analyzes the difficulties of the currently unlabeled images from its own modality viewpoint. A consensus is subsequently reached among all the teachers, determining the currently simplest images (i.e., a curriculum), which are to be reliably classified by the multi-modal learner. This well-organized propagation process leveraging multiple teachers and one learner enables our MMCL to outperform five state-of-the-art methods on eight popular image data sets.

  5. Linear Subpixel Learning Algorithm for Land Cover Classification from WELD using High Performance Computing

    Science.gov (United States)

    Ganguly, S.; Kumar, U.; Nemani, R. R.; Kalia, S.; Michaelis, A.

    2017-12-01

    In this work, we use a Fully Constrained Least Squares Subpixel Learning Algorithm to unmix global WELD (Web Enabled Landsat Data) to obtain fractions or abundances of substrate (S), vegetation (V) and dark objects (D) classes. Because of the sheer nature of data and compute needs, we leveraged the NASA Earth Exchange (NEX) high performance computing architecture to optimize and scale our algorithm for large-scale processing. Subsequently, the S-V-D abundance maps were characterized into 4 classes namely, forest, farmland, water and urban areas (with NPP-VIIRS - national polar orbiting partnership visible infrared imaging radiometer suite nighttime lights data) over California, USA using Random Forest classifier. Validation of these land cover maps with NLCD (National Land Cover Database) 2011 products and NAFD (North American Forest Dynamics) static forest cover maps showed that an overall classification accuracy of over 91% was achieved, which is a 6% improvement in unmixing based classification relative to per-pixel based classification. As such, abundance maps continue to offer an useful alternative to high-spatial resolution data derived classification maps for forest inventory analysis, multi-class mapping for eco-climatic models and applications, fast multi-temporal trend analysis and for societal and policy-relevant applications needed at the watershed scale.

  6. Hyperspectral Image Enhancement and Mixture Deep-Learning Classification of Corneal Epithelium Injuries.

    Science.gov (United States)

    Noor, Siti Salwa Md; Michael, Kaleena; Marshall, Stephen; Ren, Jinchang

    2017-11-16

    In our preliminary study, the reflectance signatures obtained from hyperspectral imaging (HSI) of normal and abnormal corneal epithelium tissues of porcine show similar morphology with subtle differences. Here we present image enhancement algorithms that can be used to improve the interpretability of data into clinically relevant information to facilitate diagnostics. A total of 25 corneal epithelium images without the application of eye staining were used. Three image feature extraction approaches were applied for image classification: (i) image feature classification from histogram using a support vector machine with a Gaussian radial basis function (SVM-GRBF); (ii) physical image feature classification using deep-learning Convolutional Neural Networks (CNNs) only; and (iii) the combined classification of CNNs and SVM-Linear. The performance results indicate that our chosen image features from the histogram and length-scale parameter were able to classify with up to 100% accuracy; particularly, at CNNs and CNNs-SVM, by employing 80% of the data sample for training and 20% for testing. Thus, in the assessment of corneal epithelium injuries, HSI has high potential as a method that could surpass current technologies regarding speed, objectivity, and reliability.

  7. Deep-learning-based classification of FDG-PET data for Alzheimer's disease categories

    Science.gov (United States)

    Singh, Shibani; Srivastava, Anant; Mi, Liang; Caselli, Richard J.; Chen, Kewei; Goradia, Dhruman; Reiman, Eric M.; Wang, Yalin

    2017-11-01

    Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate for glucose, offering a reliable metabolic biomarker even on presymptomatic Alzheimer's disease (AD) patients. PET scans provide functional information that is unique and unavailable using other types of imaging. However, the computational efficacy of FDG-PET data alone, for the classification of various Alzheimers Diagnostic categories, has not been well studied. This motivates us to correctly discriminate various AD Diagnostic categories using FDG-PET data. Deep learning has improved state-of-the-art classification accuracies in the areas of speech, signal, image, video, text mining and recognition. We propose novel methods that involve probabilistic principal component analysis on max-pooled data and mean-pooled data for dimensionality reduction, and multilayer feed forward neural network which performs binary classification. Our experimental dataset consists of baseline data of subjects including 186 cognitively unimpaired (CU) subjects, 336 mild cognitive impairment (MCI) subjects with 158 Late MCI and 178 Early MCI, and 146 AD patients from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We measured F1-measure, precision, recall, negative and positive predictive values with a 10-fold cross validation scheme. Our results indicate that our designed classifiers achieve competitive results while max pooling achieves better classification performance compared to mean-pooled features. Our deep model based research may advance FDG-PET analysis by demonstrating their potential as an effective imaging biomarker of AD.

  8. Image-based deep learning for classification of noise transients in gravitational wave detectors

    Science.gov (United States)

    Razzano, Massimiliano; Cuoco, Elena

    2018-05-01

    The detection of gravitational waves has inaugurated the era of gravitational astronomy and opened new avenues for the multimessenger study of cosmic sources. Thanks to their sensitivity, the Advanced LIGO and Advanced Virgo interferometers will probe a much larger volume of space and expand the capability of discovering new gravitational wave emitters. The characterization of these detectors is a primary task in order to recognize the main sources of noise and optimize the sensitivity of interferometers. Glitches are transient noise events that can impact the data quality of the interferometers and their classification is an important task for detector characterization. Deep learning techniques are a promising tool for the recognition and classification of glitches. We present a classification pipeline that exploits convolutional neural networks to classify glitches starting from their time-frequency evolution represented as images. We evaluated the classification accuracy on simulated glitches, showing that the proposed algorithm can automatically classify glitches on very fast timescales and with high accuracy, thus providing a promising tool for online detector characterization.

  9. Automated classification of tropical shrub species: a hybrid of leaf shape and machine learning approach.

    Science.gov (United States)

    Murat, Miraemiliana; Chang, Siow-Wee; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai

    2017-01-01

    Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson's coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99

  10. Evaluation of Semi-supervised Learning for Classification of Protein Crystallization Imagery.

    Science.gov (United States)

    Sigdel, Madhav; Dinç, İmren; Dinç, Semih; Sigdel, Madhu S; Pusey, Marc L; Aygün, Ramazan S

    2014-03-01

    In this paper, we investigate the performance of two wrapper methods for semi-supervised learning algorithms for classification of protein crystallization images with limited labeled images. Firstly, we evaluate the performance of semi-supervised approach using self-training with naïve Bayesian (NB) and sequential minimum optimization (SMO) as the base classifiers. The confidence values returned by these classifiers are used to select high confident predictions to be used for self-training. Secondly, we analyze the performance of Yet Another Two Stage Idea (YATSI) semi-supervised learning using NB, SMO, multilayer perceptron (MLP), J48 and random forest (RF) classifiers. These results are compared with the basic supervised learning using the same training sets. We perform our experiments on a dataset consisting of 2250 protein crystallization images for different proportions of training and test data. Our results indicate that NB and SMO using both self-training and YATSI semi-supervised approaches improve accuracies with respect to supervised learning. On the other hand, MLP, J48 and RF perform better using basic supervised learning. Overall, random forest classifier yields the best accuracy with supervised learning for our dataset.

  11. Classification of Learning Styles in Virtual Learning Environment Using J48 Decision Tree

    Science.gov (United States)

    Maaliw, Renato R. III; Ballera, Melvin A.

    2017-01-01

    The usage of data mining has dramatically increased over the past few years and the education sector is leveraging this field in order to analyze and gain intuitive knowledge in terms of the vast accumulated data within its confines. The primary objective of this study is to compare the results of different classification techniques such as Naïve…

  12. Active learning strategies for the deduplication of electronic patient data using classification trees.

    Science.gov (United States)

    Sariyar, M; Borg, A; Pommerening, K

    2012-10-01

    Supervised record linkage methods often require a clerical review to gain informative training data. Active learning means to actively prompt the user to label data with special characteristics in order to minimise the review costs. We conducted an empirical evaluation to investigate whether a simple active learning strategy using binary comparison patterns is sufficient or if string metrics together with a more sophisticated algorithm are necessary to achieve high accuracies with a small training set. Based on medical registry data with different numbers of attributes, we used active learning to acquire training sets for classification trees, which were then used to classify the remaining data. Active learning for binary patterns means that every distinct comparison pattern represents a stratum from which one item is sampled. Active learning for patterns consisting of the Levenshtein string metric values uses an iterative process where the most informative and representative examples are added to the training set. In this context, we extended the active learning strategy by Sarawagi and Bhamidipaty (2002). On the original data set, active learning based on binary comparison patterns leads to the best results. When dropping four or six attributes, using string metrics leads to better results. In both cases, not more than 200 manually reviewed training examples are necessary. In record linkage applications where only forename, name and birthday are available as attributes, we suggest the sophisticated active learning strategy based on string metrics in order to achieve highly accurate results. We recommend the simple strategy if more attributes are available, as in our study. In both cases, active learning significantly reduces the amount of manual involvement in training data selection compared to usual record linkage settings. Copyright © 2012 Elsevier Inc. All rights reserved.

  13. Accurate Classification of Protein Subcellular Localization from High-Throughput Microscopy Images Using Deep Learning

    Directory of Open Access Journals (Sweden)

    Tanel Pärnamaa

    2017-05-01

    Full Text Available High-throughput microscopy of many single cells generates high-dimensional data that are far from straightforward to analyze. One important problem is automatically detecting the cellular compartment where a fluorescently-tagged protein resides, a task relatively simple for an experienced human, but difficult to automate on a computer. Here, we train an 11-layer neural network on data from mapping thousands of yeast proteins, achieving per cell localization classification accuracy of 91%, and per protein accuracy of 99% on held-out images. We confirm that low-level network features correspond to basic image characteristics, while deeper layers separate localization classes. Using this network as a feature calculator, we train standard classifiers that assign proteins to previously unseen compartments after observing only a small number of training examples. Our results are the most accurate subcellular localization classifications to date, and demonstrate the usefulness of deep learning for high-throughput microscopy.

  14. Musical Instrument Classification Based on Nonlinear Recurrence Analysis and Supervised Learning

    Directory of Open Access Journals (Sweden)

    R.Rui

    2013-04-01

    Full Text Available In this paper, the phase space reconstruction of time series produced by different instruments is discussed based on the nonlinear dynamic theory. The dense ratio, a novel quantitative recurrence parameter, is proposed to describe the difference of wind instruments, stringed instruments and keyboard instruments in the phase space by analyzing the recursive property of every instrument. Furthermore, a novel supervised learning algorithm for automatic classification of individual musical instrument signals is addressed deriving from the idea of supervised non-negative matrix factorization (NMF algorithm. In our approach, the orthogonal basis matrix could be obtained without updating the matrix iteratively, which NMF is unable to do. The experimental results indicate that the accuracy of the proposed method is improved by 3% comparing with the conventional features in the individual instrument classification.

  15. Accurate Classification of Protein Subcellular Localization from High-Throughput Microscopy Images Using Deep Learning.

    Science.gov (United States)

    Pärnamaa, Tanel; Parts, Leopold

    2017-05-05

    High-throughput microscopy of many single cells generates high-dimensional data that are far from straightforward to analyze. One important problem is automatically detecting the cellular compartment where a fluorescently-tagged protein resides, a task relatively simple for an experienced human, but difficult to automate on a computer. Here, we train an 11-layer neural network on data from mapping thousands of yeast proteins, achieving per cell localization classification accuracy of 91%, and per protein accuracy of 99% on held-out images. We confirm that low-level network features correspond to basic image characteristics, while deeper layers separate localization classes. Using this network as a feature calculator, we train standard classifiers that assign proteins to previously unseen compartments after observing only a small number of training examples. Our results are the most accurate subcellular localization classifications to date, and demonstrate the usefulness of deep learning for high-throughput microscopy. Copyright © 2017 Parnamaa and Parts.

  16. Image Classification, Deep Learning and Convolutional Neural Networks : A Comparative Study of Machine Learning Frameworks

    OpenAIRE

    Airola, Rasmus; Hager, Kristoffer

    2017-01-01

    The use of machine learning and specifically neural networks is a growing trend in software development, and has grown immensely in the last couple of years in the light of an increasing need to handle big data and large information flows. Machine learning has a broad area of application, such as human-computer interaction, predicting stock prices, real-time translation, and self driving vehicles. Large companies such as Microsoft and Google have already implemented machine learning in some o...

  17. LTD windows of the STDP learning rule and synaptic connections having a large transmission delay enable robust sequence learning amid background noise.

    Science.gov (United States)

    Hayashi, Hatsuo; Igarashi, Jun

    2009-06-01

    Spike-timing-dependent synaptic plasticity (STDP) is a simple and effective learning rule for sequence learning. However, synapses being subject to STDP rules are readily influenced in noisy circumstances because synaptic conductances are modified by pre- and postsynaptic spikes elicited within a few tens of milliseconds, regardless of whether those spikes convey information or not. Noisy firing existing everywhere in the brain may induce irrelevant enhancement of synaptic connections through STDP rules and would result in uncertain memory encoding and obscure memory patterns. We will here show that the LTD windows of the STDP rules enable robust sequence learning amid background noise in cooperation with a large signal transmission delay between neurons and a theta rhythm, using a network model of the entorhinal cortex layer II with entorhinal-hippocampal loop connections. The important element of the present model for robust sequence learning amid background noise is the symmetric STDP rule having LTD windows on both sides of the LTP window, in addition to the loop connections having a large signal transmission delay and the theta rhythm pacing activities of stellate cells. Above all, the LTD window in the range of positive spike-timing is important to prevent influences of noise with the progress of sequence learning.

  18. Neuropsychological Test Selection for Cognitive Impairment Classification: A Machine Learning Approach

    Science.gov (United States)

    Williams, Jennifer A.; Schmitter-Edgecombe, Maureen; Cook, Diane J.

    2016-01-01

    Introduction Reducing the amount of testing required to accurately detect cognitive impairment is clinically relevant. The aim of this research was to determine the fewest number of clinical measures required to accurately classify participants as healthy older adult, mild cognitive impairment (MCI) or dementia using a suite of classification techniques. Methods Two variable selection machine learning models (i.e., naive Bayes, decision tree), a logistic regression, and two participant datasets (i.e., clinical diagnosis, clinical dementia rating; CDR) were explored. Participants classified using clinical diagnosis criteria included 52 individuals with dementia, 97 with MCI, and 161 cognitively healthy older adults. Participants classified using CDR included 154 individuals CDR = 0, 93 individuals with CDR = 0.5, and 25 individuals with CDR = 1.0+. Twenty-seven demographic, psychological, and neuropsychological variables were available for variable selection. Results No significant difference was observed between naive Bayes, decision tree, and logistic regression models for classification of both clinical diagnosis and CDR datasets. Participant classification (70.0 – 99.1%), geometric mean (60.9 – 98.1%), sensitivity (44.2 – 100%), and specificity (52.7 – 100%) were generally satisfactory. Unsurprisingly, the MCI/CDR = 0.5 participant group was the most challenging to classify. Through variable selection only 2 – 9 variables were required for classification and varied between datasets in a clinically meaningful way. Conclusions The current study results reveal that machine learning techniques can accurately classifying cognitive impairment and reduce the number of measures required for diagnosis. PMID:26332171

  19. Classification of follicular lymphoma images: a holistic approach with symbol-based machine learning methods.

    Science.gov (United States)

    Zorman, Milan; Sánchez de la Rosa, José Luis; Dinevski, Dejan

    2011-12-01

    It is not very often to see a symbol-based machine learning approach to be used for the purpose of image classification and recognition. In this paper we will present such an approach, which we first used on the follicular lymphoma images. Lymphoma is a broad term encompassing a variety of cancers of the lymphatic system. Lymphoma is differentiated by the type of cell that multiplies and how the cancer presents itself. It is very important to get an exact diagnosis regarding lymphoma and to determine the treatments that will be most effective for the patient's condition. Our work was focused on the identification of lymphomas by finding follicles in microscopy images provided by the Laboratory of Pathology in the University Hospital of Tenerife, Spain. We divided our work in two stages: in the first stage we did image pre-processing and feature extraction, and in the second stage we used different symbolic machine learning approaches for pixel classification. Symbolic machine learning approaches are often neglected when looking for image analysis tools. They are not only known for a very appropriate knowledge representation, but also claimed to lack computational power. The results we got are very promising and show that symbolic approaches can be successful in image analysis applications.

  20. A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification.

    Science.gov (United States)

    Peikari, Mohammad; Salama, Sherine; Nofech-Mozes, Sharon; Martel, Anne L

    2018-05-08

    Completely labeled pathology datasets are often challenging and time-consuming to obtain. Semi-supervised learning (SSL) methods are able to learn from fewer labeled data points with the help of a large number of unlabeled data points. In this paper, we investigated the possibility of using clustering analysis to identify the underlying structure of the data space for SSL. A cluster-then-label method was proposed to identify high-density regions in the data space which were then used to help a supervised SVM in finding the decision boundary. We have compared our method with other supervised and semi-supervised state-of-the-art techniques using two different classification tasks applied to breast pathology datasets. We found that compared with other state-of-the-art supervised and semi-supervised methods, our SSL method is able to improve classification performance when a limited number of labeled data instances are made available. We also showed that it is important to examine the underlying distribution of the data space before applying SSL techniques to ensure semi-supervised learning assumptions are not violated by the data.

  1. Galaxy And Mass Assembly: automatic morphological classification of galaxies using statistical learning

    Science.gov (United States)

    Sreejith, Sreevarsha; Pereverzyev, Sergiy, Jr.; Kelvin, Lee S.; Marleau, Francine R.; Haltmeier, Markus; Ebner, Judith; Bland-Hawthorn, Joss; Driver, Simon P.; Graham, Alister W.; Holwerda, Benne W.; Hopkins, Andrew M.; Liske, Jochen; Loveday, Jon; Moffett, Amanda J.; Pimbblet, Kevin A.; Taylor, Edward N.; Wang, Lingyu; Wright, Angus H.

    2018-03-01

    We apply four statistical learning methods to a sample of 7941 galaxies (z test the feasibility of using automated algorithms to classify galaxies. Using 10 features measured for each galaxy (sizes, colours, shape parameters, and stellar mass), we apply the techniques of Support Vector Machines, Classification Trees, Classification Trees with Random Forest (CTRF) and Neural Networks, and returning True Prediction Ratios (TPRs) of 75.8 per cent, 69.0 per cent, 76.2 per cent, and 76.0 per cent, respectively. Those occasions whereby all four algorithms agree with each other yet disagree with the visual classification (`unanimous disagreement') serves as a potential indicator of human error in classification, occurring in ˜ 9 per cent of ellipticals, ˜ 9 per cent of little blue spheroids, ˜ 14 per cent of early-type spirals, ˜ 21 per cent of intermediate-type spirals, and ˜ 4 per cent of late-type spirals and irregulars. We observe that the choice of parameters rather than that of algorithms is more crucial in determining classification accuracy. Due to its simplicity in formulation and implementation, we recommend the CTRF algorithm for classifying future galaxy data sets. Adopting the CTRF algorithm, the TPRs of the five galaxy types are : E, 70.1 per cent; LBS, 75.6 per cent; S0-Sa, 63.6 per cent; Sab-Scd, 56.4 per cent, and Sd-Irr, 88.9 per cent. Further, we train a binary classifier using this CTRF algorithm that divides galaxies into spheroid-dominated (E, LBS, and S0-Sa) and disc-dominated (Sab-Scd and Sd-Irr), achieving an overall accuracy of 89.8 per cent. This translates into an accuracy of 84.9 per cent for spheroid-dominated systems and 92.5 per cent for disc-dominated systems.

  2. CF-integrals: A new family of pre-aggregation functions with application to fuzzy rule-based classification systems

    Czech Academy of Sciences Publication Activity Database

    Lucca, G.; Sanz, A.; Dimuro, G. P.; Bedregal, B.; Bustince, H.; Mesiar, Radko

    2018-01-01

    Roč. 435, č. 1 (2018), s. 94-110 ISSN 0020-0255 Institutional support: RVO:67985556 Keywords : CF-integral * classification problems * pre-aggregation Subject RIV: BA - General Mathematics Impact factor: 4.832, year: 2016 http://library.utia.cas.cz/separaty/2018/E/mesiar-0485363.pdf

  3. Collective Machine Learning: Team Learning and Classification in Multi-Agent Systems

    Science.gov (United States)

    Gifford, Christopher M.

    2009-01-01

    This dissertation focuses on the collaboration of multiple heterogeneous, intelligent agents (hardware or software) which collaborate to learn a task and are capable of sharing knowledge. The concept of collaborative learning in multi-agent and multi-robot systems is largely under studied, and represents an area where further research is needed to…

  4. Using methods from the data mining and machine learning literature for disease classification and prediction: A case study examining classification of heart failure sub-types

    Science.gov (United States)

    Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.

    2014-01-01

    Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592

  5. Increasing the thermal stability of cellulase C using rules learned from thermophilic proteins: a pilot study.

    Science.gov (United States)

    Németh, Attila; Kamondi, Szilárd; Szilágyi, András; Magyar, Csaba; Kovári, Zoltán; Závodszky, Péter

    2002-05-02

    Some structural features underlying the increased thermostability of enzymes from thermophilic organisms relative to their homologues from mesophiles are known from earlier studies. We used cellulase C from Clostridium thermocellum to test whether thermostability can be increased by mutations designed using rules learned from thermophilic proteins. Cellulase C has a TIM barrel fold with an additional helical subdomain. We designed and produced a number of mutants with the aim to increase its thermostability. Five mutants were designed to create new electrostatic interactions. They all retained catalytic activity but exhibited decreased thermostability relative to the wild-type enzyme. Here, the stabilizing contributions are obviously smaller than the destabilization caused by the introduction of the new side chains. In another mutant, the small helical subdomain was deleted. This mutant lost activity but its melting point was only 3 degrees C lower than that of the wild-type enzyme, which suggests that the subdomain is an independent folding unit and is important for catalytic function. A double mutant was designed to introduce a new disulfide bridge into the enzyme. This mutant is active and has an increased stability (deltaT(m)=3 degrees C, delta(deltaG(u))=1.73 kcal/mol) relative to the wild-type enzyme. Reduction of the disulfide bridge results in destabilization and an altered thermal denaturation behavior. We conclude that rules learned from thermophilic proteins cannot be used in a straightforward way to increase the thermostability of a protein. Creating a crosslink such as a disulfide bond is a relatively sure-fire method but the stabilization may be smaller than calculated due to coupled destabilizing effects.

  6. Harnessing user generated multimedia content in the creation of collaborative classification structures and retrieval learning games

    Science.gov (United States)

    Borchert, Otto Jerome

    This paper describes a software tool to assist groups of people in the classification and identification of real world objects called the Classification, Identification, and Retrieval-based Collaborative Learning Environment (CIRCLE). A thorough literature review identified current pedagogical theories that were synthesized into a series of five tasks: gathering, elaboration, classification, identification, and reinforcement through game play. This approach is detailed as part of an included peer reviewed paper. Motivation is increased through the use of formative and summative gamification; getting points completing important portions of the tasks and playing retrieval learning based games, respectively, which is also included as a peer-reviewed conference proceedings paper. Collaboration is integrated into the experience through specific tasks and communication mediums. Implementation focused on a REST-based client-server architecture. The client is a series of web-based interfaces to complete each of the tasks, support formal classroom interaction through faculty accounts and student tracking, and a module for peers to help each other. The server, developed using an in-house JavaMOO platform, stores relevant project data and serves data through a series of messages implemented as a JavaScript Object Notation Application Programming Interface (JSON API). Through a series of two beta tests and two experiments, it was discovered the second, elaboration, task requires considerable support. While students were able to properly suggest experiments and make observations, the subtask involving cleaning the data for use in CIRCLE required extra support. When supplied with more structured data, students were enthusiastic about the classification and identification tasks, showing marked improvement in usability scores and in open ended survey responses. CIRCLE tracks a variety of educationally relevant variables, facilitating support for instructors and researchers. Future

  7. Application of Machine Learning to Proteomics Data: Classification and Biomarker Identification in Postgenomics Biology

    Science.gov (United States)

    Swan, Anna Louise; Mobasheri, Ali; Allaway, David; Liddell, Susan

    2013-01-01

    Abstract Mass spectrometry is an analytical technique for the characterization of biological samples and is increasingly used in omics studies because of its targeted, nontargeted, and high throughput abilities. However, due to the large datasets generated, it requires informatics approaches such as machine learning techniques to analyze and interpret relevant data. Machine learning can be applied to MS-derived proteomics data in two ways. First, directly to mass spectral peaks and second, to proteins identified by sequence database searching, although relative protein quantification is required for the latter. Machine learning has been applied to mass spectrometry data from different biological disciplines, particularly for various cancers. The aims of such investigations have been to identify biomarkers and to aid in diagnosis, prognosis, and treatment of specific diseases. This review describes how machine learning has been applied to proteomics tandem mass spectrometry data. This includes how it can be used to identify proteins suitable for use as biomarkers of disease and for classification of samples into disease or treatment groups, which may be applicable for diagnostics. It also includes the challenges faced by such investigations, such as prediction of proteins present, protein quantification, planning for the use of machine learning, and small sample sizes. PMID:24116388

  8. Research into Financial Position of Listed Companies following Classification via Extreme Learning Machine Based upon DE Optimization

    OpenAIRE

    Fu Yu; Mu Jiong; Duan Xu Liang

    2016-01-01

    By means of the model of extreme learning machine based upon DE optimization, this article particularly centers on the optimization thinking of such a model as well as its application effect in the field of listed company’s financial position classification. It proves that the improved extreme learning machine algorithm based upon DE optimization eclipses the traditional extreme learning machine algorithm following comparison. Meanwhile, this article also intends to introduce certain research...

  9. Transcranial infrared laser stimulation improves rule-based, but not information-integration, category learning in humans.

    Science.gov (United States)

    Blanco, Nathaniel J; Saucedo, Celeste L; Gonzalez-Lima, F

    2017-03-01

    This is the first randomized, controlled study comparing the cognitive effects of transcranial laser stimulation on category learning tasks. Transcranial infrared laser stimulation is a new non-invasive form of brain stimulation that shows promise for wide-ranging experimental and neuropsychological applications. It involves using infrared laser to enhance cerebral oxygenation and energy metabolism through upregulation of the respiratory enzyme cytochrome oxidase, the primary infrared photon acceptor in cells. Previous research found that transcranial infrared laser stimulation aimed at the prefrontal cortex can improve sustained attention, short-term memory, and executive function. In this study, we directly investigated the influence of transcranial infrared laser stimulation on two neurobiologically dissociable systems of category learning: a prefrontal cortex mediated reflective system that learns categories using explicit rules, and a striatally mediated reflexive learning system that forms gradual stimulus-response associations. Participants (n=118) received either active infrared laser to the lateral prefrontal cortex or sham (placebo) stimulation, and then learned one of two category structures-a rule-based structure optimally learned by the reflective system, or an information-integration structure optimally learned by the reflexive system. We found that prefrontal rule-based learning was substantially improved following transcranial infrared laser stimulation as compared to placebo (treatment X block interaction: F(1, 298)=5.117, p=0.024), while information-integration learning did not show significant group differences (treatment X block interaction: F(1, 288)=1.633, p=0.202). These results highlight the exciting potential of transcranial infrared laser stimulation for cognitive enhancement and provide insight into the neurobiological underpinnings of category learning. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Comparison of Heuristics for Inhibitory Rule Optimization

    KAUST Repository

    Alsolami, Fawaz

    2014-09-13

    Knowledge representation and extraction are very important tasks in data mining. In this work, we proposed a variety of rule-based greedy algorithms that able to obtain knowledge contained in a given dataset as a series of inhibitory rules containing an expression “attribute ≠ value” on the right-hand side. The main goal of this paper is to determine based on rule characteristics, rule length and coverage, whether the proposed rule heuristics are statistically significantly different or not; if so, we aim to identify the best performing rule heuristics for minimization of rule length and maximization of rule coverage. Friedman test with Nemenyi post-hoc are used to compare the greedy algorithms statistically against each other for length and coverage. The experiments are carried out on real datasets from UCI Machine Learning Repository. For leading heuristics, the constructed rules are compared with optimal ones obtained based on dynamic programming approach. The results seem to be promising for the best heuristics: the average relative difference between length (coverage) of constructed and optimal rules is at most 2.27% (7%, respectively). Furthermore, the quality of classifiers based on sets of inhibitory rules constructed by the considered heuristics are compared against each other, and the results show that the three best heuristics from the point of view classification accuracy coincides with the three well-performed heuristics from the point of view of rule length minimization.

  11. Machine learning in infrared object classification - an all-sky selection of YSO candidates

    Science.gov (United States)

    Marton, Gabor; Zahorecz, Sarolta; Toth, L. Viktor; Magnus McGehee, Peregrine; Kun, Maria

    2015-08-01

    Object classification is a fundamental and challenging problem in the era of big data. I will discuss up-to-date methods and their application to classify infrared point sources.We analysed the ALLWISE catalogue, the most recent public source catalogue of the Wide-field Infrared Survey Explorer (WISE) to compile a reliable list of Young Stellar Object (YSO) candidates. We tested and compared classical and up-to-date statistical methods as well, to discriminate source types like extragalactic objects, evolved stars, main sequence stars, objects related to the interstellar medium and YSO candidates by using their mid-IR WISE properties and associated near-IR 2MASS data.In the particular classification problem the Support Vector Machines (SVM), a class of supervised learning algorithm turned out to be the best tool. As a result we classify Class I and II YSOs with >90% accuracy while the fraction of contaminating extragalactic objects remains well below 1%, based on the number of known objects listed in the SIMBAD and VizieR databases. We compare our results to other classification schemes from the literature and show that the SVM outperforms methods that apply linear cuts on the colour-colour and colour-magnitude space. Our homogenous YSO candidate catalog can serve as an excellent pathfinder for future detailed observations of individual objects and a starting point of statistical studies that aim to add pieces to the big picture of star formation theory.

  12. Machine learning algorithms for meteorological event classification in the coastal area using in-situ data

    Science.gov (United States)

    Sokolov, Anton; Gengembre, Cyril; Dmitriev, Egor; Delbarre, Hervé

    2017-04-01

    The problem is considered of classification of local atmospheric meteorological events in the coastal area such as sea breezes, fogs and storms. The in-situ meteorological data as wind speed and direction, temperature, humidity and turbulence are used as predictors. Local atmospheric events of 2013-2014 were analysed manually to train classification algorithms in the coastal area of English Channel in Dunkirk (France). Then, ultrasonic anemometer data and LIDAR wind profiler data were used as predictors. A few algorithms were applied to determine meteorological events by local data such as a decision tree, the nearest neighbour classifier, a support vector machine. The comparison of classification algorithms was carried out, the most important predictors for each event type were determined. It was shown that in more than 80 percent of the cases machine learning algorithms detect the meteorological class correctly. We expect that this methodology could be applied also to classify events by climatological in-situ data or by modelling data. It allows estimating frequencies of each event in perspective of climate change.

  13. Micro-Doppler Based Classification of Human Aquatic Activities via Transfer Learning of Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Jinhee Park

    2016-11-01

    Full Text Available Accurate classification of human aquatic activities using radar has a variety of potential applications such as rescue operations and border patrols. Nevertheless, the classification of activities on water using radar has not been extensively studied, unlike the case on dry ground, due to its unique challenge. Namely, not only is the radar cross section of a human on water small, but the micro-Doppler signatures are much noisier due to water drops and waves. In this paper, we first investigate whether discriminative signatures could be obtained for activities on water through a simulation study. Then, we show how we can effectively achieve high classification accuracy by applying deep convolutional neural networks (DCNN directly to the spectrogram of real measurement data. From the five-fold cross-validation on our dataset, which consists of five aquatic activities, we report that the conventional feature-based scheme only achieves an accuracy of 45.1%. In contrast, the DCNN trained using only the collected data attains 66.7%, and the transfer learned DCNN, which takes a DCNN pre-trained on a RGB image dataset and fine-tunes the parameters using the collected data, achieves a much higher 80.3%, which is a significant performance boost.

  14. Feature Selection for Motor Imagery EEG Classification Based on Firefly Algorithm and Learning Automata

    Directory of Open Access Journals (Sweden)

    Aiming Liu

    2017-11-01

    Full Text Available Motor Imagery (MI electroencephalography (EEG is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP and local characteristic-scale decomposition (LCD algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA classifier. Both the fourth brain–computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain–computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain–computer interface systems.

  15. Seeing is believing: video classification for computed tomographic colonography using multiple-instance learning.

    Science.gov (United States)

    Wang, Shijun; McKenna, Matthew T; Nguyen, Tan B; Burns, Joseph E; Petrick, Nicholas; Sahiner, Berkman; Summers, Ronald M

    2012-05-01

    In this paper, we present development and testing results for a novel colonic polyp classification method for use as part of a computed tomographic colonography (CTC) computer-aided detection (CAD) system. Inspired by the interpretative methodology of radiologists using 3-D fly-through mode in CTC reading, we have developed an algorithm which utilizes sequences of images (referred to here as videos) for classification of CAD marks. For each CAD mark, we created a video composed of a series of intraluminal, volume-rendered images visualizing the detection from multiple viewpoints. We then framed the video classification question as a multiple-instance learning (MIL) problem. Since a positive (negative) bag may contain negative (positive) instances, which in our case depends on the viewing angles and camera distance to the target, we developed a novel MIL paradigm to accommodate this class of problems. We solved the new MIL problem by maximizing a L2-norm soft margin using semidefinite programming, which can optimize relevant parameters automatically. We tested our method by analyzing a CTC data set obtained from 50 patients from three medical centers. Our proposed method showed significantly better performance compared with several traditional MIL methods.

  16. Gene selection for microarray data classification via subspace learning and manifold regularization.

    Science.gov (United States)

    Tang, Chang; Cao, Lijuan; Zheng, Xiao; Wang, Minhui

    2017-12-19

    With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.

  17. Feature Selection for Motor Imagery EEG Classification Based on Firefly Algorithm and Learning Automata.

    Science.gov (United States)

    Liu, Aiming; Chen, Kun; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi

    2017-11-08

    Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain-computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain-computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain-computer interface systems.

  18. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Directory of Open Access Journals (Sweden)

    Dawyndt Peter

    2010-01-01

    Full Text Available Abstract Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the

  19. From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.

    Science.gov (United States)

    Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard

    2010-01-30

    Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial

  20. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Science.gov (United States)

    2010-01-01

    Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for

  1. Classification of two steroids, prostanozol and methasterone, as Schedule III anabolic steroids under the Controlled Substance Act. Final rule.

    Science.gov (United States)

    2012-07-30

    With the issuance of this Final Rule, the Administrator of the DEA classifies the following two steroids as "anabolic steroids'' under the Controlled Substances Act (CSA): prostanozol (17[beta]-hydroxy-5[alpha]-androstano[3,2-c]pyrazole) and methasterone (2[alpha],17[alpha]-dimethyl-5[alpha]-androstan-17[beta]-ol-3-one). These steroids and their salts, esters, and ethers are Schedule III controlled substances subject to the regulatory control provisions of the CSA.

  2. Rule-bases construction through self-learning for a table-based Sugeno-Takagi fuzzy logic control system

    Directory of Open Access Journals (Sweden)

    C. Boldisor

    2009-12-01

    Full Text Available A self-learning based methodology for building the rule-base of a fuzzy logic controller (FLC is presented and verified, aiming to engage intelligent characteristics to a fuzzy logic control systems. The methodology is a simplified version of those presented in today literature. Some aspects are intentionally ignored since it rarely appears in control system engineering and a SISO process is considered here. The fuzzy inference system obtained is a table-based Sugeno-Takagi type. System’s desired performance is defined by a reference model and rules are extracted from recorded data, after the correct control actions are learned. The presented algorithm is tested in constructing the rule-base of a fuzzy controller for a DC drive application. System’s performances and method’s viability are analyzed.

  3. Feature learning and change feature classification based on deep learning for ternary change detection in SAR images

    Science.gov (United States)

    Gong, Maoguo; Yang, Hailun; Zhang, Puzhao

    2017-07-01

    Ternary change detection aims to detect changes and group the changes into positive change and negative change. It is of great significance in the joint interpretation of spatial-temporal synthetic aperture radar images. In this study, sparse autoencoder, convolutional neural networks (CNN) and unsupervised clustering are combined to solve ternary change detection problem without any supervison. Firstly, sparse autoencoder is used to transform log-ratio difference image into a suitable feature space for extracting key changes and suppressing outliers and noise. And then the learned features are clustered into three classes, which are taken as the pseudo labels for training a CNN model as change feature classifier. The reliable training samples for CNN are selected from the feature maps learned by sparse autoencoder with certain selection rules. Having training samples and the corresponding pseudo labels, the CNN model can be trained by using back propagation with stochastic gradient descent. During its training procedure, CNN is driven to learn the concept of change, and more powerful model is established to distinguish different types of changes. Unlike the traditional methods, the proposed framework integrates the merits of sparse autoencoder and CNN to learn more robust difference representations and the concept of change for ternary change detection. Experimental results on real datasets validate the effectiveness and superiority of the proposed framework.

  4. Rule-Governed and Contingency-Shaped Behavior of Learning-Disabled, Hyperactive, and Nonselected Elementary School Children.

    Science.gov (United States)

    Metzger, Mary Ann; Freund, Lisa

    The major purpose of this study was to describe the rule-governed and contingency-shaped behavior of learning-disabled, hyperactive, and nonselected elementary school children working on a computer-managed task. Hypotheses tested were (1) that the children would differ in the degree to which either instructions or external contingencies controlled…

  5. Robust automated classification of first-motion polarities for focal mechanism determination with machine learning

    Science.gov (United States)

    Ross, Z. E.; Meier, M. A.; Hauksson, E.

    2017-12-01

    Accurate first-motion polarities are essential for determining earthquake focal mechanisms, but are difficult to measure automatically because of picking errors and signal to noise issues. Here we develop an algorithm for reliable automated classification of first-motion polarities using machine learning algorithms. A classifier is designed to identify whether the first-motion polarity is up, down, or undefined by examining the waveform data directly. We first improve the accuracy of automatic P-wave onset picks by maximizing a weighted signal/noise ratio for a suite of candidate picks around the automatic pick. We then use the waveform amplitudes before and after the optimized pick as features for the classification. We demonstrate the method's potential by training and testing the classifier on tens of thousands of hand-made first-motion picks by the Southern California Seismic Network. The classifier assigned the same polarity as chosen by an analyst in more than 94% of the records. We show that the method is generalizable to a variety of learning algorithms, including neural networks and random forest classifiers. The method is suitable for automated processing of large seismic waveform datasets, and can potentially be used in real-time applications, e.g. for improving the source characterizations of earthquake early warning algorithms.

  6. Classification without labels: learning from mixed samples in high energy physics

    Science.gov (United States)

    Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse

    2017-10-01

    Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimal classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.

  7. Topological data analyses and machine learning for detection, classification and characterization of atmospheric rivers

    Science.gov (United States)

    Muszynski, G.; Kashinath, K.; Wehner, M. F.; Prabhat, M.; Kurlin, V.

    2017-12-01

    We investigate novel approaches to detecting, classifying and characterizing extreme weather events, such as atmospheric rivers (ARs), in large high-dimensional climate datasets. ARs are narrow filaments of concentrated water vapour in the atmosphere that bring much of the precipitation in many mid-latitude regions. The precipitation associated with ARs is also responsible for major flooding events in many coastal regions of the world, including the west coast of the United States and western Europe. In this study we combine ideas from Topological Data Analysis (TDA) with Machine Learning (ML) for detecting, classifying and characterizing extreme weather events, like ARs. TDA is a new field that sits at the interface between topology and computer science, that studies "shape" - hidden topological structure - in raw data. It has been applied successfully in many areas of applied sciences, including complex networks, signal processing and image recognition. Using TDA we provide ARs with a shape characteristic as a new feature descriptor for the task of AR classification. In particular, we track the change in topology in precipitable water (integrated water vapour) fields using the Union-Find algorithm. We use the generated feature descriptors with ML classifiers to establish reliability and classification performance of our approach. We utilize the parallel toolkit for extreme climate events analysis (TECA: Petascale Pattern Recognition for Climate Science, Prabhat et al., Computer Analysis of Images and Patterns, 2015) for comparison (it is assumed that events identified by TECA is ground truth). Preliminary results indicate that our approach brings new insight into the study of ARs and provides quantitative information about the relevance of topological feature descriptors in analyses of a large climate datasets. We illustrate this method on climate model output and NCEP reanalysis datasets. Further, our method outperforms existing methods on detection and

  8. Machine learning and dyslexia: Classification of individual structural neuro-imaging scans of students with and without dyslexia

    Directory of Open Access Journals (Sweden)

    P. Tamboer

    2016-01-01

    In a second and independent sample of 876 young adults of a general population, the trained classifier of the first sample was tested, resulting in a classification performance of 59% (p = 0.07; d-prime = 0.65. This decline in classification performance resulted from a large percentage of false alarms. This study provided support for the use of machine learning in anatomical brain imaging.

  9. A Cognitive Skill Classification Based On Multi Objective Optimization Using Learning Vector Quantization for Serious Games

    Directory of Open Access Journals (Sweden)

    Moh. Aries Syufagi

    2011-12-01

    Full Text Available Nowadays, serious games and game technology are poised to transform the way of educating and training students at all levels. However, pedagogical value in games do not help novice students learn, too many memorizing and reduce learning process due to no information of player’s ability. To asses the cognitive level of player ability, we propose a Cognitive Skill Game (CSG. CSG improves this cognitive concept to monitor how players interact with the game. This game employs Learning Vector Quantization (LVQ for optimizing the cognitive skill input classification of the player. CSG is using teacher’s data to obtain the neuron vector of cognitive skill pattern supervise. Three clusters multi objective target will be classified as; trial and error, carefully and, expert cognitive skill. In the game play experiments using 33 respondent players demonstrates that 61% of players have high trial and error cognitive skill, 21% have high carefully cognitive skill, and 18% have high expert cognitive skill. CSG may provide information to game engine when a player needs help or when wanting a formidable challenge. The game engine will provide the appropriate tasks according to players’ ability. CSG will help balance the emotions of players, so players do not get bored and frustrated. Players have a high interest to finish the game if the player is emotionally stable. Interests in the players strongly support the procedural learning in a serious game.

  10. Application of deep learning to the classification of images from colposcopy.

    Science.gov (United States)

    Sato, Masakazu; Horie, Koji; Hara, Aki; Miyamoto, Yuichiro; Kurihara, Kazuko; Tomio, Kensuke; Yokota, Harushige

    2018-03-01

    The objective of the present study was to investigate whether deep learning could be applied successfully to the classification of images from colposcopy. For this purpose, a total of 158 patients who underwent conization were enrolled, and medical records and data from the gynecological oncology database were retrospectively reviewed. Deep learning was performed with the Keras neural network and TensorFlow libraries. Using preoperative images from colposcopy as the input data and deep learning technology, the patients were classified into three groups [severe dysplasia, carcinoma in situ (CIS) and invasive cancer (IC)]. A total of 485 images were obtained for the analysis, of which 142 images were of severe dysplasia (2.9 images/patient), 257 were of CIS (3.3 images/patient), and 86 were of IC (4.1 images/patient). Of these, 233 images were captured with a green filter, and the remaining 252 were captured without a green filter. Following the application of L2 regularization, L1 regularization, dropout and data augmentation, the accuracy of the validation dataset was ~50%. Although the present study is preliminary, the results indicated that deep learning may be applied to classify colposcopy images.

  11. Deep learning based classification of breast tumors with shear-wave elastography.

    Science.gov (United States)

    Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong

    2016-12-01

    This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Effects of neonatal inferior prefrontal and medial temporal lesions on learning the rule for delayed nonmatching-to-sample.

    Science.gov (United States)

    Málková, L; Bachevalier, J; Webster, M; Mishkin, M

    2000-01-01

    The ability of rhesus monkeys to master the rule for delayed nonmatching-to-sample (DNMS) has a protracted ontogenetic development, reaching adult levels of proficiency around 4 to 5 years of age (Bachevalier, 1990). To test the possibility that this slow development could be due, at least in part, to immaturity of the prefrontal component of a temporo-prefrontal circuit important for DNMS rule learning (Kowalska, Bachevalier, & Mishkin, 1991; Weinstein, Saunders, & Mishkin, 1988), monkeys with neonatal lesions of the inferior prefrontal convexity were compared on DNMS with both normal controls and animals given neonatal lesions of the medial temporal lobe. Consistent with our previous results (Bachevalier & Mishkin, 1994; Málková, Mishkin, & Bachevalier, 1995), the neonatal medial temporal lesions led to marked impairment in rule learning (as well as in recognition memory with long delays and list lengths) at both 3 months and 2 years of age. By contrast, the neonatal inferior convexity lesions yielded no impairment in rule-learning at 3 months and only a mild impairment at 2 years, a finding that also contrasts sharply with the marked effects of the same lesion made in adulthood. This pattern of sparing closely resembles the one found earlier after neonatal lesions to the cortical visual area TE (Bachevalier & Mishkin, 1994; Málková et al., 1995). The functional sparing at 3 months probably reflects the fact that the temporo-prefrontal circuit is nonfunctional at this early age, resulting in a total dependency on medial temporal contributions to rule learning. With further development, however, this circuit begins to provide a supplementary route for learning.

  13. A comparative study of deep learning models for medical image classification

    Science.gov (United States)

    Dutta, Suvajit; Manideep, B. C. S.; Rai, Shalva; Vijayarajan, V.

    2017-11-01

    Deep Learning(DL) techniques are conquering over the prevailing traditional approaches of neural network, when it comes to the huge amount of dataset, applications requiring complex functions demanding increase accuracy with lower time complexities. Neurosciences has already exploited DL techniques, thus portrayed itself as an inspirational source for researchers exploring the domain of Machine learning. DL enthusiasts cover the areas of vision, speech recognition, motion planning and NLP as well, moving back and forth among fields. This concerns with building models that can successfully solve variety of tasks requiring intelligence and distributed representation. The accessibility to faster CPUs, introduction of GPUs-performing complex vector and matrix computations, supported agile connectivity to network. Enhanced software infrastructures for distributed computing worked in strengthening the thought that made researchers suffice DL methodologies. The paper emphases on the following DL procedures to traditional approaches which are performed manually for classifying medical images. The medical images are used for the study Diabetic Retinopathy(DR) and computed tomography (CT) emphysema data. Both DR and CT data diagnosis is difficult task for normal image classification methods. The initial work was carried out with basic image processing along with K-means clustering for identification of image severity levels. After determining image severity levels ANN has been applied on the data to get the basic classification result, then it is compared with the result of DNNs (Deep Neural Networks), which performed efficiently because of its multiple hidden layer features basically which increases accuracy factors, but the problem of vanishing gradient in DNNs made to consider Convolution Neural Networks (CNNs) as well for better results. The CNNs are found to be providing better outcomes when compared to other learning models aimed at classification of images. CNNs are

  14. Concussion classification via deep learning using whole-brain white matter fiber strains

    Science.gov (United States)

    Cai, Yunliang; Wu, Shaoju; Zhao, Wei; Li, Zhigang; Wu, Zheyang

    2018-01-01

    Developing an accurate and reliable injury predictor is central to the biomechanical studies of traumatic brain injury. State-of-the-art efforts continue to rely on empirical, scalar metrics based on kinematics or model-estimated tissue responses explicitly pre-defined in a specific brain region of interest. They could suffer from loss of information. A single training dataset has also been used to evaluate performance but without cross-validation. In this study, we developed a deep learning approach for concussion classification using implicit features of the entire voxel-wise white matter fiber strains. Using reconstructed American National Football League (NFL) injury cases, leave-one-out cross-validation was employed to objectively compare injury prediction performances against two baseline machine learning classifiers (support vector machine (SVM) and random forest (RF)) and four scalar metrics via univariate logistic regression (Brain Injury Criterion (BrIC), cumulative strain damage measure of the whole brain (CSDM-WB) and the corpus callosum (CSDM-CC), and peak fiber strain in the CC). Feature-based machine learning classifiers including deep learning, SVM, and RF consistently outperformed all scalar injury metrics across all performance categories (e.g., leave-one-out accuracy of 0.828–0.862 vs. 0.690–0.776, and .632+ error of 0.148–0.176 vs. 0.207–0.292). Further, deep learning achieved the best cross-validation accuracy, sensitivity, AUC, and .632+ error. These findings demonstrate the superior performances of deep learning in concussion prediction and suggest its promise for future applications in biomechanical investigations of traumatic brain injury. PMID:29795640

  15. Concussion classification via deep learning using whole-brain white matter fiber strains.

    Science.gov (United States)

    Cai, Yunliang; Wu, Shaoju; Zhao, Wei; Li, Zhigang; Wu, Zheyang; Ji, Songbai

    2018-01-01

    Developing an accurate and reliable injury predictor is central to the biomechanical studies of traumatic brain injury. State-of-the-art efforts continue to rely on empirical, scalar metrics based on kinematics or model-estimated tissue responses explicitly pre-defined in a specific brain region of interest. They could suffer from loss of information. A single training dataset has also been used to evaluate performance but without cross-validation. In this study, we developed a deep learning approach for concussion classification using implicit features of the entire voxel-wise white matter fiber strains. Using reconstructed American National Football League (NFL) injury cases, leave-one-out cross-validation was employed to objectively compare injury prediction performances against two baseline machine learning classifiers (support vector machine (SVM) and random forest (RF)) and four scalar metrics via univariate logistic regression (Brain Injury Criterion (BrIC), cumulative strain damage measure of the whole brain (CSDM-WB) and the corpus callosum (CSDM-CC), and peak fiber strain in the CC). Feature-based machine learning classifiers including deep learning, SVM, and RF consistently outperformed all scalar injury metrics across all performance categories (e.g., leave-one-out accuracy of 0.828-0.862 vs. 0.690-0.776, and .632+ error of 0.148-0.176 vs. 0.207-0.292). Further, deep learning achieved the best cross-validation accuracy, sensitivity, AUC, and .632+ error. These findings demonstrate the superior performances of deep learning in concussion prediction and suggest its promise for future applications in biomechanical investigations of traumatic brain injury.

  16. Evaluation of a rule-based method for epidemiological document classification towards the automation of systematic reviews.

    Science.gov (United States)

    Karystianis, George; Thayer, Kristina; Wolfe, Mary; Tsafnat, Guy

    2017-06-01

    Most data extraction efforts in epidemiology are focused on obtaining targeted information from clinical trials. In contrast, limited research has been conducted on the identification of information from observational studies, a major source for human evidence in many fields, including environmental health. The recognition of key epidemiological information (e.g., exposures) through text mining techniques can assist in the automation of systematic reviews and other evidence summaries. We designed and applied a knowledge-driven, rule-based approach to identify targeted information (study design, participant population, exposure, outcome, confounding factors, and the country where the study was conducted) from abstracts of epidemiological studies included in several systematic reviews of environmental health exposures. The rules were based on common syntactical patterns observed in text and are thus not specific to any systematic review. To validate the general applicability of our approach, we compared the data extracted using our approach versus hand curation for 35 epidemiological study abstracts manually selected for inclusion in two systematic reviews. The returned F-score, precision, and recall ranged from 70% to 98%, 81% to 100%, and 54% to 97%, respectively. The highest precision was observed for exposure, outcome and population (100%) while recall was best for exposure and study design with 97% and 89%, respectively. The lowest recall was observed for the population (54%), which also had the lowest F-score (70%). The generated performance of our text-mining approach demonstrated encouraging results for the identification of targeted information from observational epidemiological study abstracts related to environmental exposures. We have demonstrated that rules based on generic syntactic patterns in one corpus can be applied to other observational study design by simple interchanging the dictionaries aiming to identify certain characteristics (i.e., outcomes

  17. Improved Neural Signal Classification in a Rapid Serial Visual Presentation Task Using Active Learning.

    Science.gov (United States)

    Marathe, Amar R; Lawhern, Vernon J; Wu, Dongrui; Slayback, David; Lance, Brent J

    2016-03-01

    The application space for brain-computer interface (BCI) technologies is rapidly expanding with improvements in technology. However, most real-time BCIs require extensive individualized calibration prior to use, and systems often have to be recalibrated to account for changes in the neural signals due to a variety of factors including changes in human state, the surrounding environment, and task conditions. Novel approaches to reduce calibration time or effort will dramatically improve the usability of BCI systems. Active Learning (AL) is an iterative semi-supervised learning technique for learning in situations in which data may be abundant, but labels for the data are difficult or expensive to obtain. In this paper, we apply AL to a simulated BCI system for target identification using data from a rapid serial visual presentation (RSVP) paradigm to minimize the amount of training samples needed to initially calibrate a neural classifier. Our results show AL can produce similar overall classification accuracy with significantly less labeled data (in some cases less than 20%) when compared to alternative calibration approaches. In fact, AL classification performance matches performance of 10-fold cross-validation (CV) in over 70% of subjects when training with less than 50% of the data. To our knowledge, this is the first work to demonstrate the use of AL for offline electroencephalography (EEG) calibration in a simulated BCI paradigm. While AL itself is not often amenable for use in real-time systems, this work opens the door to alternative AL-like systems that are more amenable for BCI applications and thus enables future efforts for developing highly adaptive BCI systems.

  18. Deep-learning derived features for lung nodule classification with limited datasets

    Science.gov (United States)

    Thammasorn, P.; Wu, W.; Pierce, L. A.; Pipavath, S. N.; Lampe, P. D.; Houghton, A. M.; Haynor, D. R.; Chaovalitwongse, W. A.; Kinahan, P. E.

    2018-02-01

    Only a few percent of indeterminate nodules found in lung CT images are cancer. However, enabling earlier diagnosis is important to avoid invasive procedures or long-time surveillance to those benign nodules. We are evaluating a classification framework using radiomics features derived with a machine learning approach from a small data set of indeterminate CT lung nodule images. We used a retrospective analysis of 194 cases with pulmonary nodules in the CT images with or without contrast enhancement from lung cancer screening clinics. The nodules were contoured by a radiologist and texture features of the lesion were calculated. In addition, sematic features describing shape were categorized. We also explored a Multiband network, a feature derivation path that uses a modified convolutional neural network (CNN) with a Triplet Network. This was trained to create discriminative feature representations useful for variable-sized nodule classification. The diagnostic accuracy was evaluated for multiple machine learning algorithms using texture, shape, and CNN features. In the CT contrast-enhanced group, the texture or semantic shape features yielded an overall diagnostic accuracy of 80%. Use of a standard deep learning network in the framework for feature derivation yielded features that substantially underperformed compared to texture and/or semantic features. However, the proposed Multiband approach of feature derivation produced results similar in diagnostic accuracy to the texture and semantic features. While the Multiband feature derivation approach did not outperform the texture and/or semantic features, its equivalent performance indicates promise for future improvements to increase diagnostic accuracy. Importantly, the Multiband approach adapts readily to different size lesions without interpolation, and performed well with relatively small amount of training data.

  19. Automated classification of tropical shrub species: a hybrid of leaf shape and machine learning approach

    Directory of Open Access Journals (Sweden)

    Miraemiliana Murat

    2017-09-01

    Full Text Available Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM, Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD, Histogram of Oriented Gradients (HOG, Hu invariant moments (Hu and Zernike moments (ZM. Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN, random forest (RF, support vector machine (SVM, k-nearest neighbour (k-NN, linear discriminant analysis (LDA and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM. In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS and Pearson’s coefficient correlation (PCC. The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia

  20. Machine learning in APOGEE. Unsupervised spectral classification with K-means

    Science.gov (United States)

    Garcia-Dias, Rafael; Allende Prieto, Carlos; Sánchez Almeida, Jorge; Ordovás-Pascual, Ignacio

    2018-05-01

    Context. The volume of data generated by astronomical surveys is growing rapidly. Traditional analysis techniques in spectroscopy either demand intensive human interaction or are computationally expensive. In this scenario, machine learning, and unsupervised clustering algorithms in particular, offer interesting alternatives. The Apache Point Observatory Galactic Evolution Experiment (APOGEE) offers a vast data set of near-infrared stellar spectra, which is perfect for testing such alternatives. Aims: Our research applies an unsupervised classification scheme based on K-means to the massive APOGEE data set. We explore whether the data are amenable to classification into discrete classes. Methods: We apply the K-means algorithm to 153 847 high resolution spectra (R ≈ 22 500). We discuss the main virtues and weaknesses of the algorithm, as well as our choice of parameters. Results: We show that a classification based on normalised spectra captures the variations in stellar atmospheric parameters, chemical abundances, and rotational velocity, among other factors. The algorithm is able to separate the bulge and halo populations, and distinguish dwarfs, sub-giants, RC, and RGB stars. However, a discrete classification in flux space does not result in a neat organisation in the parameters' space. Furthermore, the lack of obvious groups in flux space causes the results to be fairly sensitive to the initialisation, and disrupts the efficiency of commonly-used methods to select the optimal number of clusters. Our classification is publicly available, including extensive online material associated with the APOGEE Data Release 12 (DR12). Conclusions: Our description of the APOGEE database can help greatly with the identification of specific types of targets for various applications. We find a lack of obvious groups in flux space, and identify limitations of the K-means algorithm in dealing with this kind of data. Full Tables B.1-B.4 are only available at the CDS via

  1. Rule-Based Classification of MMPI Data of Patients with Mental Disorders: Experiments with Basic and Extended Profiles

    Directory of Open Access Journals (Sweden)

    Jerzy Gomula

    2011-10-01

    Full Text Available In the research presented in the paper, we test the classification potential for different configurations of the MMPI data used for nosological diagnosis of patients with mental disorders. Originally, each patient is described by an MMPI profile consisting of a set of values of thirteen attributes (the so-called scales. A profile can be extended by different specialized indexes defined in the professional domain literature. Each created index has been proposed by practitioners and has a proper meaning from the clinical point of view. Adding such indexes leads to the extension of the attribute space for cases to be classified. In the paper, we present results of experiments with basic (original and extended profiles.

  2. Alteration of a motor learning rule under mirror-reversal transformation does not depend on the amplitude of visual error.

    Science.gov (United States)

    Kasuga, Shoko; Kurata, Makiko; Liu, Meigen; Ushiba, Junichi

    2015-05-01

    Human's sophisticated motor learning system paradoxically interferes with motor performance when visual information is mirror-reversed (MR), because normal movement error correction further aggravates the error. This error-increasing mechanism makes performing even a simple reaching task difficult, but is overcome by alterations in the error correction rule during the trials. To isolate factors that trigger learners to change the error correction rule, we manipulated the gain of visual angular errors when participants made arm-reaching movements with mirror-reversed visual feedback, and compared the rule alteration timing between groups with normal or reduced gain. Trial-by-trial changes in the visual angular error was tracked to explain the timing of the change in the error correction rule. Under both gain conditions, visual angular errors increased under the MR transformation, and suddenly decreased after 3-5 trials with increase. The increase became degressive at different amplitude between the two groups, nearly proportional to the visual gain. The findings suggest that the alteration of the error-correction rule is not dependent on the amplitude of visual angular errors, and possibly determined by the number of trials over which the errors increased or statistical property of the environment. The current results encourage future intensive studies focusing on the exact rule-change mechanism. Copyright © 2014 Elsevier Ireland Ltd and the Japan Neuroscience Society. All rights reserved.

  3. Machine learning for radioxenon event classification for the Comprehensive Nuclear-Test-Ban Treaty

    Energy Technology Data Exchange (ETDEWEB)

    Stocki, Trevor J., E-mail: trevor_stocki@hc-sc.gc.c [Radiation Protection Bureau, 775 Brookfield Road, A.L. 6302D1, Ottawa, ON, K1A 1C1 (Canada); Li, Guichong; Japkowicz, Nathalie [School of Information Technology and Engineering, University of Ottawa, 800 King Edward Avenue, Ottawa, ON, K1N 6N5 (Canada); Ungar, R. Kurt [Radiation Protection Bureau, 775 Brookfield Road, A.L. 6302D1, Ottawa, ON, K1A 1C1 (Canada)

    2010-01-15

    A method of weapon detection for the Comprehensive nuclear-Test-Ban-Treaty (CTBT) consists of monitoring the amount of radioxenon in the atmosphere by measuring and sampling the activity concentration of {sup 131m}Xe, {sup 133}Xe, {sup 133m}Xe, and {sup 135}Xe by radionuclide monitoring. Several explosion samples were simulated based on real data since the measured data of this type is quite rare. These data sets consisted of different circumstances of a nuclear explosion, and are used as training data sets to establish an effective classification model employing state-of-the-art technologies in machine learning. A study was conducted involving classic induction algorithms in machine learning including Naive Bayes, Neural Networks, Decision Trees, k-Nearest Neighbors, and Support Vector Machines, that revealed that they can successfully be used in this practical application. In particular, our studies show that many induction algorithms in machine learning outperform a simple linear discriminator when a signal is found in a high radioxenon background environment.

  4. A Cognitive Skill Classification Based on Multi Objective Optimization Using Learning Vector Quantization for Serious Games

    Directory of Open Access Journals (Sweden)

    Moh. Aries Syufagi

    2013-09-01

    Full Text Available Nowadays, serious games and game technology are poised to transform the way of educating and training students at all levels. However, pedagogical value in games do not help novice students learn, too many memorizing and reduce learning process due to no information of player’s ability. To asses the cognitive level of player ability, we propose a Cognitive Skill Game (CSG. CSG improves this cognitive concept to monitor how players interact with the game. This game employs Learning Vector Quantization (LVQ for optimizing the cognitive skill input classification of the player. CSG is using teacher’s data to obtain the neuron vector of cognitive skill pattern supervise. Three clusters multi objective XE "multi objective"  target will be classified as; trial and error, carefully and, expert cognitive skill. In the game play experiments employ 33 respondent players demonstrates that 61% of players have high trial and error, 21% have high carefully, and 18% have high expert cognitive skill. CSG may provide information to game engine when a player needs help or when wanting a formidable challenge. The game engine will provide the appropriate tasks according to players’ ability. CSG will help balance the emotions of players, so players do not get bored and frustrated. 

  5. Machine learning for radioxenon event classification for the Comprehensive Nuclear-Test-Ban Treaty

    International Nuclear Information System (INIS)

    Stocki, Trevor J.; Li, Guichong; Japkowicz, Nathalie; Ungar, R. Kurt

    2010-01-01

    A method of weapon detection for the Comprehensive nuclear-Test-Ban-Treaty (CTBT) consists of monitoring the amount of radioxenon in the atmosphere by measuring and sampling the activity concentration of 131m Xe, 133 Xe, 133m Xe, and 135 Xe by radionuclide monitoring. Several explosion samples were simulated based on real data since the measured data of this type is quite rare. These data sets consisted of different circumstances of a nuclear explosion, and are used as training data sets to establish an effective classification model employing state-of-the-art technologies in machine learning. A study was conducted involving classic induction algorithms in machine learning including Naive Bayes, Neural Networks, Decision Trees, k-Nearest Neighbors, and Support Vector Machines, that revealed that they can successfully be used in this practical application. In particular, our studies show that many induction algorithms in machine learning outperform a simple linear discriminator when a signal is found in a high radioxenon background environment.

  6. Conduction Delay Learning Model for Unsupervised and Supervised Classification of Spatio-Temporal Spike Patterns.

    Science.gov (United States)

    Matsubara, Takashi

    2017-01-01

    Precise spike timing is considered to play a fundamental role in communications and signal processing in biological neural networks. Understanding the mechanism of spike timing adjustment would deepen our understanding of biological systems and enable advanced engineering applications such as efficient computational architectures. However, the biological mechanisms that adjust and maintain spike timing remain unclear. Existing algorithms adopt a supervised approach, which adjusts the axonal conduction delay and synaptic efficacy until the spike timings approximate the desired timings. This study proposes a spike timing-dependent learning model that adjusts the axonal conduction delay and synaptic efficacy in both unsupervised and supervised manners. The proposed learning algorithm approximates the Expectation-Maximization algorithm, and classifies the input data encoded into spatio-temporal spike patterns. Even in the supervised classification, the algorithm requires no external spikes indicating the desired spike timings unlike existing algorithms. Furthermore, because the algorithm is consistent with biological models and hypotheses found in existing biological studies, it could capture the mechanism underlying biological delay learning.

  7. Effects of Memorization of Rule Statements on Acquisition and Retention of Rule-Governed Behavior in a Computer-Based Learning Task.

    Science.gov (United States)

    Towle, Nelson J.

    One hundred and twenty-four high school students were randomly assigned to four groups: 33 subjects memorized the rule statement before, 29 subjects memorized the rule statement during, and 30 subjects memorized the rule statement after instruction in rule application skills. Thirty-two subjects were not required to memorize rule statements.…

  8. Identifying Human Phenotype Terms by Combining Machine Learning and Validation Rules

    Directory of Open Access Journals (Sweden)

    Manuel Lobo

    2017-01-01

    Full Text Available Named-Entity Recognition is commonly used to identify biological entities such as proteins, genes, and chemical compounds found in scientific articles. The Human Phenotype Ontology (HPO is an ontology that provides a standardized vocabulary for phenotypic abnormalities found in human diseases. This article presents the Identifying Human Phenotypes (IHP system, tuned to recognize HPO entities in unstructured text. IHP uses Stanford CoreNLP for text processing and applies Conditional Random Fields trained with a rich feature set, which includes linguistic, orthographic, morphologic, lexical, and context features created for the machine learning-based classifier. However, the main novelty of IHP is its validation step based on a set of carefully crafted manual rules, such as the negative connotation analysis, that combined with a dictionary can filter incorrectly identified entities, find missed entities, and combine adjacent entities. The performance of IHP was evaluated using the recently published HPO Gold Standardized Corpora (GSC, where the system Bio-LarK CR obtained the best F-measure of 0.56. IHP achieved an F-measure of 0.65 on the GSC. Due to inconsistencies found in the GSC, an extended version of the GSC was created, adding 881 entities and modifying 4 entities. IHP achieved an F-measure of 0.863 on the new GSC.

  9. Perceptron learning rule derived from spike-frequency adaptation and spike-time-dependent plasticity.

    Science.gov (United States)

    D'Souza, Prashanth; Liu, Shih-Chii; Hahnloser, Richard H R

    2010-03-09

    It is widely believed that sensory and motor processing in the brain is based on simple computational primitives rooted in cellular and synaptic physiology. However, many gaps remain in our understanding of the connections between neural computations and biophysical properties of neurons. Here, we show that synaptic spike-time-dependent plasticity (STDP) combined with spike-frequency adaptation (SFA) in a single neuron together approximate the well-known perceptron learning rule. Our calculations and integrate-and-fire simulations reveal that delayed inputs to a neuron endowed with STDP and SFA precisely instruct neural responses to earlier arriving inputs. We demonstrate this mechanism on a developmental example of auditory map formation guided by visual inputs, as observed in the external nucleus of the inferior colliculus (ICX) of barn owls. The interplay of SFA and STDP in model ICX neurons precisely transfers the tuning curve from the visual modality onto the auditory modality, demonstrating a useful computation for multimodal and sensory-guided processing.

  10. Automatic white blood cell classification using pre-trained deep learning models: ResNet and Inception

    Science.gov (United States)

    Habibzadeh, Mehdi; Jannesari, Mahboobeh; Rezaei, Zahra; Baharvand, Hossein; Totonchi, Mehdi

    2018-04-01

    This works gives an account of evaluation of white blood cell differential counts via computer aided diagnosis (CAD) system and hematology rules. Leukocytes, also called white blood cells (WBCs) play main role of the immune system. Leukocyte is responsible for phagocytosis and immunity and therefore in defense against infection involving the fatal diseases incidence and mortality related issues. Admittedly, microscopic examination of blood samples is a time consuming, expensive and error-prone task. A manual diagnosis would search for specific Leukocytes and number abnormalities in the blood slides while complete blood count (CBC) examination is performed. Complications may arise from the large number of varying samples including different types of Leukocytes, related sub-types and concentration in blood, which makes the analysis prone to human error. This process can be automated by computerized techniques which are more reliable and economical. In essence, we seek to determine a fast, accurate mechanism for classification and gather information about distribution of white blood evidences which may help to diagnose the degree of any abnormalities during CBC test. In this work, we consider the problem of pre-processing and supervised classification of white blood cells into their four primary types including Neutrophils, Eosinophils, Lymphocytes, and Monocytes using a consecutive proposed deep learning framework. For first step, this research proposes three consecutive pre-processing calculations namely are color distortion; bounding box distortion (crop) and image flipping mirroring. In second phase, white blood cell recognition performed with hierarchy topological feature extraction using Inception and ResNet architectures. Finally, the results obtained from the preliminary analysis of cell classification with (11200) training samples and 1244 white blood cells evaluation data set are presented in confusion matrices and interpreted using accuracy rate, and false

  11. The Hybrid of Classification Tree and Extreme Learning Machine for Permeability Prediction in Oil Reservoir

    KAUST Repository

    Prasetyo Utomo, Chandra

    2011-06-01

    Permeability is an important parameter connected with oil reservoir. Predicting the permeability could save millions of dollars. Unfortunately, petroleum engineers have faced numerous challenges arriving at cost-efficient predictions. Much work has been carried out to solve this problem. The main challenge is to handle the high range of permeability in each reservoir. For about a hundred year, mathematicians and engineers have tried to deliver best prediction models. However, none of them have produced satisfying results. In the last two decades, artificial intelligence models have been used. The current best prediction model in permeability prediction is extreme learning machine (ELM). It produces fairly good results but a clear explanation of the model is hard to come by because it is so complex. The aim of this research is to propose a way out of this complexity through the design of a hybrid intelligent model. In this proposal, the system combines classification and regression models to predict the permeability value. These are based on the well logs data. In order to handle the high range of the permeability value, a classification tree is utilized. A benefit of this innovation is that the tree represents knowledge in a clear and succinct fashion and thereby avoids the complexity of all previous models. Finally, it is important to note that the ELM is used as a final predictor. Results demonstrate that this proposed hybrid model performs better when compared with support vector machines (SVM) and ELM in term of correlation coefficient. Moreover, the classification tree model potentially leads to better communication among petroleum engineers concerning this important process and has wider implications for oil reservoir management efficiency.

  12. Combining machine learning and ontological data handling for multi-source classification of nature conservation areas

    Science.gov (United States)

    Moran, Niklas; Nieland, Simon; Tintrup gen. Suntrup, Gregor; Kleinschmit, Birgit

    2017-02-01

    Manual field surveys for nature conservation management are expensive and time-consuming and could be supplemented and streamlined by using Remote Sensing (RS). RS is critical to meet requirements of existing laws such as the EU Habitats Directive (HabDir) and more importantly to meet future challenges. The full potential of RS has yet to be harnessed as different nomenclatures and procedures hinder interoperability, comparison and provenance. Therefore, automated tools are needed to use RS data to produce comparable, empirical data outputs that lend themselves to data discovery and provenance. These issues are addressed by a novel, semi-automatic ontology-based classification method that uses machine learning algorithms and Web Ontology Language (OWL) ontologies that yields traceable, interoperable and observation-based classification outputs. The method was tested on European Union Nature Information System (EUNIS) grasslands in Rheinland-Palatinate, Germany. The developed methodology is a first step in developing observation-based ontologies in the field of nature conservation. The tests show promising results for the determination of the grassland indicators wetness and alkalinity with an overall accuracy of 85% for alkalinity and 76% for wetness.

  13. Machine-learning-based Brokers for Real-time Classification of the LSST Alert Stream

    Science.gov (United States)

    Narayan, Gautham; Zaidi, Tayeb; Soraisam, Monika D.; Wang, Zhe; Lochner, Michelle; Matheson, Thomas; Saha, Abhijit; Yang, Shuo; Zhao, Zhenge; Kececioglu, John; Scheidegger, Carlos; Snodgrass, Richard T.; Axelrod, Tim; Jenness, Tim; Maier, Robert S.; Ridgway, Stephen T.; Seaman, Robert L.; Evans, Eric Michael; Singh, Navdeep; Taylor, Clark; Toeniskoetter, Jackson; Welch, Eric; Zhu, Songzhe; The ANTARES Collaboration

    2018-05-01

    The unprecedented volume and rate of transient events that will be discovered by the Large Synoptic Survey Telescope (LSST) demand that the astronomical community update its follow-up paradigm. Alert-brokers—automated software system to sift through, characterize, annotate, and prioritize events for follow-up—will be critical tools for managing alert streams in the LSST era. The Arizona-NOAO Temporal Analysis and Response to Events System (ANTARES) is one such broker. In this work, we develop a machine learning pipeline to characterize and classify variable and transient sources only using the available multiband optical photometry. We describe three illustrative stages of the pipeline, serving the three goals of early, intermediate, and retrospective classification of alerts. The first takes the form of variable versus transient categorization, the second a multiclass typing of the combined variable and transient data set, and the third a purity-driven subtyping of a transient class. Although several similar algorithms have proven themselves in simulations, we validate their performance on real observations for the first time. We quantitatively evaluate our pipeline on sparse, unevenly sampled, heteroskedastic data from various existing observational campaigns, and demonstrate very competitive classification performance. We describe our progress toward adapting the pipeline developed in this work into a real-time broker working on live alert streams from time-domain surveys.

  14. Optimizing Neuropsychological Assessments for Cognitive, Behavioral, and Functional Impairment Classification: A Machine Learning Study

    Directory of Open Access Journals (Sweden)

    Petronilla Battista

    2017-01-01

    Full Text Available Subjects with Alzheimer’s disease (AD show loss of cognitive functions and change in behavioral and functional state affecting the quality of their daily life and that of their families and caregivers. A neuropsychological assessment plays a crucial role in detecting such changes from normal conditions. However, despite the existence of clinical measures that are used to classify and diagnose AD, a large amount of subjectivity continues to exist. Our aim was to assess the potential of machine learning in quantifying this process and optimizing or even reducing the amount of neuropsychological tests used to classify AD patients, also at an early stage of impairment. We investigated the role of twelve state-of-the-art neuropsychological tests in the automatic classification of subjects with none, mild, or severe impairment as measured by the clinical dementia rating (CDR. Data were obtained from the ADNI database. In the groups of measures used as features, we included measures of both cognitive domains and subdomains. Our findings show that some tests are more frequently best predictors for the automatic classification, namely, LM, ADAS-Cog, AVLT, and FAQ, with a major role of the ADAS-Cog measures of delayed and immediate memory and the FAQ measure of financial competency.

  15. New Dandelion Algorithm Optimizes Extreme Learning Machine for Biomedical Classification Problems

    Directory of Open Access Journals (Sweden)

    Xiguang Li

    2017-01-01

    Full Text Available Inspired by the behavior of dandelion sowing, a new novel swarm intelligence algorithm, namely, dandelion algorithm (DA, is proposed for global optimization of complex functions in this paper. In DA, the dandelion population will be divided into two subpopulations, and different subpopulations will undergo different sowing behaviors. Moreover, another sowing method is designed to jump out of local optimum. In order to demonstrate the validation of DA, we compare the proposed algorithm with other existing algorithms, including bat algorithm, particle swarm optimization, and enhanced fireworks algorithm. Simulations show that the proposed algorithm seems much superior to other algorithms. At the same time, the proposed algorithm can be applied to optimize extreme learning machine (ELM for biomedical classification problems, and the effect is considerable. At last, we use different fusion methods to form different fusion classifiers, and the fusion classifiers can achieve higher accuracy and better stability to some extent.

  16. Applying a Machine Learning Technique to Classification of Japanese Pressure Patterns

    Directory of Open Access Journals (Sweden)

    H Kimura

    2009-04-01

    Full Text Available In climate research, pressure patterns are often very important. When a climatologists need to know the days of a specific pressure pattern, for example "low pressure in Western areas of Japan and high pressure in Eastern areas of Japan (Japanese winter-type weather," they have to visually check a huge number of surface weather charts. To overcome this problem, we propose an automatic classification system using a support vector machine (SVM, which is a machine-learning method. We attempted to classify pressure patterns into two classes: "winter type" and "non-winter type". For both training datasets and test datasets, we used the JRA-25 dataset from 1981 to 2000. An experimental evaluation showed that our method obtained a greater than 0.8 F-measure. We noted that variations in results were based on differences in training datasets.

  17. Cancer cell detection and classification using transformation invariant template learning methods

    International Nuclear Information System (INIS)

    Talware, Rajendra; Abhyankar, Aditya

    2011-01-01

    In traditional cancer cell detection, pathologists examine biopsies to make diagnostic assessments, largely based on cell morphology and tissue distribution. The process of image acquisition is very much subjective and the pattern undergoes unknown or random transformations during data acquisition (e.g. variation in illumination, orientation, translation and perspective) results in high degree of variability. Transformed Component Analysis (TCA) incorporates a discrete, hidden variable that accounts for transformations and uses the Expectation Maximization (EM) algorithm to jointly extract components and normalize for transformations. Further the TEMPLAR framework developed takes advantage of hierarchical pattern models and adds probabilistic modeling for local transformations. Pattern classification is based on Expectation Maximization algorithm and General Likelihood Ratio Tests (GLRT). Performance of TEMPLAR is certainly improved by defining area of interest on slide a priori. Performance can be further enhanced by making the kernel function adaptive during learning. (author)

  18. A Just-in-Time Learning based Monitoring and Classification Method for Hyper/Hypocalcemia Diagnosis.

    Science.gov (United States)

    Peng, Xin; Tang, Yang; He, Wangli; Du, Wenli; Qian, Feng

    2017-01-20

    This study focuses on the classification and pathological status monitoring of hyper/hypo-calcemia in the calcium regulatory system. By utilizing the Independent Component Analysis (ICA) mixture model, samples from healthy patients are collected, diagnosed, and subsequently classified according to their underlying behaviors, characteristics, and mechanisms. Then, a Just-in-Time Learning (JITL) has been employed in order to estimate the diseased status dynamically. In terms of JITL, for the purpose of the construction of an appropriate similarity index to identify relevant datasets, a novel similarity index based on the ICA mixture model is proposed in this paper to improve online model quality. The validity and effectiveness of the proposed approach have been demonstrated by applying it to the calcium regulatory system under various hypocalcemic and hypercalcemic diseased conditions.

  19. Deep learning for hybrid EEG-fNIRS brain–computer interface: application to motor imagery classification

    Science.gov (United States)

    Chiarelli, Antonio Maria; Croce, Pierpaolo; Merla, Arcangelo; Zappasodi, Filippo

    2018-06-01

    Objective. Brain–computer interface (BCI) refers to procedures that link the central nervous system to a device. BCI was historically performed using electroencephalography (EEG). In the last years, encouraging results were obtained by combining EEG with other neuroimaging technologies, such as functional near infrared spectroscopy (fNIRS). A crucial step of BCI is brain state classification from recorded signal features. Deep artificial neural networks (DNNs) recently reached unprecedented complex classification outcomes. These performances were achieved through increased computational power, efficient learning algorithms, valuable activation functions, and restricted or back-fed neurons connections. By expecting significant overall BCI performances, we investigated the capabilities of combining EEG and fNIRS recordings with state-of-the-art deep learning procedures. Approach. We performed a guided left and right hand motor imagery task on 15 subjects with a fixed classification response time of 1 s and overall experiment length of 10 min. Left versus right classification accuracy of a DNN in the multi-modal recording modality was estimated and it was compared to standalone EEG and fNIRS and other classifiers. Main results. At a group level we obtained significant increase in performance when considering multi-modal recordings and DNN classifier with synergistic effect. Significance. BCI performances can be significantly improved by employing multi-modal recordings that provide electrical and hemodynamic brain activity information, in combination with advanced non-linear deep learning classification procedures.

  20. Classification of suicide attempters in schizophrenia using sociocultural and clinical features: A machine learning approach.

    Science.gov (United States)

    Hettige, Nuwan C; Nguyen, Thai Binh; Yuan, Chen; Rajakulendran, Thanara; Baddour, Jermeen; Bhagwat, Nikhil; Bani-Fatemi, Ali; Voineskos, Aristotle N; Mallar Chakravarty, M; De Luca, Vincenzo

    2017-07-01

    Suicide is a major concern for those afflicted by schizophrenia. Identifying patients at the highest risk for future suicide attempts remains a complex problem for psychiatric interventions. Machine learning models allow for the integration of many risk factors in order to build an algorithm that predicts which patients are likely to attempt suicide. Currently it is unclear how to integrate previously identified risk factors into a clinically relevant predictive tool to estimate the probability of a patient with schizophrenia for attempting suicide. We conducted a cross-sectional assessment on a sample of 345 participants diagnosed with schizophrenia spectrum disorders. Suicide attempters and non-attempters were clearly identified using the Columbia Suicide Severity Rating Scale (C-SSRS) and the Beck Suicide Ideation Scale (BSS). We developed four classification algorithms using a regularized regression, random forest, elastic net and support vector machine models with sociocultural and clinical variables as features to train the models. All classification models performed similarly in identifying suicide attempters and non-attempters. Our regularized logistic regression model demonstrated an accuracy of 67% and an area under the curve (AUC) of 0.71, while the random forest model demonstrated 66% accuracy and an AUC of 0.67. Support vector classifier (SVC) model demonstrated an accuracy of 67% and an AUC of 0.70, and the elastic net model demonstrated and accuracy of 65% and an AUC of 0.71. Machine learning algorithms offer a relatively successful method for incorporating many clinical features to predict individuals at risk for future suicide attempts. Increased performance of these models using clinically relevant variables offers the potential to facilitate early treatment and intervention to prevent future suicide attempts. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Habituation: a non-associative learning rule design for spiking neurons and an autonomous mobile robots implementation

    International Nuclear Information System (INIS)

    Cyr, André; Boukadoum, Mounir

    2013-01-01

    This paper presents a novel bio-inspired habituation function for robots under control by an artificial spiking neural network. This non-associative learning rule is modelled at the synaptic level and validated through robotic behaviours in reaction to different stimuli patterns in a dynamical virtual 3D world. Habituation is minimally represented to show an attenuated response after exposure to and perception of persistent external stimuli. Based on current neurosciences research, the originality of this rule includes modulated response to variable frequencies of the captured stimuli. Filtering out repetitive data from the natural habituation mechanism has been demonstrated to be a key factor in the attention phenomenon, and inserting such a rule operating at multiple temporal dimensions of stimuli increases a robot's adaptive behaviours by ignoring broader contextual irrelevant information. (paper)

  2. Habituation: a non-associative learning rule design for spiking neurons and an autonomous mobile robots implementation.

    Science.gov (United States)

    Cyr, André; Boukadoum, Mounir

    2013-03-01

    This paper presents a novel bio-inspired habituation function for robots under control by an artificial spiking neural network. This non-associative learning rule is modelled at the synaptic level and validated through robotic behaviours in reaction to different stimuli patterns in a dynamical virtual 3D world. Habituation is minimally represented to show an attenuated response after exposure to and perception of persistent external stimuli. Based on current neurosciences research, the originality of this rule includes modulated response to variable frequencies of the captured stimuli. Filtering out repetitive data from the natural habituation mechanism has been demonstrated to be a key factor in the attention phenomenon, and inserting such a rule operating at multiple temporal dimensions of stimuli increases a robot's adaptive behaviours by ignoring broader contextual irrelevant information.

  3. Global Optimization Ensemble Model for Classification Methods

    Science.gov (United States)

    Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab

    2014-01-01

    Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382

  4. Global Optimization Ensemble Model for Classification Methods

    Directory of Open Access Journals (Sweden)

    Hina Anwar

    2014-01-01

    Full Text Available Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity.

  5. Classification and unsupervised clustering of LIGO data with Deep Transfer Learning

    Science.gov (United States)

    George, Daniel; Shen, Hongyu; Huerta, E. A.

    2018-05-01

    Gravitational wave detection requires a detailed understanding of the response of the LIGO and Virgo detectors to true signals in the presence of environmental and instrumental noise. Of particular interest is the study of anomalous non-Gaussian transients, such as glitches, since their occurrence rate in LIGO and Virgo data can obscure or even mimic true gravitational wave signals. Therefore, successfully identifying and excising these anomalies from gravitational wave data is of utmost importance for the detection and characterization of true signals and for the accurate computation of their significance. To facilitate this work, we present the first application of deep learning combined with transfer learning to show that knowledge from pretrained models for real-world object recognition can be transferred for classifying spectrograms of glitches. To showcase this new method, we use a data set of twenty-two classes of glitches, curated and labeled by the Gravity Spy project using data collected during LIGO's first discovery campaign. We demonstrate that our Deep Transfer Learning method enables an optimal use of very deep convolutional neural networks for glitch classification given small and unbalanced training data sets, significantly reduces the training time, and achieves state-of-the-art accuracy above 98.8%, lowering the previous error rate by over 60%. More importantly, once trained via transfer learning on the known classes, we show that our neural networks can be truncated and used as feature extractors for unsupervised clustering to automatically group together new unknown classes of glitches and anomalous signals. This novel capability is of paramount importance to identify and remove new types of glitches which will occur as the LIGO/Virgo detectors gradually attain design sensitivity.

  6. Multi-channel EEG-based sleep stage classification with joint collaborative representation and multiple kernel learning.

    Science.gov (United States)

    Shi, Jun; Liu, Xiao; Li, Yan; Zhang, Qi; Li, Yingjie; Ying, Shihui

    2015-10-30

    Electroencephalography (EEG) based sleep staging is commonly used in clinical routine. Feature extraction and representation plays a crucial role in EEG-based automatic classification of sleep stages. Sparse representation (SR) is a state-of-the-art unsupervised feature learning method suitable for EEG feature representation. Collaborative representation (CR) is an effective data coding method used as a classifier. Here we use CR as a data representation method to learn features from the EEG signal. A joint collaboration model is established to develop a multi-view learning algorithm, and generate joint CR (JCR) codes to fuse and represent multi-channel EEG signals. A two-stage multi-view learning-based sleep staging framework is then constructed, in which JCR and joint sparse representation (JSR) algorithms first fuse and learning the feature representation from multi-channel EEG signals, respectively. Multi-view JCR and JSR features are then integrated and sleep stages recognized by a multiple kernel extreme learning machine (MK-ELM) algorithm with grid search. The proposed two-stage multi-view learning algorithm achieves superior performance for sleep staging. With a K-means clustering based dictionary, the mean classification accuracy, sensitivity and specificity are 81.10 ± 0.15%, 71.42 ± 0.66% and 94.57 ± 0.07%, respectively; while with the dictionary learned using the submodular optimization method, they are 80.29 ± 0.22%, 71.26 ± 0.78% and 94.38 ± 0.10%, respectively. The two-stage multi-view learning based sleep staging framework outperforms all other classification methods compared in this work, while JCR is superior to JSR. The proposed multi-view learning framework has the potential for sleep staging based on multi-channel or multi-modality polysomnography signals. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Logical-rules and the classification of integral dimensions: Individual differences in the processing of arbitrary dimensions

    Directory of Open Access Journals (Sweden)

    Anthea G. Blunden

    2015-01-01

    Full Text Available A variety of converging operations demonstrate key differences between separable dimensions, which can be analyzed independently, and integral dimensions, which are processed in a non-analytic fashion. A recent investigation of response time distributions, applying a set of logical rule-based models, demonstrated that integral dimensions are pooled into a single coactive processing channel, in contrast to separable dimensions, which are processed in multiple, independent processing channels. This paper examines the claim that arbitrary dimensions created by factorially morphing four faces are processed in an integral manner (i.e., coactively. In two experiments, sixteen participants completed a categorization task in which either upright or inverted morph stimuli were classified in a speeded fashion. Analyses focused on contrasting different assumptions about the psychological representation of the stimuli, perceptual and decisional separability, and the processing architecture. We report consistent individual differences which demonstrate a mixture of some observers who demonstrate coactive processing with other observers who process the dimensions in a parallel self-terminating manner.

  8. A Classification Model and an Open E-Learning System Based on Intuitionistic Fuzzy Sets for Instructional Design Concepts

    Science.gov (United States)

    Güyer, Tolga; Aydogdu, Seyhmus

    2016-01-01

    This study suggests a classification model and an e-learning system based on this model for all instructional theories, approaches, models, strategies, methods, and technics being used in the process of instructional design that constitutes a direct or indirect resource for educational technology based on the theory of intuitionistic fuzzy sets…

  9. BENCHMARK OF MACHINE LEARNING METHODS FOR CLASSIFICATION OF A SENTINEL-2 IMAGE

    Directory of Open Access Journals (Sweden)

    F. Pirotti

    2016-06-01

    Full Text Available Thanks to mainly ESA and USGS, a large bulk of free images of the Earth is readily available nowadays. One of the main goals of remote sensing is to label images according to a set of semantic categories, i.e. image classification. This is a very challenging issue since land cover of a specific class may present a large spatial and spectral variability and objects may appear at different scales and orientations. In this study, we report the results of benchmarking 9 machine learning algorithms tested for accuracy and speed in training and classification of land-cover classes in a Sentinel-2 dataset. The following machine learning methods (MLM have been tested: linear discriminant analysis, k-nearest neighbour, random forests, support vector machines, multi layered perceptron, multi layered perceptron ensemble, ctree, boosting, logarithmic regression. The validation is carried out using a control dataset which consists of an independent classification in 11 land-cover classes of an area about 60 km2, obtained by manual visual interpretation of high resolution images (20 cm ground sampling distance by experts. In this study five out of the eleven classes are used since the others have too few samples (pixels for testing and validating subsets. The classes used are the following: (i urban (ii sowable areas (iii water (iv tree plantations (v grasslands. Validation is carried out using three different approaches: (i using pixels from the training dataset (train, (ii using pixels from the training dataset and applying cross-validation with the k-fold method (kfold and (iii using all pixels from the control dataset. Five accuracy indices are calculated for the comparison between the values predicted with each model and control values over three sets of data: the training dataset (train, the whole control dataset (full and with k-fold cross-validation (kfold with ten folds. Results from validation of predictions of the whole dataset (full show the

  10. Deep learning based classification for head and neck cancer detection with hyperspectral imaging in an animal model

    Science.gov (United States)

    Ma, Ling; Lu, Guolan; Wang, Dongsheng; Wang, Xu; Chen, Zhuo Georgia; Muller, Susan; Chen, Amy; Fei, Baowei

    2017-03-01

    Hyperspectral imaging (HSI) is an emerging imaging modality that can provide a noninvasive tool for cancer detection and image-guided surgery. HSI acquires high-resolution images at hundreds of spectral bands, providing big data to differentiating different types of tissue. We proposed a deep learning based method for the detection of head and neck cancer with hyperspectral images. Since the deep learning algorithm can learn the feature hierarchically, the learned features are more discriminative and concise than the handcrafted features. In this study, we adopt convolutional neural networks (CNN) to learn the deep feature of pixels for classifying each pixel into tumor or normal tissue. We evaluated our proposed classification method on the dataset containing hyperspectral images from 12 tumor-bearing mice. Experimental results show that our method achieved an average accuracy of 91.36%. The preliminary study demonstrated that our deep learning method can be applied to hyperspectral images for detecting head and neck tumors in animal models.

  11. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    Science.gov (United States)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  12. A machine learning approach to galaxy-LSS classification - I. Imprints on halo merger trees

    Science.gov (United States)

    Hui, Jianan; Aragon, Miguel; Cui, Xinping; Flegal, James M.

    2018-04-01

    The cosmic web plays a major role in the formation and evolution of galaxies and defines, to a large extent, their properties. However, the relation between galaxies and environment is still not well understood. Here, we present a machine learning approach to study imprints of environmental effects on the mass assembly of haloes. We present a galaxy-LSS machine learning classifier based on galaxy properties sensitive to the environment. We then use the classifier to assess the relevance of each property. Correlations between galaxy properties and their cosmic environment can be used to predict galaxy membership to void/wall or filament/cluster with an accuracy of 93 per cent. Our study unveils environmental information encoded in properties of haloes not normally considered directly dependent on the cosmic environment such as merger history and complexity. Understanding the physical mechanism by which the cosmic web is imprinted in a halo can lead to significant improvements in galaxy formation models. This is accomplished by extracting features from galaxy properties and merger trees, computing feature scores for each feature and then applying support vector machine (SVM) to different feature sets. To this end, we have discovered that the shape and depth of the merger tree, formation time, and density of the galaxy are strongly associated with the cosmic environment. We describe a significant improvement in the original classification algorithm by performing LU decomposition of the distance matrix computed by the feature vectors and then using the output of the decomposition as input vectors for SVM.

  13. Classification of amyotrophic lateral sclerosis disease based on convolutional neural network and reinforcement sample learning algorithm.

    Science.gov (United States)

    Sengur, Abdulkadir; Akbulut, Yaman; Guo, Yanhui; Bajaj, Varun

    2017-12-01

    Electromyogram (EMG) signals contain useful information of the neuromuscular diseases like amyotrophic lateral sclerosis (ALS). ALS is a well-known brain disease, which can progressively degenerate the motor neurons. In this paper, we propose a deep learning based method for efficient classification of ALS and normal EMG signals. Spectrogram, continuous wavelet transform (CWT), and smoothed pseudo Wigner-Ville distribution (SPWVD) have been employed for time-frequency (T-F) representation of EMG signals. A convolutional neural network is employed to classify these features. In it, Two convolution layers, two pooling layer, a fully connected layer and a lost function layer is considered in CNN architecture. The CNN architecture is trained with the reinforcement sample learning strategy. The efficiency of the proposed implementation is tested on publicly available EMG dataset. The dataset contains 89 ALS and 133 normal EMG signals with 24 kHz sampling frequency. Experimental results show 96.80% accuracy. The obtained results are also compared with other methods, which show the superiority of the proposed method.

  14. Decision rule classifiers for multi-label decision tables

    KAUST Repository

    Alsolami, Fawaz

    2014-01-01

    Recently, multi-label classification problem has received significant attention in the research community. This paper is devoted to study the effect of the considered rule heuristic parameters on the generalization error. The results of experiments for decision tables from UCI Machine Learning Repository and KEEL Repository show that rule heuristics taking into account both coverage and uncertainty perform better than the strategies taking into account a single criterion. © 2014 Springer International Publishing.

  15. Learning Classification Models of Cognitive Conditions from Subtle Behaviors in the Digital Clock Drawing Test.

    Science.gov (United States)

    Souillard-Mandar, William; Davis, Randall; Rudin, Cynthia; Au, Rhoda; Libon, David J; Swenson, Rodney; Price, Catherine C; Lamar, Melissa; Penney, Dana L

    2016-03-01

    The Clock Drawing Test - a simple pencil and paper test - has been used for more than 50 years as a screening tool to differentiate normal individuals from those with cognitive impairment, and has proven useful in helping to diagnose cognitive dysfunction associated with neurological disorders such as Alzheimer's disease, Parkinson's disease, and other dementias and conditions. We have been administering the test using a digitizing ballpoint pen that reports its position with considerable spatial and temporal precision, making available far more detailed data about the subject's performance. Using pen stroke data from these drawings categorized by our software, we designed and computed a large collection of features, then explored the tradeoffs in performance and interpretability in classifiers built using a number of different subsets of these features and a variety of different machine learning techniques. We used traditional machine learning methods to build prediction models that achieve high accuracy. We operationalized widely used manual scoring systems so that we could use them as benchmarks for our models. We worked with clinicians to define guidelines for model interpretability, and constructed sparse linear models and rule lists designed to be as easy to use as scoring systems currently used by clinicians, but more accurate. While our models will require additional testing for validation, they offer the possibility of substantial improvement in detecting cognitive impairment earlier than currently possible, a development with considerable potential impact in practice.

  16. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    Science.gov (United States)

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  17. ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Berian James, J. [Astronomy Department, University of California, Berkeley, CA 94720-7450 (United States); Brink, Henrik [Dark Cosmology Centre, Juliane Maries Vej 30, 2100 Copenhagen O (Denmark); Long, James P.; Rice, John, E-mail: jwrichar@stat.berkeley.edu [Statistics Department, University of California, Berkeley, CA 94720-7450 (United States)

    2012-01-10

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL-where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up-is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  18. Supervised learning classification models for prediction of plant virus encoded RNA silencing suppressors.

    Directory of Open Access Journals (Sweden)

    Zeenia Jagga

    Full Text Available Viral encoded RNA silencing suppressor proteins interfere with the host RNA silencing machinery, facilitating viral infection by evading host immunity. In plant hosts, the viral proteins have several basic science implications and biotechnology applications. However in silico identification of these proteins is limited by their high sequence diversity. In this study we developed supervised learning based classification models for plant viral RNA silencing suppressor proteins in plant viruses. We developed four classifiers based on supervised learning algorithms: J48, Random Forest, LibSVM and Naïve Bayes algorithms, with enriched model learning by correlation based feature selection. Structural and physicochemical features calculated for experimentally verified primary protein sequences were used to train the classifiers. The training features include amino acid composition; auto correlation coefficients; composition, transition, and distribution of various physicochemical properties; and pseudo amino acid composition. Performance analysis of predictive models based on 10 fold cross-validation and independent data testing revealed that the Random Forest based model was the best and achieved 86.11% overall accuracy and 86.22% balanced accuracy with a remarkably high area under the Receivers Operating Characteristic curve of 0.95 to predict viral RNA silencing suppressor proteins. The prediction models for plant viral RNA silencing suppressors can potentially aid identification of novel viral RNA silencing suppressors, which will provide valuable insights into the mechanism of RNA silencing and could be further explored as potential targets for designing novel antiviral therapeutics. Also, the key subset of identified optimal features may help in determining compositional patterns in the viral proteins which are important determinants for RNA silencing suppressor activities. The best prediction model developed in the study is available as a

  19. ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION

    International Nuclear Information System (INIS)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Berian James, J.; Brink, Henrik; Long, James P.; Rice, John

    2012-01-01

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  20. Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

    Science.gov (United States)

    Richards, Joseph W.; Starr, Dan L.; Brink, Henrik; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; James, J. Berian; Long, James P.; Rice, John

    2012-01-01

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.