WorldWideScience

Sample records for supervised classification technique

  1. Sentiment Analysis of Twitter tweets using supervised classification technique

    Directory of Open Access Journals (Sweden)

    Pranav Waykar

    2016-05-01

    Full Text Available Making use of social media for analyzing the perceptions of the masses over a product, event or a person has gained momentum in recent times. Out of a wide array of social networks, we chose Twitter for our analysis as the opinions expressed their, are concise and bear a distinctive polarity. Here, we collect the most recent tweets on users' area of interest and analyze them. The extracted tweets are then segregated as positive, negative and neutral. We do the classification in following manner: collect the tweets using Twitter API; then we process the collected tweets to convert all letters to lowercase, eliminate special characters etc. which makes the classification more efficient; the processed tweets are classified using a supervised classification technique. We make use of Naive Bayes classifier to segregate the tweets as positive, negative and neutral. We use a set of sample tweets to train the classifier. The percentage of the tweets in each category is then computed and the result is represented graphically. The result can be used further to gain an insight into the views of the people using Twitter about a particular topic that is being searched by the user. It can help corporate houses devise strategies on the basis of the popularity of their product among the masses. It may help the consumers to make informed choices based on the general sentiment expressed by the Twitter users on a product

  2. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  3. Supervised Ensemble Classification of Kepler Variable Stars

    CERN Document Server

    Bass, Gideon

    2016-01-01

    Variable star analysis and classification is an important task in the understanding of stellar features and processes. While historically classifications have been done manually by highly skilled experts, the recent and rapid expansion in the quantity and quality of data has demanded new techniques, most notably automatic classification through supervised machine learning. We present an expansion of existing work on the field by analyzing variable stars in the {\\em Kepler} field using an ensemble approach, combining multiple characterization and classification techniques to produce improved classification rates. Classifications for each of the roughly 150,000 stars observed by {\\em Kepler} are produced separating the stars into one of 14 variable star classes.

  4. Classification of damage in structural systems using time series analysis and supervised and unsupervised pattern recognition techniques

    Science.gov (United States)

    Omenzetter, Piotr; de Lautour, Oliver R.

    2010-04-01

    Developed for studying long, periodic records of various measured quantities, time series analysis methods are inherently suited and offer interesting possibilities for Structural Health Monitoring (SHM) applications. However, their use in SHM can still be regarded as an emerging application and deserves more studies. In this research, Autoregressive (AR) models were used to fit experimental acceleration time histories from two experimental structural systems, a 3- storey bookshelf-type laboratory structure and the ASCE Phase II SHM Benchmark Structure, in healthy and several damaged states. The coefficients of the AR models were chosen as damage sensitive features. Preliminary visual inspection of the large, multidimensional sets of AR coefficients to check the presence of clusters corresponding to different damage severities was achieved using Sammon mapping - an efficient nonlinear data compression technique. Systematic classification of damage into states based on the analysis of the AR coefficients was achieved using two supervised classification techniques: Nearest Neighbor Classification (NNC) and Learning Vector Quantization (LVQ), and one unsupervised technique: Self-organizing Maps (SOM). This paper discusses the performance of AR coefficients as damage sensitive features and compares the efficiency of the three classification techniques using experimental data.

  5. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    Science.gov (United States)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  6. Projected estimators for robust semi-supervised classification

    DEFF Research Database (Denmark)

    Krijthe, Jesse H.; Loog, Marco

    2017-01-01

    For semi-supervised techniques to be applied safely in practice we at least want methods to outperform their supervised counterparts. We study this question for classification using the well-known quadratic surrogate loss function. Unlike other approaches to semi-supervised learning, the procedure...... proposed in this work does not rely on assumptions that are not intrinsic to the classifier at hand. Using a projection of the supervised estimate onto a set of constraints imposed by the unlabeled data, we find we can safely improve over the supervised solution in terms of this quadratic loss. More...... specifically, we prove that, measured on the labeled and unlabeled training data, this semi-supervised procedure never gives a lower quadratic loss than the supervised alternative. To our knowledge this is the first approach that offers such strong, albeit conservative, guarantees for improvement over...

  7. A New Method for Solving Supervised Data Classification Problems

    Directory of Open Access Journals (Sweden)

    Parvaneh Shabanzadeh

    2014-01-01

    Full Text Available Supervised data classification is one of the techniques used to extract nontrivial information from data. Classification is a widely used technique in various fields, including data mining, industry, medicine, science, and law. This paper considers a new algorithm for supervised data classification problems associated with the cluster analysis. The mathematical formulations for this algorithm are based on nonsmooth, nonconvex optimization. A new algorithm for solving this optimization problem is utilized. The new algorithm uses a derivative-free technique, with robustness and efficiency. To improve classification performance and efficiency in generating classification model, a new feature selection algorithm based on techniques of convex programming is suggested. Proposed methods are tested on real-world datasets. Results of numerical experiments have been presented which demonstrate the effectiveness of the proposed algorithms.

  8. Supervised Classification of Agricultural Land Cover Using a Modified k-NN Technique (MNN and Landsat Remote Sensing Imagery

    Directory of Open Access Journals (Sweden)

    Karsten Schulz

    2009-11-01

    Full Text Available Nearest neighbor techniques are commonly used in remote sensing, pattern recognition and statistics to classify objects into a predefined number of categories based on a given set of predictors. These techniques are especially useful for highly nonlinear relationship between the variables. In most studies the distance measure is adopted a priori. In contrast we propose a general procedure to find an adaptive metric that combines a local variance reducing technique and a linear embedding of the observation space into an appropriate Euclidean space. To illustrate the application of this technique, two agricultural land cover classifications using mono-temporal and multi-temporal Landsat scenes are presented. The results of the study, compared with standard approaches used in remote sensing such as maximum likelihood (ML or k-Nearest Neighbor (k-NN indicate substantial improvement with regard to the overall accuracy and the cardinality of the calibration data set. Also, using MNN in a soft/fuzzy classification framework demonstrated to be a very useful tool in order to derive critical areas that need some further attention and investment concerning additional calibration data.

  9. Genetic classification of populations using supervised learning.

    LENUS (Irish Health Repository)

    Bridges, Michael

    2011-01-01

    There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case-control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available.In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies.

  10. 7 CFR 27.80 - Fees; classification, Micronaire, and supervision.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Fees; classification, Micronaire, and supervision. 27... Classification and Micronaire § 27.80 Fees; classification, Micronaire, and supervision. For services rendered by... classification and Micronaire determination results certified on cotton class certificates.) (e) Supervision,...

  11. Document Classification Using Expectation Maximization with Semi Supervised Learning

    CERN Document Server

    Nigam, Bhawna; Salve, Sonal; Vamney, Swati

    2011-01-01

    As the amount of online document increases, the demand for document classification to aid the analysis and management of document is increasing. Text is cheap, but information, in the form of knowing what classes a document belongs to, is expensive. The main purpose of this paper is to explain the expectation maximization technique of data mining to classify the document and to learn how to improve the accuracy while using semi-supervised approach. Expectation maximization algorithm is applied with both supervised and semi-supervised approach. It is found that semi-supervised approach is more accurate and effective. The main advantage of semi supervised approach is "Dynamically Generation of New Class". The algorithm first trains a classifier using the labeled document and probabilistically classifies the unlabeled documents. The car dataset for the evaluation purpose is collected from UCI repository dataset in which some changes have been done from our side.

  12. Supervised Classification Performance of Multispectral Images

    CERN Document Server

    Perumal, K

    2010-01-01

    Nowadays government and private agencies use remote sensing imagery for a wide range of applications from military applications to farm development. The images may be a panchromatic, multispectral, hyperspectral or even ultraspectral of terra bytes. Remote sensing image classification is one amongst the most significant application worlds for remote sensing. A few number of image classification algorithms have proved good precision in classifying remote sensing data. But, of late, due to the increasing spatiotemporal dimensions of the remote sensing data, traditional classification algorithms have exposed weaknesses necessitating further research in the field of remote sensing image classification. So an efficient classifier is needed to classify the remote sensing images to extract information. We are experimenting with both supervised and unsupervised classification. Here we compare the different classification methods and their performances. It is found that Mahalanobis classifier performed the best in our...

  13. Generative supervised classification using Dirichlet process priors.

    Science.gov (United States)

    Davy, Manuel; Tourneret, Jean-Yves

    2010-10-01

    Choosing the appropriate parameter prior distributions associated to a given bayesian model is a challenging problem. Conjugate priors can be selected for simplicity motivations. However, conjugate priors can be too restrictive to accurately model the available prior information. This paper studies a new generative supervised classifier which assumes that the parameter prior distributions conditioned on each class are mixtures of Dirichlet processes. The motivations for using mixtures of Dirichlet processes is their known ability to model accurately a large class of probability distributions. A Monte Carlo method allowing one to sample according to the resulting class-conditional posterior distributions is then studied. The parameters appearing in the class-conditional densities can then be estimated using these generated samples (following bayesian learning). The proposed supervised classifier is applied to the classification of altimetric waveforms backscattered from different surfaces (oceans, ices, forests, and deserts). This classification is a first step before developing tools allowing for the extraction of useful geophysical information from altimetric waveforms backscattered from nonoceanic surfaces.

  14. Semi-supervised Learning for Photometric Supernova Classification

    CERN Document Server

    Richards, Joseph W; Freeman, Peter E; Schafer, Chad M; Poznanski, Dovi

    2011-01-01

    We present a semi-supervised method for photometric supernova typing. Our approach is to first use the nonlinear dimension reduction technique diffusion map to detect structure in a database of supernova light curves and subsequently employ random forest classification on a spectroscopically confirmed training set to learn a model that can predict the type of each newly observed supernova. We demonstrate that this is an effective method for supernova typing. As supernova numbers increase, our semi-supervised method efficiently utilizes this information to improve classification, a property not enjoyed by template based methods. Applied to supernova data simulated by Kessler et al. (2010b) to mimic those of the Dark Energy Survey, our methods achieve (cross-validated) 96% Type Ia purity and 86% Type Ia efficiency on the spectroscopic sample, but only 56% Type Ia purity and 48% efficiency on the photometric sample due to their spectroscopic followup strategy. To improve the performance on the photometric sample...

  15. [RVM supervised feature extraction and Seyfert spectra classification].

    Science.gov (United States)

    Li, Xiang-Ru; Hu, Zhan-Yi; Zhao, Yong-Heng; Li, Xiao-Ming

    2009-06-01

    With recent technological advances in wide field survey astronomy and implementation of several large-scale astronomical survey proposals (e. g. SDSS, 2dF and LAMOST), celestial spectra are becoming very abundant and rich. Therefore, research on automated classification methods based on celestial spectra has been attracting more and more attention in recent years. Feature extraction is a fundamental problem in automated spectral classification, which not only influences the difficulty and complexity of the problem, but also determines the performance of the designed classifying system. The available methods of feature extraction for spectra classification are usually unsupervised, e. g. principal components analysis (PCA), wavelet transform (WT), artificial neural networks (ANN) and Rough Set theory. These methods extract features not by their capability to classify spectra, but by some kind of power to approximate the original celestial spectra. Therefore, the extracted features by these methods usually are not the best ones for classification. In the present work, the authors pointed out the necessary to investigate supervised feature extraction by analyzing the characteristics of the spectra classification research in available literature and the limitations of unsupervised feature extracting methods. And the authors also studied supervised feature extracting based on relevance vector machine (RVM) and its application in Seyfert spectra classification. RVM is a recently introduced method based on Bayesian methodology, automatic relevance determination (ARD), regularization technique and hierarchical priors structure. By this method, the authors can easily fuse the information in training data, the authors' prior knowledge and belief in the problem, etc. And RVM could effectively extract the features and reduce the data based on classifying capability. Extensive experiments show its superior performance in dimensional reduction and feature extraction for Seyfert

  16. Automatic age and gender classification using supervised appearance model

    Science.gov (United States)

    Bukar, Ali Maina; Ugail, Hassan; Connah, David

    2016-11-01

    Age and gender classification are two important problems that recently gained popularity in the research community, due to their wide range of applications. Research has shown that both age and gender information are encoded in the face shape and texture, hence the active appearance model (AAM), a statistical model that captures shape and texture variations, has been one of the most widely used feature extraction techniques for the aforementioned problems. However, AAM suffers from some drawbacks, especially when used for classification. This is primarily because principal component analysis (PCA), which is at the core of the model, works in an unsupervised manner, i.e., PCA dimensionality reduction does not take into account how the predictor variables relate to the response (class labels). Rather, it explores only the underlying structure of the predictor variables, thus, it is no surprise if PCA discards valuable parts of the data that represent discriminatory features. Toward this end, we propose a supervised appearance model (sAM) that improves on AAM by replacing PCA with partial least-squares regression. This feature extraction technique is then used for the problems of age and gender classification. Our experiments show that sAM has better predictive power than the conventional AAM.

  17. Supervision That Improves Teaching: Strategies and Techniques.

    Science.gov (United States)

    Sullivan, Susan; Glanz, Jeffrey

    This book offers a plan for improved classroom practice through the supervisory process. It includes hands-on practices for developing a personalized supervision strategy, research-based and empirically tested strategies, field-tested tools and techniques for qualitative and quantitative observation, a comprehensive resource of traditional and…

  18. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification

    Directory of Open Access Journals (Sweden)

    R. Sathya

    2013-02-01

    Full Text Available This paper presents a comparative account of unsupervised and supervised learning models and their pattern classification evaluations as applied to the higher education scenario. Classification plays a vital role in machine based learning algorithms and in the present study, we found that, though the error back-propagation learning algorithm as provided by supervised learning model is very efficient for a number of non-linear real-time problems, KSOM of unsupervised learning model, offers efficient solution and classification in the present study.

  19. Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data

    OpenAIRE

    Kurth, Thorsten; Zhang, Jian; Satish, Nadathur; Mitliagkas, Ioannis; Racah, Evan; Patwary, Mostofa Ali; Malas, Tareq; Sundaram, Narayanan; Bhimji, Wahid; Smorkalov, Mikhail; Deslippe, Jack; Shiryaev, Mikhail; Sridharan, Srinivas; Prabhat; Dubey, Pradeep

    2017-01-01

    This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy physics data as well as semi-supervised architectures for localizing and classifying extreme weather in climate data. Our Intelcaffe-based implementation obtains $\\sim$2TFLOP/s on a single Cori Phase-II Xeon-Phi node. We use a hybrid strategy employin...

  20. Semi-supervised SVM for individual tree crown species classification

    Science.gov (United States)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  1. QUEST: Eliminating Online Supervised Learning for Efficient Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Ardjan Zwartjes

    2016-10-01

    Full Text Available In this work, we introduce QUEST (QUantile Estimation after Supervised Training, an adaptive classification algorithm for Wireless Sensor Networks (WSNs that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting raw sensor data puts high demands on the battery, reducing network life time. By merely transmitting partial results or classifications based on the sampled data, the amount of traffic on the network can be significantly reduced. Such classifications can be made by learning based algorithms using sampled data. An important issue, however, is the training phase of these learning based algorithms. Training a deployed sensor network requires a lot of communication and an impractical amount of human involvement. QUEST is a hybrid algorithm that combines supervised learning in a controlled environment with unsupervised learning on the location of deployment. Using the SITEX02 dataset, we demonstrate that the presented solution works with a performance penalty of less than 10% in 90% of the tests. Under some circumstances, it even outperforms a network of classifiers completely trained with supervised learning. As a result, the need for on-site supervised learning and communication for training is completely eliminated by our solution.

  2. Enhanced manifold regularization for semi-supervised classification.

    Science.gov (United States)

    Gan, Haitao; Luo, Zhizeng; Fan, Yingle; Sang, Nong

    2016-06-01

    Manifold regularization (MR) has become one of the most widely used approaches in the semi-supervised learning field. It has shown superiority by exploiting the local manifold structure of both labeled and unlabeled data. The manifold structure is modeled by constructing a Laplacian graph and then incorporated in learning through a smoothness regularization term. Hence the labels of labeled and unlabeled data vary smoothly along the geodesics on the manifold. However, MR has ignored the discriminative ability of the labeled and unlabeled data. To address the problem, we propose an enhanced MR framework for semi-supervised classification in which the local discriminative information of the labeled and unlabeled data is explicitly exploited. To make full use of labeled data, we firstly employ a semi-supervised clustering method to discover the underlying data space structure of the whole dataset. Then we construct a local discrimination graph to model the discriminative information of labeled and unlabeled data according to the discovered intrinsic structure. Therefore, the data points that may be from different clusters, though similar on the manifold, are enforced far away from each other. Finally, the discrimination graph is incorporated into the MR framework. In particular, we utilize semi-supervised fuzzy c-means and Laplacian regularized Kernel minimum squared error for semi-supervised clustering and classification, respectively. Experimental results on several benchmark datasets and face recognition demonstrate the effectiveness of our proposed method.

  3. Benchmarking protein classification algorithms via supervised cross-validation.

    Science.gov (United States)

    Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor

    2008-04-24

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and

  4. Supervised Classification Methods for Seismic Phase Identification

    Science.gov (United States)

    Schneider, Jeff; Given, Jeff; Le Bras, Ronan; Fisseha, Misrak

    2010-05-01

    The Comprehensive Nuclear Test Ban Treaty Organization (CTBTO) is tasked with monitoring compliance with the CTBT. The organization is installing the International Monitoring System (IMS), a global network of seismic, hydroacoustic, infrasound, and radionuclide sensor stations. The International Data Centre (IDC) receives the data from seismic stations either in real time or on request. These data are first processed on a station per station basis. This initial step yields discrete detections which are then assembled on a network basis (with the addition of hydroacoustic and infrasound data) to produce automatic and analyst reviewed bulletins containing seismic, hydroacoustic, and infrasound detections. The initial station processing step includes the identification of seismic and acoustic phases which are given a label. Subsequent network processing relies on this preliminary labeling, and as a consequence, the accuracy and reliability of automatic and reviewed bulletins also depend on this initial step. A very large ground truth database containing massive amounts of detections with analyst-reviewed labels is available to improve on the current operational system using machine learning methods. An initial study using a limited amount of data was conducted during the ISS09 project of the CTBTO. Several classification methods were tested: decision tree with bagging; logistic regression; neural networks trained with back-propagation; Bayesian networks as generative class models; naive Bayse classification; support vector machines. The initial assessment was that the phase identification process could be improved by at least 13% over the current operational system and that the method obtaining the best results was the decision tree with bagging. We present the results of a study using a much larger learning dataset and preliminary implementation results.

  5. A review of supervised object-based land-cover image classification

    Science.gov (United States)

    Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue

    2017-08-01

    Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial

  6. Supervised and unsupervised classification - The case of IRAS point sources

    Science.gov (United States)

    Adorf, Hans-Martin; Meurs, E. J. A.

    Progress is reported on a project which aims at mapping the extragalactic sky in order to derive the large scale distribution of luminous matter. The approach consists in selecting from the IRAS Point Source Catalog a set of galaxies which is as clean and as complete as possible. The decision and discrimination problems involved lend themselves to a treatment using methods from multivariate statistics, in particular statistical pattern recognition. Two different approaches, one based on supervised Bayesian classification, the other on unsupervised data-driven classification, are presented and some preliminary results are reported.

  7. Phenotype classification of zebrafish embryos by supervised learning.

    Directory of Open Access Journals (Sweden)

    Nathalie Jeanray

    Full Text Available Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning. We show that, compared to manual classification, automatic classification results in 90 to 100% agreement with consensus voting of biological experts in nine out of eleven considered defects in 3 days old zebrafish larvae. Automation of the analysis and classification of zebrafish embryo pictures reduces the workload and time required for the biological expert and increases the reproducibility and objectivity of this classification.

  8. Phenotype classification of zebrafish embryos by supervised learning.

    Science.gov (United States)

    Jeanray, Nathalie; Marée, Raphaël; Pruvot, Benoist; Stern, Olivier; Geurts, Pierre; Wehenkel, Louis; Muller, Marc

    2015-01-01

    Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning. We show that, compared to manual classification, automatic classification results in 90 to 100% agreement with consensus voting of biological experts in nine out of eleven considered defects in 3 days old zebrafish larvae. Automation of the analysis and classification of zebrafish embryo pictures reduces the workload and time required for the biological expert and increases the reproducibility and objectivity of this classification.

  9. Quintic spline smooth semi-supervised support vector classification machine

    Institute of Scientific and Technical Information of China (English)

    Xiaodan Zhang; Jinggai Ma; Aihua Li; Ang Li

    2015-01-01

    A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi-cation. Since the objective function of the model for an unstrained semi-supervised vector machine is not smooth, many fast opti-mization algorithms cannot be applied to solve the model. In order to overcome the difficulty of dealing with non-smooth objective functions, new methods that can solve the semi-supervised vector machine with desired classification accuracy are in great demand. A quintic spline function with three-times differentiability at the ori-gin is constructed by a general three-moment method, which can be used to approximate the symmetric hinge loss function. The approximate accuracy of the quintic spline function is estimated. Moreover, a quintic spline smooth semi-support vector machine is obtained and the convergence accuracy of the smooth model to the non-smooth one is analyzed. Three experiments are performed to test the efficiency of the model. The experimental results show that the new model outperforms other smooth models, in terms of classification performance. Furthermore, the new model is not sensitive to the increasing number of the labeled samples, which means that the new model is more efficient.

  10. Random forest automated supervised classification of Hipparcos periodic variable stars

    CERN Document Server

    Dubath, P; Süveges, M; Blomme, J; López, M; Sarro, L M; De Ridder, J; Cuypers, J; Guy, L; Lecoeur, I; Nienartowicz, K; Jan, A; Beck, M; Mowlavi, N; De Cat, P; Lebzelter, T; Eyer, L

    2011-01-01

    We present an evaluation of the performance of an automated classification of the Hipparcos periodic variable stars into 26 types. The sub-sample with the most reliable variability types available in the literature is used to train supervised algorithms to characterize the type dependencies on a number of attributes. The most useful attributes evaluated with the random forest methodology include, in decreasing order of importance, the period, the amplitude, the V-I colour index, the absolute magnitude, the residual around the folded light-curve model, the magnitude distribution skewness and the amplitude of the second harmonic of the Fourier series model relative to that of the fundamental frequency. Random forests and a multi-stage scheme involving Bayesian network and Gaussian mixture methods lead to statistically equivalent results. In standard 10-fold cross-validation experiments, the rate of correct classification is between 90 and 100%, depending on the variability type. The main mis-classification case...

  11. Supervised and Unsupervised Classification for Pattern Recognition Purposes

    Directory of Open Access Journals (Sweden)

    Catalina COCIANU

    2006-01-01

    Full Text Available A cluster analysis task has to identify the grouping trends of data, to decide on the sound clusters as well as to validate somehow the resulted structure. The identification of the grouping tendency existing in a data collection assumes the selection of a framework stated in terms of a mathematical model allowing to express the similarity degree between couples of particular objects, quasi-metrics expressing the similarity between an object an a cluster and between clusters, respectively. In supervised classification, we are provided with a collection of preclassified patterns, and the problem is to label a newly encountered pattern. Typically, the given training patterns are used to learn the descriptions of classes which in turn are used to label a new pattern. The final section of the paper presents a new methodology for supervised learning based on PCA. The classes are represented in the measurement/feature space by a continuous repartitions

  12. Artificial neural network classification using a minimal training set - Comparison to conventional supervised classification

    Science.gov (United States)

    Hepner, George F.; Logan, Thomas; Ritter, Niles; Bryant, Nevin

    1990-01-01

    Recent research has shown an artificial neural network (ANN) to be capable of pattern recognition and the classification of image data. This paper examines the potential for the application of neural network computing to satellite image processing. A second objective is to provide a preliminary comparison and ANN classification. An artificial neural network can be trained to do land-cover classification of satellite imagery using selected sites representative of each class in a manner similar to conventional supervised classification. One of the major problems associated with recognition and classifications of pattern from remotely sensed data is the time and cost of developing a set of training sites. This reseach compares the use of an ANN back propagation classification procedure with a conventional supervised maximum likelihood classification procedure using a minimal training set. When using a minimal training set, the neural network is able to provide a land-cover classification superior to the classification derived from the conventional classification procedure. This research is the foundation for developing application parameters for further prototyping of software and hardware implementations for artificial neural networks in satellite image and geographic information processing.

  13. Automatic Building Detection based on Supervised Classification using High Resolution Google Earth Images

    OpenAIRE

    Ghaffarian, S.

    2014-01-01

    This paper presents a novel approach to detect the buildings by automization of the training area collecting stage for supervised classification. The method based on the fact that a 3d building structure should cast a shadow under suitable imaging conditions. Therefore, the methodology begins with the detection and masking out the shadow areas using luminance component of the LAB color space, which indicates the lightness of the image, and a novel double thresholding technique. Furth...

  14. Semi Supervised Weighted K-Means Clustering for Multi Class Data Classification

    Directory of Open Access Journals (Sweden)

    Vijaya Geeta Dharmavaram

    2013-01-01

    Full Text Available Supervised Learning techniques require large number of labeled examples to train a classifier model. Research on Semi Supervised Learning is motivated by the availability of unlabeled examples in abundance even in domains with limited number of labeled examples. In such domains semi supervised classifier uses the results of clustering for classifier development since clustering does not rely only on labeled examples as it groups the objects based on their similarities. In this paper, the authors propose a new algorithm for semi supervised classification namely Semi Supervised Weighted K-Means (SSWKM. In this algorithm, the authors suggest the usage of weighted Euclidean distance metric designed as per the purpose of clustering for estimating the proximity between a pair of points and used it for building semi supervised classifier. The authors propose a new approach for estimating the weights of features by appropriately adopting the results of multiple discriminant analysis. The proposed method was then tested on benchmark datasets from UCI repository with varied percentage of labeled examples and found to be consistent and promising.

  15. Weakly supervised histopathology cancer image segmentation and classification.

    Science.gov (United States)

    Xu, Yan; Zhu, Jun-Yan; Chang, Eric I-Chao; Lai, Maode; Tu, Zhuowen

    2014-04-01

    Labeling a histopathology image as having cancerous regions or not is a critical task in cancer diagnosis; it is also clinically important to segment the cancer tissues and cluster them into various classes. Existing supervised approaches for image classification and segmentation require detailed manual annotations for the cancer pixels, which are time-consuming to obtain. In this paper, we propose a new learning method, multiple clustered instance learning (MCIL) (along the line of weakly supervised learning) for histopathology image segmentation. The proposed MCIL method simultaneously performs image-level classification (cancer vs. non-cancer image), medical image segmentation (cancer vs. non-cancer tissue), and patch-level clustering (different classes). We embed the clustering concept into the multiple instance learning (MIL) setting and derive a principled solution to performing the above three tasks in an integrated framework. In addition, we introduce contextual constraints as a prior for MCIL, which further reduces the ambiguity in MIL. Experimental results on histopathology colon cancer images and cytology images demonstrate the great advantage of MCIL over the competing methods.

  16. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    Science.gov (United States)

    Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

  17. Semi-Supervised Learning for Classification of Protein Sequence Data

    Directory of Open Access Journals (Sweden)

    Brian R. King

    2008-01-01

    Full Text Available Protein sequence data continue to become available at an exponential rate. Annotation of functional and structural attributes of these data lags far behind, with only a small fraction of the data understood and labeled by experimental methods. Classification methods that are based on semi-supervised learning can increase the overall accuracy of classifying partly labeled data in many domains, but very few methods exist that have shown their effect on protein sequence classification. We show how proven methods from text classification can be applied to protein sequence data, as we consider both existing and novel extensions to the basic methods, and demonstrate restrictions and differences that must be considered. We demonstrate comparative results against the transductive support vector machine, and show superior results on the most difficult classification problems. Our results show that large repositories of unlabeled protein sequence data can indeed be used to improve predictive performance, particularly in situations where there are fewer labeled protein sequences available, and/or the data are highly unbalanced in nature.

  18. A new semi-supervised classification strategy combining active learning and spectral unmixing of hyperspectral data

    Science.gov (United States)

    Sun, Yanli; Zhang, Xia; Plaza, Antonio; Li, Jun; Dópido, Inmaculada; Liu, Yi

    2016-10-01

    Hyperspectral remote sensing allows for the detailed analysis of the surface of the Earth by providing high-dimensional images with hundreds of spectral bands. Hyperspectral image classification plays a significant role in hyperspectral image analysis and has been a very active research area in the last few years. In the context of hyperspectral image classification, supervised techniques (which have achieved wide acceptance) must address a difficult task due to the unbalance between the high dimensionality of the data and the limited availability of labeled training samples in real analysis scenarios. While the collection of labeled samples is generally difficult, expensive, and time-consuming, unlabeled samples can be generated in a much easier way. Semi-supervised learning offers an effective solution that can take advantage of both unlabeled and a small amount of labeled samples. Spectral unmixing is another widely used technique in hyperspectral image analysis, developed to retrieve pure spectral components and determine their abundance fractions in mixed pixels. In this work, we propose a method to perform semi-supervised hyperspectral image classification by combining the information retrieved with spectral unmixing and classification. Two kinds of samples that are highly mixed in nature are automatically selected, aiming at finding the most informative unlabeled samples. One kind is given by the samples minimizing the distance between the first two most probable classes by calculating the difference between the two highest abundances. Another kind is given by the samples minimizing the distance between the most probable class and the least probable class, obtained by calculating the difference between the highest and lowest abundances. The effectiveness of the proposed method is evaluated using a real hyperspectral data set collected by the airborne visible infrared imaging spectrometer (AVIRIS) over the Indian Pines region in Northwestern Indiana. In the

  19. Random forest automated supervised classification of Hipparcos periodic variable stars

    Science.gov (United States)

    Dubath, P.; Rimoldini, L.; Süveges, M.; Blomme, J.; López, M.; Sarro, L. M.; De Ridder, J.; Cuypers, J.; Guy, L.; Lecoeur, I.; Nienartowicz, K.; Jan, A.; Beck, M.; Mowlavi, N.; De Cat, P.; Lebzelter, T.; Eyer, L.

    2011-07-01

    We present an evaluation of the performance of an automated classification of the Hipparcos periodic variable stars into 26 types. The sub-sample with the most reliable variability types available in the literature is used to train supervised algorithms to characterize the type dependencies on a number of attributes. The most useful attributes evaluated with the random forest methodology include, in decreasing order of importance, the period, the amplitude, the V-I colour index, the absolute magnitude, the residual around the folded light-curve model, the magnitude distribution skewness and the amplitude of the second harmonic of the Fourier series model relative to that of the fundamental frequency. Random forests and a multi-stage scheme involving Bayesian network and Gaussian mixture methods lead to statistically equivalent results. In standard 10-fold cross-validation (CV) experiments, the rate of correct classification is between 90 and 100 per cent, depending on the variability type. The main mis-classification cases, up to a rate of about 10 per cent, arise due to confusion between SPB and ACV blue variables and between eclipsing binaries, ellipsoidal variables and other variability types. Our training set and the predicted types for the other Hipparcos periodic stars are available online.

  20. TV-SVM: Total Variation Support Vector Machine for Semi-Supervised Data Classification

    OpenAIRE

    Bresson, Xavier; Zhang, Ruiliang

    2012-01-01

    We introduce semi-supervised data classification algorithms based on total variation (TV), Reproducing Kernel Hilbert Space (RKHS), support vector machine (SVM), Cheeger cut, labeled and unlabeled data points. We design binary and multi-class semi-supervised classification algorithms. We compare the TV-based classification algorithms with the related Laplacian-based algorithms, and show that TV classification perform significantly better when the number of labeled data is small.

  1. Detection and Evaluation of Cheating on College Exams using Supervised Classification

    Directory of Open Access Journals (Sweden)

    Elmano Ramalho CAVALCANTI

    2012-10-01

    Full Text Available Text mining has been used for various purposes, such as document classification and extraction of domain-specific information from text. In this paper we present a study in which text mining methodology and algorithms were properly employed for academic dishonesty (cheating detection and evaluation on open-ended college exams, based on document classification techniques. Firstly, we propose two classification models for cheating detection by using a decision tree supervised algorithm. Then, both classifiers are compared against the result produced by a domain expert. The results point out that one of the classifiers achieved an excellent quality in detecting and evaluating cheating in exams, making possible its use in real school and college environments.

  2. Supervised neural networks for the classification of structures.

    Science.gov (United States)

    Sperduti, A; Starita, A

    1997-01-01

    Standard neural networks and statistical methods are usually believed to be inadequate when dealing with complex structures because of their feature-based approach. In fact, feature-based approaches usually fail to give satisfactory solutions because of the sensitivity of the approach to the a priori selection of the features, and the incapacity to represent any specific information on the relationships among the components of the structures. However, we show that neural networks can, in fact, represent and classify structured patterns. The key idea underpinning our approach is the use of the so called "generalized recursive neuron", which is essentially a generalization to structures of a recurrent neuron. By using generalized recursive neurons, all the supervised networks developed for the classification of sequences, such as backpropagation through time networks, real-time recurrent networks, simple recurrent networks, recurrent cascade correlation networks, and neural trees can, on the whole, be generalized to structures. The results obtained by some of the above networks (with generalized recursive neurons) on the classification of logic terms are presented.

  3. Semi-Supervised Classification based on Gaussian Mixture Model for remote imagery

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Semi-Supervised Classification (SSC),which makes use of both labeled and unlabeled data to determine classification borders in feature space,has great advantages in extracting classification information from mass data.In this paper,a novel SSC method based on Gaussian Mixture Model (GMM) is proposed,in which each class’s feature space is described by one GMM.Experiments show the proposed method can achieve high classification accuracy with small amount of labeled data.However,for the same accuracy,supervised classification methods such as Support Vector Machine,Object Oriented Classification,etc.should be provided with much more labeled data.

  4. Establishing a Supervised Classification of Global Blue Carbon Mangrove Ecosystems

    Science.gov (United States)

    Baltezar, P.

    2016-12-01

    Understanding change in mangroves over time will aid forest management systems working to protect them from over exploitation. Mangroves are one of the most carbon dense terrestrial ecosystems on the planet and are therefore a high priority for sustainable forest management. Although they represent 1% of terrestrial cover, they could account for about 10% of global carbon emissions. The foundation of this analysis uses remote sensing to establish a supervised classification of mangrove forests for discrete regions in the Zambezi Delta of Mozambique and the Rufiji Delta of Tanzania. Open-source mapping platforms provided a dynamic space for analyzing satellite imagery in the Google Earth Engine (GEE) coding environment. C-Band Synthetic Aperture Radar data from Sentinel 1 was used in the model as a mask by optimizing SAR parameters. Exclusion metrics identified within Global Land Surface Temperature data from MODIS and the Shuttle Radar Topography Mission were used to accentuate mangrove features. Variance was accounted for in exclusion metrics by statistically calculating thresholds for radar, thermal, and elevation data. Optical imagery from the Landsat 8 archive aided a quality mosaic in extracting the highest spectral index values most appropriate for vegetative mapping. The enhanced radar, thermal, and digital elevation imagery were then incorporated into the quality mosaic. Training sites were selected from Google Earth imagery and used in the classification with a resulting output of four mangrove cover map models for each site. The model was assessed for accuracy by observing the differences between the mangrove classification models to the reference maps. Although the model was over predicting mangroves in non-mangrove regions, it was more accurately classifying mangrove regions established by the references. Future refinements will expand the model with an objective degree of accuracy.

  5. Automatic Building Detection based on Supervised Classification using High Resolution Google Earth Images

    Science.gov (United States)

    Ghaffarian, S.; Ghaffarian, S.

    2014-08-01

    This paper presents a novel approach to detect the buildings by automization of the training area collecting stage for supervised classification. The method based on the fact that a 3d building structure should cast a shadow under suitable imaging conditions. Therefore, the methodology begins with the detection and masking out the shadow areas using luminance component of the LAB color space, which indicates the lightness of the image, and a novel double thresholding technique. Further, the training areas for supervised classification are selected by automatically determining a buffer zone on each building whose shadow is detected by using the shadow shape and the sun illumination direction. Thereafter, by calculating the statistic values of each buffer zone which is collected from the building areas the Improved Parallelepiped Supervised Classification is executed to detect the buildings. Standard deviation thresholding applied to the Parallelepiped classification method to improve its accuracy. Finally, simple morphological operations conducted for releasing the noises and increasing the accuracy of the results. The experiments were performed on set of high resolution Google Earth images. The performance of the proposed approach was assessed by comparing the results of the proposed approach with the reference data by using well-known quality measurements (Precision, Recall and F1-score) to evaluate the pixel-based and object-based performances of the proposed approach. Evaluation of the results illustrates that buildings detected from dense and suburban districts with divers characteristics and color combinations using our proposed method have 88.4 % and 853 % overall pixel-based and object-based precision performances, respectively.

  6. Classification of Autism Spectrum Disorder Using Supervised Learning of Brain Connectivity Measures Extracted from Synchrostates

    CERN Document Server

    Jamal, Wasifa; Oprescu, Ioana-Anastasia; Maharatna, Koushik; Apicella, Fabio; Sicca, Federico

    2014-01-01

    Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave ...

  7. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    In this paper, a new supervised clustering and classification method is proposed. First, the application of discriminant partial least squares (DPLS) for the selection of a minimum number of key genes is applied on a gene expression microarray data set. Second, supervised hierarchical clustering ...

  8. Classification of autism spectrum disorder using supervised learning of brain connectivity measures extracted from synchrostates

    Science.gov (United States)

    Jamal, Wasifa; Das, Saptarshi; Oprescu, Ioana-Anastasia; Maharatna, Koushik; Apicella, Fabio; Sicca, Federico

    2014-08-01

    Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave one out cross-validation of the classification algorithm gives 94.7% accuracy as the best performance with corresponding sensitivity and specificity values as 85.7% and 100% respectively. Significance. The proposed method gives high classification accuracies and outperforms other contemporary research results. The effectiveness of the proposed method for classification of autistic and typical children suggests the possibility of using it on a larger population to validate it for clinical practice.

  9. MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE CLUSTERING

    Data.gov (United States)

    National Aeronautics and Space Administration — MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE CLUSTERING MOHAMMAD SALIM AHMED, LATIFUR KHAN, NIKUNJ OZA, AND MANDAVA RAJESWARI Abstract....

  10. Techniques & Processes of Teacher Education & Supervision for TESOL

    Institute of Scientific and Technical Information of China (English)

    Zhu Chen; Chen Shi

    2011-01-01

    This essay gives the brief rationales on techniques and processes of teacher education and supervision by designing an in-service teacher education course and analyzing one of sessions.And also,the essay discusses the feasibilities of these teacher educat

  11. Urban Classification Techniques Using the Fusion of LiDAR and Spectral Data

    Science.gov (United States)

    2012-09-01

    probably be made to improve final results. 14. SUBJECT TERMS Fusion, Multi-Source, Hyperspectral , Multispectral, LiDAR, Urban Classification ...Landsat Thematic Mapper for 1986, 1991, 1998, and 2002. A hybrid supervised- unsupervised classification technique was developed that clustered the...The multispectral spectral resolution is not as high as that of a hyperspectral sensor. Using hyperspectral data, finer classifications could

  12. Modeling electroencephalography waveforms with semi-supervised deep belief nets: fast classification and anomaly measurement

    Science.gov (United States)

    Wulsin, D. F.; Gupta, J. R.; Mani, R.; Blanco, J. A.; Litt, B.

    2011-06-01

    Clinical electroencephalography (EEG) records vast amounts of human complex data yet is still reviewed primarily by human readers. Deep belief nets (DBNs) are a relatively new type of multi-layer neural network commonly tested on two-dimensional image data but are rarely applied to times-series data such as EEG. We apply DBNs in a semi-supervised paradigm to model EEG waveforms for classification and anomaly detection. DBN performance was comparable to standard classifiers on our EEG dataset, and classification time was found to be 1.7-103.7 times faster than the other high-performing classifiers. We demonstrate how the unsupervised step of DBN learning produces an autoencoder that can naturally be used in anomaly measurement. We compare the use of raw, unprocessed data—a rarity in automated physiological waveform analysis—with hand-chosen features and find that raw data produce comparable classification and better anomaly measurement performance. These results indicate that DBNs and raw data inputs may be more effective for online automated EEG waveform recognition than other common techniques.

  13. Improving Landsat and IRS Image Classification: Evaluation of Unsupervised and Supervised Classification through Band Ratios and DEM in a Mountainous Landscape in Nepal

    Directory of Open Access Journals (Sweden)

    Krishna Bahadur K.C.

    2009-12-01

    Full Text Available Modification of the original bands and integration of ancillary data in digital image classification has been shown to improve land use land cover classification accuracy. There are not many studies demonstrating such techniques in the context of the mountains of Nepal. The objective of this study was to explore and evaluate the use of modified band and ancillary data in Landsat and IRS image classification, and to produce a land use land cover map of the Galaudu watershed of Nepal. Classification of land uses were explored using supervised and unsupervised classification for 12 feature sets containing the LandsatMSS, TM and IRS original bands, ratios, normalized difference vegetation index, principal components and a digital elevation model. Overall, the supervised classification method produced higher accuracy than the unsupervised approach. The result from the combination of bands ration 4/3, 5/4 and 5/7 ranked the highest in terms of accuracy (82.86%, while the combination of bands 2, 3 and 4 ranked the lowest (45.29%. Inclusion of DEM as a component band shows promising results.

  14. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine

    Science.gov (United States)

    Gao, Fei; Mei, Jingyuan; Sun, Jinping; Wang, Jun; Yang, Erfu; Hussain, Amir

    2015-01-01

    For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM) is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a “soft-start” approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment. PMID:26275294

  15. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine.

    Directory of Open Access Journals (Sweden)

    Fei Gao

    Full Text Available For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a "soft-start" approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment.

  16. Polarimetric SAR Image Supervised Classification Method Integrating Eigenvalues

    Directory of Open Access Journals (Sweden)

    Xing Yanxiao

    2016-04-01

    Full Text Available Since classification methods based on H/α space have the drawback of yielding poor classification results for terrains with similar scattering features, in this study, we propose a polarimetric Synthetic Aperture Radar (SAR image classification method based on eigenvalues. First, we extract eigenvalues and fit their distribution with an adaptive Gaussian mixture model. Then, using the naive Bayesian classifier, we obtain preliminary classification results. The distribution of eigenvalues in two kinds of terrains may be similar, leading to incorrect classification in the preliminary step. So, we calculate the similarity of every terrain pair, and add them to the similarity table if their similarity is greater than a given threshold. We then apply the Wishart distance-based KNN classifier to these similar pairs to obtain further classification results. We used the proposed method on both airborne and spaceborne SAR datasets, and the results show that our method can overcome the shortcoming of the H/α-based unsupervised classification method for eigenvalues usage, and produces comparable results with the Support Vector Machine (SVM-based classification method.

  17. Use of Sub-Aperture Decomposition for Supervised PolSAR Classification in Urban Area

    Directory of Open Access Journals (Sweden)

    Lei Deng

    2015-01-01

    Full Text Available A novel approach is proposed for classifying the polarimetric SAR (PolSAR data by integrating polarimetric decomposition, sub-aperture decomposition and decision tree algorithm. It is composed of three key steps: sub-aperture decomposition, feature extraction and combination, and decision tree classification. Feature extraction and combination is the main contribution to the innovation of the proposed method. Firstly, the full-resolution PolSAR image and its two sub-aperture images are decomposed to obtain the scattering entropy, average scattering angle and anisotropy, respectively. Then, the difference information between the two sub-aperture images are extracted, and combined with the target decomposition features from full-resolution images to form the classification feature set. Finally, C5.0 decision tree algorithm is used to classify the PolSAR image. A comparison between the proposed method and commonly-used Wishart supervised classification was made to verify the improvement of the proposed method on the classification. The overall accuracy using the proposed method was 88.39%, much higher than that using the Wishart supervised classification, which exhibited an overall accuracy of 69.82%. The Kappa Coefficient was 0.83, whereas that using the Wishart supervised classification was 0.56. The results indicate that the proposed method performed better than Wishart supervised classification for landscape classification in urban area using PolSAR data. Further investigation was carried out on the contribution of difference information to PolSAR classification. It was found that the sub-aperture decomposition improved the classification accuracy of forest, buildings and grassland effectively in high-density urban area. Compared with support vector machine (SVM and QUEST classifier, C5.0 decision tree classifier performs more efficient in time consumption, feature selection and construction of decision rule.

  18. A Supervised Classification Algorithm for Note Onset Detection

    Directory of Open Access Journals (Sweden)

    Douglas Eck

    2007-01-01

    Full Text Available This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

  19. Supervised, Multivariate, Whole-brain Reduction Did Not Help to Achieve High Classification Performance in Schizophrenia Research

    Directory of Open Access Journals (Sweden)

    Eva Janousova

    2016-08-01

    Full Text Available We examined how penalized linear discriminant analysis with resampling, which is a supervised, multivariate, whole-brain reduction technique, can help schizophrenia diagnostics and research. In an experiment with magnetic resonance brain images of 52 first-episode schizophrenia patients and 52 healthy controls, this method allowed us to select brain areas relevant to schizophrenia, such as the left prefrontal cortex, the anterior cingulum, the right anterior insula, the thalamus and the hippocampus. Nevertheless, the classification performance based on such reduced data was not significantly better than the classification of data reduced by mass univariate selection using a t-test or unsupervised multivariate reduction using principal component analysis. Moreover, we found no important influence of the type of imaging features, namely local deformations or grey matter volumes, and the classification method, specifically linear discriminant analysis or linear support vector machines, on the classification results. However, we ascertained significant effect of a cross-validation setting on classification performance as classification results were overestimated even though the resampling was performed during the selection of brain imaging features. Therefore, it is critically important to perform cross-validation in all steps of the analysis (not only during classification in case there is no external validation set to avoid optimistically biasing the results of classification studies.

  20. Towards designing an email classification system using multi-view based semi-supervised learning

    NARCIS (Netherlands)

    Li, Wenjuan; Meng, Weizhi; Tan, Zhiyuan; Xiang, Yang

    2014-01-01

    The goal of email classification is to classify user emails into spam and legitimate ones. Many supervised learning algorithms have been invented in this domain to accomplish the task, and these algorithms require a large number of labeled training data. However, data labeling is a labor intensive t

  1. Out-of-Sample Generalizations for Supervised Manifold Learning for Classification

    Science.gov (United States)

    Vural, Elif; Guillemot, Christine

    2016-03-01

    Supervised manifold learning methods for data classification map data samples residing in a high-dimensional ambient space to a lower-dimensional domain in a structure-preserving way, while enhancing the separation between different classes in the learned embedding. Most nonlinear supervised manifold learning methods compute the embedding of the manifolds only at the initially available training points, while the generalization of the embedding to novel points, known as the out-of-sample extension problem in manifold learning, becomes especially important in classification applications. In this work, we propose a semi-supervised method for building an interpolation function that provides an out-of-sample extension for general supervised manifold learning algorithms studied in the context of classification. The proposed algorithm computes a radial basis function (RBF) interpolator that minimizes an objective function consisting of the total embedding error of unlabeled test samples, defined as their distance to the embeddings of the manifolds of their own class, as well as a regularization term that controls the smoothness of the interpolation function in a direction-dependent way. The class labels of test data and the interpolation function parameters are estimated jointly with a progressive procedure. Experimental results on face and object images demonstrate the potential of the proposed out-of-sample extension algorithm for the classification of manifold-modeled data sets.

  2. Benchmarking protein classification algorithms via supervised cross-validation

    NARCIS (Netherlands)

    Kertész-Farkas, A.; Dhir, S.; Sonego, P.; Pacurar, M.; Netoteia, S.; Nijveen, H.; Kuzniar, A.; Leunissen, J.A.M.; Kocsor, A.; Pongor, S.

    2008-01-01

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-o

  3. Supervised Classification: The Naive Beyesian Returns to the Old Bailey

    Directory of Open Access Journals (Sweden)

    Vilja Hulden

    2014-12-01

    Full Text Available A few years back, William Turkel wrote a series of blog posts called A Naive Bayesian in the Old Bailey, which showed how one could use machine learning to extract interesting documents out of a digital archive. This tutorial is a kind of an update on that blog essay, with roughly the same data but a slightly different version of the machine learner. The idea is to show why machine learning methods are of interest to historians, as well as to present a step-by-step implementation of a supervised machine learner. This learner is then applied to the Old Bailey digital archive, which contains several centuries’ worth of transcripts of trials held at the Old Bailey in London. We will be using Python for the implementation.

  4. Supervised classification of solar features using prior information

    Directory of Open Access Journals (Sweden)

    De Visscher Ruben

    2015-01-01

    Full Text Available Context: The Sun as seen by Extreme Ultraviolet (EUV telescopes exhibits a variety of large-scale structures. Of particular interest for space-weather applications is the extraction of active regions (AR and coronal holes (CH. The next generation of GOES-R satellites will provide continuous monitoring of the solar corona in six EUV bandpasses that are similar to the ones provided by the SDO-AIA EUV telescope since May 2010. Supervised segmentations of EUV images that are consistent with manual segmentations by for example space-weather forecasters help in extracting useful information from the raw data. Aims: We present a supervised segmentation method that is based on the Maximum A Posteriori rule. Our method allows integrating both manually segmented images as well as other type of information. It is applied on SDO-AIA images to segment them into AR, CH, and the remaining Quiet Sun (QS part. Methods: A Bayesian classifier is applied on training masks provided by the user. The noise structure in EUV images is non-trivial, and this suggests the use of a non-parametric kernel density estimator to fit the intensity distribution within each class. Under the Naive Bayes assumption we can add information such as latitude distribution and total coverage of each class in a consistent manner. Those information can be prescribed by an expert or estimated with an Expectation-Maximization algorithm. Results: The segmentation masks are in line with the training masks given as input and show consistency over time. Introduction of additional information besides pixel intensity improves upon the quality of the final segmentation. Conclusions: Such a tool can aid in building automated segmentations that are consistent with some ground truth’ defined by the users.

  5. A Novel Semi-Supervised Electronic Nose Learning Technique: M-Training

    Directory of Open Access Journals (Sweden)

    Pengfei Jia

    2016-03-01

    Full Text Available When an electronic nose (E-nose is used to distinguish different kinds of gases, the label information of the target gas could be lost due to some fault of the operators or some other reason, although this is not expected. Another fact is that the cost of getting the labeled samples is usually higher than for unlabeled ones. In most cases, the classification accuracy of an E-nose trained using labeled samples is higher than that of the E-nose trained by unlabeled ones, so gases without label information should not be used to train an E-nose, however, this wastes resources and can even delay the progress of research. In this work a novel multi-class semi-supervised learning technique called M-training is proposed to train E-noses with both labeled and unlabeled samples. We employ M-training to train the E-nose which is used to distinguish three indoor pollutant gases (benzene, toluene and formaldehyde. Data processing results prove that the classification accuracy of E-nose trained by semi-supervised techniques (tri-training and M-training is higher than that of an E-nose trained only with labeled samples, and the performance of M-training is better than that of tri-training because more base classifiers can be employed by M-training.

  6. Research of Plant-Leaves Classification Algorithm Based on Supervised LLE

    Directory of Open Access Journals (Sweden)

    Yan Qing

    2013-06-01

    Full Text Available A new supervised LLE method based on the fisher projection was proposed in this paper, and combined it with a new classification algorithm based on manifold learning to realize the recognition of the plant leaves. Firstly,the method utilizes the Fisher projection distance to replace the sample's geodesic distance, and a new supervised LLE algorithm is obtained .Then, a classification algorithm which uses the manifold reconstruction error to distinguish the sample classification directly is adopted. This algorithm can utilize the category information better,and improve recognition rate effectively. At the same time, it has the advantage of the easily parameter estimation. The experimental results based on the real-world plant leaf databases shows its average accuracy of recognition was up to 95.17%.

  7. Semi-supervised Learning for Classification of Polarimetric SAR Images Based on SVM-Wishart

    Directory of Open Access Journals (Sweden)

    Hua Wen-qiang

    2015-02-01

    Full Text Available In this study, we propose a new semi-supervised classification method for Polarimetric SAR (PolSAR images, aiming at handling the issue that the number of train set is small. First, considering the scattering characters of PolSAR data, this method extracts multiple scattering features using target decomposition approach. Then, a semi-supervised learning model is established based on a co-training framework and Support Vector Machine (SVM. Both labeled and unlabeled data are utilized in this model to obtain high classification accuracy. Third, a recovery scheme based on the Wishart classifier is proposed to improve the classification performance. From the experiments conducted in this study, it is evident that the proposed method performs more effectively compared with other traditional methods when the number of train set is small.

  8. SEMI-SUPERVISED RADIO TRANSMITTER CLASSIFICATION BASED ON ELASTIC SPARSITY REGULARIZED SVM

    Institute of Scientific and Technical Information of China (English)

    Hu Guyu; Gong Yong; Chen Yande; Pan Zhisong; Deng Zhantao

    2012-01-01

    Non-collaborative radio transmitter recognition is a significant but challenging issue,sinceit is hard or costly to obtain labeled training data samples.In order to make effective use of the unlabeled samples which can be obtained much easier,a novel semi-supervised classification method named Elastic Sparsity Regularized Support Vector Machine (ESRSVM) is proposed for radio transmitter classification.ESRSVM first constructs an elastic-net graph over data samples to capture the robust and natural discriminating information and then incorporate the information into the manifold learning framework by an elastic sparsity regularization term.Experimental results on 10 GMSK modulated Automatic Identification System radios and 15 FM walkie-talkie radios show that ESRSVM achieves obviously better performance than KNN and SVM,which use only labeled samples for classification,and also outperforms semi-supervised classifier LapSVM based on manifold regularization.

  9. Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques

    Science.gov (United States)

    Chen, Kuan-Yu; Wang, Hsin-Min; Chen, Berlin

    This paper describes the application of two attractive categories of topic modeling techniques to the problem of spoken document retrieval (SDR), viz. document topic model (DTM) and word topic model (WTM). Apart from using the conventional unsupervised training strategy, we explore a supervised training strategy for estimating these topic models, imagining a scenario that user query logs along with click-through information of relevant documents can be utilized to build an SDR system. This attempt has the potential to associate relevant documents with queries even if they do not share any of the query words, thereby improving on retrieval quality over the baseline system. Likewise, we also study a novel use of pseudo-supervised training to associate relevant documents with queries through a pseudo-feedback procedure. Moreover, in order to lessen SDR performance degradation caused by imperfect speech recognition, we investigate leveraging different levels of index features for topic modeling, including words, syllable-level units, and their combination. We provide a series of experiments conducted on the TDT (TDT-2 and TDT-3) Chinese SDR collections. The empirical results show that the methods deduced from our proposed modeling framework are very effective when compared with a few existing retrieval approaches.

  10. Recent advances on techniques and theories of feedforward networks with supervised learning

    Science.gov (United States)

    Xu, Lei; Klasa, Stan

    1992-07-01

    The rediscovery and popularization of the back propagation training technique for multilayer perceptrons as well as the invention of the Boltzmann Machine learning algorithm has given a new boost to the study of supervised learning networks. In recent years, besides the widely spread applications and the various further improvements of the classical back propagation technique, many new supervised learning models, techniques as well as theories, have also been proposed in a vast number of publications. This paper tries to give a rather systematical review on the recent advances on supervised learning techniques and theories for static feedforward networks. We summarize a great number of developments into four aspects: (1) Various improvements and variants made on the classical back propagation techniques for multilayer (static) perceptron nets, for speeding up training, avoiding local minima, increasing the generalization ability, as well as for many other interesting purposes. (2) A number of other learning methods for training multilayer (static) perceptron, such as derivative estimation by perturbation, direct weight update by perturbation, genetic algorithms, recursive least square estimate and extended Kalman filter, linear programming, the policy of fixing one layer while updating another, constructing networks by converting decision tree classifiers, and others. (3) Various other feedforward models which are also able to implement function approximation, probability density estimation and classification, including various models of basis function expansion (e.g., radial basis functions, restricted coulomb energy, multivariate adaptive regression splines, trigonometric and polynomial bases, projection pursuit, basis function tree, and may others), and several other supervised learning models. (4) Models with complex structures, e.g., modular architecture, hierarchy architecture, and others. (5) A number of theoretical issues involving the universal

  11. Extending self-organizing maps for supervised classification of remotely sensed data

    Institute of Scientific and Technical Information of China (English)

    CHEN Yongliang

    2009-01-01

    An extended self-organizing map for supervised classification is proposed in this paper. Unlike other traditional SOMs, the model has an input layer, a Kohonen layer, and an output layer. The number of neurons in the input layer depends on the dimensionality of input patterns. The number of neurons in the output layer equals the number of the desired classes. The number of neurons in the Kohonen layer may be a few to several thousands, which depends on the complexity of classification problems and the classification precision. Each training sample is expressed by a pair of vectors: an input vector and a class codebook vector. When a training sample is input into the model, Kohonens competitive learning rule is applied to selecting the winning neuron from the Kohonen layer and the weight coefficients connecting all the neurons in the input layer with both the winning neuron and its neighbors in the Kohonen layer are modified to be closer to the input vector, and those connecting all the neurons around the winning neuron within a certain diameter in the Kohonen layer with all the neurons in the output layer are adjusted to be closer to the class codebook vector. If the number of training samples is sufficiently large and the learning epochs iterate enough times, the model will be able to serve as a supervised classifier. The model has been tentatively applied to the supervised classification of multispectral remotely sensed data. The author compared the performances of the extended SOM and BPN in remotely sensed data classification. The investigation manifests that the extended SOM is feasible for supervised classification.

  12. Improving TMD classification using the Delphi technique.

    Science.gov (United States)

    John, M T

    2010-10-01

    The classification of temporomandibular disorders (TMD) is still controversial. Consensus methods such as the Delphi technique, a method that polls experts' anonymous opinion in an iterative process with controlled feedback and statistical aggregation of group response, could be valuable to improve this challenging topic. The article illustrates the application of the Delphi technique for deciding whether the terms myalgia or myofascial pain should be used in a TMD classification system and discusses the technique's potential for TMD classification in general. In three Delphi rounds, 14 TMD experts from the Division of TMD and Orofacial Pain of the University of Minnesota reached a consensus about which TMD diagnoses should be included in a TMD classification system. They preferred the term myofascial pain over myalgia. The Delphi technique has the potential to provide answers to complex questions in TMD classification, e.g., TMD nomenclature and range as well as scope of conditions included in a future TMD classification system.

  13. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

    Directory of Open Access Journals (Sweden)

    Deborah Galpert

    2015-01-01

    Full Text Available Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.

  14. Using Supervised Learning Techniques for Diagnosis of Dynamic Systems

    Science.gov (United States)

    2002-05-04

    classification systems [11]. Neural network techniques have recently been applied in diverse fields, as 1 INTRODUCTION medicine [12] or power supply [13]. Machine...partiality financed by the Comisi6n 0.99 0.98 1.02 OK OK OK Interministerial de Ciencia y Tecnologia (DP12000-0666-C02-02) 1 1.02 1.02 OK OK OK and the

  15. Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques

    Directory of Open Access Journals (Sweden)

    Rayner Alfred

    2010-01-01

    Full Text Available Problem statement: In solving a classification problem in relational data mining, traditional methods, for example, the C4.5 and its variants, usually require data transformations from datasets stored in multiple tables into a single table. Unfortunately, we may loss some information when we join tables with a high degree of one-to-many association. Therefore, data transformation becomes a tedious trial-and-error work and the classification result is often not very promising especially when the number of tables and the degree of one-to-many association are large. Approach: We proposed a genetic semi-supervised clustering technique as a means of aggregating data stored in multiple tables to facilitate the task of solving a classification problem in relational database. This algorithm is suitable for classification of datasets with a high degree of one-to-many associations. It can be used in two ways. One is user-controlled clustering, where the user may control the result of clustering by varying the compactness of the spherical cluster. The other is automatic clustering, where a non-overlap clustering strategy is applied. In this study, we use the latter method to dynamically cluster multiple instances, as a means of aggregating them and illustrate the effectiveness of this method using the semi-supervised genetic algorithm-based clustering technique. Results: It was shown in the experimental results that using the reciprocal of Davies-Bouldin Index for cluster dispersion and the reciprocal of Gini Index for cluster purity, as the fitness function in the Genetic Algorithm (GA, finds solutions with much greater accuracy. The results obtained in this study showed that automatic clustering (seeding, by optimizing the cluster dispersion or cluster purity alone using GA, provides one with good results compared to the traditional k-means clustering. However, the best result can be achieved by optimizing the combination values of both the cluster

  16. Gene classification using parameter-free semi-supervised manifold learning.

    Science.gov (United States)

    Huang, Hong; Feng, Hailiang

    2012-01-01

    A new manifold learning method, called parameter-free semi-supervised local Fisher discriminant analysis (pSELF), is proposed to map the gene expression data into a low-dimensional space for tumor classification. Motivated by the fact that semi-supervised and parameter-free are two desirable and promising characteristics for dimension reduction, a new difference-based optimization objective function with unlabeled samples has been designed. The proposed method preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The semi-supervised method has an analytic form of the globally optimal solution, which can be computed efficiently by eigen decomposition. Experimental results on synthetic data and SRBCT, DLBCL, and Brain Tumor gene expression data sets demonstrate the effectiveness of the proposed method.

  17. Chemometric classification techniques as a tool for solving problems in analytical chemistry.

    Science.gov (United States)

    Bevilacqua, Marta; Nescatelli, Riccardo; Bucci, Remo; Magrì, Andrea D; Magrì, Antonio L; Marini, Federico

    2014-01-01

    Supervised pattern recognition (classification) techniques, i.e., the family of chemometric methods whose aim is the prediction of a qualitative response on a set of samples, represent a very important assortment of tools for solving problems in several areas of applied analytical chemistry. This paper describes the theory behind the chemometric classification techniques most frequently used in analytical chemistry together with some examples of their application to real-world problems.

  18. Supervised pixel classification for segmenting geographic atrophy in fundus autofluorescene images

    Science.gov (United States)

    Hu, Zhihong; Medioni, Gerard G.; Hernandez, Matthias; Sadda, SriniVas R.

    2014-03-01

    Age-related macular degeneration (AMD) is the leading cause of blindness in people over the age of 65. Geographic atrophy (GA) is a manifestation of the advanced or late-stage of the AMD, which may result in severe vision loss and blindness. Techniques to rapidly and precisely detect and quantify GA lesions would appear to be of important value in advancing the understanding of the pathogenesis of GA and the management of GA progression. The purpose of this study is to develop an automated supervised pixel classification approach for segmenting GA including uni-focal and multi-focal patches in fundus autofluorescene (FAF) images. The image features include region wise intensity (mean and variance) measures, gray level co-occurrence matrix measures (angular second moment, entropy, and inverse difference moment), and Gaussian filter banks. A k-nearest-neighbor (k-NN) pixel classifier is applied to obtain a GA probability map, representing the likelihood that the image pixel belongs to GA. A voting binary iterative hole filling filter is then applied to fill in the small holes. Sixteen randomly chosen FAF images were obtained from sixteen subjects with GA. The algorithm-defined GA regions are compared with manual delineation performed by certified graders. Two-fold cross-validation is applied for the evaluation of the classification performance. The mean Dice similarity coefficients (DSC) between the algorithm- and manually-defined GA regions are 0.84 +/- 0.06 for one test and 0.83 +/- 0.07 for the other test and the area correlations between them are 0.99 (p < 0.05) and 0.94 (p < 0.05) respectively.

  19. Musical Instrument Classification Based on Nonlinear Recurrence Analysis and Supervised Learning

    Directory of Open Access Journals (Sweden)

    R.Rui

    2013-04-01

    Full Text Available In this paper, the phase space reconstruction of time series produced by different instruments is discussed based on the nonlinear dynamic theory. The dense ratio, a novel quantitative recurrence parameter, is proposed to describe the difference of wind instruments, stringed instruments and keyboard instruments in the phase space by analyzing the recursive property of every instrument. Furthermore, a novel supervised learning algorithm for automatic classification of individual musical instrument signals is addressed deriving from the idea of supervised non-negative matrix factorization (NMF algorithm. In our approach, the orthogonal basis matrix could be obtained without updating the matrix iteratively, which NMF is unable to do. The experimental results indicate that the accuracy of the proposed method is improved by 3% comparing with the conventional features in the individual instrument classification.

  20. Classification and Weakly Supervised Pain Localization using Multiple Segment Representation

    Science.gov (United States)

    Sikka, Karan; Dhall, Abhinav; Bartlett, Marian Stewart

    2014-01-01

    Automatic pain recognition from videos is a vital clinical application and, owing to its spontaneous nature, poses interesting challenges to automatic facial expression recognition (AFER) research. Previous pain vs no-pain systems have highlighted two major challenges: (1) ground truth is provided for the sequence, but the presence or absence of the target expression for a given frame is unknown, and (2) the time point and the duration of the pain expression event(s) in each video are unknown. To address these issues we propose a novel framework (referred to as MS-MIL) where each sequence is represented as a bag containing multiple segments, and multiple instance learning (MIL) is employed to handle this weakly labeled data in the form of sequence level ground-truth. These segments are generated via multiple clustering of a sequence or running a multi-scale temporal scanning window, and are represented using a state-of-the-art Bag of Words (BoW) representation. This work extends the idea of detecting facial expressions through ‘concept frames’ to ‘concept segments’ and argues through extensive experiments that algorithms such as MIL are needed to reap the benefits of such representation. The key advantages of our approach are: (1) joint detection and localization of painful frames using only sequence-level ground-truth, (2) incorporation of temporal dynamics by representing the data not as individual frames but as segments, and (3) extraction of multiple segments, which is well suited to signals with uncertain temporal location and duration in the video. Extensive experiments on UNBC-McMaster Shoulder Pain dataset highlight the effectiveness of the approach by achieving competitive results on both tasks of pain classification and localization in videos. We also empirically evaluate the contributions of different components of MS-MIL. The paper also includes the visualization of discriminative facial patches, important for pain detection, as discovered by

  1. Classification and Weakly Supervised Pain Localization using Multiple Segment Representation.

    Science.gov (United States)

    Sikka, Karan; Dhall, Abhinav; Bartlett, Marian Stewart

    2014-10-01

    Automatic pain recognition from videos is a vital clinical application and, owing to its spontaneous nature, poses interesting challenges to automatic facial expression recognition (AFER) research. Previous pain vs no-pain systems have highlighted two major challenges: (1) ground truth is provided for the sequence, but the presence or absence of the target expression for a given frame is unknown, and (2) the time point and the duration of the pain expression event(s) in each video are unknown. To address these issues we propose a novel framework (referred to as MS-MIL) where each sequence is represented as a bag containing multiple segments, and multiple instance learning (MIL) is employed to handle this weakly labeled data in the form of sequence level ground-truth. These segments are generated via multiple clustering of a sequence or running a multi-scale temporal scanning window, and are represented using a state-of-the-art Bag of Words (BoW) representation. This work extends the idea of detecting facial expressions through 'concept frames' to 'concept segments' and argues through extensive experiments that algorithms such as MIL are needed to reap the benefits of such representation. The key advantages of our approach are: (1) joint detection and localization of painful frames using only sequence-level ground-truth, (2) incorporation of temporal dynamics by representing the data not as individual frames but as segments, and (3) extraction of multiple segments, which is well suited to signals with uncertain temporal location and duration in the video. Extensive experiments on UNBC-McMaster Shoulder Pain dataset highlight the effectiveness of the approach by achieving competitive results on both tasks of pain classification and localization in videos. We also empirically evaluate the contributions of different components of MS-MIL. The paper also includes the visualization of discriminative facial patches, important for pain detection, as discovered by our

  2. Synthesis of supervised classification algorithm using intelligent and statistical tools

    Directory of Open Access Journals (Sweden)

    Ali Douik

    2009-09-01

    Full Text Available A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby, the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national competition. This work is prominent for many next computer vision studies as it's detailed in this study.

  3. Synthesis of supervised classification algorithm using intelligent and statistical tools

    CERN Document Server

    Douik, Ali

    2009-01-01

    A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby..., the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national c...

  4. A Novel Approach to Developing a Supervised Spatial Decision Support System for Image Classification: A Study of Paddy Rice Investigation

    Directory of Open Access Journals (Sweden)

    Shih-Hsun Chang

    2014-01-01

    Full Text Available Paddy rice area estimation via remote sensing techniques has been well established in recent years. Texture information and vegetation indicators are widely used to improve the classification accuracy of satellite images. Accordingly, this study employs texture information and vegetation indicators as ancillary information for classifying paddy rice through remote sensing images. In the first stage, the images are attained using a remote sensing technique and ancillary information is employed to increase the accuracy of classification. In the second stage, we decide to construct an efficient supervised classifier, which is used to evaluate the ancillary information. In the third stage, linear discriminant analysis (LDA is introduced. LDA is a well-known method for classifying images to various categories. Also, the particle swarm optimization (PSO algorithm is employed to optimize the LDA classification outcomes and increase classification performance. In the fourth stage, we discuss the strategy of selecting different window sizes and analyze particle numbers and iteration numbers with corresponding accuracy. Accordingly, a rational strategy for the combination of ancillary information is introduced. Afterwards, the PSO algorithm improves the accuracy rate from 82.26% to 89.31%. The improved accuracy results in a much lower salt-and-pepper effect in the thematic map.

  5. HYBRID INTERNET TRAFFIC CLASSIFICATION TECHNIQUE1

    Institute of Scientific and Technical Information of China (English)

    Li Jun; Zhang Shunyi; Lu Yanqing; Yan Junrong

    2009-01-01

    Accurate and real-time classification of network traffic is significant to network operation and management such as QoS differentiation, traffic shaping and security surveillance. However, with many newly emerged P2P applications using dynamic port numbers, masquerading techniques, and payload encryption to avoid detection, traditional classification approaches turn to be ineffective. In this paper, we present a layered hybrid system to classify current Internet traffic, motivated by variety of network activities and their requirements of traffic classification. The proposed method could achieve fast and accurate traffic classification with low overheads and robustness to accommodate both known and unknown/encrypted applications. Furthermore, it is feasible to be used in the context of real-time traffic classification. Our experimental results show the distinct advantages of the proposed classification system, compared with the one-step Machine Learning (ML) approach.

  6. A novel Neuro-fuzzy classification technique for data mining

    Directory of Open Access Journals (Sweden)

    Soumadip Ghosh

    2014-11-01

    Full Text Available In our study, we proposed a novel Neuro-fuzzy classification technique for data mining. The inputs to the Neuro-fuzzy classification system were fuzzified by applying generalized bell-shaped membership function. The proposed method utilized a fuzzification matrix in which the input patterns were associated with a degree of membership to different classes. Based on the value of degree of membership a pattern would be attributed to a specific category or class. We applied our method to ten benchmark data sets from the UCI machine learning repository for classification. Our objective was to analyze the proposed method and, therefore compare its performance with two powerful supervised classification algorithms Radial Basis Function Neural Network (RBFNN and Adaptive Neuro-fuzzy Inference System (ANFIS. We assessed the performance of these classification methods in terms of different performance measures such as accuracy, root-mean-square error, kappa statistic, true positive rate, false positive rate, precision, recall, and f-measure. In every aspect the proposed method proved to be superior to RBFNN and ANFIS algorithms.

  7. Soft supervised self-organizing mapping (3SOM) for improving land cover classification with MODIS time-series

    Science.gov (United States)

    Lawawirojwong, Siam

    Classification of remote sensing data has long been a fundamental technique for studying vegetation and land cover. Furthermore, land use and land cover maps are a basic need for environmental science. These maps are important for crop system monitoring and are also valuable resources for decision makers. Therefore, an up-to-date and highly accurate land cover map with detailed and timely information is required for the global environmental change research community to support natural resource management, environmental protection, and policy making. However, there appears to be a number of limitations associated with data utilization such as weather conditions, data availability, cost, and the time needed for acquiring and processing large numbers of images. Additionally, improving the classification accuracy and reducing the classification time have long been the goals of remote sensing research and they still require the further study. To manage these challenges, the primary goal of this research is to improve classification algorithms that utilize MODIS-EVI time-series images. A supervised self-organizing map (SSOM) and a soft supervised self-organizing map (3SOM) are modified and improved to increase classification efficiency and accuracy. To accomplish the main goal, the performance of the proposed methods is investigated using synthetic and real landscape data derived from MODIS-EVI time-series images. Two study areas are selected based on a difference of land cover characteristics: one in Thailand and one in the Midwestern U.S. The results indicate that time-series imagery is a potentially useful input dataset for land cover classification. Moreover, the SSOM with time-series data significantly outperforms the conventional classification techniques of the Gaussian maximum likelihood classifier (GMLC) and backpropagation neural network (BPNN). In addition, the 3SOM employed as a soft classifier delivers a more accurate classification than the SSOM applied as

  8. Supervised Classification of Benthic Reflectance in Shallow Subtropical Waters Using a Generalized Pixel-Based Classifier across a Time Series

    Directory of Open Access Journals (Sweden)

    Tara Blakey

    2015-04-01

    Full Text Available We tested a supervised classification approach with Landsat 5 Thematic Mapper (TM data for time-series mapping of seagrass in a subtropical lagoon. Seagrass meadows are an integral link between marine and inland ecosystems and are at risk from upstream processes such as runoff and erosion. Despite the prevalence of image-specific approaches, the classification accuracies we achieved show that pixel-based spectral classes may be generalized and applied to a time series of images that were not included in the classifier training. We employed in-situ data on seagrass abundance from 2007 to 2011 to train and validate a classification model. We created depth-invariant bands from TM bands 1, 2, and 3 to correct for variations in water column depth prior to building the classification model. In-situ data showed mean total seagrass cover remained relatively stable over the study area and period, with seagrass cover generally denser in the west than the east. Our approach achieved mapping accuracies (67% and 76% for two validation years comparable with those attained using spectral libraries, but was simpler to implement. We produced a series of annual maps illustrating inter-annual variability in seagrass occurrence. Accuracies may be improved in future work by better addressing the spatial mismatch between pixel size of remotely sensed data and footprint of field data and by employing atmospheric correction techniques that normalize reflectances across images.

  9. Multi-Modal Curriculum Learning for Semi-Supervised Image Classification.

    Science.gov (United States)

    Gong, Chen; Tao, Dacheng; Maybank, Stephen J; Liu, Wei; Kang, Guoliang; Yang, Jie

    2016-07-01

    Semi-supervised image classification aims to classify a large quantity of unlabeled images by typically harnessing scarce labeled images. Existing semi-supervised methods often suffer from inadequate classification accuracy when encountering difficult yet critical images, such as outliers, because they treat all unlabeled images equally and conduct classifications in an imperfectly ordered sequence. In this paper, we employ the curriculum learning methodology by investigating the difficulty of classifying every unlabeled image. The reliability and the discriminability of these unlabeled images are particularly investigated for evaluating their difficulty. As a result, an optimized image sequence is generated during the iterative propagations, and the unlabeled images are logically classified from simple to difficult. Furthermore, since images are usually characterized by multiple visual feature descriptors, we associate each kind of features with a teacher, and design a multi-modal curriculum learning (MMCL) strategy to integrate the information from different feature modalities. In each propagation, each teacher analyzes the difficulties of the currently unlabeled images from its own modality viewpoint. A consensus is subsequently reached among all the teachers, determining the currently simplest images (i.e., a curriculum), which are to be reliably classified by the multi-modal learner. This well-organized propagation process leveraging multiple teachers and one learner enables our MMCL to outperform five state-of-the-art methods on eight popular image data sets.

  10. Supervised Cross-Modal Factor Analysis for Multiple Modal Data Classification

    KAUST Repository

    Wang, Jingbin

    2015-10-09

    In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., An image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods.

  11. Supervised pixel classification using a feature space derived from an artificial visual system

    Science.gov (United States)

    Baxter, Lisa C.; Coggins, James M.

    1991-01-01

    Image segmentation involves labelling pixels according to their membership in image regions. This requires the understanding of what a region is. Using supervised pixel classification, the paper investigates how groups of pixels labelled manually according to perceived image semantics map onto the feature space created by an Artificial Visual System. Multiscale structure of regions are investigated and it is shown that pixels form clusters based on their geometric roles in the image intensity function, not by image semantics. A tentative abstract definition of a 'region' is proposed based on this behavior.

  12. Supervised Self-Organizing Classification of Superresolution ISAR Images: An Anechoic Chamber Experiment

    Directory of Open Access Journals (Sweden)

    Radoi Emanuel

    2006-01-01

    Full Text Available The problem of the automatic classification of superresolution ISAR images is addressed in the paper. We describe an anechoic chamber experiment involving ten-scale-reduced aircraft models. The radar images of these targets are reconstructed using MUSIC-2D (multiple signal classification method coupled with two additional processing steps: phase unwrapping and symmetry enhancement. A feature vector is then proposed including Fourier descriptors and moment invariants, which are calculated from the target shape and the scattering center distribution extracted from each reconstructed image. The classification is finally performed by a new self-organizing neural network called SART (supervised ART, which is compared to two standard classifiers, MLP (multilayer perceptron and fuzzy KNN ( nearest neighbors. While the classification accuracy is similar, SART is shown to outperform the two other classifiers in terms of training speed and classification speed, especially for large databases. It is also easier to use since it does not require any input parameter related to its structure.

  13. Supervised Self-Organizing Classification of Superresolution ISAR Images: An Anechoic Chamber Experiment

    Science.gov (United States)

    Radoi, Emanuel; Quinquis, André; Totir, Felix

    2006-12-01

    The problem of the automatic classification of superresolution ISAR images is addressed in the paper. We describe an anechoic chamber experiment involving ten-scale-reduced aircraft models. The radar images of these targets are reconstructed using MUSIC-2D (multiple signal classification) method coupled with two additional processing steps: phase unwrapping and symmetry enhancement. A feature vector is then proposed including Fourier descriptors and moment invariants, which are calculated from the target shape and the scattering center distribution extracted from each reconstructed image. The classification is finally performed by a new self-organizing neural network called SART (supervised ART), which is compared to two standard classifiers, MLP (multilayer perceptron) and fuzzy KNN ([InlineEquation not available: see fulltext.] nearest neighbors). While the classification accuracy is similar, SART is shown to outperform the two other classifiers in terms of training speed and classification speed, especially for large databases. It is also easier to use since it does not require any input parameter related to its structure.

  14. Using Motivational Interviewing Techniques to Address Parallel Process in Supervision

    Science.gov (United States)

    Giordano, Amanda; Clarke, Philip; Borders, L. DiAnne

    2013-01-01

    Supervision offers a distinct opportunity to experience the interconnection of counselor-client and counselor-supervisor interactions. One product of this network of interactions is parallel process, a phenomenon by which counselors unconsciously identify with their clients and subsequently present to their supervisors in a similar fashion…

  15. Using Clinical Supervision Techniques with Student Art Teachers.

    Science.gov (United States)

    Susi, Frank D.

    1992-01-01

    Contends that the student teaching experience and the cooperating teacher are the most significant aspects of the teacher education process. Describes the features and the implementation of clinical supervision in art education. Concludes that cooperating teachers also benefit as a result of their experiences with student teachers. (CFR)

  16. Facial nerve image enhancement from CBCT using supervised learning technique.

    Science.gov (United States)

    Ping Lu; Barazzetti, Livia; Chandran, Vimal; Gavaghan, Kate; Weber, Stefan; Gerber, Nicolas; Reyes, Mauricio

    2015-08-01

    Facial nerve segmentation plays an important role in surgical planning of cochlear implantation. Clinically available CBCT images are used for surgical planning. However, its relatively low resolution renders the identification of the facial nerve difficult. In this work, we present a supervised learning approach to enhance facial nerve image information from CBCT. A supervised learning approach based on multi-output random forest was employed to learn the mapping between CBCT and micro-CT images. Evaluation was performed qualitatively and quantitatively by using the predicted image as input for a previously published dedicated facial nerve segmentation, and cochlear implantation surgical planning software, OtoPlan. Results show the potential of the proposed approach to improve facial nerve image quality as imaged by CBCT and to leverage its segmentation using OtoPlan.

  17. Generation of a Supervised Classification Algorithm for Time-Series Variable Stars with an Application to the LINEAR Dataset

    CERN Document Server

    Johnston, Kyle B

    2016-01-01

    With the advent of digital astronomy, new benefits and new problems have been presented to the modern day astronomer. While data can be captured in a more efficient and accurate manor using digital means, the efficiency of data retrieval has led to an overload of scientific data for processing and storage. This paper will focus on the construction and application of a supervised pattern classification algorithm for the identification of variable stars. Given the reduction of a survey of stars into a standard feature space, the problem of using prior patterns to identify new observed patterns can be reduced to time tested classification methodologies and algorithms. Such supervised methods, so called because the user trains the algorithms prior to application using patterns with known classes or labels, provide a means to probabilistically determine the estimated class type of new observations. This paper will demonstrate the construction and application of a supervised classification algorithm on variable sta...

  18. Enhancing Accuracy of Plant Leaf Classification Techniques

    Directory of Open Access Journals (Sweden)

    C. S. Sumathi

    2014-03-01

    Full Text Available Plants have become an important source of energy, and are a fundamental piece in the puzzle to solve the problem of global warming. Living beings also depend on plants for their food, hence it is of great importance to know about the plants growing around us and to preserve them. Automatic plant leaf classification is widely researched. This paper investigates the efficiency of learning algorithms of MLP for plant leaf classification. Incremental back propagation, Levenberg–Marquardt and batch propagation learning algorithms are investigated. Plant leaf images are examined using three different Multi-Layer Perceptron (MLP modelling techniques. Back propagation done in batch manner increases the accuracy of plant leaf classification. Results reveal that batch training is faster and more accurate than MLP with incremental training and Levenberg– Marquardt based learning for plant leaf classification. Various levels of semi-batch training used on 9 species of 15 sample each, a total of 135 instances show a roughly linear increase in classification accuracy.

  19. Supervised Classification of Polarimetric SAR Imagery Using Temporal and Contextual Information

    Science.gov (United States)

    Dargahi, A.; Maghsoudi, Y.; Abkar, A. A.

    2013-09-01

    Using the context as a source of ancillary information in classification process provides a powerful tool to obtain better class discrimination. Modelling context using Markov Random Fields (MRFs) and combining with Bayesian approach, a context-based supervised classification method is proposed. In this framework, to have a full use of the statistical a priori knowledge of the data, the spatial relation of the neighbouring pixels was used. The proposed context-based algorithm combines a Gaussian-based wishart distribution of PolSAR images with temporal and contextual information. This combination was done through the Bayes decision theory: the class-conditional probability density function and the prior probability are modelled by the wishart distribution and the MRF model. Given the complexity and similarity of classes, in order to enhance the class separation, simultaneously two PolSAR images from two different seasons (leaf-on and leaf-off) were used. According to the achieved results, the maximum improvement in the overall accuracy of classification using WMRF (Combining Wishart and MRF) compared to the wishart classifier when the leaf-on image was used. The highest accuracy obtained was when using the combined datasets. In this case, the overall accuracy of the wishart and WMRF methods were 72.66% and 78.95% respectively.

  20. SUPERVISED CLASSIFICATION OF POLARIMETRIC SAR IMAGERY USING TEMPORAL AND CONTEXTUAL INFORMATION

    Directory of Open Access Journals (Sweden)

    A. Dargahi

    2013-09-01

    Full Text Available Using the context as a source of ancillary information in classification process provides a powerful tool to obtain better class discrimination. Modelling context using Markov Random Fields (MRFs and combining with Bayesian approach, a context-based supervised classification method is proposed. In this framework, to have a full use of the statistical a priori knowledge of the data, the spatial relation of the neighbouring pixels was used. The proposed context-based algorithm combines a Gaussian-based wishart distribution of PolSAR images with temporal and contextual information. This combination was done through the Bayes decision theory: the class-conditional probability density function and the prior probability are modelled by the wishart distribution and the MRF model. Given the complexity and similarity of classes, in order to enhance the class separation, simultaneously two PolSAR images from two different seasons (leaf-on and leaf-off were used. According to the achieved results, the maximum improvement in the overall accuracy of classification using WMRF (Combining Wishart and MRF compared to the wishart classifier when the leaf-on image was used. The highest accuracy obtained was when using the combined datasets. In this case, the overall accuracy of the wishart and WMRF methods were 72.66% and 78.95% respectively.

  1. Semi-supervised vibration-based classification and condition monitoring of compressors

    Science.gov (United States)

    Potočnik, Primož; Govekar, Edvard

    2017-09-01

    Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.

  2. Semi-Supervised Projective Non-Negative Matrix Factorization for Cancer Classification.

    Directory of Open Access Journals (Sweden)

    Xiang Zhang

    Full Text Available Advances in DNA microarray technologies have made gene expression profiles a significant candidate in identifying different types of cancers. Traditional learning-based cancer identification methods utilize labeled samples to train a classifier, but they are inconvenient for practical application because labels are quite expensive in the clinical cancer research community. This paper proposes a semi-supervised projective non-negative matrix factorization method (Semi-PNMF to learn an effective classifier from both labeled and unlabeled samples, thus boosting subsequent cancer classification performance. In particular, Semi-PNMF jointly learns a non-negative subspace from concatenated labeled and unlabeled samples and indicates classes by the positions of the maximum entries of their coefficients. Because Semi-PNMF incorporates statistical information from the large volume of unlabeled samples in the learned subspace, it can learn more representative subspaces and boost classification performance. We developed a multiplicative update rule (MUR to optimize Semi-PNMF and proved its convergence. The experimental results of cancer classification for two multiclass cancer gene expression profile datasets show that Semi-PNMF outperforms the representative methods.

  3. Semi-Supervised Learning Techniques in AO Applications: A Novel Approach To Drift Counteraction

    Science.gov (United States)

    De Vito, S.; Fattoruso, G.; Pardo, M.; Tortorella, F.; Di Francia, G.

    2011-11-01

    In this work we proposed and tested the use of SSL techniques in the AO domain. The SSL characteristics have been exploited to reduce the need for costly supervised samples and the effects of time dependant drift of state-of-the-art statistical learning approaches. For this purpose, an on-field recorded one year long atmospheric pollution dataset has been used. The semi-supervised approach benefitted from the use of updated unlabeled samples, adapting its knowledge to the slowly changing drift effects. We expect that semi-supervised learning can provide significant advantages to the performance of sensor fusion subsystems in artificial olfaction exhibiting an interesting drift counteraction effect.

  4. Weakly Supervised Segmentation-Aided Classification of Urban Scenes from 3d LIDAR Point Clouds

    Science.gov (United States)

    Guinard, S.; Landrieu, L.

    2017-05-01

    We consider the problem of the semantic classification of 3D LiDAR point clouds obtained from urban scenes when the training set is limited. We propose a non-parametric segmentation model for urban scenes composed of anthropic objects of simple shapes, partionning the scene into geometrically-homogeneous segments which size is determined by the local complexity. This segmentation can be integrated into a conditional random field classifier (CRF) in order to capture the high-level structure of the scene. For each cluster, this allows us to aggregate the noisy predictions of a weakly-supervised classifier to produce a higher confidence data term. We demonstrate the improvement provided by our method over two publicly-available large-scale data sets.

  5. Improved Gait Classification with Different Smoothing Techniques

    Directory of Open Access Journals (Sweden)

    Hu Ng

    2011-01-01

    Full Text Available Gait as a biometric has received great attention nowadays as it can offer human identification at a distance without any contact with the feature capturing device. This is motivated by the increasing number of synchronised closed-circuit television (CCTV cameras which have been installed in many major towns, in order to monitor and prevent crime by identifying the criminal or suspect. This paper present a method to improve gait classification results by applying smoothing techniques on the extracted gait features. The proposed approach is consisted of three parts: extraction of human gait features from enhanced human silhouette, smoothing process on extracted gait features and classification by fuzzy k-nearest neighbours (KNN. The extracted gait features are height, width, crotch height, step-size of the human silhouette and joint trajectories. To improve the recognition rate, two of these extracted gait features are smoothened before the classification process in order to alleviate the effect of outliers. The proposed approach has been applied on a dataset of nine subjects walking bidirectionally on an indoor pathway with twelve different covariate factors. From the experimental results, it can be concluded that the proposed approach is effective in gait classification.

  6. Cortex transform and its application for supervised texture classification of digital images

    Science.gov (United States)

    Bashar, M. K.; Ohnishi, Noboru; Shevgaonkar, R. K.

    2002-02-01

    This paper proposes a localized multi-channel filtering approach of image texture analysis based on the cortical behavior of Human Visual System (HVS). In our efforts, 2D Gaussian function, called Cortex Filter, in the frequency domain is used to model the band pass nature of simple cells in HVS. A block-based iterative method is addressed. In each pass, a square block of data is captured and cortex filters at various directions and radial bands are applied to filter out the available texture information in that block. Such decomposition results in a set of band pass images from a single input image and we call it Cortex Transform (CT). We use filter responses in each pass to compute the representative texture features i.e., the average filtered energies. The procedure is repeated for the subsequent blocks of data until the whole image is scanned. Various energy values calculated above are stored into different arrays or files and are regarded as feature images. Thus the obtained feature images are integrated with minimum distance classifier for supervised texture classification. We demonstrated the algorithm with various real world and synthetic images from various sources. Confusion matrix analysis shows a high average overall classification accuracy (97.01%) of our CT based approach in comparison with that (71.27%) of the popular gray level co-occurrence matrix (GLCM) approach.

  7. Supervised Classification Processes for the Characterization of Heritage Elements, Case Study: Cuenca-Ecuador

    Science.gov (United States)

    Briones, J. C.; Heras, V.; Abril, C.; Sinchi, E.

    2017-08-01

    The proper control of built heritage entails many challenges related to the complexity of heritage elements and the extent of the area to be managed, for which the available resources must be efficiently used. In this scenario, the preventive conservation approach, based on the concept that prevent is better than cure, emerges as a strategy to avoid the progressive and imminent loss of monuments and heritage sites. Regular monitoring appears as a key tool to identify timely changes in heritage assets. This research demonstrates that the supervised learning model (Support Vector Machines - SVM) is an ideal tool that supports the monitoring process detecting visible elements in aerial images such as roofs structures, vegetation and pavements. The linear, gaussian and polynomial kernel functions were tested; the lineal function provided better results over the other functions. It is important to mention that due to the high level of segmentation generated by the classification procedure, it was necessary to apply a generalization process through opening a mathematical morphological operation, which simplified the over classification for the monitored elements.

  8. A comparison of supervised, unsupervised and synthetic land use classification methods in the north of Iran

    NARCIS (Netherlands)

    Mohammady, M.; Moradi, H.R.; Zeinivand, H.; Temme, A.J.A.M.

    2015-01-01

    Land use classification is often the first step in land use studies and thus forms the basis for many earth science studies. In this paper, we focus on low-cost techniques for combining Landsat images with geographic information system approaches to create a land use map. In the Golestan region of I

  9. Semi-Supervised Bayesian Classification of Materials with Impact-Echo Signals

    Directory of Open Access Journals (Sweden)

    Jorge Igual

    2015-05-01

    Full Text Available The detection and identification of internal defects in a material require the use of some technology that translates the hidden interior damages into observable signals with different signature-defect correspondences. We apply impact-echo techniques for this purpose. The materials are classified according to their defective status (homogeneous, one defect or multiple defects and kind of defect (hole or crack, passing through or not. Every specimen is impacted by a hammer, and the spectrum of the propagated wave is recorded. This spectrum is the input data to a Bayesian classifier that is based on the modeling of the conditional probabilities with a mixture of Gaussians. The parameters of the Gaussian mixtures and the class probabilities are estimated using an extended expectation-maximization algorithm. The advantage of our proposal is that it is flexible, since it obtains good results for a wide range of models even under little supervision; e.g., it obtains a harmonic average of precision and recall value of 92.38% given only a 10% supervision ratio. We test the method with real specimens made of aluminum alloy. The results show that the algorithm works very well. This technique could be applied in many industrial problems, such as the optimization of the marble cutting process.

  10. Semi-supervised Bayesian classification of materials with impact-echo signals.

    Science.gov (United States)

    Igual, Jorge; Salazar, Addisson; Safont, Gonzalo; Vergara, Luis

    2015-05-19

    The detection and identification of internal defects in a material require the use of some technology that translates the hidden interior damages into observable signals with different signature-defect correspondences. We apply impact-echo techniques for this purpose. The materials are classified according to their defective status (homogeneous, one defect or multiple defects) and kind of defect (hole or crack, passing through or not). Every specimen is impacted by a hammer, and the spectrum of the propagated wave is recorded. This spectrum is the input data to a Bayesian classifier that is based on the modeling of the conditional probabilities with a mixture of Gaussians. The parameters of the Gaussian mixtures and the class probabilities are estimated using an extended expectation-maximization algorithm. The advantage of our proposal is that it is flexible, since it obtains good results for a wide range of models even under little supervision; e.g., it obtains a harmonic average of precision and recall value of 92.38% given only a 10% supervision ratio. We test the method with real specimens made of aluminum alloy. The results show that the algorithm works very well. This technique could be applied in many industrial problems, such as the optimization of the marble cutting process.

  11. A multi-label, semi-supervised classification approach applied to personality prediction in social media.

    Science.gov (United States)

    Lima, Ana Carolina E S; de Castro, Leandro Nunes

    2014-10-01

    Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naïve Bayes classifier, a Support Vector Machine, and a Multilayer Perceptron neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others.

  12. Supervised learning classification models for prediction of plant virus encoded RNA silencing suppressors.

    Directory of Open Access Journals (Sweden)

    Zeenia Jagga

    Full Text Available Viral encoded RNA silencing suppressor proteins interfere with the host RNA silencing machinery, facilitating viral infection by evading host immunity. In plant hosts, the viral proteins have several basic science implications and biotechnology applications. However in silico identification of these proteins is limited by their high sequence diversity. In this study we developed supervised learning based classification models for plant viral RNA silencing suppressor proteins in plant viruses. We developed four classifiers based on supervised learning algorithms: J48, Random Forest, LibSVM and Naïve Bayes algorithms, with enriched model learning by correlation based feature selection. Structural and physicochemical features calculated for experimentally verified primary protein sequences were used to train the classifiers. The training features include amino acid composition; auto correlation coefficients; composition, transition, and distribution of various physicochemical properties; and pseudo amino acid composition. Performance analysis of predictive models based on 10 fold cross-validation and independent data testing revealed that the Random Forest based model was the best and achieved 86.11% overall accuracy and 86.22% balanced accuracy with a remarkably high area under the Receivers Operating Characteristic curve of 0.95 to predict viral RNA silencing suppressor proteins. The prediction models for plant viral RNA silencing suppressors can potentially aid identification of novel viral RNA silencing suppressors, which will provide valuable insights into the mechanism of RNA silencing and could be further explored as potential targets for designing novel antiviral therapeutics. Also, the key subset of identified optimal features may help in determining compositional patterns in the viral proteins which are important determinants for RNA silencing suppressor activities. The best prediction model developed in the study is available as a

  13. Gaia eclipsing binary and multiple systems. Supervised classification and self-organizing maps

    Science.gov (United States)

    Süveges, M.; Barblan, F.; Lecoeur-Taïbi, I.; Prša, A.; Holl, B.; Eyer, L.; Kochoska, A.; Mowlavi, N.; Rimoldini, L.

    2017-07-01

    Context. Large surveys producing tera- and petabyte-scale databases require machine-learning and knowledge discovery methods to deal with the overwhelming quantity of data and the difficulties of extracting concise, meaningful information with reliable assessment of its uncertainty. This study investigates the potential of a few machine-learning methods for the automated analysis of eclipsing binaries in the data of such surveys. Aims: We aim to aid the extraction of samples of eclipsing binaries from such databases and to provide basic information about the objects. We intend to estimate class labels according to two different, well-known classification systems, one based on the light curve morphology (EA/EB/EW classes) and the other based on the physical characteristics of the binary system (system morphology classes; detached through overcontact systems). Furthermore, we explore low-dimensional surfaces along which the light curves of eclipsing binaries are concentrated, and consider their use in the characterization of the binary systems and in the exploration of biases of the full unknown Gaia data with respect to the training sets. Methods: We have explored the performance of principal component analysis (PCA), linear discriminant analysis (LDA), Random Forest classification and self-organizing maps (SOM) for the above aims. We pre-processed the photometric time series by combining a double Gaussian profile fit and a constrained smoothing spline, in order to de-noise and interpolate the observed light curves. We achieved further denoising, and selected the most important variability elements from the light curves using PCA. Supervised classification was performed using Random Forest and LDA based on the PC decomposition, while SOM gives a continuous 2-dimensional manifold of the light curves arranged by a few important features. We estimated the uncertainty of the supervised methods due to the specific finite training set using ensembles of models constructed

  14. Cloud detection in all-sky images via multi-scale neighborhood features and multiple supervised learning techniques

    Science.gov (United States)

    Cheng, Hsu-Yung; Lin, Chih-Lung

    2017-01-01

    Cloud detection is important for providing necessary information such as cloud cover in many applications. Existing cloud detection methods include red-to-blue ratio thresholding and other classification-based techniques. In this paper, we propose to perform cloud detection using supervised learning techniques with multi-resolution features. One of the major contributions of this work is that the features are extracted from local image patches with different sizes to include local structure and multi-resolution information. The cloud models are learned through the training process. We consider classifiers including random forest, support vector machine, and Bayesian classifier. To take advantage of the clues provided by multiple classifiers and various levels of patch sizes, we employ a voting scheme to combine the results to further increase the detection accuracy. In the experiments, we have shown that the proposed method can distinguish cloud and non-cloud pixels more accurately compared with existing works.

  15. Efficient Plant Supervision Strategy Using NN Based Techniques

    Science.gov (United States)

    Garcia, Ramon Ferreiro; Rolle, Jose Luis Calvo; Castelo, Francisco Javier Perez

    Most of non-linear type one and type two control systems suffers from lack of detectability when model based techniques are applied on FDI (fault detection and isolation) tasks. In general, all types of processes suffer from lack of detectability also due to the ambiguity to discriminate the process, sensors and actuators in order to isolate any given fault. This work deals with a strategy to detect and isolate faults which include massive neural networks based functional approximation procedures associated to recursive rule based techniques applied to a parity space approach.

  16. Generating a Spanish Affective Dictionary with Supervised Learning Techniques

    Science.gov (United States)

    Bermudez-Gonzalez, Daniel; Miranda-Jiménez, Sabino; García-Moreno, Raúl-Ulises; Calderón-Nepamuceno, Dora

    2016-01-01

    Nowadays, machine learning techniques are being used in several Natural Language Processing (NLP) tasks such as Opinion Mining (OM). OM is used to analyse and determine the affective orientation of texts. Usually, OM approaches use affective dictionaries in order to conduct sentiment analysis. These lexicons are labeled manually with affective…

  17. Automated Classification and Correlation of Drill Cores using High-Resolution Hyperspectral Images and Supervised Pattern Classification Algorithms. Applications to Paleoseismology

    Science.gov (United States)

    Ragona, D. E.; Minster, B.; Rockwell, T.; Jasso, H.

    2006-12-01

    The standard methodology to describe, classify and correlate geologic materials in the field or lab rely on physical inspection of samples, sometimes with the assistance of conventional analytical techniques (e. g. XRD, microscopy, particle size analysis). This is commonly both time-consuming and inherently subjective. Many geological materials share identical visible properties (e.g. fine grained materials, alteration minerals) and therefore cannot be mapped using the human eye alone. Recent investigations have shown that ground- based hyperspectral imaging provides an effective method to study and digitally store stratigraphic and structural data from cores or field exposures. Neural networks and Naive Bayesian classifiers supply a variety of well-established techniques towards pattern recognition, especially for data examples with high- dimensionality input-outputs. In this poster, we present a new methodology for automatic mapping of sedimentary stratigraphy in the lab (drill cores, samples) or the field (outcrops, exposures) using short wave infrared (SWIR) hyperspectral images and these two supervised classification algorithms. High-spatial/spectral resolution data from large sediment samples (drill cores) from a paleoseismic excavation site were collected using a portable hyperspectral scanner with 245 continuous channels measured across the 960 to 2404 nm spectral range. The data were corrected for geometric and radiometric distortions and pre-processed to obtain reflectance at each pixel of the images. We built an example set using hundreds of reflectance spectra collected from the sediment core images. The examples were grouped into eight classes corresponding to materials found in the samples. We constructed two additional example sets by computing the 2-norm normalization, the derivative of the smoothed original reflectance examples. Each example set was divided into four subsets: training, training test, verification and validation. A multi

  18. Material classification and automatic content enrichment of images using supervised learning and knowledge bases

    Science.gov (United States)

    Mallepudi, Sri Abhishikth; Calix, Ricardo A.; Knapp, Gerald M.

    2011-02-01

    In recent years there has been a rapid increase in the size of video and image databases. Effective searching and retrieving of images from these databases is a significant current research area. In particular, there is a growing interest in query capabilities based on semantic image features such as objects, locations, and materials, known as content-based image retrieval. This study investigated mechanisms for identifying materials present in an image. These capabilities provide additional information impacting conditional probabilities about images (e.g. objects made of steel are more likely to be buildings). These capabilities are useful in Building Information Modeling (BIM) and in automatic enrichment of images. I2T methodologies are a way to enrich an image by generating text descriptions based on image analysis. In this work, a learning model is trained to detect certain materials in images. To train the model, an image dataset was constructed containing single material images of bricks, cloth, grass, sand, stones, and wood. For generalization purposes, an additional set of 50 images containing multiple materials (some not used in training) was constructed. Two different supervised learning classification models were investigated: a single multi-class SVM classifier, and multiple binary SVM classifiers (one per material). Image features included Gabor filter parameters for texture, and color histogram data for RGB components. All classification accuracy scores using the SVM-based method were above 85%. The second model helped in gathering more information from the images since it assigned multiple classes to the images. A framework for the I2T methodology is presented.

  19. Semi-supervised classification of emotional pictures based on feature combination

    Science.gov (United States)

    Li, Shuo; Zhang, Yu-Jin

    2011-02-01

    Can the abundant emotions reflected in pictures be classified automatically by computer? Only the visual features extracted from images are considered in the previous researches, which have the constrained capability to reveal various emotions. In addition, the training database utilized by previous methods is the subset of International Affective Picture System (IAPS) that has a relatively small scale, which exerts negative effects on the discrimination of emotion classifiers. To solve the above problems, this paper proposes a novel and practical emotional picture classification approach, using semi-supervised learning scheme with both visual feature and keyword tag information. Besides the IAPS with both emotion labels and keyword tags as part of the training dataset, nearly 2000 pictures with only keyword tags that are downloaded from the website Flickr form an auxiliary training dataset. The visual feature of the latent emotional semantic factors is extracted by probabilistic Latent Semantic Analysis (pLSA) model, while the text feature is described by binary vectors on the tag vocabulary. A first Linear Programming Boost (LPBoost) classifier which is trained on the samples from IAPS combines the above two features, and aims to label the other training samples from the internet. Then the second SVM classifier which is trained on all training images using only visual feature, focuses on the test images. In the experiment, the categorization performance of our approach is better than the latest methods.

  20. Automated segmentation of geographic atrophy in fundus autofluorescence images using supervised pixel classification.

    Science.gov (United States)

    Hu, Zhihong; Medioni, Gerard G; Hernandez, Matthias; Sadda, Srinivas R

    2015-01-01

    Geographic atrophy (GA) is a manifestation of the advanced or late stage of age-related macular degeneration (AMD). AMD is the leading cause of blindness in people over the age of 65 in the western world. The purpose of this study is to develop a fully automated supervised pixel classification approach for segmenting GA, including uni- and multifocal patches in fundus autofluorescene (FAF) images. The image features include region-wise intensity measures, gray-level co-occurrence matrix measures, and Gaussian filter banks. A [Formula: see text]-nearest-neighbor pixel classifier is applied to obtain a GA probability map, representing the likelihood that the image pixel belongs to GA. Sixteen randomly chosen FAF images were obtained from 16 subjects with GA. The algorithm-defined GA regions are compared with manual delineation performed by a certified image reading center grader. Eight-fold cross-validation is applied to evaluate the algorithm performance. The mean overlap ratio (OR), area correlation (Pearson's [Formula: see text]), accuracy (ACC), true positive rate (TPR), specificity (SPC), positive predictive value (PPV), and false discovery rate (FDR) between the algorithm- and manually defined GA regions are [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text], respectively.

  1. Applying Supervised Opinion Mining Techniques on Online User Reviews

    Directory of Open Access Journals (Sweden)

    Ion SMEUREANU

    2012-01-01

    Full Text Available In recent years, the spectacular development of web technologies, lead to an enormous quantity of user generated information in online systems. This large amount of information on web platforms make them viable for use as data sources, in applications based on opinion mining and sentiment analysis. The paper proposes an algorithm for detecting sentiments on movie user reviews, based on naive Bayes classifier. We make an analysis of the opinion mining domain, techniques used in sentiment analysis and its applicability. We implemented the proposed algorithm and we tested its performance, and suggested directions of development.

  2. DIAGNOSIS OF DIABETES USING CLASSIFICATION MINING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Aiswarya Iyer

    2015-01-01

    Full Text Available Diabetes has affected over 246 million people worldwide with a majority of them being women. According to the WHO report, by 2025 this number is expected to rise to over 380 million. The disease has been named the fifth deadliest disease in the United States with no imminent cure in sight. With the rise of information technology and its continued advent into the medical and healthcare sector, the cases of diabetes as well as their symptoms are well documented. This paper aims at finding solutions to diagnose the disease by analyzing the patterns found in the data through classification analysis by employing Decision Tree and Naïve Bayes algorithms. The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients

  3. School Counselors' Use of Solution-Focused Tenets and Techniques in School-Based Site Supervision

    Science.gov (United States)

    Cigrand, Dawnette L.; Wood, Susannah M.; Duys, David

    2014-01-01

    The tenets and techniques of solution-focused (SF) theory have potential for application to school counseling site supervision; however, research on the use of these practices in site supervision is needed. This study examined the extent to which school counseling site supervisors integrated SF tenets and techniques into their supervisory…

  4. Generation of a supervised classification algorithm for time-series variable stars with an application to the LINEAR dataset

    Science.gov (United States)

    Johnston, K. B.; Oluseyi, H. M.

    2017-04-01

    With the advent of digital astronomy, new benefits and new problems have been presented to the modern day astronomer. While data can be captured in a more efficient and accurate manner using digital means, the efficiency of data retrieval has led to an overload of scientific data for processing and storage. This paper will focus on the construction and application of a supervised pattern classification algorithm for the identification of variable stars. Given the reduction of a survey of stars into a standard feature space, the problem of using prior patterns to identify new observed patterns can be reduced to time-tested classification methodologies and algorithms. Such supervised methods, so called because the user trains the algorithms prior to application using patterns with known classes or labels, provide a means to probabilistically determine the estimated class type of new observations. This paper will demonstrate the construction and application of a supervised classification algorithm on variable star data. The classifier is applied to a set of 192,744 LINEAR data points. Of the original samples, 34,451 unique stars were classified with high confidence (high level of probability of being the true class).

  5. Optimal Subset Selection of Time-Series MODIS Images and Sample Data Transfer with Random Forests for Supervised Classification Modelling.

    Science.gov (United States)

    Zhou, Fuqun; Zhang, Aining

    2016-10-25

    Nowadays, various time-series Earth Observation data with multiple bands are freely available, such as Moderate Resolution Imaging Spectroradiometer (MODIS) datasets including 8-day composites from NASA, and 10-day composites from the Canada Centre for Remote Sensing (CCRS). It is challenging to efficiently use these time-series MODIS datasets for long-term environmental monitoring due to their vast volume and information redundancy. This challenge will be greater when Sentinel 2-3 data become available. Another challenge that researchers face is the lack of in-situ data for supervised modelling, especially for time-series data analysis. In this study, we attempt to tackle the two important issues with a case study of land cover mapping using CCRS 10-day MODIS composites with the help of Random Forests' features: variable importance, outlier identification. The variable importance feature is used to analyze and select optimal subsets of time-series MODIS imagery for efficient land cover mapping, and the outlier identification feature is utilized for transferring sample data available from one year to an adjacent year for supervised classification modelling. The results of the case study of agricultural land cover classification at a regional scale show that using only about a half of the variables we can achieve land cover classification accuracy close to that generated using the full dataset. The proposed simple but effective solution of sample transferring could make supervised modelling possible for applications lacking sample data.

  6. Mapping of riparian invasive species with supervised classification of Unmanned Aerial System (UAS) imagery

    Science.gov (United States)

    Michez, Adrien; Piégay, Hervé; Jonathan, Lisein; Claessens, Hugues; Lejeune, Philippe

    2016-02-01

    Riparian zones are key landscape features, representing the interface between terrestrial and aquatic ecosystems. Although they have been influenced by human activities for centuries, their degradation has increased during the 20th century. Concomitant with (or as consequences of) these disturbances, the invasion of exotic species has increased throughout the world's riparian zones. In our study, we propose a easily reproducible methodological framework to map three riparian invasive taxa using Unmanned Aerial Systems (UAS) imagery: Impatiens glandulifera Royle, Heracleum mantegazzianum Sommier and Levier, and Japanese knotweed (Fallopia sachalinensis (F. Schmidt Petrop.), Fallopia japonica (Houtt.) and hybrids). Based on visible and near-infrared UAS orthophoto, we derived simple spectral and texture image metrics computed at various scales of image segmentation (10, 30, 45, 60 using eCognition software). Supervised classification based on the random forests algorithm was used to identify the most relevant variable (or combination of variables) derived from UAS imagery for mapping riparian invasive plant species. The models were built using 20% of the dataset, the rest of the dataset being used as a test set (80%). Except for H. mantegazzianum, the best results in terms of global accuracy were achieved with the finest scale of analysis (segmentation scale parameter = 10). The best values of overall accuracies reached 72%, 68%, and 97% for I. glandulifera, Japanese knotweed, and H. mantegazzianum respectively. In terms of selected metrics, simple spectral metrics (layer mean/camera brightness) were the most used. Our results also confirm the added value of texture metrics (GLCM derivatives) for mapping riparian invasive species. The results obtained for I. glandulifera and Japanese knotweed do not reach sufficient accuracies for operational applications. However, the results achieved for H. mantegazzianum are encouraging. The high accuracies values combined to

  7. A new tool for supervised classification of satellite images available on web servers: Google Maps as a case study

    Science.gov (United States)

    García-Flores, Agustín.; Paz-Gallardo, Abel; Plaza, Antonio; Li, Jun

    2016-10-01

    This paper describes a new web platform dedicated to the classification of satellite images called Hypergim. The current implementation of this platform enables users to perform classification of satellite images from any part of the world thanks to the worldwide maps provided by Google Maps. To perform this classification, Hypergim uses unsupervised algorithms like Isodata and K-means. Here, we present an extension of the original platform in which we adapt Hypergim in order to use supervised algorithms to improve the classification results. This involves a significant modification of the user interface, providing the user with a way to obtain samples of classes present in the images to use in the training phase of the classification process. Another main goal of this development is to improve the runtime of the image classification process. To achieve this goal, we use a parallel implementation of the Random Forest classification algorithm. This implementation is a modification of the well-known CURFIL software package. The use of this type of algorithms to perform image classification is widespread today thanks to its precision and ease of training. The actual implementation of Random Forest was developed using CUDA platform, which enables us to exploit the potential of several models of NVIDIA graphics processing units using them to execute general purpose computing tasks as image classification algorithms. As well as CUDA, we use other parallel libraries as Intel Boost, taking advantage of the multithreading capabilities of modern CPUs. To ensure the best possible results, the platform is deployed in a cluster of commodity graphics processing units (GPUs), so that multiple users can use the tool in a concurrent way. The experimental results indicate that this new algorithm widely outperform the previous unsupervised algorithms implemented in Hypergim, both in runtime as well as precision of the actual classification of the images.

  8. SAR Ice Image Classification Using Parallelepiped Classifier Based on Gram-Schmidt Spectral Technique

    Directory of Open Access Journals (Sweden)

    A.Vanitha

    2013-05-01

    Full Text Available Synthetic Aperture Radar (SAR is a special type of imaging radar that involves advanced technology and complex data processing to obtain de tailed images from the lake surface. Lake ice typically reflects more of the radar energy emi tted by the sensor than the surrounding area, which makes it easy to distinguish between the wate r and the ice surface. In this research work, SAR images are used for ice classification based on supervised and unsupervised classification algorithms. In the pre-processing stage, Hue satura tion value (HSV and Gram–Schmidt spectral sharpening techniques are applied for shar pening and resampling to attain high- resolution pixel size. Based on the performance eva luation metrics it is proved that Gram- Schmidt spectral sharpening performs better than sh arpening the HSV between the boundaries. In classification stage, Gram–Schmidt spectral tech nique based sharpened SAR images are used as the input for classifying using parallelepiped a nd ISO data classifier. The performances of the classifiers are evaluated with overall accuracy and kappa coefficient. From the experimental results, ice from water is classified more accurately in the parallelepiped supervised classification algorithm.

  9. A comparison of classification techniques for glacier change detection using multispectral images

    OpenAIRE

    Rahul Nijhawan; Pradeep Garg; Praveen Thakur

    2016-01-01

    Main aim of this paper is to compare the classification accuracies of glacier change detection by following classifiers: sub-pixel classification algorithm, indices based supervised classification and object based algorithm using Landsat imageries. It was observed that shadow effect was not removed in sub-pixel based classification which was removed by the indices method. Further the accuracy was improved by object based classification. Objective of the paper is to analyse different classific...

  10. An Implementation Of Network Traffic Classification Technique Based On K-Medoids

    Directory of Open Access Journals (Sweden)

    Dheeraj Basant Shukla

    2014-04-01

    Full Text Available Classification of network traffic is extensively required mainly for many network management tasks such as flow prioritization, traffic shaping/policing, and diagnostic monitoring. Many approaches have been evolved for this purpose. The classical approach such as port number or payload analysis methods has their own limitations. For example, some applications uses dynamic port number and encryption techniques, making these techniques ineffective. To overcome these limitations machine learning approaches were proposed. But these approaches also have problems of labeled instances in supervised learning and tedious manual work in unsupervised learning. Our aim was to implement an approach for classification of network traffic on semi-supervised data which overcomes the shortcomings of other two approaches. In this approach, flow (instance statistics are used to classify the traffic. These flow statistics contains few labeled and many unlabeled instances constitutes a training data set which was used for the training (learning of classifier. Then we used two processes: the clustering (using K-Medoids which divides the training data into different groups and classification in which the labeling to the groups was done. To build the model we used the MATLAB tool. To test the build model we used KDD CUP 99 intrusion detection data set, which includes both attack data and normal data.

  11. Managing complex processing of medical image sequences by program supervision techniques

    Science.gov (United States)

    Crubezy, Monica; Aubry, Florent; Moisan, Sabine; Chameroy, Virginie; Thonnat, Monique; Di Paola, Robert

    1997-05-01

    Our objective is to offer clinicians wider access to evolving medical image processing (MIP) techniques, crucial to improve assessment and quantification of physiological processes, but difficult to handle for non-specialists in MIP. Based on artificial intelligence techniques, our approach consists in the development of a knowledge-based program supervision system, automating the management of MIP libraries. It comprises a library of programs, a knowledge base capturing the expertise about programs and data and a supervision engine. It selects, organizes and executes the appropriate MIP programs given a goal to achieve and a data set, with dynamic feedback based on the results obtained. It also advises users in the development of new procedures chaining MIP programs.. We have experimented the approach for an application of factor analysis of medical image sequences as a means of predicting the response of osteosarcoma to chemotherapy, with both MRI and NM dynamic image sequences. As a result our program supervision system frees clinical end-users from performing tasks outside their competence, permitting them to concentrate on clinical issues. Therefore our approach enables a better exploitation of possibilities offered by MIP and higher quality results, both in terms of robustness and reliability.

  12. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  13. A Novel Pre-Processing Technique for Original Feature Matrix of Electronic Nose Based on Supervised Locality Preserving Projections

    Directory of Open Access Journals (Sweden)

    Pengfei Jia

    2016-06-01

    Full Text Available An electronic nose (E-nose consisting of 14 metal oxide gas sensors and one electronic chemical gas sensor has been constructed to identify four different classes of wound infection. However, the classification results of the E-nose are not ideal if the original feature matrix containing the maximum steady-state response value of sensors is processed by the classifier directly, so a novel pre-processing technique based on supervised locality preserving projections (SLPP is proposed in this paper to process the original feature matrix before it is put into the classifier to improve the performance of the E-nose. SLPP is good at finding and keeping the nonlinear structure of data; furthermore, it can provide an explicit mapping expression which is unreachable by the traditional manifold learning methods. Additionally, some effective optimization methods are found by us to optimize the parameters of SLPP and the classifier. Experimental results prove that the classification accuracy of support vector machine (SVM combined with the data pre-processed by SLPP outperforms other considered methods. All results make it clear that SLPP has a better performance in processing the original feature matrix of the E-nose.

  14. A Generalized Image Scene Decomposition-Based System for Supervised Classification of Very High Resolution Remote Sensing Imagery

    Directory of Open Access Journals (Sweden)

    ZhiYong Lv

    2016-09-01

    Full Text Available Very high resolution (VHR remote sensing images are widely used for land cover classification. However, to the best of our knowledge, few approaches have been shown to improve classification accuracies through image scene decomposition. In this paper, a simple yet powerful observational scene scale decomposition (OSSD-based system is proposed for the classification of VHR images. Different from the traditional methods, the OSSD-based system aims to improve the classification performance by decomposing the complexity of an image’s content. First, an image scene is divided into sub-image blocks through segmentation to decompose the image content. Subsequently, each sub-image block is classified respectively, or each block is processed firstly through an image filter or spectral–spatial feature extraction method, and then each processed segment is taken as the feature input of a classifier. Finally, classified sub-maps are fused together for accuracy evaluation. The effectiveness of our proposed approach was investigated through experiments performed on different images with different supervised classifiers, namely, support vector machine, k-nearest neighbor, naive Bayes classifier, and maximum likelihood classifier. Compared with the accuracy achieved without OSSD processing, the accuracy of each classifier improved significantly, and our proposed approach shows outstanding performance in terms of classification accuracy.

  15. Geographical provenancing of purple grape juices from different farming systems by proton transfer reaction mass spectrometry using supervised statistical techniques

    NARCIS (Netherlands)

    Granato, Daniel; Koot, Alex; Ruth, van S.M.

    2015-01-01

    BACKGROUND: Organic, biodynamic and conventional purple grape juices (PGJ; n = 79) produced in Brazil and Europe were characterized by volatile organic compounds (m/z 20-160) measured by proton transfer reaction mass spectrometry (PTR-MS), and classification models were built using supervised sta

  16. Improvements on coronal hole detection in SDO/AIA images using supervised classification

    Science.gov (United States)

    Reiss, Martin A.; Hofmeister, Stefan J.; De Visscher, Ruben; Temmer, Manuela; Veronig, Astrid M.; Delouille, Véronique; Mampaey, Benjamin; Ahammer, Helmut

    2015-07-01

    We demonstrate the use of machine learning algorithms in combination with segmentation techniques in order to distinguish coronal holes and filaments in SDO/AIA EUV images of the Sun. Based on two coronal hole detection techniques (intensity-based thresholding, SPoCA), we prepared datasets of manually labeled coronal hole and filament channel regions present on the Sun during the time range 2011-2013. By mapping the extracted regions from EUV observations onto HMI line-of-sight magnetograms we also include their magnetic characteristics. We computed shape measures from the segmented binary maps as well as first order and second order texture statistics from the segmented regions in the EUV images and magnetograms. These attributes were used for data mining investigations to identify the most performant rule to differentiate between coronal holes and filament channels. We applied several classifiers, namely Support Vector Machine (SVM), Linear Support Vector Machine, Decision Tree, and Random Forest, and found that all classification rules achieve good results in general, with linear SVM providing the best performances (with a true skill statistic of ≈ 0.90). Additional information from magnetic field data systematically improves the performance across all four classifiers for the SPoCA detection. Since the calculation is inexpensive in computing time, this approach is well suited for applications on real-time data. This study demonstrates how a machine learning approach may help improve upon an unsupervised feature extraction method.

  17. Semi-supervised hyperspectral classification from a small number of training samples using a co-training approach

    Science.gov (United States)

    Romaszewski, Michał; Głomb, Przemysław; Cholewa, Michał

    2016-11-01

    We present a novel semi-supervised algorithm for classification of hyperspectral data from remote sensors. Our method is inspired by the Tracking-Learning-Detection (TLD) framework, originally applied for tracking objects in a video stream. TLD introduced the co-training approach called P-N learning, making use of two independent 'experts' (or learners) that scored samples in different feature spaces. In a similar fashion, we formulated the hyperspectral classification task as a co-training problem, that can be solved with the P-N learning scheme. Our method uses both spatial and spectral features of data, extending a small set of initial labelled samples during the process of region growing. We show that this approach is stable and achieves very good accuracy even for small training sets. We analyse the algorithm's performance on several publicly available hyperspectral data sets.

  18. Supervised Classification in the Presence of Misclassified Training Data: A Monte Carlo Simulation Study in the Three Group Case

    Directory of Open Access Journals (Sweden)

    Jocelyn E Bolin

    2014-02-01

    Full Text Available Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three-group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.

  19. Supervised classification of aerial imagery and multi-source data fusion for flood assessment

    Science.gov (United States)

    Sava, E.; Harding, L.; Cervone, G.

    2015-12-01

    Floods are among the most devastating natural hazards and the ability to produce an accurate and timely flood assessment before, during, and after an event is critical for their mitigation and response. Remote sensing technologies have become the de-facto approach for observing the Earth and its environment. However, satellite remote sensing data are not always available. For these reasons, it is crucial to develop new techniques in order to produce flood assessments during and after an event. Recent advancements in data fusion techniques of remote sensing with near real time heterogeneous datasets have allowed emergency responders to more efficiently extract increasingly precise and relevant knowledge from the available information. This research presents a fusion technique using satellite remote sensing imagery coupled with non-authoritative data such as Civil Air Patrol (CAP) and tweets. A new computational methodology is proposed based on machine learning algorithms to automatically identify water pixels in CAP imagery. Specifically, wavelet transformations are paired with multiple classifiers, run in parallel, to build models discriminating water and non-water regions. The learned classification models are first tested against a set of control cases, and then used to automatically classify each image separately. A measure of uncertainty is computed for each pixel in an image proportional to the number of models classifying the pixel as water. Geo-tagged tweets are continuously harvested and stored on a MongoDB and queried in real time. They are fused with CAP classified data, and with satellite remote sensing derived flood extent results to produce comprehensive flood assessment maps. The final maps are then compared with FEMA generated flood extents to assess their accuracy. The proposed methodology is applied on two test cases, relative to the 2013 floods in Boulder CO, and the 2015 floods in Texas.

  20. Improving supervised classification accuracy using non-rigid multimodal image registration: detecting prostate cancer

    Science.gov (United States)

    Chappelow, Jonathan; Viswanath, Satish; Monaco, James; Rosen, Mark; Tomaszewski, John; Feldman, Michael; Madabhushi, Anant

    2008-03-01

    Computer-aided diagnosis (CAD) systems for the detection of cancer in medical images require precise labeling of training data. For magnetic resonance (MR) imaging (MRI) of the prostate, training labels define the spatial extent of prostate cancer (CaP); the most common source for these labels is expert segmentations. When ancillary data such as whole mount histology (WMH) sections, which provide the gold standard for cancer ground truth, are available, the manual labeling of CaP can be improved by referencing WMH. However, manual segmentation is error prone, time consuming and not reproducible. Therefore, we present the use of multimodal image registration to automatically and accurately transcribe CaP from histology onto MRI following alignment of the two modalities, in order to improve the quality of training data and hence classifier performance. We quantitatively demonstrate the superiority of this registration-based methodology by comparing its results to the manual CaP annotation of expert radiologists. Five supervised CAD classifiers were trained using the labels for CaP extent on MRI obtained by the expert and 4 different registration techniques. Two of the registration methods were affi;ne schemes; one based on maximization of mutual information (MI) and the other method that we previously developed, Combined Feature Ensemble Mutual Information (COFEMI), which incorporates high-order statistical features for robust multimodal registration. Two non-rigid schemes were obtained by succeeding the two affine registration methods with an elastic deformation step using thin-plate splines (TPS). In the absence of definitive ground truth for CaP extent on MRI, classifier accuracy was evaluated against 7 ground truth surrogates obtained by different combinations of the expert and registration segmentations. For 26 multimodal MRI-WMH image pairs, all four registration methods produced a higher area under the receiver operating characteristic curve compared to that

  1. Performance of some supervised and unsupervised multivariate techniques for grouping authentic and unauthentic Viagra and Cialis

    Directory of Open Access Journals (Sweden)

    Michel J. Anzanello

    2014-09-01

    Full Text Available A typical application of multivariate techniques in forensic analysis consists of discriminating between authentic and unauthentic samples of seized drugs, in addition to finding similar properties in the unauthentic samples. In this paper, the performance of several methods belonging to two different classes of multivariate techniques–supervised and unsupervised techniques–were compared. The supervised techniques (ST are the k-Nearest Neighbor (KNN, Support Vector Machine (SVM, Probabilistic Neural Networks (PNN and Linear Discriminant Analysis (LDA; the unsupervised techniques are the k-Means CA and the Fuzzy C-Means (FCM. The methods are applied to Infrared Spectroscopy by Fourier Transform (FTIR from authentic and unauthentic Cialis and Viagra. The FTIR data are also transformed by Principal Components Analysis (PCA and kernel functions aimed at improving the grouping performance. ST proved to be a more reasonable choice when the analysis is conducted on the original data, while the UT led to better results when applied to transformed data.

  2. Search techniques in intelligent classification systems

    CERN Document Server

    Savchenko, Andrey V

    2016-01-01

    A unified methodology for categorizing various complex objects is presented in this book. Through probability theory, novel asymptotically minimax criteria suitable for practical applications in imaging and data analysis are examined including the special cases such as the Jensen-Shannon divergence and the probabilistic neural network. An optimal approximate nearest neighbor search algorithm, which allows faster classification of databases is featured. Rough set theory, sequential analysis and granular computing are used to improve performance of the hierarchical classifiers. Practical examples in face identification (including deep neural networks), isolated commands recognition in voice control system and classification of visemes captured by the Kinect depth camera are included. This approach creates fast and accurate search procedures by using exact probability densities of applied dissimilarity measures. This book can be used as a guide for independent study and as supplementary material for a technicall...

  3. Supervised machine learning on a network scale: application to seismic event classification and detection

    Science.gov (United States)

    Reynen, Andrew; Audet, Pascal

    2017-09-01

    A new method using a machine learning technique is applied to event classification and detection at seismic networks. This method is applicable to a variety of network sizes and settings. The algorithm makes use of a small catalogue of known observations across the entire network. Two attributes, the polarization and frequency content, are used as input to regression. These attributes are extracted at predicted arrival times for P and S waves using only an approximate velocity model, as attributes are calculated over large time spans. This method of waveform characterization is shown to be able to distinguish between blasts and earthquakes with 99 per cent accuracy using a network of 13 stations located in Southern California. The combination of machine learning with generalized waveform features is further applied to event detection in Oklahoma, United States. The event detection algorithm makes use of a pair of unique seismic phases to locate events, with a precision directly related to the sampling rate of the generalized waveform features. Over a week of data from 30 stations in Oklahoma, United States are used to automatically detect 25 times more events than the catalogue of the local geological survey, with a false detection rate of less than 2 per cent. This method provides a highly confident way of detecting and locating events. Furthermore, a large number of seismic events can be automatically detected with low false alarm, allowing for a larger automatic event catalogue with a high degree of trust.

  4. Analysis On Classification Techniques In Mammographic Mass Data Set

    OpenAIRE

    K.K.Kavitha; Dr.A.Kangaiammal

    2015-01-01

    Data mining, the extraction of hidden information from large databases, is to predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. Data-Mining classification techniques deals with determining to which group each data instances are associated with. It can deal with a wide variety of data so that large amount of data can be involved in processing. This paper deals with analysis on various data mining classification techniques such a...

  5. A Review on Subjectivity Analysis through Text Classification Using Mining Techniques

    Directory of Open Access Journals (Sweden)

    Ashwin Shinde

    2017-03-01

    Full Text Available The increased use of web for expressing ones opinion has resulted in to an enhanced amount of subjective content available in the Web. These contents can often be categorized as social content like movie or product reviews, Customer Feedbacks, Blogs, Communication exchange in discussion forums etc. Accurate recognition of the subjective or sentimental web content has a number of benefits. Understanding of the sentiments of human masses towards different entities and products enables better services for contextual advertisements, recommendation systems and analysis of market trends. The objective behind framing this paper to analyze various sentiment based classification techniques which can be utilized for quick estimation of subjective contents of Political reviews based on politicians speech. The paper elaborately discusses supervised machine learning algorithm: Naïve Bayes classification and compares its overall accuracy, precisions as well as recall values

  6. A Semi-supervised Heat Kernel Pagerank MBO Algorithm for Data Classification

    Science.gov (United States)

    2016-07-01

    closed-form expression for the class of each node is derived. Moreover, the authors of [50] describe a semi-supervised method for classifying data using...manifold smoothing and image denoising. In addition to image processing, methods in- volving spectral graph theory [17,56], based on a graphical setting...pagerank and Section 3 presents a model using heat kernel pagerank directly as a classifier . Section 4 formulates the new algorithm as well as provides

  7. Surface Electromyography Signal Processing and Classification Techniques

    Directory of Open Access Journals (Sweden)

    Tae G. Chang

    2013-09-01

    Full Text Available Electromyography (EMG signals are becoming increasingly important in many applications, including clinical/biomedical, prosthesis or rehabilitation devices, human machine interactions, and more. However, noisy EMG signals are the major hurdles to be overcome in order to achieve improved performance in the above applications. Detection, processing and classification analysis in electromyography (EMG is very desirable because it allows a more standardized and precise evaluation of the neurophysiological, rehabitational and assistive technological findings. This paper reviews two prominent areas; first: the pre-processing method for eliminating possible artifacts via appropriate preparation at the time of recording EMG signals, and second: a brief explanation of the different methods for processing and classifying EMG signals. This study then compares the numerous methods of analyzing EMG signals, in terms of their performance. The crux of this paper is to review the most recent developments and research studies related to the issues mentioned above.

  8. Surface Electromyography Signal Processing and Classification Techniques

    Science.gov (United States)

    Chowdhury, Rubana H.; Reaz, Mamun B. I.; Ali, Mohd Alauddin Bin Mohd; Bakar, Ashrif A. A.; Chellappan, Kalaivani; Chang, Tae. G.

    2013-01-01

    Electromyography (EMG) signals are becoming increasingly important in many applications, including clinical/biomedical, prosthesis or rehabilitation devices, human machine interactions, and more. However, noisy EMG signals are the major hurdles to be overcome in order to achieve improved performance in the above applications. Detection, processing and classification analysis in electromyography (EMG) is very desirable because it allows a more standardized and precise evaluation of the neurophysiological, rehabitational and assistive technological findings. This paper reviews two prominent areas; first: the pre-processing method for eliminating possible artifacts via appropriate preparation at the time of recording EMG signals, and second: a brief explanation of the different methods for processing and classifying EMG signals. This study then compares the numerous methods of analyzing EMG signals, in terms of their performance. The crux of this paper is to review the most recent developments and research studies related to the issues mentioned above. PMID:24048337

  9. Surface electromyography signal processing and classification techniques.

    Science.gov (United States)

    Chowdhury, Rubana H; Reaz, Mamun B I; Ali, Mohd Alauddin Bin Mohd; Bakar, Ashrif A A; Chellappan, K; Chang, T G

    2013-09-17

    Electromyography (EMG) signals are becoming increasingly important in many applications, including clinical/biomedical, prosthesis or rehabilitation devices, human machine interactions, and more. However, noisy EMG signals are the major hurdles to be overcome in order to achieve improved performance in the above applications. Detection, processing and classification analysis in electromyography (EMG) is very desirable because it allows a more standardized and precise evaluation of the neurophysiological, rehabitational and assistive technological findings. This paper reviews two prominent areas; first: the pre-processing method for eliminating possible artifacts via appropriate preparation at the time of recording EMG signals, and second: a brief explanation of the different methods for processing and classifying EMG signals. This study then compares the numerous methods of analyzing EMG signals, in terms of their performance. The crux of this paper is to review the most recent developments and research studies related to the issues mentioned above.

  10. Fine Grained Sentiment Classification of Customer Reviews Using Computational Intelligent Technique

    Directory of Open Access Journals (Sweden)

    C Priyanka

    2015-08-01

    Full Text Available Online reviews are now popularly used for judging quality of product or service and influence decision making of users while selecting a product or service. Due to innumerous number of customer reviews on the web, it is difficult to summarize them which require a faster opinion mining system to classify the reviews. Many researchers have explored various supervised and unsupervised machine learning techniques for binary classification of reviews. Compared to these techniques, fuzzy logic can provide a straightforward and comparatively faster way to model the fuzziness existing between the sentiment polarities classes due to the ambiguity present in most of the natural languages. But the fuzzy logic techniques are less explored in this domain. Hence in this paper, a fuzzy logic model based on the most popularly known sentiment based lexicon SentiWordNet has been proposed for fine grained classification of the reviews into weak positive, moderate positive, strong positive, weak negative, moderate negative and strong negative classes. Experiments have been conducted on datasets containing reviews of electronic products namely smart phones, LED TV and laptops and have shown to provide fine grained classification accuracy approximately in the range of 74% to 77%.

  11. EEG source space analysis of the supervised factor analytic approach for the classification of multi-directional arm movement

    Science.gov (United States)

    Shenoy Handiru, Vikram; Vinod, A. P.; Guan, Cuntai

    2017-08-01

    Objective. In electroencephalography (EEG)-based brain-computer interface (BCI) systems for motor control tasks the conventional practice is to decode motor intentions by using scalp EEG. However, scalp EEG only reveals certain limited information about the complex tasks of movement with a higher degree of freedom. Therefore, our objective is to investigate the effectiveness of source-space EEG in extracting relevant features that discriminate arm movement in multiple directions. Approach. We have proposed a novel feature extraction algorithm based on supervised factor analysis that models the data from source-space EEG. To this end, we computed the features from the source dipoles confined to Brodmann areas of interest (BA4a, BA4p and BA6). Further, we embedded class-wise labels of multi-direction (multi-class) source-space EEG to an unsupervised factor analysis to make it into a supervised learning method. Main Results. Our approach provided an average decoding accuracy of 71% for the classification of hand movement in four orthogonal directions, that is significantly higher (>10%) than the classification accuracy obtained using state-of-the-art spatial pattern features in sensor space. Also, the group analysis on the spectral characteristics of source-space EEG indicates that the slow cortical potentials from a set of cortical source dipoles reveal discriminative information regarding the movement parameter, direction. Significance. This study presents evidence that low-frequency components in the source space play an important role in movement kinematics, and thus it may lead to new strategies for BCI-based neurorehabilitation.

  12. Photometric classification of type Ia supernovae in the SuperNova Legacy Survey with supervised learning

    Science.gov (United States)

    Möller, A.; Ruhlmann-Kleider, V.; Leloup, C.; Neveu, J.; Palanque-Delabrouille, N.; Rich, J.; Carlberg, R.; Lidman, C.; Pritchet, C.

    2016-12-01

    In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts (0.2 Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia samples. Using the Area Under the Curve (AUC) metric, where perfect classification is given by 1, we find that our best-performing classifier (Extreme Gradient Boosting Decision Tree) has an AUC of 0.98.We show that it is possible to obtain a large photometrically selected type Ia SN sample with an estimated contamination of less than 5%. When applied to data from the first three years of SNLS, we obtain 529 events. We investigate the differences between classifying simulated SNe, and real SN survey data. In particular, we find that applying a thorough set of selection cuts to the SN sample is essential for good classification. This work demonstrates for the first time the feasibility of machine learning classification in a high-z SN survey with application to real SN data.

  13. Photometric classification of type Ia supernovae in the SuperNova Legacy Survey with supervised learning

    CERN Document Server

    Möller, A; Leloup, C; Neveu, J; Palanque-Delabrouille, N; Rich, J; Carlberg, R; Lidman, C; Pritchet, C

    2016-01-01

    In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts ($0.2classification. We study the performance of different algorithms such as Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia sa...

  14. Entropy-based generation of supervised neural networks for classification of structured patterns.

    Science.gov (United States)

    Tsai, Hsien-Leing; Lee, Shie-Jue

    2004-03-01

    Sperduti and Starita proposed a new type of neural network which consists of generalized recursive neurons for classification of structures. In this paper, we propose an entropy-based approach for constructing such neural networks for classification of acyclic structured patterns. Given a classification problem, the architecture, i.e., the number of hidden layers and the number of neurons in each hidden layer, and all the values of the link weights associated with the corresponding neural network are automatically determined. Experimental results have shown that the networks constructed by our method can have a better performance, with respect to network size, learning speed, or recognition accuracy, than the networks obtained by other methods.

  15. A Disaster Document Classification Technique Using Domain Specific Ontologies

    Directory of Open Access Journals (Sweden)

    Qazi Mudassar Ilyas

    2015-12-01

    Full Text Available Manual data collection and entry is one of the bottlenecks in conventional disaster management information systems. Time is a critical factor in emergency situations and timely data collection and processing may help in saving several lives. An effective disaster management system needs to collect data from World Wide Web automatically. A prerequisite for data collection process is document classification mechanism to classify a particular document into different categories. Ontologies are formal bodies of knowledge used to capture machine understandable semantics of a domain of interest and have been used successfully to support document classification in various domains. This paper presents an ontology-based document classification technique for automatic data collection in a disaster management system. A general ontology of disasters is used that contains the description of several natural and man-made disasters. The proposed technique augments the conventional classification measures with the ontological knowledge to improve the precision of classification. A preliminary implementation of the proposed technique shows promising results with up to 10% overall improvement in precision when compared with conventional classification methods.

  16. Detection and Evaluation of Cheating on College Exams Using Supervised Classification

    Science.gov (United States)

    Cavalcanti, Elmano Ramalho; Pires, Carlos Eduardo; Cavalcanti, Elmano Pontes; Pires, Vládia Freire

    2012-01-01

    Text mining has been used for various purposes, such as document classification and extraction of domain-specific information from text. In this paper we present a study in which text mining methodology and algorithms were properly employed for academic dishonesty (cheating) detection and evaluation on open-ended college exams, based on document…

  17. Manifold regularized multitask learning for semi-supervised multilabel image classification.

    Science.gov (United States)

    Luo, Yong; Tao, Dacheng; Geng, Bo; Xu, Chao; Maybank, Stephen J

    2013-02-01

    It is a significant challenge to classify images with multiple labels by using only a small number of labeled samples. One option is to learn a binary classifier for each label and use manifold regularization to improve the classification performance by exploring the underlying geometric structure of the data distribution. However, such an approach does not perform well in practice when images from multiple concepts are represented by high-dimensional visual features. Thus, manifold regularization is insufficient to control the model complexity. In this paper, we propose a manifold regularized multitask learning (MRMTL) algorithm. MRMTL learns a discriminative subspace shared by multiple classification tasks by exploiting the common structure of these tasks. It effectively controls the model complexity because different tasks limit one another's search volume, and the manifold regularization ensures that the functions in the shared hypothesis space are smooth along the data manifold. We conduct extensive experiments, on the PASCAL VOC'07 dataset with 20 classes and the MIR dataset with 38 classes, by comparing MRMTL with popular image classification algorithms. The results suggest that MRMTL is effective for image classification.

  18. Determination of Land Cover/land Use Using SPOT 7 Data with Supervised Classification Methods

    Science.gov (United States)

    Bektas Balcik, F.; Karakacan Kuzucu, A.

    2016-10-01

    Land use/ land cover (LULC) classification is a key research field in remote sensing. With recent developments of high-spatial-resolution sensors, Earth-observation technology offers a viable solution for land use/land cover identification and management in the rural part of the cities. There is a strong need to produce accurate, reliable, and up-to-date land use/land cover maps for sustainable monitoring and management. In this study, SPOT 7 imagery was used to test the potential of the data for land cover/land use mapping. Catalca is selected region located in the north west of the Istanbul in Turkey, which is mostly covered with agricultural fields and forest lands. The potentials of two classification algorithms maximum likelihood, and support vector machine, were tested, and accuracy assessment of the land cover maps was performed through error matrix and Kappa statistics. The results indicated that both of the selected classifiers were highly useful (over 83% accuracy) in the mapping of land use/cover in the study region. The support vector machine classification approach slightly outperformed the maximum likelihood classification in both overall accuracy and Kappa statistics.

  19. EMD-Based Temporal and Spectral Features for the Classification of EEG Signals Using Supervised Learning.

    Science.gov (United States)

    Riaz, Farhan; Hassan, Ali; Rehman, Saad; Niazi, Imran Khan; Dremstrup, Kim

    2016-01-01

    This paper presents a novel method for feature extraction from electroencephalogram (EEG) signals using empirical mode decomposition (EMD). Its use is motivated by the fact that the EMD gives an effective time-frequency analysis of nonstationary signals. The intrinsic mode functions (IMF) obtained as a result of EMD give the decomposition of a signal according to its frequency components. We present the usage of upto third order temporal moments, and spectral features including spectral centroid, coefficient of variation and the spectral skew of the IMFs for feature extraction from EEG signals. These features are physiologically relevant given that the normal EEG signals have different temporal and spectral centroids, dispersions and symmetries when compared with the pathological EEG signals. The calculated features are fed into the standard support vector machine (SVM) for classification purposes. The performance of the proposed method is studied on a publicly available dataset which is designed to handle various classification problems including the identification of epilepsy patients and detection of seizures. Experiments show that good classification results are obtained using the proposed methodology for the classification of EEG signals. Our proposed method also compares favorably to other state-of-the-art feature extraction methods.

  20. Classification of assembly techniques for micro products

    DEFF Research Database (Denmark)

    Hansen, Hans Nørgaard; Tosello, Guido; Gegeckaite, Asta

    2005-01-01

    Industrial production of micro products to be introduced in the market has to be reliable, fast, carried out at a reasonable price and in an acceptable quantity. One of the crucial steps in the process chain related to micro product manufacture is the assembly phase. Here components are handled...... of components and level of integration are made. This paper describes a systematic characterization of micro assembly methods. This methodology offers the opportunity of a cross comparison among different techniques to gain a choosing principle of the favourable micro assembly technology in a specific case...

  1. Classification of assembly techniques for micro products

    DEFF Research Database (Denmark)

    Hansen, Hans Nørgaard; Tosello, Guido; Gegeckaite, Asta

    2005-01-01

    Industrial production of micro products to be introduced in the market has to be reliable, fast, carried out at a reasonable price and in an acceptable quantity. One of the crucial steps in the process chain related to micro product manufacture is the assembly phase. Here components are handled...... of components and level of integration are made. This paper describes a systematic characterization of micro assembly methods. This methodology offers the opportunity of a cross comparison among different techniques to gain a choosing principle of the favourable micro assembly technology in a specific case...

  2. A comparison of classification techniques for glacier change detection using multispectral images

    Directory of Open Access Journals (Sweden)

    Rahul Nijhawan

    2016-09-01

    Full Text Available Main aim of this paper is to compare the classification accuracies of glacier change detection by following classifiers: sub-pixel classification algorithm, indices based supervised classification and object based algorithm using Landsat imageries. It was observed that shadow effect was not removed in sub-pixel based classification which was removed by the indices method. Further the accuracy was improved by object based classification. Objective of the paper is to analyse different classification algorithms and interpret which one gives the best results in mountainous regions. The study showed that object based method was best in mountainous regions as optimum results were obtained in the shadowed covered regions.

  3. Classification models for clear cell renal carcinoma stage progression, based on tumor RNAseq expression trained supervised machine learning algorithms.

    Science.gov (United States)

    Jagga, Zeenia; Gupta, Dinesh

    2014-01-01

    Clear-cell Renal Cell Carcinoma (ccRCC) is the most- prevalent, chemotherapy resistant and lethal adult kidney cancer. There is a need for novel diagnostic and prognostic biomarkers for ccRCC, due to its heterogeneous molecular profiles and asymptomatic early stage. This study aims to develop classification models to distinguish early stage and late stage of ccRCC based on gene expression profiles. We employed supervised learning algorithms- J48, Random Forest, SMO and Naïve Bayes; with enriched model learning by fast correlation based feature selection to develop classification models trained on sequencing based gene expression data of RNAseq experiments, obtained from The Cancer Genome Atlas. Different models developed in the study were evaluated on the basis of 10 fold cross validations and independent dataset testing. Random Forest based prediction model performed best amongst the models developed in the study, with a sensitivity of 89%, accuracy of 77% and area under Receivers Operating Curve of 0.8. We anticipate that the prioritized subset of 62 genes and prediction models developed in this study will aid experimental oncologists to expedite understanding of the molecular mechanisms of stage progression and discovery of prognostic factors for ccRCC tumors.

  4. Analysis On Classification Techniques In Mammographic Mass Data Set

    Directory of Open Access Journals (Sweden)

    Mrs. K. K. Kavitha

    2015-07-01

    Full Text Available Data mining, the extraction of hidden information from large databases, is to predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. Data-Mining classification techniques deals with determining to which group each data instances are associated with. It can deal with a wide variety of data so that large amount of data can be involved in processing. This paper deals with analysis on various data mining classification techniques such as Decision Tree Induction, Naïve Bayes , k-Nearest Neighbour (KNN classifiers in mammographic mass dataset.

  5. Semi-automatic supervised classification of minerals from x-ray mapping images

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Flesche, Harald; Larsen, Rasmus

    1998-01-01

    spectroscopy (EDS) in a scanning electron microscope (SEM). Extensions to traditional multivariate statistical methods are applied to perform the classification. Training sets are grown from one or a few seed points by a method that ensures spatial and spectral closeness of observations. Spectral closeness...... to a small area in order to allow for the estimation of a variance-covariance matrix. This expansion is controlled by upper limits for the spatial and Euclidean spectral distances from the seed point. Second, after this initial expansion the growing of the training set is controlled by an upper limit...... training, a standard quadratic classifier is applied. The performance for each parameter setting is measured by the overall misclassification rate on an independently generated validation set. The classification method is presently used as a routine petrographical analysis method at Norsk Hydro Research...

  6. Restructuring supervision and reconfiguration of skill mix in community pharmacy: Classification of perceived safety and risk.

    Science.gov (United States)

    Bradley, Fay; Willis, Sarah C; Noyce, Peter R; Schafheutle, Ellen I

    2016-01-01

    Broadening the range of services provided through community pharmacy increases workloads for pharmacists that could be alleviated by reconfiguring roles within the pharmacy team. To examine pharmacists' and pharmacy technicians (PTs)' perceptions of how safe it would be for support staff to undertake a range of pharmacy activities during a pharmacist's absence. Views on supervision, support staff roles, competency and responsibility were also sought. Informed by nominal group discussions, a questionnaire was developed and distributed to a random sample of 1500 pharmacists and 1500 PTs registered in England. Whilst focused on community pharmacy practice, hospital pharmacy respondents were included, as more advanced skill mix models may provide valuable insights. Respondents were asked to rank a list of 22 pharmacy activities in terms of perceived risk and safety of these activities being performed by support staff during a pharmacist's absence. Descriptive and comparative statistic analyses were conducted. Six-hundred-and-forty-two pharmacists (43.2%) and 854 PTs (57.3%) responded; the majority worked in community pharmacy. Dependent on agreement levels with perceived safety, from community pharmacists and PTs, and hospital pharmacists and PTs, the 22 activities were grouped into 'safe' (n = 7), 'borderline' (n = 9) and 'unsafe' (n = 6). Activities such as assembly and labeling were considered 'safe,' clinical activities were considered 'unsafe.' There were clear differences between pharmacists and PTs, and sectors (community pharmacy vs. hospital). Community pharmacists were most cautious (particularly mobile and portfolio pharmacists) about which activities they felt support staff could safely perform; PTs in both sectors felt significantly more confident performing particularly technical activities than pharmacists. This paper presents novel empirical evidence informing the categorization of pharmacy activities into 'safe,' 'borderline' or 'unsafe

  7. Assessment of Heart Disease using Fuzzy Classification Techniques

    Directory of Open Access Journals (Sweden)

    Horia F. Pop

    2001-01-01

    Full Text Available In this paper we discuss the classification results of cardiac patients of ischemical cardiopathy, valvular heart disease, and arterial hypertension, based on 19 characteristics (descriptors including ECHO data, effort testings, and age and weight. In this order we have used different fuzzy clustering algorithms, namely hierarchical fuzzy clustering, hierarchical and horizontal fuzzy characteristics clustering, and a new clustering technique, fuzzy hierarchical cross-classification. The characteristics clustering techniques produce fuzzy partitions of the characteristics involved and, thus, are useful tools for studying the similarities between different characteristics and for essential characteristics selection. The cross-classification algorithm produces not only a fuzzy partition of the cardiac patients analyzed, but also a fuzzy partition of their considered characteristics. In this way it is possible to identify which characteristics are responsible for the similarities or dissimilarities observed between different groups of patients.

  8. Significance of Classification Techniques in Prediction of Learning Disabilities

    CERN Document Server

    Balakrishnan, Julie M David And Kannan

    2010-01-01

    The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified.

  9. Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting

    KAUST Repository

    Fernandes, José Antonio

    2013-02-01

    A multi-species approach to fisheries management requires taking into account the interactions between species in order to improve recruitment forecasting of the fish species. Recent advances in Bayesian networks direct the learning of models with several interrelated variables to be forecasted simultaneously. These models are known as multi-dimensional Bayesian network classifiers (MDBNs). Pre-processing steps are critical for the posterior learning of the model in these kinds of domains. Therefore, in the present study, a set of \\'state-of-the-art\\' uni-dimensional pre-processing methods, within the categories of missing data imputation, feature discretization and feature subset selection, are adapted to be used with MDBNs. A framework that includes the proposed multi-dimensional supervised pre-processing methods, coupled with a MDBN classifier, is tested with synthetic datasets and the real domain of fish recruitment forecasting. The correctly forecasting of three fish species (anchovy, sardine and hake) simultaneously is doubled (from 17.3% to 29.5%) using the multi-dimensional approach in comparison to mono-species models. The probability assessments also show high improvement reducing the average error (estimated by means of Brier score) from 0.35 to 0.27. Finally, these differences are superior to the forecasting of species by pairs. © 2012 Elsevier Ltd.

  10. Detection of facilities in satellite imagery using semi-supervised image classification and auxiliary contextual observables

    Energy Technology Data Exchange (ETDEWEB)

    Harvey, Neal R [Los Alamos National Laboratory; Ruggiero, Christy E [Los Alamos National Laboratory; Pawley, Norma H [Los Alamos National Laboratory; Brumby, Steven P [Los Alamos National Laboratory; Macdonald, Brian [Los Alamos National Laboratory; Balick, Lee [Los Alamos National Laboratory; Oyer, Alden [Los Alamos National Laboratory

    2009-01-01

    Detecting complex targets, such as facilities, in commercially available satellite imagery is a difficult problem that human analysts try to solve by applying world knowledge. Often there are known observables that can be extracted by pixel-level feature detectors that can assist in the facility detection process. Individually, each of these observables is not sufficient for an accurate and reliable detection, but in combination, these auxiliary observables may provide sufficient context for detection by a machine learning algorithm. We describe an approach for automatic detection of facilities that uses an automated feature extraction algorithm to extract auxiliary observables, and a semi-supervised assisted target recognition algorithm to then identify facilities of interest. We illustrate the approach using an example of finding schools in Quickbird image data of Albuquerque, New Mexico. We use Los Alamos National Laboratory's Genie Pro automated feature extraction algorithm to find a set of auxiliary features that should be useful in the search for schools, such as parking lots, large buildings, sports fields and residential areas and then combine these features using Genie Pro's assisted target recognition algorithm to learn a classifier that finds schools in the image data.

  11. Review of feed forward neural network classification preprocessing techniques

    Science.gov (United States)

    Asadi, Roya; Kareem, Sameem Abdul

    2014-06-01

    The best feature of artificial intelligent Feed Forward Neural Network (FFNN) classification models is learning of input data through their weights. Data preprocessing and pre-training are the contributing factors in developing efficient techniques for low training time and high accuracy of classification. In this study, we investigate and review the powerful preprocessing functions of the FFNN models. Currently initialization of the weights is at random which is the main source of problems. Multilayer auto-encoder networks as the latest technique like other related techniques is unable to solve the problems. Weight Linear Analysis (WLA) is a combination of data pre-processing and pre-training to generate real weights through the use of normalized input values. The FFNN model by using the WLA increases classification accuracy and improve training time in a single epoch without any training cycle, the gradient of the mean square error function, updating the weights. The results of comparison and evaluation show that the WLA is a powerful technique in the FFNN classification area yet.

  12. Automated supervised classification of variable stars II. Application to the OGLE database

    CERN Document Server

    Sarro, L M; López, M; Aerts, C

    2008-01-01

    We aim to extend and test the classifiers presented in a previous work against an independent dataset. We complement the assessment of the validity of the classifiers by applying them to the set of OGLE light curves treated as variable objects of unknown class. The results are compared to published classification results based on the so-called extractor methods.Two complementary analyses are carried out in parallel. In both cases, the original time series of OGLE observations of the Galactic bulge and Magellanic Clouds are processed in order to identify and characterize the frequency components. In the first approach, the classifiers are applied to the data and the results analyzed in terms of systematic errors and differences between the definition samples in the training set and in the extractor rules. In the second approach, the original classifiers are extended with colour information and, again, applied to OGLE light curves. We have constructed a classification system that can process huge amounts of tim...

  13. A Study of Clinical Supervision Techniques and Training in Substance Abuse Treatment

    Science.gov (United States)

    West, Paul L.; Hamm, Terri

    2012-01-01

    Data from 57 clinical supervisors in licensed substance abuse treatment programs indicate that 28% had completed formal graduate course work in clinical supervision and 33% were professionally licensed or certified. Findings raise concerns about the scope and quality of clinical supervision available to substance abuse counselors. (Contains 3…

  14. A Novel Split and Merge Technique for Hypertext Classification

    Science.gov (United States)

    Saha, Suman; Murthy, C. A.; Pal, Sankar K.

    As web grows at an increasing speed, hypertext classification is becoming a necessity. While the literature on text categorization is quite mature, the issue of utilizing hypertext structure and hyperlinks has been relatively unexplored. In this paper, we introduce a novel split and merge technique for classification of hypertext documents. The splitting process is performed at the feature level by representing the hypertext features in a tensor space model. We exploit the local-structure and neighborhood recommendation encapsulated in the this representation model. The merging process is performed on multiple classifications obtained from split representation. A meta level decision system is formed by obtaining predictions of base level classifiers trained on different components of the tensor and actual category of the hypertext document. These individual predictions for each component of the tensor are subsequently combined to a final prediction using rough set based ensemble classifiers. Experimental results of classification obtained by using our method is marginally better than other existing hypertext classification techniques.

  15. Automated classification of female facial beauty by image analysis and supervised learning

    Science.gov (United States)

    Gunes, Hatice; Piccardi, Massimo; Jan, Tony

    2004-01-01

    The fact that perception of facial beauty may be a universal concept has long been debated amongst psychologists and anthropologists. In this paper, we performed experiments to evaluate the extent of beauty universality by asking a number of diverse human referees to grade a same collection of female facial images. Results obtained show that the different individuals gave similar votes, thus well supporting the concept of beauty universality. We then trained an automated classifier using the human votes as the ground truth and used it to classify an independent test set of facial images. The high accuracy achieved proves that this classifier can be used as a general, automated tool for objective classification of female facial beauty. Potential applications exist in the entertainment industry and plastic surgery.

  16. Comparison of three Statistical Classification Techniques for Maser Identification

    CERN Document Server

    Manning, Ellen M; Ellingsen, Simon P; Breen, Shari L; Chen, Xi; Humphries, Melissa

    2016-01-01

    We applied three statistical classification techniques - linear discriminant analysis (LDA), logistic regression and random forests - to three astronomical datasets associated with searches for interstellar masers. We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to have associated interstellar masers. We also discuss the ease, or otherwise, with which the results of each classification technique can be interpreted. Non-parametric methods have the potential to make accurate predictions when there are complex relationships between critical parameters. We found that for the small datasets the parametric methods logistic regression and LDA performed best, for the largest dataset the non-parametric method of random forests performed with comparable accuracy to parametric techniques, rather than any significant improvement. This suggests that at least for the specific examples investigated here accuracy of the predictions obtained ...

  17. Application of supervised machine learning algorithms for the classification of regulatory RNA riboswitches.

    Science.gov (United States)

    Singh, Swadha; Singh, Raghvendra

    2016-04-03

    Riboswitches, the small structured RNA elements, were discovered about a decade ago. It has been the subject of intense interest to identify riboswitches, understand their mechanisms of action and use them in genetic engineering. The accumulation of genome and transcriptome sequence data and comparative genomics provide unprecedented opportunities to identify riboswitches in the genome. In the present study, we have evaluated the following six machine learning algorithms for their efficiency to classify riboswitches: J48, BayesNet, Naïve Bayes, Multilayer Perceptron, sequential minimal optimization, hidden Markov model (HMM). For determining effective classifier, the algorithms were compared on the statistical measures of specificity, sensitivity, accuracy, F-measure and receiver operating characteristic (ROC) plot analysis. The classifier Multilayer Perceptron achieved the best performance, with the highest specificity, sensitivity, F-score and accuracy, and with the largest area under the ROC curve, whereas HMM was the poorest performer. At present, the available tools for the prediction and classification of riboswitches are based on covariance model, support vector machine and HMM. The present study determines Multilayer Perceptron as a better classifier for the genome-wide riboswitch searches.

  18. Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification.

    Science.gov (United States)

    Soares, João V B; Leandro, Jorge J G; Cesar Júnior, Roberto M; Jelinek, Herbert F; Cree, Michael J

    2006-09-01

    We present a method for automated segmentation of the vasculature in retinal images. The method produces segmentations by classifying each image pixel as vessel or nonvessel, based on the pixel's feature vector. Feature vectors are composed of the pixel's intensity and two-dimensional Gabor wavelet transform responses taken at multiple scales. The Gabor wavelet is capable of tuning to specific frequencies, thus allowing noise filtering and vessel enhancement in a single step. We use a Bayesian classifier with class-conditional probability density functions (likelihoods) described as Gaussian mixtures, yielding a fast classification, while being able to model complex decision surfaces. The probability distributions are estimated based on a training set of labeled pixels obtained from manual segmentations. The method's performance is evaluated on publicly available DRIVE (Staal et al., 2004) and STARE (Hoover et al., 2000) databases of manually labeled images. On the DRIVE database, it achieves an area under the receiver operating characteristic curve of 0.9614, being slightly superior than that presented by state-of-the-art approaches. We are making our implementation available as open source MATLAB scripts for researchers interested in implementation details, evaluation, or development of methods.

  19. Comparative Analysis of Automatic Vehicle Classification Techniques: A Survey

    Directory of Open Access Journals (Sweden)

    Kanwal Yousaf

    2012-09-01

    Full Text Available Vehicle classification has emerged as a significant field of study because of its importance in variety of applications like surveillance, security system, traffic congestion avoidance and accidents prevention etc. So far numerous algorithms have been implemented for classifying vehicle. Each algorithm follows different procedures for detecting vehicles from videos. By evaluating some of the commonly used techniques we highlighted most beneficial methodology for classifying vehicles. In this paper we pointed out the working of several video based vehicle classification algorithms and compare these algorithms on the basis of different performance metrics such as classifiers, classification methodology or principles and vehicle detection ratio etc. After comparing these parameters we concluded that Hybrid Dynamic Bayesian Network (HDBN Classification algorithm is far better than the other algorithms due to its nature of estimating the simplest features of vehicles from different videos. HDBN detects vehicles by following important stages of feature extraction, selection and classification. It extracts the rear view information of vehicles rather than other information such as distance between the wheels and height of wheel etc.

  20. Comparison of Three Statistical Classification Techniques for Maser Identification

    Science.gov (United States)

    Manning, Ellen M.; Holland, Barbara R.; Ellingsen, Simon P.; Breen, Shari L.; Chen, Xi; Humphries, Melissa

    2016-04-01

    We applied three statistical classification techniques-linear discriminant analysis (LDA), logistic regression, and random forests-to three astronomical datasets associated with searches for interstellar masers. We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to have associated interstellar masers. We also discuss the interpretability of the results of each classification technique. Non-parametric methods have the potential to make accurate predictions when there are complex relationships between critical parameters. We found that for the small datasets the parametric methods logistic regression and LDA performed best, for the largest dataset the non-parametric method of random forests performed with comparable accuracy to parametric techniques, rather than any significant improvement. This suggests that at least for the specific examples investigated here accuracy of the predictions obtained is not being limited by the use of parametric models. We also found that for LDA, transformation of the data to match a normal distribution led to a significant improvement in accuracy. The different classification techniques had significant overlap in their predictions; further astronomical observations will enable the accuracy of these predictions to be tested.

  1. A survey of supervised machine learning models for mobile-phone based pathogen identification and classification

    Science.gov (United States)

    Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Tseng, Derek; Benien, Parul; Ozcan, Aydogan

    2017-03-01

    Giardia lamblia causes a disease known as giardiasis, which results in diarrhea, abdominal cramps, and bloating. Although conventional pathogen detection methods used in water analysis laboratories offer high sensitivity and specificity, they are time consuming, and need experts to operate bulky equipment and analyze the samples. Here we present a field-portable and cost-effective smartphone-based waterborne pathogen detection platform that can automatically classify Giardia cysts using machine learning. Our platform enables the detection and quantification of Giardia cysts in one hour, including sample collection, labeling, filtration, and automated counting steps. We evaluated the performance of three prototypes using Giardia-spiked water samples from different sources (e.g., reagent-grade, tap, non-potable, and pond water samples). We populated a training database with >30,000 cysts and estimated our detection sensitivity and specificity using 20 different classifier models, including decision trees, nearest neighbor classifiers, support vector machines (SVMs), and ensemble classifiers, and compared their speed of training and classification, as well as predicted accuracies. Among them, cubic SVM, medium Gaussian SVM, and bagged-trees were the most promising classifier types with accuracies of 94.1%, 94.2%, and 95%, respectively; we selected the latter as our preferred classifier for the detection and enumeration of Giardia cysts that are imaged using our mobile-phone fluorescence microscope. Without the need for any experts or microbiologists, this field-portable pathogen detection platform can present a useful tool for water quality monitoring in resource-limited-settings.

  2. Supervised novelty detection in brain tissue classification with an application to white matter hyperintensities

    Science.gov (United States)

    Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.

    2016-03-01

    Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.

  3. Improvements on coronal hole detection in SDO/AIA images using supervised classification

    CERN Document Server

    Reiss, Martin A; De Visscher, Ruben; Temmer, Manuela; Veronig, Astrid M; Delouille, Véronique; Mampaey, Benjamin; Ahammer, Helmut

    2015-01-01

    We demonstrate the use of machine learning algorithms in combination with segmentation techniques in order to distinguish coronal holes and filaments in SDO/AIA EUV images of the Sun. Based on two coronal hole detection techniques (intensity-based thresholding, SPoCA), we prepared data sets of manually labeled coronal hole and filament channel regions present on the Sun during the time range 2011 - 2013. By mapping the extracted regions from EUV observations onto HMI line-of-sight magnetograms we also include their magnetic characteristics. We computed shape measures from the segmented binary maps as well as first order and second order texture statistics from the segmented regions in the EUV images and magnetograms. These attributes were used for data mining investigations to identify the most performant rule to differentiate between coronal holes and filament channels. We applied several classifiers, namely Support Vector Machine, Linear Support Vector Machine, Decision Tree, and Random Forest and found tha...

  4. Advances in projection of climate change impacts using supervised nonlinear dimensionality reduction techniques

    Science.gov (United States)

    Sarhadi, Ali; Burn, Donald H.; Yang, Ge; Ghodsi, Ali

    2017-02-01

    One of the main challenges in climate change studies is accurate projection of the global warming impacts on the probabilistic behaviour of hydro-climate processes. Due to the complexity of climate-associated processes, identification of predictor variables from high dimensional atmospheric variables is considered a key factor for improvement of climate change projections in statistical downscaling approaches. For this purpose, the present paper adopts a new approach of supervised dimensionality reduction, which is called "Supervised Principal Component Analysis (Supervised PCA)" to regression-based statistical downscaling. This method is a generalization of PCA, extracting a sequence of principal components of atmospheric variables, which have maximal dependence on the response hydro-climate variable. To capture the nonlinear variability between hydro-climatic response variables and projectors, a kernelized version of Supervised PCA is also applied for nonlinear dimensionality reduction. The effectiveness of the Supervised PCA methods in comparison with some state-of-the-art algorithms for dimensionality reduction is evaluated in relation to the statistical downscaling process of precipitation in a specific site using two soft computing nonlinear machine learning methods, Support Vector Regression and Relevance Vector Machine. The results demonstrate a significant improvement over Supervised PCA methods in terms of performance accuracy.

  5. Kernel-based machine learning techniques for infrasound signal classification

    Science.gov (United States)

    Tuma, Matthias; Igel, Christian; Mialle, Pierrick

    2014-05-01

    Infrasound monitoring is one of four remote sensing technologies continuously employed by the CTBTO Preparatory Commission. The CTBTO's infrasound network is designed to monitor the Earth for potential evidence of atmospheric or shallow underground nuclear explosions. Upon completion, it will comprise 60 infrasound array stations distributed around the globe, of which 47 were certified in January 2014. Three stages can be identified in CTBTO infrasound data processing: automated processing at the level of single array stations, automated processing at the level of the overall global network, and interactive review by human analysts. At station level, the cross correlation-based PMCC algorithm is used for initial detection of coherent wavefronts. It produces estimates for trace velocity and azimuth of incoming wavefronts, as well as other descriptive features characterizing a signal. Detected arrivals are then categorized into potentially treaty-relevant versus noise-type signals by a rule-based expert system. This corresponds to a binary classification task at the level of station processing. In addition, incoming signals may be grouped according to their travel path in the atmosphere. The present work investigates automatic classification of infrasound arrivals by kernel-based pattern recognition methods. It aims to explore the potential of state-of-the-art machine learning methods vis-a-vis the current rule-based and task-tailored expert system. To this purpose, we first address the compilation of a representative, labeled reference benchmark dataset as a prerequisite for both classifier training and evaluation. Data representation is based on features extracted by the CTBTO's PMCC algorithm. As classifiers, we employ support vector machines (SVMs) in a supervised learning setting. Different SVM kernel functions are used and adapted through different hyperparameter optimization routines. The resulting performance is compared to several baseline classifiers. All

  6. Supervised Learning Approach for Spam Classification Analysis using Data Mining Tools

    Directory of Open Access Journals (Sweden)

    R.Deepa Lakshmi

    2010-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifierrelated issues. In recent days, Machine learning for spamclassification is an important research issue. This paper exploresand identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysisamong the algorithms has also been presented.

  7. Supervised Learning Approach for Spam Classification Analysis using Data Mining Tools

    Directory of Open Access Journals (Sweden)

    R.Deepa Lakshmi

    2010-11-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifierrelated issues. In recent days, Machine learning for spamclassification is an important research issue. This paper exploresand identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysisamong the algorithms has also been presented.

  8. A systematic comparison of different object-based classification techniques using high spatial resolution imagery in agricultural environments

    Science.gov (United States)

    Li, Manchun; Ma, Lei; Blaschke, Thomas; Cheng, Liang; Tiede, Dirk

    2016-07-01

    Geographic Object-Based Image Analysis (GEOBIA) is becoming more prevalent in remote sensing classification, especially for high-resolution imagery. Many supervised classification approaches are applied to objects rather than pixels, and several studies have been conducted to evaluate the performance of such supervised classification techniques in GEOBIA. However, these studies did not systematically investigate all relevant factors affecting the classification (segmentation scale, training set size, feature selection and mixed objects). In this study, statistical methods and visual inspection were used to compare these factors systematically in two agricultural case studies in China. The results indicate that Random Forest (RF) and Support Vector Machines (SVM) are highly suitable for GEOBIA classifications in agricultural areas and confirm the expected general tendency, namely that the overall accuracies decline with increasing segmentation scale. All other investigated methods except for RF and SVM are more prone to obtain a lower accuracy due to the broken objects at fine scales. In contrast to some previous studies, the RF classifiers yielded the best results and the k-nearest neighbor classifier were the worst results, in most cases. Likewise, the RF and Decision Tree classifiers are the most robust with or without feature selection. The results of training sample analyses indicated that the RF and adaboost. M1 possess a superior generalization capability, except when dealing with small training sample sizes. Furthermore, the classification accuracies were directly related to the homogeneity/heterogeneity of the segmented objects for all classifiers. Finally, it was suggested that RF should be considered in most cases for agricultural mapping.

  9. Preliminary hard and soft bottom seafloor substrate map derived from an supervised classification of bathymetry derived from multispectral World View-2 satellite imagery of Ni'ihau Island, Territory of Main Hawaiian Islands, USA

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Preliminary hard and soft seafloor substrate map derived from a supervised classification from multispectral World View-2 satellite imagery of Ni'ihau Island,...

  10. Supervised Transfer Sparse Coding

    KAUST Repository

    Al-Shedivat, Maruan

    2014-07-27

    A combination of the sparse coding and transfer learn- ing techniques was shown to be accurate and robust in classification tasks where training and testing objects have a shared feature space but are sampled from differ- ent underlying distributions, i.e., belong to different do- mains. The key assumption in such case is that in spite of the domain disparity, samples from different domains share some common hidden factors. Previous methods often assumed that all the objects in the target domain are unlabeled, and thus the training set solely comprised objects from the source domain. However, in real world applications, the target domain often has some labeled objects, or one can always manually label a small num- ber of them. In this paper, we explore such possibil- ity and show how a small number of labeled data in the target domain can significantly leverage classifica- tion accuracy of the state-of-the-art transfer sparse cod- ing methods. We further propose a unified framework named supervised transfer sparse coding (STSC) which simultaneously optimizes sparse representation, domain transfer and classification. Experimental results on three applications demonstrate that a little manual labeling and then learning the model in a supervised fashion can significantly improve classification accuracy.

  11. Automatic Cataract Hardness Classification Ex Vivo by Ultrasound Techniques.

    Science.gov (United States)

    Caixinha, Miguel; Santos, Mário; Santos, Jaime

    2016-04-01

    To demonstrate the feasibility of a new methodology for cataract hardness characterization and automatic classification using ultrasound techniques, different cataract degrees were induced in 210 porcine lenses. A 25-MHz ultrasound transducer was used to obtain acoustical parameters (velocity and attenuation) and backscattering signals. B-Scan and parametric Nakagami images were constructed. Ninety-seven parameters were extracted and subjected to a Principal Component Analysis. Bayes, K-Nearest-Neighbours, Fisher Linear Discriminant and Support Vector Machine (SVM) classifiers were used to automatically classify the different cataract severities. Statistically significant increases with cataract formation were found for velocity, attenuation, mean brightness intensity of the B-Scan images and mean Nakagami m parameter (p hardness characterization and automatic classification.

  12. Driver's situation awareness during supervision of automated control - comparison between SART and SAGAT measurement techniques

    NARCIS (Netherlands)

    Beukel, van den A.P.; Voort, van der M.C.; Risser, R.

    2014-01-01

    Systems enabling to drive automatically are being introduced on the market. When using this technology, drivers are in need for interfaces which support them with supervision of the automated control. Assessment of Situation Awareness (SA) which drivers are able to gain while using such interfaces,

  13. Sprint conditioning of junior soccer players: effects of training intensity and technique supervision.

    Directory of Open Access Journals (Sweden)

    Thomas Haugen

    Full Text Available The aims of the present study were to compare the effects of 1 training at 90 and 100% sprint velocity and 2 supervised versus unsupervised sprint training on soccer-specific physical performance in junior soccer players. Young, male soccer players (17 ± 1 yr, 71 ± 10 kg, 180 ± 6 cm were randomly assigned to four different treatment conditions over a 7-week intervention period. A control group (CON, n = 9 completed regular soccer training according to their teams' original training plans. Three training groups performed a weekly repeated-sprint training session in addition to their regular soccer training sessions performed at A 100% intensity without supervision (100UNSUP, n = 13, B 90% of maximal sprint velocity with supervision (90SUP, n = 10 or C 90% of maximal sprint velocity without supervision (90UNSUP, n=13. Repetitions x distance for the sprint-training sessions were 15 x 20 m for 100UNSUP and 30 x 20 m for 90SUP and 90UNSUP. Single-sprint performance (best time from 15 x 20 m sprints, repeated-sprint performance (mean time over 15 x 20 m sprints, countermovement jump and Yo-Yo Intermittent Recovery Level 1 (Yo-Yo IR1 were assessed during pre-training and post-training tests. No significant differences in performance outcomes were observed across groups. 90SUP improved Yo-Yo IR1 by a moderate margin compared to controls, while all other effect magnitudes were trivial or small. In conclusion, neither weekly sprint training at 90 or 100% velocity, nor supervised sprint training enhanced soccer-specific physical performance in junior soccer players.

  14. Sprint conditioning of junior soccer players: effects of training intensity and technique supervision.

    Science.gov (United States)

    Haugen, Thomas; Tønnessen, Espen; Øksenholt, Øyvind; Haugen, Fredrik Lie; Paulsen, Gøran; Enoksen, Eystein; Seiler, Stephen

    2015-01-01

    The aims of the present study were to compare the effects of 1) training at 90 and 100% sprint velocity and 2) supervised versus unsupervised sprint training on soccer-specific physical performance in junior soccer players. Young, male soccer players (17 ± 1 yr, 71 ± 10 kg, 180 ± 6 cm) were randomly assigned to four different treatment conditions over a 7-week intervention period. A control group (CON, n = 9) completed regular soccer training according to their teams' original training plans. Three training groups performed a weekly repeated-sprint training session in addition to their regular soccer training sessions performed at A) 100% intensity without supervision (100UNSUP, n = 13), B) 90% of maximal sprint velocity with supervision (90SUP, n = 10) or C) 90% of maximal sprint velocity without supervision (90UNSUP, n=13). Repetitions x distance for the sprint-training sessions were 15 x 20 m for 100UNSUP and 30 x 20 m for 90SUP and 90UNSUP. Single-sprint performance (best time from 15 x 20 m sprints), repeated-sprint performance (mean time over 15 x 20 m sprints), countermovement jump and Yo-Yo Intermittent Recovery Level 1 (Yo-Yo IR1) were assessed during pre-training and post-training tests. No significant differences in performance outcomes were observed across groups. 90SUP improved Yo-Yo IR1 by a moderate margin compared to controls, while all other effect magnitudes were trivial or small. In conclusion, neither weekly sprint training at 90 or 100% velocity, nor supervised sprint training enhanced soccer-specific physical performance in junior soccer players.

  15. Improving Crop Classification Techniques Using Optical Remote Sensing Imagery, High-Resolution Agriculture Resource Inventory Shapefiles and Decision Trees

    Science.gov (United States)

    Melnychuk, A. L.; Berg, A. A.; Sweeney, S.

    2010-12-01

    Recognition of anthropogenic effects of land use management practices on bodies of water is important for remediating and preventing eutrophication. In the case of Lake Simcoe, Ontario the main surrounding landuse is agriculture. To better manage the nutrient flow into the lake, knowledge of the management of the agricultural land is important. For this basin, a comprehensive agricultural resource inventory is required for assessment of policy and for input into water quality management and assessment tools. Supervised decision tree classification schemes, used in many previous applications, have yielded reliable classifications in agricultural land-use systems. However, when using these classification techniques the user is confronted with numerous data sources. In this study we use a large inventory of optical satellite image products (Landsat, AWiFS, SPOT and MODIS) and ancillary data sources (temporal MODIS-NDVI product signatures, digital elevation models and soil maps) at various spatial and temporal resolutions in a decision tree classification scheme. The sensitivity of the classification accuracy to various products is assessed to identify optimal data sources for classifying crop systems.

  16. Color Image Classification and Retrieval using Image mining Techniques

    Directory of Open Access Journals (Sweden)

    Dr.V.Mohan,

    2010-05-01

    Full Text Available Mining Image data is one of the essential features in the present scenario. Image data is the major one which plays vital role in every aspect of the systems like business for marketing, hospital for surgery, engineering for construction, Web for publication and so on. The other area in the Image mining system is the Content-BasedImage Retrieval (CBIR. CBIR systems perform retrieval based on the similarity defined in terms of extracted features with more objectiveness. But, the features of the query image alone will not be sufficient constraint for retrieving images. Hence, a new technique Color Image Classification and Retrieval using a Image Technique isproposed for improving user interaction with image retrieval systems by fully exploiting the similarity information.

  17. Review and classification of variability analysis techniques with clinical applications.

    Science.gov (United States)

    Bravi, Andrea; Longtin, André; Seely, Andrew J E

    2011-10-10

    Analysis of patterns of variation of time-series, termed variability analysis, represents a rapidly evolving discipline with increasing applications in different fields of science. In medicine and in particular critical care, efforts have focussed on evaluating the clinical utility of variability. However, the growth and complexity of techniques applicable to this field have made interpretation and understanding of variability more challenging. Our objective is to provide an updated review of variability analysis techniques suitable for clinical applications. We review more than 70 variability techniques, providing for each technique a brief description of the underlying theory and assumptions, together with a summary of clinical applications. We propose a revised classification for the domains of variability techniques, which include statistical, geometric, energetic, informational, and invariant. We discuss the process of calculation, often necessitating a mathematical transform of the time-series. Our aims are to summarize a broad literature, promote a shared vocabulary that would improve the exchange of ideas, and the analyses of the results between different studies. We conclude with challenges for the evolving science of variability analysis.

  18. Interactive exploration of uncertainty in fuzzy classifications by isosurface visualization of class clusters

    NARCIS (Netherlands)

    Lucieer, A.; Veen, L.E.

    2009-01-01

    Uncertainty and vagueness are important concepts when dealing with transition zones between vegetation communities or land-cover classes. In this study, classification uncertainty is quantified by applying a supervised fuzzy classification algorithm. New visualization techniques are proposed and pre

  19. Artificial intelligence techniques for embryo and oocyte classification.

    Science.gov (United States)

    Manna, Claudio; Nanni, Loris; Lumini, Alessandra; Pappalardo, Sebastiana

    2013-01-01

    One of the most relevant aspects in assisted reproduction technology is the possibility of characterizing and identifying the most viable oocytes or embryos. In most cases, embryologists select them by visual examination and their evaluation is totally subjective. Recently, due to the rapid growth in the capacity to extract texture descriptors from a given image, a growing interest has been shown in the use of artificial intelligence methods for embryo or oocyte scoring/selection in IVF programmes. This work concentrates the efforts on the possible prediction of the quality of embryos and oocytes in order to improve the performance of assisted reproduction technology, starting from their images. The artificial intelligence system proposed in this work is based on a set of Levenberg-Marquardt neural networks trained using textural descriptors (the local binary patterns). The proposed system was tested on two data sets of 269 oocytes and 269 corresponding embryos from 104 women and compared with other machine learning methods already proposed in the past for similar classification problems. Although the results are only preliminary, they show an interesting classification performance. This technique may be of particular interest in those countries where legislation restricts embryo selection. One of the most relevant aspects in assisted reproduction technology is the possibility of characterizing and identifying the most viable oocytes or embryos. In most cases, embryologists select them by visual examination and their evaluation is totally subjective. Recently, due to the rapid growth in our capacity to extract texture descriptors from a given image, a growing interest has been shown in the use of artificial intelligence methods for embryo or oocyte scoring/selection in IVF programmes. In this work, we concentrate our efforts on the possible prediction of the quality of embryos and oocytes in order to improve the performance of assisted reproduction technology

  20. Comparison of three arterial pulse waveform classification techniques.

    Science.gov (United States)

    Allen, J; Murray, A

    1996-01-01

    Peripheral pulse waveforms can become stretched and damped with increasing severity of peripheral vascular disease (PVD) and hence could provide valuable diagnostic information. This study compares the diagnostic performance of 3 established classification techniques (a linear discriminant classifier, a k-nearest neighbour classifier, and an artificial neural network) for the detection of lower limb arterial disease from pulse waveforms obtained using photoelectric plethysmography (PPG). Pulse waveforms and pre- and post-exercise Doppler ultrasound ankle to brachial pressure indices (ABPI) were obtained from patients attending a vascular measurement laboratory. A single PPG pulse from each big toe was recorded direct to computer, pre-processed, and then used as classifier input data. The correct classifier outputs were the corresponding ABPI diagnostic classification. Pulse and ABPI measurements from 100 legs were used as training data for each classifier, and the computed classifications for pulses from a further 266 legs were then compared with their ABPI diagnoses. The diagnostic accuracy of the artificial neural network (80%; was higher than for the optimized k-nearest neighbour classifier (k = 27, accuracy 76% and the linear discriminant classifier (71%). The Kappa measure of agreement which excludes chance was highest for the artificial neural network (57%) and significantly higher than that of the linear discriminant classifier (Kappa 40%, p < 0.05). The value of Kappa for the optimized k-nearest neighbour classifier (k = 27) was intermediate at 47%. This study has shown that classifiers can be taught to discriminate between small, and perhaps subtle, differences in features. We have demonstrated that artificial neural networks can be used to classify arterial pulse waveforms, and can perform better overall than k-nearest neighbour or linear discriminant classifiers for this application.

  1. Classification of Phishing Email Using Random Forest Machine Learning Technique

    Directory of Open Access Journals (Sweden)

    Andronicus A. Akinyelu

    2014-01-01

    Full Text Available Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN and false positive (FP rates.

  2. Supervised Mineral Classification with Semi-automatic Training and Validation Set Generation in Scanning Electron Microscope Energy Dispersive Spectroscopy Images of Thin Sections

    DEFF Research Database (Denmark)

    Flesche, Harald; Nielsen, Allan Aasbjerg; Larsen, Rasmus

    2000-01-01

    This paper addresses the problem of classifying minerals common in siliciclastic and carbonate rocks. Twelve chemical elements are mapped from thin sections by energy dispersive spectroscopy in a scanning electron microscope (SEM). Extensions to traditional multivariate statistical methods...... are applied to perform the classification. First, training and validation sets are grown from one or a few seed points by a method that ensures spatial and spectral closeness of observations. Spectral closeness is obtained by excluding observations that have high Mahalanobis distances to the training class......–Matusita distance and the posterior probability of a class mean being classified as another class. Fourth, the actual classification is carried out based on four supervised classifiers all assuming multinormal distributions: simple quadratic, a contextual quadratic, and two hierarchical quadratic classifiers...

  3. Incremental Image Classification Method Based on Semi-Supervised Learning%基于半监督学习的增量图像分类方法

    Institute of Scientific and Technical Information of China (English)

    梁鹏; 黎绍发; 覃姜维; 罗剑高

    2012-01-01

    In order to use large numbers of unlabeled images effectively, an image classification method is proposed based on semi-supervised learning. The proposed method bridges a large amount of unlabeled images and limited numbers of labeled images by exploiting the common topics. The classification accuracy is improved by using the must-link constraint and cannot-link constraint of labeled images. The experimental results on Caltech-101 and 7-classes image dataset demonstrate that the classification accuracy improves about 10% by the proposed method. Furthermore, due to the present semi-supervised image classification methods lacking of incremental learning ability, an incremental implementation of our method is proposed. Comparing with non-incremental learning model in literature, the incrementallearning method improves the computation efficiency of nearly 90%.%为有效使用大量未标注的图像进行分类,提出一种基于半监督学习的图像分类方法.通过共同的隐含话题桥接少量已标注的图像和大量未标注的图像,利用已标注图像的Must-link约束和Cannot-link约束提高未标注图像分类的精度.实验结果表明,该方法有效提高Caltech-101数据集和7类图像集约10%的分类精度.此外,针对目前绝大部分半监督图像分类方法不具备增量学习能力这一缺点,提出该方法的增量学习模型.实验结果表明,增量学习模型相比无增量学习模型提高近90%的计算效率.

  4. A Combined Texture-principal Component Image Classification Technique For Landslide Identification Using Airborne Multispectral Imagery

    Science.gov (United States)

    Whitworth, M.; Giles, D.; Murphy, W.

    The Jurassic strata of the Cotswolds escarpment of southern central United Kingdom are associated with extensive mass movement activity, including mudslide systems, rotational and translational landslides. These mass movements can pose a significant engineering risk and have been the focus of research into the use of remote sensing techniques as a tool for landslide identification and delineation on clay slopes. The study has utilised a field site on the Cotswold escarpment above the village of Broad- way, Worcestershire, UK. Geomorphological investigation was initially undertaken at the site in order to establish ground control on landslides and other landforms present at the site. Subsequent to this, Airborne Thematic Mapper (ATM) imagery and colour stereo photography were acquired by the UK Natural Environment Research Coun- cil (NERC) for further analysis and interpretation. This paper describes the textu- ral enhancement of the airborne imagery undertaken using both mean euclidean dis- tance (MEUC) and grey level co-occurrence matrix entropy (GLCM) together with a combined texture-principal component based supervised image classification that was adopted as the method for landslide identification. The study highlights the importance of image texture for discriminating mass movements within multispectral imagery and demonstrates that by adopting a combined texture-principal component image classi- fication we have been able to achieve classification accuracy of 84 % with a Kappa statistic of 0.838 for landslide classes. This paper also highlights the potential prob- lems that can be encountered when using high-resolution multispectral imagery, such as the presence of dense variable woodland present within the image, and presents a solution using principal component analysis.

  5. Chemometric techniques in oil classification from oil spill fingerprinting.

    Science.gov (United States)

    Ismail, Azimah; Toriman, Mohd Ekhwan; Juahir, Hafizan; Kassim, Azlina Md; Zain, Sharifuddin Md; Ahmad, Wan Kamaruzaman Wan; Wong, Kok Fah; Retnam, Ananthy; Zali, Munirah Abdul; Mokhtar, Mazlin; Yusri, Mohd Ayub

    2016-10-15

    Extended use of GC-FID and GC-MS in oil spill fingerprinting and matching is significantly important for oil classification from the oil spill sources collected from various areas of Peninsular Malaysia and Sabah (East Malaysia). Oil spill fingerprinting from GC-FID and GC-MS coupled with chemometric techniques (discriminant analysis and principal component analysis) is used as a diagnostic tool to classify the types of oil polluting the water. Clustering and discrimination of oil spill compounds in the water from the actual site of oil spill events are divided into four groups viz. diesel, Heavy Fuel Oil (HFO), Mixture Oil containing Light Fuel Oil (MOLFO) and Waste Oil (WO) according to the similarity of their intrinsic chemical properties. Principal component analysis (PCA) demonstrates that diesel, HFO, MOLFO and WO are types of oil or oil products from complex oil mixtures with a total variance of 85.34% and are identified with various anthropogenic activities related to either intentional releasing of oil or accidental discharge of oil into the environment. Our results show that the use of chemometric techniques is significant in providing independent validation for classifying the types of spilled oil in the investigation of oil spill pollution in Malaysia. This, in consequence would result in cost and time saving in identification of the oil spill sources.

  6. Kollegial supervision

    DEFF Research Database (Denmark)

    Andersen, Ole Dibbern; Petersson, Erling

    Publikationen belyser, hvordan kollegial supervision i en kan organiseres i en uddannelsesinstitution......Publikationen belyser, hvordan kollegial supervision i en kan organiseres i en uddannelsesinstitution...

  7. Water resources climate change projections using supervised nonlinear and multivariate soft computing techniques

    Science.gov (United States)

    Sarhadi, Ali; Burn, Donald H.; Johnson, Fiona; Mehrotra, Raj; Sharma, Ashish

    2016-05-01

    Accurate projection of global warming on the probabilistic behavior of hydro-climate variables is one of the main challenges in climate change impact assessment studies. Due to the complexity of climate-associated processes, different sources of uncertainty influence the projected behavior of hydro-climate variables in regression-based statistical downscaling procedures. The current study presents a comprehensive methodology to improve the predictive power of the procedure to provide improved projections. It does this by minimizing the uncertainty sources arising from the high-dimensionality of atmospheric predictors, the complex and nonlinear relationships between hydro-climate predictands and atmospheric predictors, as well as the biases that exist in climate model simulations. To address the impact of the high dimensional feature spaces, a supervised nonlinear dimensionality reduction algorithm is presented that is able to capture the nonlinear variability among projectors through extracting a sequence of principal components that have maximal dependency with the target hydro-climate variables. Two soft-computing nonlinear machine-learning methods, Support Vector Regression (SVR) and Relevance Vector Machine (RVM), are engaged to capture the nonlinear relationships between predictand and atmospheric predictors. To correct the spatial and temporal biases over multiple time scales in the GCM predictands, the Multivariate Recursive Nesting Bias Correction (MRNBC) approach is used. The results demonstrate that this combined approach significantly improves the downscaling procedure in terms of precipitation projection.

  8. Construction accident narrative classification: An evaluation of text mining techniques.

    Science.gov (United States)

    Goh, Yang Miang; Ubeynarayana, C U

    2017-08-31

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. An experiment in multispectral, multitemporal crop classification using relaxation techniques

    Science.gov (United States)

    Davis, L. S.; Wang, C.-Y.; Xie, H.-C

    1983-01-01

    The paper describes the result of an experimental study concerning the use of probabilistic relaxation for improving pixel classification rates. Two LACIE sites were used in the study and in both cases, relaxation resulted in a marked improvement in classification rates.

  10. Analysed potential of big data and supervised machine learning techniques in effectively forecasting travel times from fused data

    Directory of Open Access Journals (Sweden)

    Ivana Šemanjski

    2015-12-01

    Full Text Available Travel time forecasting is an interesting topic for many ITS services. Increased availability of data collection sensors increases the availability of the predictor variables but also highlights the high processing issues related to this big data availability. In this paper we aimed to analyse the potential of big data and supervised machine learning techniques in effectively forecasting travel times. For this purpose we used fused data from three data sources (Global Positioning System vehicles tracks, road network infrastructure data and meteorological data and four machine learning techniques (k-nearest neighbours, support vector machines, boosting trees and random forest. To evaluate the forecasting results we compared them in-between different road classes in the context of absolute values, measured in minutes, and the mean squared percentage error. For the road classes with the high average speed and long road segments, machine learning techniques forecasted travel times with small relative error, while for the road classes with the small average speeds and segment lengths this was a more demanding task. All three data sources were proven itself to have a high impact on the travel time forecast accuracy and the best results (taking into account all road classes were achieved for the k-nearest neighbours and random forest techniques.

  11. 基于ENVI的遥感图像监督分类方法比较研究%The Comparative Study of Remote Sensing Image Supervised Classification Methods Based on ENVI

    Institute of Scientific and Technical Information of China (English)

    闫琰; 董秀兰; 李燕

    2011-01-01

    基于监督分类方法在遥感影像分类中的普遍应用,介绍了四种ENVI提供的常用的监督分类方法。对同一TM图像运用这四种方法进行分类,并对分类结果进行了对比,从而分析了这四种方法分类精度之间的差异。%This paper describes four commonly used methods of supervised classification ENVI provides,based on the universal application of supervised classification in remote sensing image classification.The same TM image is classified using four methods,the result was analyzed essentially.Therefore,the paper analyzes the difference between the classification accuracy of these four methods.

  12. Feature-space transformation improves supervised segmentation across scanners

    DEFF Research Database (Denmark)

    van Opbroek, Annegreet; Achterberg, Hakim C.; de Bruijne, Marleen

    2015-01-01

    Image-segmentation techniques based on supervised classification generally perform well on the condition that training and test samples have the same feature distribution. However, if training and test images are acquired with different scanners or scanning parameters, their feature distributions...

  13. Using Psychodrama Techniques to Promote Counselor Identity Development in Group Supervision

    Science.gov (United States)

    Scholl, Mark B.; Smith-Adcock, Sondra

    2007-01-01

    The authors briefly introduce the concepts, techniques, and theory of identity development associated with J. L. Moreno's (1946, 1969, 1993) Psychodrama. Based upon Loganbill, Hardy, and Delworth's (1982) model, counselor identity development is conceptualized as consisting of seven developmental themes or vectors (e.g., issues of awareness and…

  14. Classification techniques in quantitative comparative research : a meta-comparison

    OpenAIRE

    Nijkamp, P.; Rietveld, P.; Spierdijk, L.

    1999-01-01

    This paper emphasizes the importance of quantitative comparative research in the social sciences. For that purpose a great variety of modem classification methods is available. The paper aims to give a selective overview of major classes of these methods and highlights the advantages and limitations of these methods.

  15. Comparison of multivariate preprocessing techniques as applied to electronic tongue based pattern classification for black tea

    Energy Technology Data Exchange (ETDEWEB)

    Palit, Mousumi [Department of Electronics and Telecommunication Engineering, Central Calcutta Polytechnic, Kolkata 700014 (India); Tudu, Bipan, E-mail: bt@iee.jusl.ac.in [Department of Instrumentation and Electronics Engineering, Jadavpur University, Kolkata 700098 (India); Bhattacharyya, Nabarun [Centre for Development of Advanced Computing, Kolkata 700091 (India); Dutta, Ankur; Dutta, Pallab Kumar [Department of Instrumentation and Electronics Engineering, Jadavpur University, Kolkata 700098 (India); Jana, Arun [Centre for Development of Advanced Computing, Kolkata 700091 (India); Bandyopadhyay, Rajib [Department of Instrumentation and Electronics Engineering, Jadavpur University, Kolkata 700098 (India); Chatterjee, Anutosh [Department of Electronics and Communication Engineering, Heritage Institute of Technology, Kolkata 700107 (India)

    2010-08-18

    In an electronic tongue, preprocessing on raw data precedes pattern analysis and choice of the appropriate preprocessing technique is crucial for the performance of the pattern classifier. While attempting to classify different grades of black tea using a voltammetric electronic tongue, different preprocessing techniques have been explored and a comparison of their performances is presented in this paper. The preprocessing techniques are compared first by a quantitative measurement of separability followed by principle component analysis; and then two different supervised pattern recognition models based on neural networks are used to evaluate the performance of the preprocessing techniques.

  16. Comparison of multivariate preprocessing techniques as applied to electronic tongue based pattern classification for black tea.

    Science.gov (United States)

    Palit, Mousumi; Tudu, Bipan; Bhattacharyya, Nabarun; Dutta, Ankur; Dutta, Pallab Kumar; Jana, Arun; Bandyopadhyay, Rajib; Chatterjee, Anutosh

    2010-08-18

    In an electronic tongue, preprocessing on raw data precedes pattern analysis and choice of the appropriate preprocessing technique is crucial for the performance of the pattern classifier. While attempting to classify different grades of black tea using a voltammetric electronic tongue, different preprocessing techniques have been explored and a comparison of their performances is presented in this paper. The preprocessing techniques are compared first by a quantitative measurement of separability followed by principle component analysis; and then two different supervised pattern recognition models based on neural networks are used to evaluate the performance of the preprocessing techniques.

  17. Grapevine Yield and Leaf Area Estimation Using Supervised Classification Methodology on RGB Images Taken under Field Conditions

    Science.gov (United States)

    Diago, Maria-Paz; Correa, Christian; Millán, Borja; Barreiro, Pilar; Valero, Constantino; Tardaguila, Javier

    2012-01-01

    The aim of this research was to implement a methodology through the generation of a supervised classifier based on the Mahalanobis distance to characterize the grapevine canopy and assess leaf area and yield using RGB images. The method automatically processes sets of images, and calculates the areas (number of pixels) corresponding to seven different classes (Grapes, Wood, Background, and four classes of Leaf, of increasing leaf age). Each one is initialized by the user, who selects a set of representative pixels for every class in order to induce the clustering around them. The proposed methodology was evaluated with 70 grapevine (V. vinifera L. cv. Tempranillo) images, acquired in a commercial vineyard located in La Rioja (Spain), after several defoliation and de-fruiting events on 10 vines, with a conventional RGB camera and no artificial illumination. The segmentation results showed a performance of 92% for leaves and 98% for clusters, and allowed to assess the grapevine’s leaf area and yield with R2 values of 0.81 (p < 0.001) and 0.73 (p = 0.002), respectively. This methodology, which operates with a simple image acquisition setup and guarantees the right number and kind of pixel classes, has shown to be suitable and robust enough to provide valuable information for vineyard management. PMID:23235443

  18. Two Linear Unmixing Algorithms to Recognize Targets Using Supervised Classification and Orthogonal Rotation in Airborne Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Michael Zheludev

    2012-02-01

    Full Text Available The goal of the paper is to detect pixels that contain targets of known spectra. The target can be present in a sub- or above pixel. Pixels without targets are classified as background pixels. Each pixel is treated via the content of its neighborhood. A pixel whose spectrum is different from its neighborhood is classified as a “suspicious point”. In each suspicious point there is a mix of target(s and background. The main objective in a supervised detection (also called “target detection” is to search for a specific given spectral material (target in hyperspectral imaging (HSI where the spectral signature of the target is known a priori from laboratory measurements. In addition, the fractional abundance of the target is computed. To achieve this we present two linear unmixing algorithms that recognize targets with known (given spectral signatures. The CLUN is based on automatic feature extraction from the target’s spectrum. These features separate the target from the background. The ROTU algorithm is based on embedding the spectra space into a special space by random orthogonal transformation and on the statistical properties of the embedded result. Experimental results demonstrate that the targets’ locations were extracted correctly and these algorithms are robust and efficient.

  19. Tree Species Abundance Predictions in a Tropical Agricultural Landscape with a Supervised Classification Model and Imbalanced Data

    Directory of Open Access Journals (Sweden)

    Sarah J. Graves

    2016-02-01

    Full Text Available Mapping species through classification of imaging spectroscopy data is facilitating research to understand tree species distributions at increasingly greater spatial scales. Classification requires a dataset of field observations matched to the image, which will often reflect natural species distributions, resulting in an imbalanced dataset with many samples for common species and few samples for less common species. Despite the high prevalence of imbalanced datasets in multiclass species predictions, the effect on species prediction accuracy and landscape species abundance has not yet been quantified. First, we trained and assessed the accuracy of a support vector machine (SVM model with a highly imbalanced dataset of 20 tropical species and one mixed-species class of 24 species identified in a hyperspectral image mosaic (350–2500 nm of Panamanian farmland and secondary forest fragments. The model, with an overall accuracy of 62% ± 2.3% and F-score of 59% ± 2.7%, was applied to the full image mosaic (23,000 ha at a 2-m resolution to produce a species prediction map, which suggested that this tropical agricultural landscape is more diverse than what has been presented in field-based studies. Second, we quantified the effect of class imbalance on model accuracy. Model assessment showed a trend where species with more samples were consistently over predicted while species with fewer samples were under predicted. Standardizing sample size reduced model accuracy, but also reduced the level of species over- and under-prediction. This study advances operational species mapping of diverse tropical landscapes by detailing the effect of imbalanced data on classification accuracy and providing estimates of tree species abundance in an agricultural landscape. Species maps using data and methods presented here can be used in landscape analyses of species distributions to understand human or environmental effects, in addition to focusing conservation

  20. Classification of JERS-1 Image Mosaic of Central Africa Using A Supervised Multiscale Classifier of Texture Features

    Science.gov (United States)

    Saatchi, Sassan; DeGrandi, Franco; Simard, Marc; Podest, Erika

    1999-01-01

    In this paper, a multiscale approach is introduced to classify the Japanese Research Satellite-1 (JERS-1) mosaic image over the Central African rainforest. A series of texture maps are generated from the 100 m mosaic image at various scales. Using a quadtree model and relating classes at each scale by a Markovian relationship, the multiscale images are classified from course to finer scale. The results are verified at various scales and the evolution of classification is monitored by calculating the error at each stage.

  1. Salient Feature Identification and Analysis using Kernel-Based Classification Techniques for Synthetic Aperture Radar Automatic Target Recognition

    Science.gov (United States)

    2014-03-27

    SALIENT FEATURE IDENTIFICATION AND ANALYSIS USING KERNEL-BASED CLASSIFICATION TECHNIQUES FOR SYNTHETIC APERTURE RADAR AUTOMATIC TARGET RECOGNITION...FEATURE IDENTIFICATION AND ANALYSIS USING KERNEL-BASED CLASSIFICATION TECHNIQUES FOR SYNTHETIC APERTURE RADAR AUTOMATIC TARGET RECOGNITION THESIS Presented...SALIENT FEATURE IDENTIFICATION AND ANALYSIS USING KERNEL-BASED CLASSIFICATION TECHNIQUES FOR SYNTHETIC APERTURE RADAR AUTOMATIC TARGET RECOGNITION

  2. SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Thiruvengatanadhan Ramalingam

    2014-01-01

    Full Text Available Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.

  3. Comparing techniques for vegetation classification using multi- and hyperspectral images and ancillary environmental data

    NARCIS (Netherlands)

    Sluiter, R; Pebesma, E.J.

    2010-01-01

    This paper evaluates the predictive power of innovative and more conventional statistical classification techniques. We use Landsat 7 Enhanced Thematic Mapper Plus (ETMþ), Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and airborne imaging spectrometer (HyMap) images

  4. Classification of the Regional Ionospheric Disturbance Based on Machine Learning Techniques

    Science.gov (United States)

    Terzi, Merve Begum; Arikan, Orhan; Karatay, Secil; Arikan, Feza; Gulyaeva, Tamara

    2016-08-01

    In this study, Total Electron Content (TEC) estimated from GPS receivers is used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. For the automated classification of regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. Performance of developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing developed classification technique to Global Ionospheric Map (GIM) TEC data, which is provided by the NASA Jet Propulsion Laboratory (JPL), it is shown that SVM can be a suitable learning method to detect anomalies in TEC variations.

  5. Handwritten Character Classification using the Hotspot Feature Extraction Technique

    NARCIS (Netherlands)

    Surinta, Olarik; Schomaker, Lambertus; Wiering, Marco

    2012-01-01

    Feature extraction techniques can be important in character recognition, because they can enhance the efficacy of recognition in comparison to featureless or pixel-based approaches. This study aims to investigate the novel feature extraction technique called the hotspot technique in order to use it

  6. An Improved Image Mining Technique For Brain Tumour Classification Using Efficient Classifier

    OpenAIRE

    Rajendran, P.; M.Madheswaran

    2010-01-01

    An improved image mining technique for brain tumor classification using pruned association rule with MARI algorithm is presented in this paper. The method proposed makes use of association rule mining technique to classify the CT scan brain images into three categories namely normal, benign and malign. It combines the low-level features extracted from images and high level knowledge from specialists. The developed algorithm can assist the physicians for efficient classification with multiple ...

  7. A new texture and shape based technique for improving meningioma classification.

    Science.gov (United States)

    Fatima, Kiran; Arooj, Arshia; Majeed, Hammad

    2014-11-01

    Over the past decade, computer-aided diagnosis is rapidly growing due to the availability of patient data, sophisticated image acquisition tools and advancement in image processing and machine learning algorithms. Meningiomas are the tumors of brain and spinal cord. They account for 20% of all the brain tumors. Meningioma subtype classification involves the classification of benign meningioma into four major subtypes: meningothelial, fibroblastic, transitional, and psammomatous. Under the microscope, the histology images of these four subtypes show a variety of textural and structural characteristics. High intraclass and low interclass variabilities in meningioma subtypes make it an extremely complex classification problem. A number of techniques have been proposed for meningioma subtype classification with varying performances on different subtypes. Most of these techniques employed wavelet packet transforms for textural features extraction and analysis of meningioma histology images. In this article, a hybrid classification technique based on texture and shape characteristics is proposed for the classification of meningioma subtypes. Meningothelial and fibroblastic subtypes are classified on the basis of nuclei shapes while grey-level co-occurrence matrix textural features are used to train a multilayer perceptron for the classification of transitional and psammomatous subtypes. On the whole, average classification accuracy of 92.50% is achieved through the proposed hybrid classifier; which to the best of our knowledge is the highest. © 2014 Wiley Periodicals, Inc.

  8. A-Survey of Feature Extraction and Classification Techniques in OCR Systems

    Directory of Open Access Journals (Sweden)

    Rohit Verma

    2012-11-01

    Full Text Available This paper describes a set of feature extraction and classification techniques, which play very important role in the recognition of characters. Feature extraction provides us methods with the help of which we can identify characters uniquely and with high degree of accuracy. Feature extraction helps us to find the shape contained in the pattern. Although a number of techniques are available for feature extraction and classification, but the choice of an excellent technique decides the degree of accuracy of recognition. A lot of research has been done in this field and new techniques of extraction and classification has been developed. The objective of this paper is to review these techniques, so that the set of these techniques can be appreciated.

  9. Classification of traditional Chinese pork bacon based on physicochemical properties and chemometric techniques.

    Science.gov (United States)

    Guo, Xin; Huang, Feng; Zhang, Hong; Zhang, Chunjiang; Hu, Honghai; Chen, Wenbo

    2016-07-01

    Sixty-seven pork bacon samples from Hunan, Sichuan Guangdong, Jiangxi, and Yunnan Provinces in China were analyzed to understand their geographical properties. Classification was performed by determining their physicochemical properties through chemometric techniques, including variance analysis, principal component analysis (PCA), and discriminant analysis (DA). Results showed that certain differences existed in terms of nine physicochemical determinations in traditional Chinese pork bacon. PCA revealed the distinction among Hunan, Sichuan, and Guangdong style bacon. Meanwhile, seven key physicochemical determination criteria were identified in line with DA and could be reasonably applied to the classification of traditional Chinese pork bacon. Furthermore, the ratio of overall correct classification was 97.76% and that of cross-validation was 91.76%. These findings indicated that chemometric techniques, together with several physicochemical determination, were effective for the classification of traditional Chinese pork bacon with geographical features. Our study provided a theoretical reference for the classification of traditional Chinese pork bacon.

  10. A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment

    Directory of Open Access Journals (Sweden)

    Polz Martin F

    2009-05-01

    Full Text Available Abstract Background Cyanobacteria of the genera Synechococcus and Prochlorococcus play a key role in marine photosynthesis, which contributes to the global carbon cycle and to the world oxygen supply. Recently, genes encoding the photosystem II reaction center (psbA and psbD were found in cyanophage genomes. This phenomenon suggested that the horizontal transfer of these genes may be involved in increasing phage fitness. To date, a very small percentage of marine bacteria and phages has been cultured. Thus, mapping genomic data extracted directly from the environment to its taxonomic origin is necessary for a better understanding of phage-host relationships and dynamics. Results To achieve an accurate and rapid taxonomic classification, we employed a computational approach combining a multi-class Support Vector Machine (SVM with a codon usage position specific scoring matrix (cuPSSM. Our method has been applied successfully to classify core-photosystem-II gene fragments, including partial sequences coming directly from the ocean, to seven different taxonomic classes. Applying the method on a large set of DNA and RNA psbA clones from the Mediterranean Sea, we studied the distribution of cyanobacterial psbA genes and transcripts in their natural environment. Using our approach, we were able to simultaneously examine taxonomic and ecological distributions in the marine environment. Conclusion The ability to accurately classify the origin of individual genes and transcripts coming directly from the environment is of great importance in studying marine ecology. The classification method presented in this paper could be applied further to classify other genes amplified from the environment, for which training data is available.

  11. Classification of German white wines with certified brand of origin by multielement quantitation and pattern recognition techniques.

    Science.gov (United States)

    Castiñeira Gómez, Maria del Mar; Feldmann, Ingo; Jakubowski, Norbert; Andersson, Jan T

    2004-05-19

    A procedure is proposed for the determination of the authenticity of white wines from four German wine-growing regions (Baden, Rheingau, Rheinhessen, and Pfalz) based on their content of some major, trace, and ultratrace elements. One hundred and twenty-seven white wine samples possessing a certificate of origin, all of the 2000 vintage, were analyzed. The concentrations of 13 elements (Li, B, Mg, Ca, V, Mn, Co, Fe, Zn, Rb, Sr, Cs, and Pb) were determined in wine diluted 1:20 by sector field inductively coupled plasma mass spectrometry (SF-ICP-MS). Indium was routinely used as internal standard. Supervised pattern recognition techniques such as discriminant analysis and classification trees were applied for the interpretation of the data. A quadratic discriminant analysis (QDA) allowed the four regions to be discriminated with 83% accuracy when using only eight variables (Li, B, Mg, Fe, Zn, Sr, Cs, and Pb), and the prediction ability for classifying new samples was 76%. By use of a second method, a decision tree, the classification of samples coming from the four regions could be performed with an accuracy of 84% when only four elements were used: Li (very low in samples from Baden), Zn (abnormally low in the samples from the Rheingau), and Mg and Sr (both important for the differentiation between Pfalz and Rheinhessen samples). For this method, the prediction ability was only 74% in the identification of unknown samples. The robustness of the QDA model was not good enough, and therefore the tree is better recommended for the classification of new wine samples from these areas of German wine production.

  12. Improving SVDD classification performance on hyperspectral images via correlation based ensemble technique

    Science.gov (United States)

    Uslu, Faruk Sukru; Binol, Hamidullah; Ilarslan, Mustafa; Bal, Abdullah

    2017-02-01

    Support Vector Data Description (SVDD) is a nonparametric and powerful method for target detection and classification. The SVDD constructs a minimum hypersphere enclosing the target objects as much as possible. It has advantages of sparsity, good generalization and using kernel machines. In many studies, different methods have been offered in order to improve the performance of the SVDD. In this paper, we have presented ensemble methods to improve classification performance of the SVDD in remotely sensed hyperspectral imagery (HSI) data. Among various ensemble approaches we have selected bagging technique for training data set with different combinations. As a novel technique for weighting we have proposed a correlation based weight coefficients assignment. In this technique, correlation between each bagged classifier is calculated to give coefficients to weighted combinators. To verify the improvement performance, two hyperspectral images are processed for classification purpose. The obtained results show that the ensemble SVDD has been found to be significantly better than conventional SVDD in terms of classification accuracy.

  13. Automated authorship attribution using advanced signal classification techniques.

    Directory of Open Access Journals (Sweden)

    Maryam Ebrahimpour

    Full Text Available In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discriminant Analysis (MDA and the other based on a Support Vector Machine (SVM. The classification features we exploit are based on word frequencies in the text. We adopt an approach of preprocessing each text by stripping it of all characters except a-z and space. This is in order to increase the portability of the software to different types of texts. We test the methodology on a corpus of undisputed English texts, and use leave-one-out cross validation to demonstrate classification accuracies in excess of 90%. We further test our methods on the Federalist Papers, which have a partly disputed authorship and a fair degree of scholarly consensus. And finally, we apply our methodology to the question of the authorship of the Letter to the Hebrews by comparing it against a number of original Greek texts of known authorship. These tests identify where some of the limitations lie, motivating a number of open questions for future work. An open source implementation of our methodology is freely available for use at https://github.com/matthewberryman/author-detection.

  14. Motivation techniques for supervision

    Science.gov (United States)

    Gray, N. D.

    1974-01-01

    Guide has been published which deals with various aspects of employee motivation. Training methods are designed to improve communication between supervisors and subordinates, to create feeling of achievement and recognition for every employee, and to retain personnel confidence in spite of some negative motivators. End result of training is reduction or prevention of errors.

  15. Classification

    Data.gov (United States)

    National Aeronautics and Space Administration — A supervised learning task involves constructing a mapping from an input data space (normally described by several features) to an output space. A set of training...

  16. Techniques of EMG signal analysis: detection, processing, classification and applications

    Science.gov (United States)

    Hussain, M.S.; Mohd-Yasin, F.

    2006-01-01

    Electromyography (EMG) signals can be used for clinical/biomedical applications, Evolvable Hardware Chip (EHW) development, and modern human computer interaction. EMG signals acquired from muscles require advanced methods for detection, decomposition, processing, and classification. The purpose of this paper is to illustrate the various methodologies and algorithms for EMG signal analysis to provide efficient and effective ways of understanding the signal and its nature. We further point up some of the hardware implementations using EMG focusing on applications related to prosthetic hand control, grasp recognition, and human computer interaction. A comparison study is also given to show performance of various EMG signal analysis methods. This paper provides researchers a good understanding of EMG signal and its analysis procedures. This knowledge will help them develop more powerful, flexible, and efficient applications. PMID:16799694

  17. Three-Class EEG-Based Motor Imagery Classification Using Phase-Space Reconstruction Technique

    Science.gov (United States)

    Djemal, Ridha; Bazyed, Ayad G.; Belwafi, Kais; Gannouni, Sofien; Kaaniche, Walid

    2016-01-01

    Over the last few decades, brain signals have been significantly exploited for brain-computer interface (BCI) applications. In this paper, we study the extraction of features using event-related desynchronization/synchronization techniques to improve the classification accuracy for three-class motor imagery (MI) BCI. The classification approach is based on combining the features of the phase and amplitude of the brain signals using fast Fourier transform (FFT) and autoregressive (AR) modeling of the reconstructed phase space as well as the modification of the BCI parameters (trial length, trial frequency band, classification method). We report interesting results compared with those present in the literature by utilizing sequential forward floating selection (SFFS) and a multi-class linear discriminant analysis (LDA), our findings showed superior classification results, a classification accuracy of 86.06% and 93% for two BCI competition datasets, with respect to results from previous studies. PMID:27563927

  18. Classification of alarm processing techniques and human performance issues

    Energy Technology Data Exchange (ETDEWEB)

    Kim, I.S.; O' Hara, J.M.

    1993-01-01

    Human factors reviews indicate that conventional alarm systems based on the one sensor, one alarm approach, have many human engineering deficiencies, a paramount example being too many alarms during major disturbances. As an effort to resolve these deficiencies, various alarm processing systems have been developed using different techniques. To ensure their contribution to operational safety, the impacts of those systems on operating crew performance should be carefully evaluated. This paper briefly reviews some of the human factors research issues associated with alarm processing techniques and then discusses a framework with which to classify the techniques. The dimensions of this framework can be used to explore the effects of alarm processing systems on human performance.

  19. Classification of alarm processing techniques and human performance issues

    Energy Technology Data Exchange (ETDEWEB)

    Kim, I.S.; O`Hara, J.M.

    1993-05-01

    Human factors reviews indicate that conventional alarm systems based on the one sensor, one alarm approach, have many human engineering deficiencies, a paramount example being too many alarms during major disturbances. As an effort to resolve these deficiencies, various alarm processing systems have been developed using different techniques. To ensure their contribution to operational safety, the impacts of those systems on operating crew performance should be carefully evaluated. This paper briefly reviews some of the human factors research issues associated with alarm processing techniques and then discusses a framework with which to classify the techniques. The dimensions of this framework can be used to explore the effects of alarm processing systems on human performance.

  20. The confusion technique untangled: its theoretical rationale and preliminary classification.

    Science.gov (United States)

    Otani, A

    1989-01-01

    This article examines the historical development of Milton H. Erickson's theoretical approach to hypnosis using confusion. Review of the literature suggests that the Confusion Technique, in principle, consists of a two-stage "confusion-restructuring" process. The article also attempts to categorize several examples of confusion suggestions by seven linguistic characteristics: (1) antonyms, (2) homonyms, (3) synonyms, (4) elaboration, (5) interruption, (6) echoing, and (7) uncommon words. The Confusion Technique is an important yet little studied strategy developed by Erickson. More work is urged to investigate its nature and properties.

  1. IMPLICATION OF CLASSIFICATION TECHNIQUES IN PREDICTING STUDENT’S RECITAL

    Directory of Open Access Journals (Sweden)

    S. Anupama Kumar

    2011-09-01

    Full Text Available Educational data mining is used to study the data available in the educational field and bring out the hidden knowledge from it. Classification methods like decision trees, rule mining, Bayesian network etc can be applied on the educational data for predicting the students behavior, performance in examination etc. This prediction will help the tutors to identify the weak students and help them to score better marks. The C4.5 decision tree algorithm is applied on student’s internal assessment data to predict their performance in the final exam. The outcome of the decision tree predicted the number of students who are likely to fail or pass. The result is given to the tutor and steps were taken to improve the performance of the students who were predicted to fail. After the declaration of the results in the final examination the marks obtained by the students are fed into the system and the results were analyzed. The comparative analysis of the results states that the prediction has helped the weaker students to improve and brought out betterment in the result. The algorithm is also analyzed by duplicating the same data and the result of the duplication brings no much change in predicting the student’s outcome. To analyze the accuracy of the algorithm, it is compared with ID3 algorithm and found to be more efficient in terms of the accurately predicting the outcome of the student and time taken to derive the tree.

  2. Web Video Mining: Metadata Predictive Analysis using Classification Techniques

    Directory of Open Access Journals (Sweden)

    Siddu P. Algur

    2016-02-01

    Full Text Available Now a days, the Data Engineering becoming emerging trend to discover knowledge from web audiovisual data such as- YouTube videos, Yahoo Screen, Face Book videos etc. Different categories of web video are being shared on such social websites and are being used by the billions of users all over the world. The uploaded web videos will have different kind of metadata as attribute information of the video data. The metadata attributes defines the contents and features/characteristics of the web videos conceptually. Hence, accomplishing web video mining by extracting features of web videos in terms of metadata is a challenging task. In this work, effective attempts are made to classify and predict the metadata features of web videos such as length of the web videos, number of comments of the web videos, ratings information and view counts of the web videos using data mining algorithms such as Decision tree J48 and navie Bayesian algorithms as a part of web video mining. The results of Decision tree J48 and navie Bayesian classification models are analyzed and compared as a step in the process of knowledge discovery from web videos.

  3. Classification of protein profiles using fuzzy clustering techniques

    DEFF Research Database (Denmark)

    Karemore, Gopal; Mullick, Jhinuk B.; Sujatha, R.

    2010-01-01

    -to-day   variation,   artifacts   due   to experimental   strategies,   inherent   uncertainty   in   pumping procedure which are very common activities during HPLC-LIF experiment.  Under  these  circumstances  we  demonstrate  how fuzzy clustering algorithm like Gath Geva followed by sammon mapping   outperform......   PCA   mapping   in   classifying   various cancers from healthy spectra with classification rate up to 95 % from  60%.  Methods  are  validated  using  various  clustering indexes   and   shows   promising   improvement   in   developing optical pathology like HPLC-LIF for early detection of various...

  4. Biomechanical Classification of Judo Throwing Techniques (Nage Waza)

    CERN Document Server

    Sacripanti, Attilio

    2008-01-01

    In this paper it is applied the classic mechanical point of view to classify all the movements known as throwing judo techniques; the application of the Newtonian physical methods and principles is able to rationalize the whole matter and to group under only two very simple and clear principles all the movements, before grouped in many generic and not very clear way.

  5. Modern Multivariate Statistical Techniques Regression, Classification, and Manifold Learning

    CERN Document Server

    Izenman, Alan Julian

    2006-01-01

    Describes the advances in computation and data storage that led to the introduction of many statistical tools for high-dimensional data analysis. Focusing on multivariate analysis, this book discusses nonlinear methods as well as linear methods. It presents an integrated mixture of classical and modern multivariate statistical techniques.

  6. Application of Musical Information Retrieval (MIR Techniques to Seismic Facies Classification. Examples in Hydrocarbon Exploration

    Directory of Open Access Journals (Sweden)

    Paolo Dell’Aversana

    2016-12-01

    Full Text Available In this paper, we introduce a novel approach for automatic pattern recognition and classification of geophysical data based on digital music technology. We import and apply in the geophysical domain the same approaches commonly used for Musical Information Retrieval (MIR. After accurate conversion from geophysical formats (example: SEG-Y to musical formats (example: Musical Instrument Digital Interface, or briefly MIDI, we extract musical features from the converted data. These can be single-valued attributes, such as pitch and sound intensity, or multi-valued attributes, such as pitch histograms, melodic, harmonic and rhythmic paths. Using a real data set, we show that these musical features can be diagnostic for seismic facies classification in a complex exploration area. They can be complementary with respect to “conventional” seismic attributes. Using a supervised machine learning approach based on the k-Nearest Neighbors algorithm and on Automatic Neural Networks, we classify three gas-bearing channels. The good performance of our classification approach is confirmed by borehole data available in the same area.

  7. Spectral Biomimetic Technique for Wood Classification Inspired by Human Echolocation

    Directory of Open Access Journals (Sweden)

    Juan Antonio Martínez Rojas

    2012-01-01

    Full Text Available Palatal clicks are most interesting for human echolocation. Moreover, these sounds are suitable for other acoustic applications due to their regular mathematical properties and reproducibility. Simple and nondestructive techniques, bioinspired by synthetized pulses whose form reproduces the best features of palatal clicks, can be developed. The use of synthetic palatal pulses also allows detailed studies of the real possibilities of acoustic human echolocation without the problems associated with subjective individual differences. These techniques are being applied to the study of wood. As an example, a comparison of the performance of both natural and synthetic human echolocation to identify three different species of wood is presented. The results show that human echolocation has a vast potential.

  8. A Study on Approaches of Classification Supervision on Property Insurance Companies with the Analytic Hierarchy Model%利用层次分析模型研究财产保险公司监管分类方法

    Institute of Scientific and Technical Information of China (English)

    王智鑫; 罗军; 龙胤

    2012-01-01

    本文根据中国保险监督管理委员会《保险公司分支机构分类监管暂行办法》和陕西省保监局《陕西省保险公司分类监管办法(征求意见稿)》,选择在陕财产保险公司二级机构2010年相关数据,运用层次分析数学模型对相关指标进行分析和评价,与陕西保监局的《陕西省保险公司分类监管办法(征求意见稿)》分析结果进行对比,以达到对财产保险公司省级分支机构监管分类方法的有效性和科学性,积极探索对其分类的新方法和新思路,为财产保险公司二级机构分类监管体系的建立和完善.提供科学的方法依据和理论参考。%Accord to Interim Measures for Classification Supervision of Insurance Company Branches issued by China Insurance Regulatory Commission and Measures for Classification Supervision of Shaanxi Provincial Insurance Companies (Consultative Draft) issued by Shaanxi Bureau of CIRC, the paper chooses data of secondary level property insurance companies in Shaanxi province in 2010, analyzes and evaluates the relevant index with analytic hierarchy model, compares the results so as to establish an efficient and scientific classification supervision on provincial property insurance companies, actively explores new measures of classification supervision, and provides a scientific theoretical reference for establishment and improvement of classification supervision system on secondary institutions of property insurance companies.

  9. Effectiveness of the bucco-lingual technique within a school-based supervised toothbrushing program on preventing caries: a randomized controlled trial

    Directory of Open Access Journals (Sweden)

    Frazão Paulo

    2011-03-01

    Full Text Available Abstract Background Supervised toothbrushing programs using fluoride dentifrice have reduced caries increment. However there is no information about the effectiveness of the professional cross-brushing technique within a community intervention. The aim was to assess if the bucco-lingual technique can increase the effectiveness of a school-based supervised toothbrushing program on preventing caries. Methods A randomized double-blinded controlled community intervention trial to be analyzed at an individual level was conducted in a Brazilian low-income fluoridated area. Six preschools were randomly assigned to the test and control groups and 284 five-year-old children presenting at least one permanent molar with emerged/sound occlusal surface participated. In control group, oral health education and dental plaque dying followed by toothbrushing with fluoride dentifrice supervised directly by a dental assistant, was developed four times per year. At the remaining school days the children brushed their teeth under indirect supervising of the teachers. In test group, children also underwent a professional cross-brushing on surfaces of first permanent molar rendered by a specially trained dental assistant five times per year. Enamel and dentin caries were recorded on buccal, occlusal and lingual surfaces of permanent molars during 18-month follow-up. Exposure time of surfaces was calculated and incidence density ratio was estimated using Poisson regression model. Results Difference of 21.6 lesions per 1,000 children between control and test groups was observed. Among boys whose caries risk was higher compared to girls, incidence density was 50% lower in test group (p = 0.016. Conclusion Modified program was effective among the boys. It is licit to project a relevant effect in a larger period suggesting in a broader population substantial reduction of dental care needs. Trial registration ISRCTN18548869.

  10. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  11. REMOTE SENSING IMAGE CLASSIFICATION WITH GIS DATA BASED ON SPATIAL DATA MINING TECHNIQUES

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Data mining techniques are used to discover knowledge from GIS database in order to improve remote sensing image classification.Two learning granularities are proposed for inductive learning from spatial data,one is spatial object granularity,the other is pixel granularity.We also present an approach to combine inductive learning with conventional image classification methods,which selects class probability of Bayes classification as learning attributes.A land use classification experiment is performed in the Beijing area using SPOT multi-spectral image and GIS data.Rules about spatial distribution patterns and shape features are discovered by C5.0 inductive learning algorithm and then the image is reclassified by deductive reasoning.Comparing with the result produced only by Bayes classification,the overall accuracy increased by 11% and the accuracy of some classes,such as garden and forest,increased by about 30%.The results indicate that inductive learning can resolve spectral confusion to a great extent.Combining Bayes method with inductive learning not only improves classification accuracy greatly,but also extends the classification by subdividing some classes with the discovered knowledge.

  12. Using Data Mining Techniques to Build a Classification Model for Predicting Employees Performance

    Directory of Open Access Journals (Sweden)

    Qasem A. Al-Radaideh

    2012-02-01

    Full Text Available Human capital is of a high concern for companies’ management where their most interest is in hiring the highly qualified personnel which are expected to perform highly as well. Recently, there has been a growing interest in the data mining area, where the objective is the discovery of knowledge that is correct and of high benefit for users. In this paper, data mining techniques were utilized to build a classification model to predict the performance of employees. To build the classification model the CRISP-DM data mining methodology was adopted. Decision tree was the main data mining tool used to build the classification model, where several classification rules were generated. To validate the generated model, several experiments were conducted using real data collected from several companies. The model is intended to be used for predicting new applicants’ performance.

  13. Authentication of Galician (N.W. Spain) quality brand potatoes using metal analysis. Classical pattern recognition techniques versus a new vector quantization-based classification procedure.

    Science.gov (United States)

    Peña, R M; García, S; Iglesias, R; Barro, S; Herrero, C

    2001-12-01

    The objective of this work was to develop a classification system in order to confirm the authenticity of Galician potatoes with a Certified Brand of Origin and Quality (CBOQ) and to differentiate them from other potatoes that did not have this quality brand. Elemental analysis (K, Na, Rb, Li, Zn, Fe, Mn, Cu, Mg and Ca) of potatoes was performed by atomic spectroscopy in 307 samples belonging to two categories, CBOQ and Non-CBOQ potatoes. The 307 x 10 data set was evaluated employing multivariate chemometric techniques, such as cluster analysis and principal component analysis in order to perform a preliminary study of the data structure. Different classification systems for the two categories on the basis of the chemical data were obtained applying several commonly supervised pattern recognition procedures [such as linear discriminant analysis, K-nearest neighbours (KNN), soft independent modelling of class analogy and multilayer feed-forward neural networks]. In spite of the fact that some of these classification methods produced satisfactory results, the particular data distribution in the 10-dimensional space led to the proposal of a new vector quantization-based classification procedure (VQBCP). The results achieved with this new approach (percentages of recognition and prediction abilities > 97%) were better than those attained by KNN and can be compared advantageously with those provided by LDA (linear discriminant analysis), SIMCA (soft independent modelling of class analogy) and MLF-ANN (multilayer feed-forward neural networks). The new VQBCP demonstrated good performance by carrying out adequate classifications in a data set in which the classes are subgrouped. The metal profiles of potatoes provided sufficient information to enable classification criteria to be developed for classifying samples on the basis of their origin and brand.

  14. Clinical supervision.

    Science.gov (United States)

    Goorapah, D

    1997-05-01

    The introduction of clinical supervision to a wider sphere of nursing is being considered from a professional and organizational point of view. Positive views are being expressed about adopting this concept, although there are indications to suggest that there are also strong reservations. This paper examines the potential for its success amidst the scepticism that exists. One important question raised is whether clinical supervision will replace or run alongside other support systems.

  15. An Improved Image Mining Technique For Brain Tumour Classification Using Efficient Classifier

    Directory of Open Access Journals (Sweden)

    P. Rajendran

    2009-12-01

    Full Text Available An improved image mining technique for brain tumor classification using pruned association rule with MARI algorithm is presented in this paper. The method proposed makes use of association rule mining technique to classify the CT scan brain images into three categories namely normal, benign and malign. It combines the low-level features extracted from images and high level knowledge from specialists. The developed algorithm can assist the physicians for efficient classification with multiple keywords per image to improve the accuracy. The experimental result on pre-diagnosed database of brain images showed 96% and 93% sensitivity and accuracy respectively.Keywords- Data mining; Image ming; Association rule mining; Medical Imaging; Medical image diagnosis; Classification;

  16. A Classification Technique for Microarray Gene Expression Data using PSO-FLANN

    Directory of Open Access Journals (Sweden)

    Jayashree Dev

    2012-09-01

    Full Text Available Despite of an increased global effort to end breast cancer, it continues to be most common cancer deaths in women. This problem reminds that new therapeutic approaches are desperately neededto improve patient survival rate. This requires proper diagnosis of disease and classification of tumor type based on genomic information according to which proper treatment can be provided to the patient.There exists a no. of classification techniques to classify the tumor types. In this paper we have focused on three different classification techniques: BPN, FLANN and PSO-FLANN and found that the integrated approach of Functional Link Artificial Neural Network (FLANN and Particle Swarm Optimization (PSO can better predict the disease as compared to other method.

  17. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  18. Semi-supervised classification of remote sensing image based on probabilistic topic model%利用概率主题模型的遥感影像半监督分类

    Institute of Scientific and Technical Information of China (English)

    易文斌; 冒亚明; 慎利

    2013-01-01

    Land cover is the center of the interaction of the natural environment and human activities and the acquisition of land cover information are obtained through the classification of remote sensing images, so the image classification is one of the most basic issues of remote sensing image analysis. Based on the image clustering analysis of high-resolution remote sensing image through the probabilistic topic model, the generated model which is a typical method in the semi-supervised learning is analyzed and a classification method based on probabilistic topic model and semi-supervised learning(SS-LDA)is formed in the paper. The process of SS-LDA model used in the text recognition applications is relearned and a basic image classification process of high-resolution remote sensing image is constructed. Comparing to traditional unsupervised classification and supervised classi-fication algorithm, the SS-LDA algorithm will get more accuracy of image classification results through experiments.%  土地覆盖是自然环境与人类活动相互作用的中心,而土地覆盖信息主要是通过遥感影像分类来获取,因此影像分类是遥感影像分析的最基本问题之一。在参考基于概率主题模型的高分辨率遥感影像聚类分析的基础上,通过半监督学习最典型的生成模型方法引出了基于概率主题模型的半监督分类(SS-LDA)算法。借鉴SS-LDA模型在文本识别应用的流程,构建了基于SS-LDA算法的高分辨率遥感影像分类的基本流程。通过实验证明,相对于传统的非监督分类与监督分类算法,SS-LDA算法能够获取较高精度的影像分类结果。

  19. Clustering technique-based least square support vector machine for EEG signal classification.

    Science.gov (United States)

    Siuly; Li, Yan; Wen, Peng Paul

    2011-12-01

    This paper presents a new approach called clustering technique-based least square support vector machine (CT-LS-SVM) for the classification of EEG signals. Decision making is performed in two stages. In the first stage, clustering technique (CT) has been used to extract representative features of EEG data. In the second stage, least square support vector machine (LS-SVM) is applied to the extracted features to classify two-class EEG signals. To demonstrate the effectiveness of the proposed method, several experiments have been conducted on three publicly available benchmark databases, one for epileptic EEG data, one for mental imagery tasks EEG data and another one for motor imagery EEG data. Our proposed approach achieves an average sensitivity, specificity and classification accuracy of 94.92%, 93.44% and 94.18%, respectively, for the epileptic EEG data; 83.98%, 84.37% and 84.17% respectively, for the motor imagery EEG data; and 64.61%, 58.77% and 61.69%, respectively, for the mental imagery tasks EEG data. The performance of the CT-LS-SVM algorithm is compared in terms of classification accuracy and execution (running) time with our previous study where simple random sampling with a least square support vector machine (SRS-LS-SVM) was employed for EEG signal classification. We also compare the proposed method with other existing methods in the literature for the three databases. The experimental results show that the proposed algorithm can produce a better classification rate than the previous reported methods and takes much less execution time compared to the SRS-LS-SVM technique. The research findings in this paper indicate that the proposed approach is very efficient for classification of two-class EEG signals.

  20. Classification of remotely sensed data using OCR-inspired neural network techniques. [Optical Character Recognition

    Science.gov (United States)

    Kiang, Richard K.

    1992-01-01

    Neural networks have been applied to classifications of remotely sensed data with some success. To improve the performance of this approach, an examination was made of how neural networks are applied to the optical character recognition (OCR) of handwritten digits and letters. A three-layer, feedforward network, along with techniques adopted from OCR, was used to classify Landsat-4 Thematic Mapper data. Good results were obtained. To overcome the difficulties that are characteristic of remote sensing applications and to attain significant improvements in classification accuracy, a special network architecture may be required.

  1. Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques

    Directory of Open Access Journals (Sweden)

    Alok Sharma

    2005-01-01

    Full Text Available This study firstly presents a survey on basic classifiers namely Minimum Distance Classifier (MDC, Vector Quantization (VQ, Principal Component Analysis (PCA, Nearest Neighbor (NN and K-Nearest Neighbor (KNN. Then Vector Quantized Principal Component Analysis (VQPCA which is generally used for representation purposes is considered for performing classification tasks. Some classifiers achieve high classification accuracy but their data storage requirement and processing time are severely expensive. On the other hand some methods for which storage and processing time are economical do not provide sufficient levels of classification accuracy. In both the cases the performance is poor. By considering the limitations involved in the classifiers we have developed Linear Combined Distance (LCD classifier which is the combination of VQ and VQPCA techniques. The proposed technique is effective and outperforms all the other techniques in terms of getting high classification accuracy at very low data storage requirement and processing time. This would allow an object to be accurately classified as quickly as possible using very low data storage capacity.

  2. Advanced Music Therapy Supervision Training

    DEFF Research Database (Denmark)

    2009-01-01

    supervision training excerpts live in the workshop will be offered. The workshop will include demonstrating a variety of supervision methods and techniques used in A) post graduate music therapy training programs b) a variety of work contexts such as psychiatry and somatic music psychotherapy. The workshop......The presentation will illustrate training models in supervision for experienced music therapists where transference/counter transference issues are in focus. Musical, verbal and body related tools will be illustrated from supervision practice by the presenters. A possibility to experience small...

  3. Advanced Music Therapy Supervision Training

    DEFF Research Database (Denmark)

    2009-01-01

    supervision training excerpts live in the workshop will be offered. The workshop will include demonstrating a variety of supervision methods and techniques used in A) post graduate music therapy training programs b) a variety of work contexts such as psychiatry and somatic music psychotherapy. The workshop......The presentation will illustrate training models in supervision for experienced music therapists where transference/counter transference issues are in focus. Musical, verbal and body related tools will be illustrated from supervision practice by the presenters. A possibility to experience small...

  4. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  5. Segmentation techniques evaluation based on a single compact breast mass classification scheme

    Science.gov (United States)

    Matheus, Bruno R. N.; Marcomini, Karem D.; Schiabel, Homero

    2016-03-01

    In this work some segmentation techniques are evaluated by using a simple centroid-based classification system regarding breast mass delineation in digital mammography images. The aim is to determine the best one for future CADx developments. Six techniques were tested: Otsu, SOM, EICAMM, Fuzzy C-Means, K-Means and Level-Set. All of them were applied to segment 317 mammography images from DDSM database. A single compact set of attributes was extracted and two centroids were defined, one for malignant and another for benign cases. The final classification was based on proximity with a given centroid and the best results were presented by the Level-Set technique with a 68.1% of Accuracy, which indicates this method as the most promising for breast masses segmentation aiming a more precise interpretation in schemes CADx.

  6. Developing an efficient technique for satellite image denoising and resolution enhancement for improving classification accuracy

    Science.gov (United States)

    Thangaswamy, Sree Sharmila; Kadarkarai, Ramar; Thangaswamy, Sree Renga Raja

    2013-01-01

    Satellite images are corrupted by noise during image acquisition and transmission. The removal of noise from the image by attenuating the high-frequency image components removes important details as well. In order to retain the useful information, improve the visual appearance, and accurately classify an image, an effective denoising technique is required. We discuss three important steps such as image denoising, resolution enhancement, and classification for improving accuracy in a noisy image. An effective denoising technique, hybrid directional lifting, is proposed to retain the important details of the images and improve visual appearance. The discrete wavelet transform based interpolation is developed for enhancing the resolution of the denoised image. The image is then classified using a support vector machine, which is superior to other neural network classifiers. The quantitative performance measures such as peak signal to noise ratio and classification accuracy show the significance of the proposed techniques.

  7. Whither Supervision?

    Directory of Open Access Journals (Sweden)

    Duncan Waite

    2006-11-01

    Full Text Available This paper inquires if the school supervision is in decadence. Dr. Waite responds that the answer will depend on which perspective you look at it. Dr. Waite suggests taking in consideration three elements that are related: the field itself, the expert in the field (the professor, the theorist, the student and the administrator, and the context. When these three elements are revised, it emphasizes that there is not a consensus about the field of supervision, but there are coincidences related to its importance and that it is related to the improvement of the practice of the students in the school for their benefit. Dr. Waite suggests that the practice on this field is not always in harmony with what the theorists affirm. When referring to the supervisor or the skilled person, the author indicates that his or her perspective depends on his or her epistemological believes or in the way he or she conceives the learning; that is why supervision can be understood in different ways. About the context, Waite suggests that there have to be taken in consideration the social or external forces that influent the people and the society, because through them the education is affected. Dr. Waite concludes that the way to understand the supervision depends on the performer’s perspective. He responds to the initial question saying that the supervision authorities, the knowledge on this field, the performers, and its practice, are maybe spread but not extinct because the supervision will always be part of the great enterprise that we called education.

  8. Pre-trained Convolutional Networks and generative statiscial models: a study in semi-supervised learning

    OpenAIRE

    John Michael Salgado Cebola

    2016-01-01

    Comparative study between the performance of Convolutional Networks using pretrained models and statistical generative models on tasks of image classification in semi-supervised enviroments.Study of multiple ensembles using these techniques and generated data from estimated pdfs.Pretrained Convents, LDA, pLSA, Fisher Vectors, Sparse-coded SPMs, TSVMs being the key models worked upon.

  9. An Empirical Study of the Applications of Classification Techniques in Students Database

    Directory of Open Access Journals (Sweden)

    Tariq O. Fadl Elsid

    2014-10-01

    Full Text Available University servers and databases store a huge amount of data including personal details, registration details, evaluation assessment, performance profiles, and many more for students and lecturers alike. main problem that faces any system administration or any users is data increasing per-second, which is stored in different type and format in the servers, learning about students from a huge amount of data including personal details, registration details, evaluation assessment, performance profiles, and many more for students and lecturers alike. Graduation and academic information in the future and maintaining structure and content of the courses according to their previous results become importance. The paper objectives are extract knowledge from incomplete data structure and what the suitable method or technique of data mining to extract knowledge from a huge amount of data about students to help the administration using technology to make a quick decision. Data mining aims to discover useful information or knowledge by using one of data mining techniques, this paper used classification technique to discover knowledge from student’s server database, where all students’ information were registered and stored. The classification task is used, the classifier tree C4.5, to predict the final academic results, grades, of students. We use classifier tree C4.5 as the method to classify the grades for the students .The data include four years period [2006-2009]. Experiment results show that classification process succeeded in training set. Thus, the predicted instances is similar to the training set, this proves the suggested classification model. Also the efficiency and effectiveness of C4.5 algorithm in predicting the academic results, grades, classification is very good. The model also can improve the efficiency of the academic results retrieving and evidently promote retrieval precision.

  10. Application of intelligent techniques for classification of bacteria using protein sequence-derived features.

    Science.gov (United States)

    Banerjee, Amit Kumar; Ravi, Vadlamani; Murty, U S N; Sengupta, Neelava; Karuna, Batepatti

    2013-07-01

    Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91%.

  11. PERFORMANCE EVALUATION OF THE DATA MINING CLASSIFICATION METHODS

    Directory of Open Access Journals (Sweden)

    CRISTINA OPREA

    2014-05-01

    Full Text Available The paper aims to analyze how the performance evaluation of different classification models from data mining process. Classification is the most widely used data mining technique of supervised learning. This is the process of identifying a set of features and templates that describe the data classes or concepts. We applied various classification algorithms on different data sets to streamline and improve the algorithm performance.

  12. FEATURE FUSION TECHNIQUE FOR COLOUR TEXTURE CLASSIFICATION SYSTEM BASED ON GRAY LEVEL CO-OCCURRENCE MATRIX

    OpenAIRE

    Shunmuganathan, K. L.; A. Suresh

    2012-01-01

    In this study, an efficient feature fusion based technique for the classification of colour texture images in VisTex album is presented. Gray Level Co-occurrence Matrix (GLCM) and its associated texture features contrast, correlation, energy and homogeneity are used in the proposed approach. The proposed GLCM texture features are obtained from the original colour texture as well as the first non singleton dimension of the same image. These features are fused at feature level to classify the c...

  13. Automatic Defect Detection and Classification Technique from Image: A Special Case Using Ceramic Tiles

    OpenAIRE

    Rahaman, G. M. Atiqur; Hossain, Md. Mobarak

    2009-01-01

    Quality control is an important issue in the ceramic tile industry. On the other hand maintaining the rate of production with respect to time is also a major issue in ceramic tile manufacturing. Again, price of ceramic tiles also depends on purity of texture, accuracy of color, shape etc. Considering this criteria, an automated defect detection and classification technique has been proposed in this report that can have ensured the better quality of tiles in manufacturing process as well as pr...

  14. AN IMPROVED TECHNIQUE FOR IDENTIFICATION AND CLASSIFICATION OF BRAIN DISORDER FROM MRI BRAIN IMAGE

    Directory of Open Access Journals (Sweden)

    Finitha Joseph

    2015-11-01

    Full Text Available Medical image processing is developing recently due to its wide applications. An efficient MRI image segmentation is needed at present. In this paper, MRI brain segmentation is done by Semi supervised learning which does not require pathology modelling and, thus, allows high degree of automation. In abnormality detection, a vector is characterized as anomalous if it does not comply with the probability distribution obtained from normal data. The estimation of the probability density function, however, is usually not feasible due to large data dimensionality. In order to overcome this challenge, we treat every image as a network of locally coherent image partitions (overlapping blocks. We formulate and maximize a strictly concave likelihood function estimating abnormality for each partition and fuse the local estimates into a globally optimal estimate that satisfies the consistency constraints, based on a distributed estimation algorithm. After this features are extracted by Gray-Level Co-occurrence Matrices (GLCM algorithm and those features are given to Particle Spam Optimization (PSO and finally classification is done by using Library Support Vector Machine (LIBSVM.Thus results are evaluated and proved its efficiency using accuracy.

  15. Verdict Accuracy of Quick Reduct Algorithm using Clustering and Classification Techniques for Gene Expression Data

    Directory of Open Access Journals (Sweden)

    T.Chandrasekhar

    2012-01-01

    Full Text Available In most gene expression data, the number of training samples is very small compared to the large number of genes involved in the experiments. However, among the large amount of genes, only a small fraction is effective for performing a certain task. Furthermore, a small subset of genes is desirable in developing gene expression based diagnostic tools for delivering reliable and understandable results. With the gene selection results, the cost of biological experiment and decision can be greatly reduced by analyzing only the marker genes. An important application of gene expression data in functional genomics is to classify samples according to their gene expression profiles. Feature selection (FS is a process which attempts to select more informative features. It is one of the important steps in knowledge discovery. Conventional supervised FS methods evaluate various feature subsets using an evaluation function or metric to select only those features which are related to the decision classes of the data under consideration. This paper studies a feature selection method based on rough set theory. Further K-Means, Fuzzy C-Means (FCM algorithm have implemented for the reduced feature set without considering class labels. Then the obtained results are compared with the original class labels. Back Propagation Network (BPN has also been used for classification. Then the performance of K-Means, FCM, and BPN are analyzed through the confusion matrix. It is found that the BPN is performing well comparatively.

  16. 基于半监督学习的Web页面内容分类技术研究%Study on Web page content classification technology based on semi-supervised learning

    Institute of Scientific and Technical Information of China (English)

    赵夫群

    2016-01-01

    For the key issues that how to use labeled and unlabeled data to conduct Web classification,a classifier of com-bining generative model with discriminative model is explored. The maximum likelihood estimation is adopted in the unlabeled training set to construct a semi-supervised classifier with high classification performance. The Dirichlet-polynomial mixed distri-bution is used to model the text,and then a hybrid model which is suitable for the semi-supervised learning is proposed. Since the EM algorithm for the semi-supervised learning has fast convergence rate and is easy to fall into local optimum,two intelli-gent optimization methods of simulated annealing algorithm and genetic algorithm are introduced,analyzed and processed. A new intelligent semi-supervised classification algorithm was generated by combing the two algorithms,and the feasibility of the algorithm was verified.%针对如何使用标记和未标记数据进行Web分类这一关键性问题,探索一种生成模型和判别模型相互结合的分类器,在无标记训练集中采用最大似然估计,构造一种具有良好分类性能的半监督分类器.利用狄利克雷-多项式混合分布对文本进行建模,提出了适用于半监督学习的混合模型.针对半监督学习的EM算法收敛速度过快,容易陷入局部最优的难题,引入两种智能优化的方法——模拟退火算法和遗传算法进行分析和处理,结合这两种算法形成一种新型智能的半监督分类算法,并且验证了该算法的可行性.

  17. Semi-supervised binary classification algorithm based on global and local regularization%结合全局和局部正则化的半监督二分类算法

    Institute of Scientific and Technical Information of China (English)

    吕佳

    2012-01-01

    As for semi-supervised classification problem, it is difficult to obtain a good classification function for the entire input space if global learning is used alone, while if local learning is utilized alone, a good classification function on some specified regions of the input space can be got. Accordingly, a new semi-supervised binary classification algorithm based on a mixed local and global regularization was presented in this paper. The algorithm integrated the benefits of global regularizer and local regularizes Global regularizer was built to smooth the class labels of the data so as to lessen insufficient training of local regularizer, and based upon the neighboring region, local regularizer was constructed to make class label of each data have the desired property, thus the objective function of semi-supervised binary classification problem was constructed. Comparative semi-supervised binary classification experiments on some benchmark datasets validate that the average classification accuracy and the standard error of the proposed algorithm are obviously superior to other algorithms.%针对在半监督分类问题中单独使用全局学习容易出现的在整个输入空间中较难获得一个优良的决策函数的问题,以及单独使用局部学习可在特定的局部区域内习得较好的决策函数的特点,提出了一种结合全局和局部正则化的半监督二分类算法.该算法综合全局正则项和局部正则项的优点,基于先验知识构建的全局正则项能平滑样本的类标号以避免局部正则项学习不充分的问题,通过基于局部邻域内样本信息构建的局部正则项使得每个样本的类标号具有理想的特性,从而构造出半监督二分类问题的目标函数.通过在标准二类数据集上的实验,结果表明所提出的算法其平均分类正确率和标准误差均优于基于拉普拉斯正则项方法、基于正则化拉普拉斯正则项方法和基于局部学习正则项方法.

  18. Partial imputation to improve predictive modelling in insurance risk classification using a hybrid positive selection algorithm and correlation-based feature selection

    CSIR Research Space (South Africa)

    Duma, M

    2013-09-01

    Full Text Available We propose a hybrid missing data imputation technique using positive selection and correlation-based feature selection for insurance data. The hybrid is used to help supervised learning methods improve their classification accuracy and resilience...

  19. Hardwood species classification with DWT based hybrid texture feature extraction techniques

    Indian Academy of Sciences (India)

    Arvind R Yadav; R S Anand; M L Dewal; Sangeeta Gupta

    2015-12-01

    In this work, discrete wavelet transform (DWT) based hybrid texture feature extraction techniques have been used to categorize the microscopic images of hardwood species into 75 different classes. Initially, the DWT has been employed to decompose the image up to 7 levels using Daubechies (db3) wavelet as decomposition filter. Further, first-order statistics (FOS) and four variants of local binary pattern (LBP) descriptors are used to acquire distinct features of these images at various levels. The linear support vector machine (SVM), radial basis function (RBF) kernel SVM and random forest classifiers have been employed for classification. The classification accuracy obtained with state-of-the-art and DWT based hybrid texture features using various classifiers are compared. The DWT based FOS-uniform local binary pattern (DWTFOSLBPu2) texture features at the 4th level of image decomposition have produced best classification accuracy of 97.67 ± 0.79% and 98.40 ± 064% for grayscale and RGB images, respectively, using linear SVM classifier. Reduction in feature dataset by minimal redundancy maximal relevance (mRMR) feature selection method is achieved and the best classification accuracy of 99.00 ± 0.79% and 99.20 ± 0.42% have been obtained for DWT based FOS-LBP histogram Fourier features (DWTFOSLBP-HF) technique at the 5th and 6th levels of image decomposition for grayscale and RGB images, respectively, using linear SVM classifier. The DWTFOSLBP-HF features selected with mRMR method has also established superiority amongst the DWT based hybrid texture feature extraction techniques for randomly divided database into different proportions of training and test datasets.

  20. Qualitative classification of milled rice grains using computer vision and metaheuristic techniques.

    Science.gov (United States)

    Zareiforoush, Hemad; Minaei, Saeid; Alizadeh, Mohammad Reza; Banakar, Ahmad

    2016-01-01

    Qualitative grading of milled rice grains was carried out in this study using a machine vision system combined with some metaheuristic classification approaches. Images of four different classes of milled rice including Low-processed sound grains (LPS), Low-processed broken grains (LPB), High-processed sound grains (HPS), and High-processed broken grains (HPB), representing quality grades of the product, were acquired using a computer vision system. Four different metaheuristic classification techniques including artificial neural networks, support vector machines, decision trees and Bayesian Networks were utilized to classify milled rice samples. Results of validation process indicated that artificial neural network with 12-5*4 topology had the highest classification accuracy (98.72 %). Next, support vector machine with Universal Pearson VII kernel function (98.48 %), decision tree with REP algorithm (97.50 %), and Bayesian Network with Hill Climber search algorithm (96.89 %) had the higher accuracy, respectively. Results presented in this paper can be utilized for developing an efficient system for fully automated classification and sorting of milled rice grains.

  1. A Novel Prostate Cancer Classification Technique Using Intermediate Memory Tabu Search

    Directory of Open Access Journals (Sweden)

    Tahir Muhammad Atif

    2005-01-01

    Full Text Available The introduction of multispectral imaging in pathology problems such as the identification of prostatic cancer is recent. Unlike conventional RGB color space, it allows the acquisition of a large number of spectral bands within the visible spectrum. This results in a feature vector of size greater than 100. For such a high dimensionality, pattern recognition techniques suffer from the well-known curse of dimensionality problem. The two well-known techniques to solve this problem are feature extraction and feature selection. In this paper, a novel feature selection technique using tabu search with an intermediate-term memory is proposed. The cost of a feature subset is measured by leave-one-out correct-classification rate of a nearest-neighbor (1-NN classifier. The experiments have been carried out on the prostate cancer textured multispectral images and the results have been compared with a reported classical feature extraction technique. The results have indicated a significant boost in the performance both in terms of minimizing features and maximizing classification accuracy.

  2. Review of Intelligent Techniques Applied for Classification and Preprocessing of Medical Image Data

    Directory of Open Access Journals (Sweden)

    H S Hota

    2013-01-01

    Full Text Available Medical image data like ECG, EEG and MRI, CT-scan images are the most important way to diagnose disease of human being in precise way and widely used by the physician. Problem can be clearly identified with the help of these medical images. A robust model can classify the medical image data in better way .In this paper intelligent techniques like neural network and fuzzy logic techniques are explored for MRI medical image data to identify tumor in human brain. Also need of preprocessing of medical image data is explored. Classification technique has been used extensively in the field of medical imaging. The conventional method in medical science for medical image data classification is done by human inspection which may result misclassification of data sometime this type of problem identification are impractical for large amounts of data and noisy data, a noisy data may be produced due to some technical fault of the machine or by human errors and can lead misclassification of medical image data. We have collected number of papers based on neural network and fuzzy logic along with hybrid technique to explore the efficiency and robustness of the model for brain MRI data. It has been analyzed that intelligent model along with data preprocessing using principal component analysis (PCA and segmentation may be the competitive model in this domain.

  3. A new coordination pattern classification to assess gait kinematics when utilising a modified vector coding technique.

    Science.gov (United States)

    Needham, Robert A; Naemi, Roozbeh; Chockalingam, Nachiappan

    2015-09-18

    A modified vector coding (VC) technique was used to quantify lumbar-pelvic coordination during gait. The outcome measure from the modified VC technique is known as the coupling angle (CA) which can be classified into one of four coordination patterns. This study introduces a new classification for this coordination pattern that expands on a current data analysis technique by introducing the terms in-phase with proximal dominancy, in-phase with distal dominancy, anti-phase with proximal dominancy and anti-phase with distal dominancy. This proposed coordination pattern classification can offer an interpretation of the CA that provides either in-phase or anti-phase coordination information, along with an understanding of the direction of segmental rotations and the segment that is the dominant mover at each point in time. Classifying the CA against the new defined coordination patterns and presenting this information in a traditional time-series format in this study has offered an insight into segmental range of motion. A new illustration is also presented which details the distribution of the CA within each of the coordination patterns and allows for the quantification of segmental dominancy. The proposed illustration technique can have important implications in demonstrating gait coordination data in an easily comprehensible fashion by clinicians and scientists alike.

  4. Probabilistic neural networks and the polynomial Adaline as complementary techniques for classification.

    Science.gov (United States)

    Specht, D F

    1990-01-01

    Two methods for classification based on the Bayes strategy and nonparametric estimators for probability density functions are reviewed. The two methods are named the probabilistic neural network (PNN) and the polynomial Adaline. Both methods involve one-pass learning algorithms that can be implemented directly in parallel neural network architectures. The performances of the two methods are compared with multipass backpropagation networks, and relative advantages and disadvantages are discussed. PNN and the polynomial Adaline are complementary techniques because they implement the same decision boundaries but have different advantages for applications. PNN is easy to use and is extremely fast for moderate-sized databases. For very large databases and for mature applications in which classification speed is more important than training speed, the polynomial equivalent can be found.

  5. Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

    Science.gov (United States)

    Alexander, Tiffaney Miller

    2017-01-01

    Research results have shown that more than half of aviation, aerospace and aeronautics mishaps incidents are attributed to human error. As a part of Quality within space exploration ground processing operations, the identification and or classification of underlying contributors and causes of human error must be identified, in order to manage human error.This presentation will provide a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

  6. Automatic Defect Detection and Classification Technique from Image: A Special Case Using Ceramic Tiles

    CERN Document Server

    Rahaman, G M Atiqur

    2009-01-01

    Quality control is an important issue in the ceramic tile industry. On the other hand maintaining the rate of production with respect to time is also a major issue in ceramic tile manufacturing. Again, price of ceramic tiles also depends on purity of texture, accuracy of color, shape etc. Considering this criteria, an automated defect detection and classification technique has been proposed in this report that can have ensured the better quality of tiles in manufacturing process as well as production rate. Our proposed method plays an important role in ceramic tiles industries to detect the defects and to control the quality of ceramic tiles. This automated classification method helps us to acquire knowledge about the pattern of defect within a very short period of time and also to decide about the recovery process so that the defected tiles may not be mixed with the fresh tiles.

  7. 7 CFR 27.73 - Supervision of transfers of cotton.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Supervision of transfers of cotton. 27.73 Section 27... Supervision of transfers of cotton. Whenever the owner of any cotton inspected and sampled for classification... be effected under the supervision of an exchange inspection agency or a supervisor of...

  8. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Science.gov (United States)

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  9. SROT: Sparse representation-based over-sampling technique for classification of imbalanced dataset

    Science.gov (United States)

    Zou, Xionggao; Feng, Yueping; Li, Huiying; Jiang, Shuyu

    2017-08-01

    As one of the most popular research fields in machine learning, the research on imbalanced dataset receives more and more attentions in recent years. The imbalanced problem usually occurs in when minority classes have extremely fewer samples than the others. Traditional classification algorithms have not taken the distribution of dataset into consideration, thus they fail to deal with the problem of class-imbalanced learning, and the performance of classification tends to be dominated by the majority class. SMOTE is one of the most effective over-sampling methods processing this problem, which changes the distribution of training sets by increasing the size of minority class. However, SMOTE would easily result in over-fitting on account of too many repetitive data samples. According to this issue, this paper proposes an improved method based on sparse representation theory and over-sampling technique, named SROT (Sparse Representation-based Over-sampling Technique). The SROT uses a sparse dictionary to create synthetic samples directly for solving the imbalanced problem. The experiments are performed on 10 UCI datasets using C4.5 as the learning algorithm. The experimental results show that compared our algorithm with Random Over-sampling techniques, SMOTE and other methods, SROT can achieve better performance on AUC value.

  10. Supervised retinal biometrics in different lighting conditions.

    Science.gov (United States)

    Azemin, Mohd Zulfaezal Che; Kumar, Dinesh K; Sugavaneswaran, Lakshmi; Krishnan, Sridhar

    2011-01-01

    Retinal image has been considered for number of health and biometrics applications. However, the reliability of these has not been investigated thoroughly. The variation observed in retina scans taken at different times is attributable to differences in illumination and positioning of the camera. It causes some missing bifurcations and crossovers from the retinal vessels. Exhaustive selection of optimal parameters is needed to construct the best similarity metrics equation to overcome the incomplete landmarks. In this paper, we extracted multiple features from the retina scans and employs supervised classification to overcome the shortcomings of the current techniques. The experimental results of 60 retina scans with different lightning conditions demonstrate the efficacy of this technique. The results were compared with the existing methods.

  11. Review on Feature Selection Techniques and the Impact of SVM for Cancer Classification using Gene Expression Profile

    CERN Document Server

    George, G Victo Sudha; 10.5121/ijcses.2011.2302

    2011-01-01

    The DNA microarray technology has modernized the approach of biology research in such a way that scientists can now measure the expression levels of thousands of genes simultaneously in a single experiment. Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. But compared to the number of genes involved, available training data sets generally have a fairly small sample size for classification. These training data limitations constitute a challenge to certain classification methodologies. Feature selection techniques can be used to extract the marker genes which influence the classification accuracy effectively by eliminating the un wanted noisy and redundant genes This paper presents a review of feature selection techniques that have been employed in micro array data based cancer classification and also the predominant role of SVM for cancer classification.

  12. Graph-based Techniques for Topic Classification of Tweets in Spanish

    Directory of Open Access Journals (Sweden)

    Hector Cordobés

    2014-03-01

    Full Text Available Topic classification of texts is one of the most interesting challenges in Natural Language Processing (NLP. Topic classifiers commonly use a bag-of-words approach, in which the classifier uses (and is trained with selected terms from the input texts. In this work we present techniques based on graph similarity to classify short texts by topic. In our classifier we build graphs from the input texts, and then use properties of these graphs to classify them. We have tested the resulting algorithm by classifying Twitter messages in Spanish among a predefined set of topics, achieving more than 70% accuracy.

  13. FEATURE FUSION TECHNIQUE FOR COLOUR TEXTURE CLASSIFICATION SYSTEM BASED ON GRAY LEVEL CO-OCCURRENCE MATRIX

    Directory of Open Access Journals (Sweden)

    K. L. Shunmuganathan

    2012-01-01

    Full Text Available In this study, an efficient feature fusion based technique for the classification of colour texture images in VisTex album is presented. Gray Level Co-occurrence Matrix (GLCM and its associated texture features contrast, correlation, energy and homogeneity are used in the proposed approach. The proposed GLCM texture features are obtained from the original colour texture as well as the first non singleton dimension of the same image. These features are fused at feature level to classify the colour texture image using nearest neighbor classifier. The results demonstrate that the proposed fusion of difference image GLCM features is much more efficient than the original GLCM features.

  14. Association Technique based on Classification for Classifying Microcalcification and Mass in Mammogram

    Directory of Open Access Journals (Sweden)

    Herwanto

    2013-01-01

    Full Text Available Currently, mammography is recognized as the most effective imaging modality for breast cancer screening. The challenge of using mammography is how to locate the area, which is indeed a solitary geographic abnormality. In mammography screening it is important to define the risk for women who have radiologically negative findings and for those who might develop malignancy later in life. Microcalcification and mass segmentation are used frequently as the first step in mammography screening. The main objective of this paper is to apply association technique based on classification algorithm to classify microcalcification and mass in mammogram. The system that we propose consists of: (i a preprocessing phase to enhance the quality of the image and followed by segmentating region of interest; (ii a phase for mining a transactional table; and (iii a phase for organizing the resulted association rules in a classification model. This paper also illustrates how important the data cleaning phase in building the data mining process for image classification. The proposed method was evaluated using the mammogram data from Mammographic Image Analysis Society (MIAS. The MIAS data consist of 207 images of normal breast, 64 benign, and 51 malignant. 85 mammograms of MIAS data have mass, and 25 mammograms have microcalcification. The features of mean and Gray Level Co-occurrence Matrix homogeneity have been proved to be potential for discriminating microcalcification from mass. The accuracy obtained by this method is 83%.

  15. Transfer learning improves supervised image segmentation across imaging protocols

    DEFF Research Database (Denmark)

    van Opbroek, Annegreet; Ikram, M. Arfan; Vernooij, Meike W.;

    2015-01-01

    well, often require a large amount of labeled training data that is exactly representative of the target data. We therefore propose to use transfer learning for image segmentation. Transfer-learning techniques can cope with differences in distributions between training and target data, and therefore......The variation between images obtained with different scanners or different imaging protocols presents a major challenge in automatic segmentation of biomedical images. This variation especially hampers the application of otherwise successful supervised-learning techniques which, in order to perform...... may improve performance over supervised learning for segmentation across scanners and scan protocols. We present four transfer classifiers that can train a classification scheme with only a small amount of representative training data, in addition to a larger amount of other training data...

  16. Real-time network traffic classification technique for wireless local area networks based on compressed sensing

    Science.gov (United States)

    Balouchestani, Mohammadreza

    2017-05-01

    Network traffic or data traffic in a Wireless Local Area Network (WLAN) is the amount of network packets moving across a wireless network from each wireless node to another wireless node, which provide the load of sampling in a wireless network. WLAN's Network traffic is the main component for network traffic measurement, network traffic control and simulation. Traffic classification technique is an essential tool for improving the Quality of Service (QoS) in different wireless networks in the complex applications such as local area networks, wireless local area networks, wireless personal area networks, wireless metropolitan area networks, and wide area networks. Network traffic classification is also an essential component in the products for QoS control in different wireless network systems and applications. Classifying network traffic in a WLAN allows to see what kinds of traffic we have in each part of the network, organize the various kinds of network traffic in each path into different classes in each path, and generate network traffic matrix in order to Identify and organize network traffic which is an important key for improving the QoS feature. To achieve effective network traffic classification, Real-time Network Traffic Classification (RNTC) algorithm for WLANs based on Compressed Sensing (CS) is presented in this paper. The fundamental goal of this algorithm is to solve difficult wireless network management problems. The proposed architecture allows reducing False Detection Rate (FDR) to 25% and Packet Delay (PD) to 15 %. The proposed architecture is also increased 10 % accuracy of wireless transmission, which provides a good background for establishing high quality wireless local area networks.

  17. Computer-aided classification of lung nodules on computed tomography images via deep learning technique

    Directory of Open Access Journals (Sweden)

    Hua KL

    2015-08-01

    Full Text Available Kai-Lung Hua,1 Che-Hao Hsu,1 Shintami Chusnul Hidayati,1 Wen-Huang Cheng,2 Yu-Jen Chen3 1Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, 2Research Center for Information Technology Innovation, Academia Sinica, 3Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan Abstract: Lung cancer has a poor prognosis when not diagnosed early and unresectable lesions are present. The management of small lung nodules noted on computed tomography scan is controversial due to uncertain tumor characteristics. A conventional computer-aided diagnosis (CAD scheme requires several image processing and pattern recognition steps to accomplish a quantitative tumor differentiation result. In such an ad hoc image analysis pipeline, every step depends heavily on the performance of the previous step. Accordingly, tuning of classification performance in a conventional CAD scheme is very complicated and arduous. Deep learning techniques, on the other hand, have the intrinsic advantage of an automatic exploitation feature and tuning of performance in a seamless fashion. In this study, we attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques. Specifically, we introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images. Two baseline methods with feature computing steps were implemented for comparison. The experimental results suggest that deep learning methods could achieve better discriminative results and hold promise in the CAD application domain. Keywords: nodule classification, deep learning, deep belief network, convolutional neural network

  18. Surface roughness classification using polarimetric radar data and ensemble learning techniques

    Science.gov (United States)

    Alvarez-Mozos, Jesus; Peters, Jan; Larrañaga, Arantzazu; Gonzalez-Audicana, Maria; Verhoest, Niko E. C.; Casali, Javier

    2010-05-01

    The availability of space-borne radar sensors with polarimetric capabilities, such as RADARSAT-2, brings new expectations for the retrieval of soil moisture and roughness from remote sensing. The additional information provided by those sensors is expected to enable a separation of the confounding effects of soil moisture and roughness on the radar signal, resulting in more robust surface parameter retrievals. In this study we analyze two RADARSAT-2 Fine Quad-Pol scenes acquired during October 2008 over an agricultural area surrounding Pamplona (Spain). At that time of the year agricultural fields were bare and showed a variety of roughness conditions due to the different tillage operations performed. Approximately 50 agricultural fields were visited and their roughness condition was qualitatively evaluated. Fields were classified as rough, medium or smooth and their tillage direction was measured. The objective of this study is to evaluate the ability of different polarimetric variables to classify agricultural fields according to their roughness condition. With this aim a recently developed machine learning technique called ‘Random Forests' (RF) is used. RF is an ensemble learning technique that generates many classification trees and aggregates the individual results through majority vote. RF have been applied to a wide variety of phenomena, and in the recent years they have been used with success in several geoscience and remote sensing applications. In addition, RF can be used to estimate the importance of each predictive variable and to detect variable interactions. RF classification was applied at the pixel and at the field scale. Preliminary analyses showed better classification results for smooth and medium roughness fields than for rough ones. The research is ongoing and the influence of tillage direction and surface slope needs to be studied in detail.

  19. Data classification using metaheuristic Cuckoo Search technique for Levenberg Marquardt back propagation (CSLM) algorithm

    Science.gov (United States)

    Nawi, Nazri Mohd.; Khan, Abdullah; Rehman, M. Z.

    2015-05-01

    A nature inspired behavior metaheuristic techniques which provide derivative-free solutions to solve complex problems. One of the latest additions to the group of nature inspired optimization procedure is Cuckoo Search (CS) algorithm. Artificial Neural Network (ANN) training is an optimization task since it is desired to find optimal weight set of a neural network in training process. Traditional training algorithms have some limitation such as getting trapped in local minima and slow convergence rate. This study proposed a new technique CSLM by combining the best features of two known algorithms back-propagation (BP) and Levenberg Marquardt algorithm (LM) for improving the convergence speed of ANN training and avoiding local minima problem by training this network. Some selected benchmark classification datasets are used for simulation. The experiment result show that the proposed cuckoo search with Levenberg Marquardt algorithm has better performance than other algorithm used in this study.

  20. Induction of formal concepts by lattice computing techniques for tunable classification

    Directory of Open Access Journals (Sweden)

    V. G. Kaburlasos

    2014-03-01

    Full Text Available This work proposes an enhancement of Formal Concept Analysis (FCA by Lattice Computing (LC techniques. More specifically, a novel Galois connection is introduced toward defining tunable metric distances as well as tunable inclusion measure functions between formal concepts induced from hybrid (i.e., nominal and numerical data. An induction of formal concepts is pursued here by a novel extension of the Karnaugh map, or K-map for short, technique from digital electronics. In conclusion, granular classification can be pursued. The capacity of a classifier based on formal concepts is demonstrated here with promising results. The formal concepts are interpreted as descriptive decisionmaking knowledge (rules induced from the training data.

  1. Computer-aided classification of lung nodules on computed tomography images via deep learning technique.

    Science.gov (United States)

    Hua, Kai-Lung; Hsu, Che-Hao; Hidayati, Shintami Chusnul; Cheng, Wen-Huang; Chen, Yu-Jen

    2015-01-01

    Lung cancer has a poor prognosis when not diagnosed early and unresectable lesions are present. The management of small lung nodules noted on computed tomography scan is controversial due to uncertain tumor characteristics. A conventional computer-aided diagnosis (CAD) scheme requires several image processing and pattern recognition steps to accomplish a quantitative tumor differentiation result. In such an ad hoc image analysis pipeline, every step depends heavily on the performance of the previous step. Accordingly, tuning of classification performance in a conventional CAD scheme is very complicated and arduous. Deep learning techniques, on the other hand, have the intrinsic advantage of an automatic exploitation feature and tuning of performance in a seamless fashion. In this study, we attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques. Specifically, we introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images. Two baseline methods with feature computing steps were implemented for comparison. The experimental results suggest that deep learning methods could achieve better discriminative results and hold promise in the CAD application domain.

  2. Brain Tumor Classification Using AFM in Combination with Data Mining Techniques

    Directory of Open Access Journals (Sweden)

    Marlene Huml

    2013-01-01

    Full Text Available Although classification of astrocytic tumors is standardized by the WHO grading system, which is mainly based on microscopy-derived, histomorphological features, there is great interobserver variability. The main causes are thought to be the complexity of morphological details varying from tumor to tumor and from patient to patient, variations in the technical histopathological procedures like staining protocols, and finally the individual experience of the diagnosing pathologist. Thus, to raise astrocytoma grading to a more objective standard, this paper proposes a methodology based on atomic force microscopy (AFM derived images made from histopathological samples in combination with data mining techniques. By comparing AFM images with corresponding light microscopy images of the same area, the progressive formation of cavities due to cell necrosis was identified as a typical morphological marker for a computer-assisted analysis. Using genetic programming as a tool for feature analysis, a best model was created that achieved 94.74% classification accuracy in distinguishing grade II tumors from grade IV ones. While utilizing modern image analysis techniques, AFM may become an important tool in astrocytic tumor diagnosis. By this way patients suffering from grade II tumors are identified unambiguously, having a less risk for malignant transformation. They would benefit from early adjuvant therapies.

  3. Brain tumor classification using AFM in combination with data mining techniques.

    Science.gov (United States)

    Huml, Marlene; Silye, René; Zauner, Gerald; Hutterer, Stephan; Schilcher, Kurt

    2013-01-01

    Although classification of astrocytic tumors is standardized by the WHO grading system, which is mainly based on microscopy-derived, histomorphological features, there is great interobserver variability. The main causes are thought to be the complexity of morphological details varying from tumor to tumor and from patient to patient, variations in the technical histopathological procedures like staining protocols, and finally the individual experience of the diagnosing pathologist. Thus, to raise astrocytoma grading to a more objective standard, this paper proposes a methodology based on atomic force microscopy (AFM) derived images made from histopathological samples in combination with data mining techniques. By comparing AFM images with corresponding light microscopy images of the same area, the progressive formation of cavities due to cell necrosis was identified as a typical morphological marker for a computer-assisted analysis. Using genetic programming as a tool for feature analysis, a best model was created that achieved 94.74% classification accuracy in distinguishing grade II tumors from grade IV ones. While utilizing modern image analysis techniques, AFM may become an important tool in astrocytic tumor diagnosis. By this way patients suffering from grade II tumors are identified unambiguously, having a less risk for malignant transformation. They would benefit from early adjuvant therapies.

  4. Supervised Classification of Satellite Images to Analyze Multi-Temporal Land Use and Coverage : A Case Study for the Town of MARABA, State of PARA, Brazil

    Directory of Open Access Journals (Sweden)

    Priscila Siqueira Aranha

    2015-03-01

    Full Text Available Natarajan Meghanathan et al. (Eds : COSIT, SEC, SI GL, AIAPP - 2015 pp. 09–19, 2015. © CS & IT-CSCP 2015 DOI : 10.5 121/csit.2015.50602 S UPERVISED C LASSIFICATION OF S ATELLITE I MAGES T O A NALYZE M ULTI - T EMPORAL L AND U SE AND C OVERAGE : A C ASE S TUDY F OR T HE T OWN OF M ARABÁ , S TATE OF P ARÁ, B RAZIL Priscila Siqueira Aranha 1 , Flavia Pessoa Monteiro 1 , Paulo André Ignácio Pontes 5 , Jorge Antonio Moraes de Souza 2 , Nandamudi Lankalapalli Vijaykumar 4 , Maurílio de Abreu Monteiro 3 and Carlos Renato Lisboa Francês 1 1 Federal University of Pará (UFPA, Pará, Brazil {priscilasa, flaviamonteiro, rfrances}@ufpa.br 2 Federal Rural University of Amazonia (UFRA, Pará, Brazil jorge.souza@ufra.edu.br 3 Federal University of South and Southeast of Pará ( UNIFESSPA, Pará, Brazil maurilio.monteiro@unifesspa.edu.br 4 National Institute for Space Research (INPE, São P aulo, Brazil vijay.nl@inpe.br 5 Federal Institute of Education, Science and Technol ogy of Pará (IFPA, Pará, Brazil paulo.pontes@ifpa.edu.br A BSTRACT Amazon has one of the most diversified biome of the planet. Its environmental preservation has an impact in the global scenario. However, besides the environmental features, the complexity of the region involves other different aspects such as social, economic and cultural. In fact, these aspects are intrinsically interrelated, for e xample, cultural aspects may affect land use/land cover characteristics. This paper proposes an innovative methodology to in vestigate changes of critical factors in the environment, based on a case study in the 26 de Mar ço Settlement, in the city of Marabá, in the Brazilian Amazon. The proposed methodology demonstr ated, from the obtained results, an improvement of the efficiency of the classification technique to determine different thematic classes as well as a substantial enhancement in the precision of classified images. Another important aspect is the automation in the process

  5. Social networks in supervision

    DEFF Research Database (Denmark)

    Lystbæk, Christian Tang

    and practice have focused on conceptual frameworks and practical techniques of promoting reflection through conversation in general and questioning in particular. However, in recent years, supervision research has started to focus on the social and technological aspects of supervision. This calls...... is constituted by the relationality of the actors, not by the actors themselves. In other words, no one acts in a vacuum but rather always under the influence of a wide range of surrounding and interconnected factors. Actors are actors because they are in a networked relationship. Thus, focusing on social...... and space. That involves mobilised an denrolled actos, both animate and inanimate (e.g. books, computers, etc. Actor-network theory defines a symmetry between animate and inanimate, i.e. subjects and objects, because ”human powers increasingly derive from the complex interconnections if human with material...

  6. Brain tumor classification using the diffusion tensor image segmentation (D-SEG) technique

    Science.gov (United States)

    Jones, Timothy L.; Byrnes, Tiernan J.; Yang, Guang; Howe, Franklyn A.; Bell, B. Anthony; Barrick, Thomas R.

    2015-01-01

    Background There is an increasing demand for noninvasive brain tumor biomarkers to guide surgery and subsequent oncotherapy. We present a novel whole-brain diffusion tensor imaging (DTI) segmentation (D-SEG) to delineate tumor volumes of interest (VOIs) for subsequent classification of tumor type. D-SEG uses isotropic (p) and anisotropic (q) components of the diffusion tensor to segment regions with similar diffusion characteristics. Methods DTI scans were acquired from 95 patients with low- and high-grade glioma, metastases, and meningioma and from 29 healthy subjects. D-SEG uses k-means clustering of the 2D (p,q) space to generate segments with different isotropic and anisotropic diffusion characteristics. Results Our results are visualized using a novel RGB color scheme incorporating p, q and T2-weighted information within each segment. The volumetric contribution of each segment to gray matter, white matter, and cerebrospinal fluid spaces was used to generate healthy tissue D-SEG spectra. Tumor VOIs were extracted using a semiautomated flood-filling technique and D-SEG spectra were computed within the VOI. Classification of tumor type using D-SEG spectra was performed using support vector machines. D-SEG was computationally fast and stable and delineated regions of healthy tissue from tumor and edema. D-SEG spectra were consistent for each tumor type, with constituent diffusion characteristics potentially reflecting regional differences in tissue microstructure. Support vector machines classified tumor type with an overall accuracy of 94.7%, providing better classification than previously reported. Conclusions D-SEG presents a user-friendly, semiautomated biomarker that may provide a valuable adjunct in noninvasive brain tumor diagnosis and treatment planning. PMID:25121771

  7. Block truncation coding with color clumps:A novel feature extraction technique for content based image classification

    Indian Academy of Sciences (India)

    SUDEEP THEPADE; RIK DAS; SAURAV GHOSH

    2016-09-01

    The paper has explored principle of block truncation coding (BTC) as a means to perform feature extraction for content based image classification. A variation of block truncation coding, named BTC with color clumps has been implemented in this work to generate feature vectors. Classification performance with the proposed technique of feature extraction has been compared to existing techniques. Two widely used publicdataset named Wang dataset and Caltech dataset have been used for analyses and comparisons of classification performances based on four different metrics. The study has established BTC with color clumps as an effective alternative for feature extraction compared to existing methods. The experiments were carried out in RGB colorspace. Two different categories of classifiers viz. K Nearest Neighbor (KNN) Classifier and RIDOR Classifier were used to measure the classification performances. A paired t test was conducted to establish the statistical significance of the findings. Evaluation of classifier algorithms were done in receiver operating characteristic (ROC) space.

  8. Towards Multi Label Text Classification through Label Propagation

    Directory of Open Access Journals (Sweden)

    Shweta C. Dharmadhikari

    2012-06-01

    Full Text Available Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text classification paradigms cannot efficiently classify such multifaceted text corpus. Through our paper we are proposing a novel label propagation approach based on semi supervised learning for Multi Label Text Classification. Our proposed approach models the relationship between class labels and also effectively represents input text documents. We are using semi supervised learning technique for effective utilization of labeled and unlabeled data for classification. Our proposed approach promises better classification accuracy and handling of complexity and elaborated on the basis of standard datasets such as Enron, Slashdot and Bibtex.

  9. [Administrative reform thinking on the regulations on the supervision and administration of medical devices].

    Science.gov (United States)

    Yue, Wei

    2014-09-01

    This paper learned and interpreted the regulations on the supervision and administration of medical devices, carded the thoughts of administrative reform, then put forward the "ten principles", including full supervision, classification supervision, risk classification, safety-effective-save, to encourage innovation, simplified license, scientific-standard, sincerity & self-discipline, clear responsibility, severe punishment, and discussed these priciples.

  10. Objective Classification of Rainfall in Northern Europe for Online Operation of Urban Water Systems Based on Clustering Techniques

    DEFF Research Database (Denmark)

    Löwe, Roland; Madsen, Henrik; McSharry, Patrick

    2016-01-01

    This study evaluated methods for automated classification of rain events into groups of "high" and "low" spatial and temporal variability in offline and online situations. The applied classification techniques are fast and based on rainfall data only, and can thus be applied by, e.g., water system...... and quadratic discriminant analysis both provided a fast and reliable identification of rain events of "high" variability, while the k-means provided the smallest number of rain events falsely identified as being of "high" variability (false hits). A simple classification method based on a threshold...

  11. SVM and ANN Based Classification of Plant Diseases Using Feature Reduction Technique

    Directory of Open Access Journals (Sweden)

    Jagadeesh D.Pujari

    2016-06-01

    Full Text Available Computers have been used for mechanization and automation in different applications of agriculture/horticulture. The critical decision on the agricultural yield and plant protection is done with the development of expert system (decision support system using computer vision techniques. One of the areas considered in the present work is the processing of images of plant diseases affecting agriculture/horticulture crops. The first symptoms of plant disease have to be correctly detected, identified, and quantified in the initial stages. The color and texture features have been used in order to work with the sample images of plant diseases. Algorithms for extraction of color and texture features have been developed, which are in turn used to train support vector machine (SVM and artificial neural network (ANN classifiers. The study has presented a reduced feature set based approach for recognition and classification of images of plant diseases. The results reveal that SVM classifier is more suitable for identification and classification of plant diseases affecting agriculture/horticulture crops.

  12. Text Classification using Artificial Intelligence

    CERN Document Server

    Kamruzzaman, S M

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms for classifying text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using artificial intelligence technique that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of na\\"ive Bayes classifier is then used on derived features and finally only a single concept of genetic algorithm has been added for final classification. A syste...

  13. Classification of ECG signals using LDA with factor analysis method as feature reduction technique.

    Science.gov (United States)

    Kaur, Manpreet; Arora, A S

    2012-11-01

    The analysis of ECG signal, especially the QRS complex as the most characteristic wave in ECG, is a widely accepted approach to study and to classify cardiac dysfunctions. In this paper, first wavelet coefficients calculated for QRS complex are taken as features. Next, factor analysis procedures without rotation and with orthogonal rotation (varimax, equimax and quartimax) are used for feature reduction. The procedure uses the 'Principal Component Method' to estimate component loadings. Further, classification has been done with a LDA classifier. The MIT-BIH arrhythmia database is used and five types of beats (normal, PVC, paced, LBBB and RBBB) are considered for analysis. Accuracy, sensitivity and positive predictivity are performance parameters used for comparing performance of feature reduction techniques. Results demonstrate that the equimax rotation method yields maximum average accuracy of 99.056% for unknown data sets among other used methods.

  14. An automatic classification technique for attenuation correction in positron emission tomography

    Energy Technology Data Exchange (ETDEWEB)

    Bettinardi, V.; Pagani, E.; Gilardi, M.C.; Landoni, C.; Riddell, C.; Rizzo, G.; Castiglioni, I.; Belluzzo, D.; Lucignani, G.; Fazio, F. [INB-CNR, Scientific Inst. H San Raffaele, Univ. of Milan (Italy); Schubert, S. [GE Medical System, Milwaukee, WI (United States)

    1999-05-01

    In this paper a clustering technique is proposed for attenuation correction (AC) in positron emission tomography (PET). The method is unsupervised and adaptive with respect to counting statistics in the transmission (TR) images. The technique allows the classification of pre- or post-injection TR images into main tissue components in terms of attenuation coefficients. The classified TR images are then forward projected to generate new TR sinograms to be used for AC in the reconstruction of the corresponding emission (EM) data. The technique has been tested on phantoms and clinical data of brain, heart and whole-body PET studies. The method allows: (a) reduction of noise propagation from TR into EM images, (b) reduction of TR scanning to a few minutes (3 min) with maintenance of the quantitative accuracy (within 6%) of longer acquisition scans (15-20 min), (c) reduction of the radiation dose to the patient, (d) performance of quantitative whole-body studies. (orig.) With 8 figs., 4 tabs., 25 refs.

  15. Quality classification of Italian wheat durum spaghetti by means of different spectrophometric techniques

    Science.gov (United States)

    Menesatti, P.; Bucarelli, A.

    2007-09-01

    Wheat durum pasta (spaghetti in particular) can be considered as the most typical Italy's food product. Many small or craft pasta factories realize different quality product regarding the use of biological wheat and the application of mild (lower drying temperature) or traditional (bronze draw-plate) technologies, in competition with large industrial enterprises. The application of higher quality standards increases the producing cost and determines higher pasta prices. In order to setup a reliable easy-to-use methodology to distinguish different production technology approaches, spectrophotometric visible and near-infrared (VIS-Nir) techniques were applied on the intact pasta. Eighteen samples of commercial brand spaghetti classified in five different quality production factors (Full industrial - Teflon-drawn, high temperature-short time drying -; semolina from organic cultivations; bronze-drawn treatment; low temperature - long time drying; traditional high quality pasta - bronze-drawn and low temperature drying treatments-) were analyzed by three different spectrometric techniques: a VIS (400 - 700 nm) spectral imaging, a Nir (1000-1700 nm) spectral imaging - both of them acquiring reflected spectral images of spaghetti bundle - and a portable VIS-Nir system (400-800 nm), working with an interactance probe on single spaghetti string. Principal component analysis (PCA) and partial least square regressions (PLS) were performed on about 1500 spectral arrays, to test the ability of the systems to distinguish the different pasta products (commercial brands). Reflectance visible data presented highest percentage of correct classification: 98.6% total value, 100% for high quality spaghetti (bronze-drawn and/or low temperature drying). NIR reflectance and VIS-NIR interactance systems presented 85% and 70% of entire correct classification while for high quality pasta the percentages rise up to 75% and 83%

  16. Fast classification and compositional analysis of cornstover fractions using Fourier transform near-infrared techniques.

    Science.gov (United States)

    Philip Ye, X; Liu, Lu; Hayes, Douglas; Womac, Alvin; Hong, Kunlun; Sokhansanj, Shahab

    2008-10-01

    The objectives of this research were to determine the variation of chemical composition across botanical fractions of cornstover, and to probe the potential of Fourier transform near-infrared (FT-NIR) techniques in qualitatively classifying separated cornstover fractions and in quantitatively analyzing chemical compositions of cornstover by developing calibration models to predict chemical compositions of cornstover based on FT-NIR spectra. Large variations of cornstover chemical composition for wide calibration ranges, which is required by a reliable calibration model, were achieved by manually separating the cornstover samples into six botanical fractions, and their chemical compositions were determined by conventional wet chemical analyses, which proved that chemical composition varies significantly among different botanical fractions of cornstover. Different botanic fractions, having total saccharide content in descending order, are husk, sheath, pith, rind, leaf, and node. Based on FT-NIR spectra acquired on the biomass, classification by Soft Independent Modeling of Class Analogy (SIMCA) was employed to conduct qualitative classification of cornstover fractions, and partial least square (PLS) regression was used for quantitative chemical composition analysis. SIMCA was successfully demonstrated in classifying botanical fractions of cornstover. The developed PLS model yielded root mean square error of prediction (RMSEP %w/w) of 0.92, 1.03, 0.17, 0.27, 0.21, 1.12, and 0.57 for glucan, xylan, galactan, arabinan, mannan, lignin, and ash, respectively. The results showed the potential of FT-NIR techniques in combination with multivariate analysis to be utilized by biomass feedstock suppliers, bioethanol manufacturers, and bio-power producers in order to better manage bioenergy feedstocks and enhance bioconversion.

  17. Combining Multiple Electrode Arrays for Two-Dimensional Electrical Resistivity Imaging Using the Unsupervised Classification Technique

    Science.gov (United States)

    Ishola, K. S.; Nawawi, M. N. M.; Abdullah, K.

    2015-06-01

    This article describes the use of k-means clustering, an unsupervised image classification technique, to help interpret subsurface targets. The k-means algorithm is employed to combine and classify the two-dimensional (2D) inverse resistivity models obtained from three different electrode arrays. The algorithm is initialized through the selection of the number of clusters, number of iterations and other parameters such as stopping criteria. Automatically, it seeks to find groups of closely related resistivity values that belong to the same cluster and are more similar to each other than resistivity values belonging to other clusters. The approach is applied to both synthetic and field data. The 2D postinversions of the resistivity data were preprocessed by resampling and interpolating to the same coordinate. Following the preprocessing, the three images are combined into a single classified image. All the image preprocessing, manipulation and analysis are performed using the PCI Geomatics software package. The results of the clustering and classification are presented as classified images. An assessment of the performance of the individual and combined images for the synthetic models is carried out using an error matrix, mean absolute error and mean absolute percent error. The estimated errors show that images obtained from maximum values of the reconstructed resistivity for the different models give the best representation of the true models. Additionally, the overall accuracy and kappa values show good agreement between the combined classified images and true models. Depending on the model, the overall accuracy ranges from 86 to 99 %, while the kappa coefficient is in the range of 54-98 %. Classified images with kappa coefficients greater than 0.8 show strong agreement, while images with kappa coefficients greater than 0.5 but less than 0.8 give moderate agreement. For the field data, the k-mean classifier produces images that incorporate structural features of

  18. Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques

    Directory of Open Access Journals (Sweden)

    Muhammad Bilal

    2016-07-01

    Full Text Available Sentiment mining is a field of text mining to determine the attitude of people about a particular product, topic, politician in newsgroup posts, review sites, comments on facebook posts twitter, etc. There are many issues involved in opinion mining. One important issue is that opinions could be in different languages (English, Urdu, Arabic, etc.. To tackle each language according to its orientation is a challenging task. Most of the research work in sentiment mining has been done in English language. Currently, limited research is being carried out on sentiment classification of other languages like Arabic, Italian, Urdu and Hindi. In this paper, three classification models are used for text classification using Waikato Environment for Knowledge Analysis (WEKA. Opinions written in Roman-Urdu and English are extracted from a blog. These extracted opinions are documented in text files to prepare a training dataset containing 150 positive and 150 negative opinions, as labeled examples. Testing data set is supplied to three different models and the results in each case are analyzed. The results show that Naïve Bayesian outperformed Decision Tree and KNN in terms of more accuracy, precision, recall and F-measure.

  19. Comparing document classification schemes using k-means clustering

    OpenAIRE

    Šivić, Artur; Žmak, Lovro; Dalbelo Bašić, Bojana; Moens, Marie-Francine

    2008-01-01

    In this work, we jointly apply several text mining methods to a corpus of legal documents in order to compare the separation quality of two inherently different document classification schemes. The classification schemes are compared with the clusters produced by the k-means algorithm. In the future, we believe that our comparison method will be coupled with semi-supervised and active learning techniques. Also, this paper presents the idea of combining k-means and Principal Component Analysis...

  20. EMOTION INTERACTION WITH VIRTUAL REALITY USING HYBRID EMOTION CLASSIFICATION TECHNIQUE TOWARD BRAIN SIGNALS

    National Research Council Canada - National Science Library

    Faris A. Abuhashish; Jamal Zraqou; Wesam Alkhodour; Mohd S. Sunar; Hoshang Kolivand

    2015-01-01

    .... Last decade many researchers focused on emotion classification in order to employ emotion in interaction with virtual reality, the classification will be done based on Electroencephalogram (EEG) brain signals...

  1. Automatic approach to solve the morphological galaxy classification problem using the sparse representation technique and dictionary learning

    Science.gov (United States)

    Diaz-Hernandez, R.; Ortiz-Esquivel, A.; Peregrina-Barreto, H.; Altamirano-Robles, L.; Gonzalez-Bernal, J.

    2016-06-01

    The observation of celestial objects in the sky is a practice that helps astronomers to understand the way in which the Universe is structured. However, due to the large number of observed objects with modern telescopes, the analysis of these by hand is a difficult task. An important part in galaxy research is the morphological structure classification based on the Hubble sequence. In this research, we present an approach to solve the morphological galaxy classification problem in an automatic way by using the Sparse Representation technique and dictionary learning with K-SVD. For the tests in this work, we use a database of galaxies extracted from the Principal Galaxy Catalog (PGC) and the APM Equatorial Catalogue of Galaxies obtaining a total of 2403 useful galaxies. In order to represent each galaxy frame, we propose to calculate a set of 20 features such as Hu's invariant moments, galaxy nucleus eccentricity, gabor galaxy ratio and some other features commonly used in galaxy classification. A stage of feature relevance analysis was performed using Relief-f in order to determine which are the best parameters for the classification tests using 2, 3, 4, 5, 6 and 7 galaxy classes making signal vectors of different length values with the most important features. For the classification task, we use a 20-random cross-validation technique to evaluate classification accuracy with all signal sets achieving a score of 82.27 % for 2 galaxy classes and up to 44.27 % for 7 galaxy classes.

  2. Supervised Mineral Classification with Semi-automatic Training and Validation Set Generation in Scanning Electron Microscope Energy Dispersive Spectroscopy Images of Thin Sections

    DEFF Research Database (Denmark)

    Flesche, Harald; Nielsen, Allan Aasbjerg; Larsen, Rasmus

    2000-01-01

    This paper addresses the problem of classifying minerals common in siliciclastic and carbonate rocks. Twelve chemical elements are mapped from thin sections by energy dispersive spectroscopy in a scanning electron microscope (SEM). Extensions to traditional multivariate statistical methods...... are applied to perform the classification. First, training and validation sets are grown from one or a few seed points by a method that ensures spatial and spectral closeness of observations. Spectral closeness is obtained by excluding observations that have high Mahalanobis distances to the training class...

  3. Comparative techniques used to evaluate Thematic Mapper data for land cover classification in Logan County, West Virginia

    Science.gov (United States)

    Brumfield, J. O.; Witt, R. G.; Blodget, H. W.; Marcell, R. F.

    1985-01-01

    Several digital data processing techniques were evaluated in an effort to identify and map active/abandoned, partially reclaimed, and fully revegetated surface mine areas in the central portion of Logan County. The TM data were first subjected to various enhancement procedures, including a linear contrast stretch, principal components and canonical analysis transformations. At the same time, four general procedures were followed to produce six classifications as a means of comparing the techniques involved. Preliminary results show that various feature extraction/data reduction techniques provide classification results equal or superior to the more straightforward unsupervised clustering technique. Analyst interaction time for labelling clusters is reduced using the canonical analysis and principal components procedures, though the canonical technique has clearly produced better results to date.

  4. Unsupervised classification of remote multispectral sensing data

    Science.gov (United States)

    Su, M. Y.

    1972-01-01

    The new unsupervised classification technique for classifying multispectral remote sensing data which can be either from the multispectral scanner or digitized color-separation aerial photographs consists of two parts: (a) a sequential statistical clustering which is a one-pass sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. Applications of the technique using an IBM-7094 computer on multispectral data sets over Purdue's Flight Line C-1 and the Yellowstone National Park test site have been accomplished. Comparisons between the classification maps by the unsupervised technique and the supervised maximum liklihood technique indicate that the classification accuracies are in agreement.

  5. Data mining classification techniques: an application to tobacco consumption in teenagers

    Directory of Open Access Journals (Sweden)

    Juan J. Montaño-Moreno

    2014-05-01

    Full Text Available This study is aimed at analysing the predictive power of different psychosocial and personality variables on the consumption or non-consumption of nicotine in a teenage population using different classification techniques from the field of Data Mining. More specifically, we analyse ANNs - Multilayer Perceptron (MLP, Radial Basis Functions (RBF and Probabilistic Neural Networks (PNNs - decision trees, the logistic regression model and discriminant analysis. To this end, we worked with a sample of 2666 teenagers, 1378 of whom do not consume nicotine while 1288 are nicotine consumers. The models analysed were able to discriminate correctly between both types of subjects within a range of 77.39% to 78.20%, achieving 91.29% sensitivity and 74.32% specificity. With this study, we place at the disposal of specialists in addictive behaviours a set of advanced statistical techniques that are capable of simultaneously processing a large quantity of variables and subjects, as well as learning complex patterns and relationships automatically, in such a way that they are very appropriate for predicting and preventing addictive behaviour.

  6. Rules for Supervision and Inspection of Offshore Oil Industry

    Institute of Scientific and Technical Information of China (English)

    Dai Zhongliang; Song Lisong

    1994-01-01

    @@ In short,safety supervision and technique inspection mean the safety supervision by the government,and the inspection by technical organization,and those are put into practice by a series of administrative rules and regulations.

  7. A Gestalt Approach to Group Supervision

    Science.gov (United States)

    Melnick, Joseph; Fall, Marijane

    2008-01-01

    The authors define and then describe the practice of group supervision. The role of creative experiment in assisting supervisees who perceive themselves as confused, moving in circles, or immobilized is described. Fictional case examples illustrate these issues in supervision. The authors posit the "good fit" of Gestalt theory and techniques with…

  8. A Gestalt Approach to Group Supervision

    Science.gov (United States)

    Melnick, Joseph; Fall, Marijane

    2008-01-01

    The authors define and then describe the practice of group supervision. The role of creative experiment in assisting supervisees who perceive themselves as confused, moving in circles, or immobilized is described. Fictional case examples illustrate these issues in supervision. The authors posit the "good fit" of Gestalt theory and techniques with…

  9. 湖南省餐饮业量化分级现状及监管对策%Quantitative classification in catering trade and countermeasures of supervision and management in Hunan Province

    Institute of Scientific and Technical Information of China (English)

    刘秀兰; 陈立章; 何翔

    2012-01-01

    Objective: To analyze the status quo of quantitative classification in Hunan Province cateringindustry, and to discuss the countermeasures in-depth.Methods: According to relevant laws and regulations, and after referring to Daily supervision andquantitative scoring sheet and consulting experts, a checklist of key supervision indicators was made.The implementation of quantitative classification in 10 cities in Hunan Province was studied, andthe status quo was analyzed.Results: All the 390 catering units implemented quantitative classified management. The largerthe catering enterprise, the higher level of quantitative classification. In addition to cafeterias, thesmaller the catering units, the higher point of deduction, and snack bars and beverage stores werethe highest. For those quantified and classified as C and D, the point of deduction was higher in theprocurement and storage of raw materials, operation processing and other aspects.Conclusion: The quantitative classification of Hunan Province has relatively wide coverage. Thereare hidden risks in food security in small catering units, snack bars, and beverage stores. The foodhygienic condition of Hunan Province needs to be improved.%目的:分析湖南省餐饮业量化分级现状,深入探讨量化分级管理对策.方法:根据相关法律法规,参考餐饮业日常监督量化评分表,咨询专家后,制成关键指标监督检查表,在湖南省10个市州城区实地调研餐饮单位执行量化分级的情况,分析执行现状.结果:调研的390家餐饮单位均进行了量化分级;规模越大的餐饮企业,量化分级的等级越高;从关键指标扣分情况看,除食堂外,餐饮单位规模越小,扣分越高,其中小吃店和饮品店扣分最多;量化分级为C和D级的,在原料采购与储存、加工操作等方面扣分较多.结论:湖南省餐饮行业的量化分级工作已经有了较宽的覆盖面;但是小型餐饮单位、小吃店、饮品店食品安全隐患

  10. Reliable and reproducible classification system for scoliotic radiograph using image processing techniques.

    Science.gov (United States)

    Anitha, H; Prabhu, G K; Karunakar, A K

    2014-11-01

    Scoliosis classification is useful for guiding the treatment and testing the clinical outcome. State-of-the-art classification procedures are inherently unreliable and non-reproducible due to technical and human judgmental error. In the current diagnostic system each examiner will have diagrammatic summary of classification procedure, number of scoliosis curves, apex level, etc. It is very difficult to define the required anatomical parameters in the noisy radiographs. The classification system demands automatic image understanding system. The proposed automated classification procedures extracts the anatomical features using image processing and applies classification procedures based on computer assisted algorithms. The reliability and reproducibility of the proposed computerized image understanding system are compared with manual and computer assisted system using Kappa values.

  11. Automatic sleep stages classification using EEG entropy features and unsupervised pattern analysis techniques

    OpenAIRE

    Jose Luis Rodríguez-Sotelo; Alejandro Osorio-Forero; Alejandro Jiménez-Rodríguez; David Cuesta-Frau; Eva Cirugeda-Roldán; Diego Peluffo

    2014-01-01

    Sleep is a growing area of research interest in medicine and neuroscience. Actually, one major concern is to find a correlation between several physiologic variables and sleep stages. There is a scientific agreement on the characteristics of the five stages of human sleep, based on EEG analysis. Nevertheless, manual stage classification is still the most widely used approach. This work proposes a new automatic sleep classification method based on unsupervised feature classification algorithms...

  12. Automatic classification of time-variable X-ray sources

    CERN Document Server

    Lo, Kitty K; Murphy, Tara; Gaensler, B M

    2014-01-01

    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the second \\textit{XMM-Newton} serendipitous source catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, and other multi-wavelength contextual information. The 10-fold cross validation accuracy of the training data is ${\\sim}$97% on a seven-class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest der...

  13. Plant leaf image classification based on supervised orthogonal locality preserving projections%基于监督正交局部保持映射的植物叶片图像分类方法

    Institute of Scientific and Technical Information of China (English)

    张善文; 张传雷; 程雷

    2013-01-01

    problem degrades the recognition performance of these algorithms. To overcome the problem, a supervised orthogonal LPP (SOLPP) algorithm is presented and applied to the plant classification by using leaf images, based on locality preserving projections (LPP). LPP can be trained and applied as a linear projection and can model feature vectors that are assumed to lie on a nonlinear embedding subspace by preserving local relations among input features, so it has an advantage over conventional linear dimensionality reduction algorithms like principal components analysis (PCA) and linear discriminant analysis (LDA). First, the class information matrix is computed by the Warshall algorithm, which is an efficient method for computing the transitive closure of a relationship. It takes a matrix as input to represent the relationship of the observed data, and outputs a matrix of the transitive closure of the original data relationship. Based on the matrix, the within-class and between-class matrices are obtained by making full use of the local information and class information of the data. After dimensionality reduction, in subspace space, the distances between the same-class samples become smaller, while the distances between the different-class samples become larger. This characteristic can improve the classifying performance of the proposed algorithm. Compared with the classical subspace supervised dimensional reduction algorithms, in the proposed method, it is not necessary to judge whether any two samples belong to the same class or not when constructing the within-class and between-class scatter matrices, which can improve the classifying performance of the proposed algorithm. Finally, the K-nearest neighborhood classifier is applied to classifying plants. Comparison experiments with other existing algorithms, such as neighborhood rough set(NRS), support vector machine(SVM), efficient moving center hypersphere(MCH), modified locally linear discriminant embedding(MLLDE) and

  14. Classification of human colonic tissues using FTIR spectra and advanced statistical techniques

    Science.gov (United States)

    Zwielly, A.; Argov, S.; Salman, A.; Bogomolny, E.; Mordechai, S.

    2010-04-01

    One of the major public health hazards is colon cancer. There is a great necessity to develop new methods for early detection of cancer. If colon cancer is detected and treated early, cure rate of more than 90% can be achieved. In this study we used FTIR microscopy (MSP), which has shown a good potential in the last 20 years in the fields of medical diagnostic and early detection of abnormal tissues. Large database of FTIR microscopic spectra was acquired from 230 human colonic biopsies. Five different subgroups were included in our database, normal and cancer tissues as well as three stages of benign colonic polyps, namely, mild, moderate and severe polyps which are precursors of carcinoma. In this study we applied advanced mathematical and statistical techniques including principal component analysis (PCA) and linear discriminant analysis (LDA), on human colonic FTIR spectra in order to differentiate among the mentioned subgroups' tissues. Good classification accuracy between normal, polyps and cancer groups was achieved with approximately 85% success rate. Our results showed that there is a great potential of developing FTIR-micro spectroscopy as a simple, reagent-free viable tool for early detection of colon cancer in particular the early stages of premalignancy among the benign colonic polyps.

  15. Sensitivity of Support Vector Machine Classification to Various Training Features

    Directory of Open Access Journals (Sweden)

    Fuling Bian

    2013-07-01

    Full Text Available Remote sensing image classification is one of the most important techniques in image interpretation, which can be used for environmental monitoring, evaluation and prediction. Many algorithms have been developed for image classification in the literature. Support vector machine (SVM is a kind of supervised classification that has been widely used recently. The classification accuracy produced by SVM may show variation depending on the choice of training features. In this paper, SVM was used for land cover classification using Quickbird images. Spectral and textural features were extracted for the classification and the results were analyzed thoroughly. Results showed that the number of features employed in SVM was not the more the better. Different features are suitable for different type of land cover extraction. This study verifies the effectiveness and robustness of SVM in the classification of high spatial resolution remote sensing images.    

  16. Comparison of statistical clustering techniques for the classification of modelled atmospheric trajectories

    Science.gov (United States)

    Kassomenos, P.; Vardoulakis, S.; Borge, R.; Lumbreras, J.; Papaloukas, C.; Karakitsios, S.

    2010-10-01

    In this study, we used and compared three different statistical clustering methods: an hierarchical, a non-hierarchical (K-means) and an artificial neural network technique (self-organizing maps (SOM)). These classification methods were applied to a 4-year dataset of 5 days kinematic back trajectories of air masses arriving in Athens, Greece at 12.00 UTC, in three different heights, above the ground. The atmospheric back trajectories were simulated with the HYSPLIT Vesion 4.7 model of National Oceanic and Atmospheric Administration (NOAA). The meteorological data used for the computation of trajectories were obtained from NOAA reanalysis database. A comparison of the three statistical clustering methods through statistical indices was attempted. It was found that all three statistical methods seem to depend to the arrival height of the trajectories, but the degree of dependence differs substantially. Hierarchical clustering showed the highest level of dependence for fast-moving trajectories to the arrival height, followed by SOM. K-means was found to be the least depended clustering technique on the arrival height. The air quality management applications of these results in relation to PM10 concentrations recorded in Athens, Greece, were also discussed. Differences of PM10 concentrations, during certain clusters, were found statistically different (at 95% confidence level) indicating that these clusters appear to be associated with long-range transportation of particulates. This study can improve the interpretation of modelled atmospheric trajectories, leading to a more reliable analysis of synoptic weather circulation patterns and their impacts on urban air quality.

  17. Application of Metamorphic Testing to Supervised Classifiers

    Science.gov (United States)

    Xie, Xiaoyuan; Ho, Joshua; Kaiser, Gail; Xu, Baowen; Chen, Tsong Yueh

    2010-01-01

    Many applications in the field of scientific computing - such as computational biology, computational linguistics, and others - depend on Machine Learning algorithms to provide important core functionality to support solutions in the particular problem domains. However, it is difficult to test such applications because often there is no “test oracle” to indicate what the correct output should be for arbitrary input. To help address the quality of such software, in this paper we present a technique for testing the implementations of supervised machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called “metamorphic testing”, which has been shown to be effective in such cases. More importantly, we demonstrate that our technique not only serves the purpose of verification, but also can be applied in validation. In addition to presenting our technique, we describe a case study we performed on a real-world machine learning application framework, and discuss how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also discuss how our findings can be of use to other areas outside scientific computing, as well. PMID:21243103

  18. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  19. White Kwao Krua variety classification by botanical characteristics and ISSR-Touchdown PCR technique.

    Science.gov (United States)

    Bunmanop, S; Sakuanrungsirikul, S; Manakasem, Y

    2011-07-01

    White Kwao Krua [Pueraria candollei Grah. var. mirifica (Airy Shaw et Suvatabandhu) Niyomdham] is a herb used as an ingredient in supplementary and cosmetic. The tuberous roots of White Kwao Krua (WKK) contain estrogen-like substances. Seeds of WKK, collected from Prachuab Khiri Khan, were planted and propagated in the farm of Suranaree University of Technology, and their genetic backgrounds were ambiguous. Thirty six plants of WKK in the same age were sampled for classification using 7 botanical characteristics and DNA fingerprint by ISSR-Touchdown PCR technique. The relationship of the 7 botanical characteristics, using principle component analysis (PCA), showed the WKK plants fell into 3 groups. In the first group was plant number 34, which was distinguished from the other plants by its small leaf size. The second group consisted of 23 plants with elliptic leaf shape, acute leaf base, and acuminate leaf apex. The third group consisted of 12 plants with ovate leaf shape, obtuse leaf base, and cuspidate leaf apex. The ISSR-Touchdown PCR technique with 41 primers detected 355 loci of DNA with an average of 8.6 loci per primer. The sizes of DNA ranged between 280 bp to 1550 bp. Two hundred ninety three loci exhibited polymorphisms (82.54%) and the rest 62 loci were monomorphic (17.46%). The polymorphism information content (PIC) was between 0.0315-0.9779 (average 0.4779) and number of effective alleles per locus (Ne) ranged between 1.1250-1.8541 (average 1.5544). Unweighted pair group method with arithmetic mean (UPGMA), Jaccard similarity coefficient and PCA were used to find the construction of genetic relationship of WKK. The genetic similarity (GS) of WKK ranged between 0.50-0.86 (average 0.77). At the GS of 0.56 from cluster analysis, the WKK varieties could be divided into 2 major groups. The first group comprised of plant number 34 and 7, and the second group could be further divided in 2 subgroups at GS of 0.69. None of the WKK plants was identical in

  20. Supervision as Metaphor

    Science.gov (United States)

    Lee, Alison; Green, Bill

    2009-01-01

    This article takes up the question of the language within which discussion of research degree supervision is couched and framed, and the consequences of such framings for supervision as a field of pedagogical practice. It examines the proliferation and intensity of metaphor, allegory and allusion in the language of candidature and supervision,…

  1. A Supervision of Solidarity

    Science.gov (United States)

    Reynolds, Vikki

    2010-01-01

    This article illustrates an approach to therapeutic supervision informed by a philosophy of solidarity and social justice activism. Called a "Supervision of Solidarity", this approach addresses the particular challenges in the supervision of therapists who work alongside clients who are subjected to social injustice and extreme marginalization. It…

  2. Classification and Evaluation the Privacy Preserving Data Mining Techniques by using a Data Modification-based Framework

    CERN Document Server

    Keyvanpour, MohammadReza

    2011-01-01

    In recent years, the data mining techniques have met a serious challenge due to the increased concerning and worries of the privacy, that is, protecting the privacy of the critical and sensitive data. Different techniques and algorithms have been already presented for Privacy Preserving data mining, which could be classified in three common approaches: Data modification approach, Data sanitization approach and Secure Multi-party Computation approach. This paper presents a Data modification- based Framework for classification and evaluation of the privacy preserving data mining techniques. Based on our framework the techniques are divided into two major groups, namely perturbation approach and anonymization approach. Also in proposed framework, eight functional criteria will be used to analyze and analogically assessment of the techniques in these two major groups. The proposed framework provides a good basis for more accurate comparison of the given techniques to privacy preserving data mining. In addition, t...

  3. Regularized generalized eigen-decomposition with applications to sparse supervised feature extraction and sparse discriminant analysis

    DEFF Research Database (Denmark)

    Han, Xixuan; Clemmensen, Line Katrine Harder

    2015-01-01

    techniques, for instance, 2D-Linear Discriminant Analysis (2D-LDA). Furthermore, an iterative algorithm based on the alternating direction method of multipliers is developed. The algorithm approximately solves RGED with monotonically decreasing convergence and at an acceptable speed for results of modest......We propose a general technique for obtaining sparse solutions to generalized eigenvalue problems, and call it Regularized Generalized Eigen-Decomposition (RGED). For decades, Fisher's discriminant criterion has been applied in supervised feature extraction and discriminant analysis...... accuracy. Numerical experiments based on four data sets of different types of images show that RGED has competitive classification performance with existing multidimensional and sparse techniques of discriminant analysis....

  4. African Journal of Science and Technology (AJST) SUPERVISED ...

    African Journals Online (AJOL)

    NORBERT OPIYO AKECH

    ABSTRACT: TThis paper proposes a new method for supervised color image classification by the ... learning quantisation vector (LVQ), is constructed and compared to the K-means clustering ..... colored scanned maps, Machine Vision and.

  5. Spatio-temporal analysis of discharge regimes based on hydrograph classification techniques in an agricultural catchment

    Science.gov (United States)

    Chen, Xiaofei; Bloeschl, Guenter; Blaschke, Alfred Paul; Silasari, Rasmiaditya; Exner-Kittridge, Mike

    2016-04-01

    The stream, discharges and groundwater hydro-graphs is an integration in spatial and temporal variations for small-scale hydrological response. Characterizing discharges response regime in a drainage farmland is essential to irrigation strategies and hydrologic modeling. Especially for agricultural basins, diurnal hydro-graphs from drainage discharges have been investigated to achieve drainage process inferences in varying magnitudes. To explore the variability of discharge responses, we developed an impersonal method to characterize and classify discharge hydrograph based on features of magnitude and time-series. A cluster analysis (hierarchical k-means) and principal components analysis techniques are used for discharge time-series and groundwater level hydro-graphs to analyze their event characteristics, using 8 different discharge and 18 groundwater level hydro-graphs to test. As the variability of rainfall activity, system location, discharge regime and soil moisture pre-event condition in the catchment, three main clusters of discharge hydro-graph are identified from the test. The results show that : (1) the hydro-graphs from these drainage discharges had similar shapes but different magnitudes for individual rainstorm; the similarity is also showed in overland flow discharge and spring system; (2) for each cluster, the similarity of shape insisted, but the rising slope are different due to different antecedent wetness condition and the rain accumulation meanwhile the difference of regression slope can be explained by system location and discharge area; and (3) surface water always has a close proportional relation with soil moisture throughout the year, while only after the soil moisture exceeds a certain threshold does the outflow of tile drainage systems have a direct ratio relationship with soil moisture and a inverse relationship with the groundwater levels. Finally, we discussed the potential application of hydrograph classification in a wider range of

  6. 融合PLS监督特征提取和虚假最近邻点的数据分类特征选择%Feature selection for data classification based on pls supervised feature extraction and false nearest neighbors

    Institute of Scientific and Technical Information of China (English)

    颜克胜; 李太福; 魏正元; 苏盈盈; 姚立忠

    2012-01-01

    The classifier is often led to the problem of low recognition accuracy and time and space overhead, due to the multicollinearity and redundant features and noise in the classification of high dimensional data. A feature selection method based on partial least squares(PLS) and false nearest neighbors(FNN) is proposed. Firstly, the partial least squares method is employed to extract the principal components of high-dimensional data and overcome difficulties encountered with the existing multicollinearity between the original features, and the independent principal components space which carries supervision information could be obtained. Then, the similarity measure based on FNN would be established by calculating the correlation in this space before and after each feature selection, furthermore, gets the original features ranking of interpretation to the dependent variable. Finally, the features which have weak explanatory ability could be removed in turn to construct various classification models, and uses recognition rate of Support Vector Machine(SVM) as a evaluation criterion of models to search out the classification model which not only has the highest recognition rate, but also contains the least number of features, the best feature subset is the just model. A series of experiments from different data models have been conducted. The simulation results show that this method has a good capability to select the best feature subset which is consistent with the nature of classification feature for the data set. Therefore, the research provides a new approach to the feature selection of data classification.%在高维数据分类中,针对多重共线性、冗余特征及噪声易导致分类器识别精度低和时空开销大的问题,提出融合偏最小二乘(Partial Least Squares,PLS)有监督特征提取和虚假最近邻点(False Nearest Neighbors,FNN)的特征选择方法:首先利用偏最小二乘对高维数据提取主元,消除特征之间的多重共

  7. A Robust Geometric Model for Argument Classification

    Science.gov (United States)

    Giannone, Cristina; Croce, Danilo; Basili, Roberto; de Cao, Diego

    Argument classification is the task of assigning semantic roles to syntactic structures in natural language sentences. Supervised learning techniques for frame semantics have been recently shown to benefit from rich sets of syntactic features. However argument classification is also highly dependent on the semantics of the involved lexicals. Empirical studies have shown that domain dependence of lexical information causes large performance drops in outside domain tests. In this paper a distributional approach is proposed to improve the robustness of the learning model against out-of-domain lexical phenomena.

  8. Good supervision and PBL

    DEFF Research Database (Denmark)

    Otrel-Cass, Kathrin

    This field study was conducted at the Faculty of Social Sciences at Aalborg University with the intention to investigate how students reflect on their experiences with supervision in a PBL environment. The overall aim of this study was to inform about the continued work in strengthening supervision...... at this faculty. This particular study invited Master level students to discuss: • How a typical supervision process proceeds • How they experienced and what they expected of PBL in the supervision process • What makes a good supervision process...

  9. Event classification and optimization methods using artificial intelligence and other relevant techniques: Sharing the experiences

    Science.gov (United States)

    Mohamed, Abdul Aziz; Hasan, Abu Bakar; Ghazali, Abu Bakar Mhd.

    2017-01-01

    Classification of large data into respected classes or groups could be carried out with the help of artificial intelligence (AI) tools readily available in the market. To get the optimum or best results, optimization tool could be applied on those data. Classification and optimization have been used by researchers throughout their works, and the outcomes were very encouraging indeed. Here, the authors are trying to share what they have experienced in three different areas of applied research.

  10. Mapping forested wetlands in the Great Zhan River Basin through integrating optical, radar, and topographical data classification techniques.

    Science.gov (United States)

    Na, X D; Zang, S Y; Wu, C S; Li, W L

    2015-11-01

    Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (pclassifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (palgorithms. The object-based classification using RF provided a more visually adequate distribution of interested land cover types, while the object classifications based on the KNN algorithm showed noticeably commissions for forested wetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.

  11. Classroom Supervision and Informal Analysis of Behavior. A Manual for Supervision.

    Science.gov (United States)

    Hull, Ray; Hansen, John

    This manual for supervision addresses itself to those with responsibility for helping teachers develop into skilled professionals through use of a rational plan of feedback and assistance. It describes the supervision cycle and outline simple and practical techniques to collect effective data that will assist the classroom teacher. The manual has…

  12. Improving Efficiency of Classification using PCA and Apriori based Attribute Selection Technique

    Directory of Open Access Journals (Sweden)

    K. Rajeswari

    2013-12-01

    Full Text Available The aim of this study is to select significant features that contribute for accuracy in classification. Data mining is a field where we find lots of data which can be useful or useless in any form available in Data Warehouse. Implementing classification on these huge, uneven, useless data sets with large number of features is just a waste of time degrading the efficiency of classification algorithms and hence the results are not much accurate. Hence we propose a system in which we first use PCA (Principal Component Analysis for selection of the attributes on which we perform Classification using Bayes theorem, Multi-Layer Perceptron, Decision tree J48 which indeed has given us better result than that of performing Classification on the huge complete data sets with all the attributes. Also association rule mining using traditional Apriori algorithm is experimented to find out sub set of features related to class label. The experiments are conducted using WEKA 3.6.0 Tool.

  13. Fractographic classification in metallic materials by using 3D processing and computer vision techniques

    Directory of Open Access Journals (Sweden)

    Maria Ximena Bastidas-Rodríguez

    2016-09-01

    Full Text Available Failure analysis aims at collecting information about how and why a failure is produced. The first step in this process is a visual inspection on the flaw surface that will reveal the features, marks, and texture, which characterize each type of fracture. This is generally carried out by personnel with no experience that usually lack the knowledge to do it. This paper proposes a classification method for three kinds of fractures in crystalline materials: brittle, fatigue, and ductile. The method uses 3D vision, and it is expected to support failure analysis. The features used in this work were: i Haralick’s features and ii the fractal dimension. These features were applied to 3D images obtained from a confocal laser scanning microscopy Zeiss LSM 700. For the classification, we evaluated two classifiers: Artificial Neural Networks and Support Vector Machine. The performance evaluation was made by extracting four marginal relations from the confusion matrix: accuracy, sensitivity, specificity, and precision, plus three evaluation methods: Receiver Operating Characteristic space, the Individual Classification Success Index, and the Jaccard’s coefficient. Despite the classification percentage obtained by an expert is better than the one obtained with the algorithm, the algorithm achieves a classification percentage near or exceeding the 60 % accuracy for the analyzed failure modes. The results presented here provide a good approach to address future research on texture analysis using 3D data.

  14. Predicting return visits to the emergency department for pediatric patients: Applying supervised learning techniques to the Taiwan National Health Insurance Research Database.

    Science.gov (United States)

    Hu, Ya-Han; Tai, Chun-Tien; Chen, Solomon Chih-Cheng; Lee, Hai-Wei; Sung, Sheng-Feng

    2017-06-01

    Return visits (RVs) to the emergency department (ED) consume medical resources and may represent a patient safety issue. The occurrence of unexpected RVs is considered a performance indicator for ED care quality. Because children are susceptible to medical errors and utilize considerable ED resources, knowing the factors that affect RVs in pediatric patients helps improve the quality of pediatric emergency care. We collected data on visits made by patients aged ≤18years to EDs from the National Health Insurance Research Database. The outcome of interest was a RV within 3days of the initial visit. Potential factors were categorized into demographics, medical history, features of ED visits, physician characteristics, hospital characteristics, and treatment-seeking behavior. A multivariate logistic regression was used to identify independent predictors of RVs. We compared the performance of various data mining techniques, including Naïve Bayes, classification and regression tree (CART), random forest, and logistic regression, in predicting RVs. Finally, we developed a decision tree to stratify the risk of RVs. Of 125,940 visits, 6,282 (5.0%) were followed by a RV within 3 days. Predictors of RVs included younger age, higher acuity, intravenous fluid, more examination types, complete blood count, consultation, lower hospital level, hospitalization within one week before the initial visit, frequent ED visits in the past one year, and visits made in Spring or on Saturdays. Patients with allergic diseases and those underwent ultrasound examination were less likely to return. Decision tree models performed better in predicting RVs in terms of area under curve. The decision tree constructed using the CART technique showed that the number of ED visits in the past one year, diagnosis category, testing of complete blood count, and age were important discriminators of risk of RVs. We identified several factors which are associated with RVs to the ED in pediatric patients

  15. Machine Learning Techniques for Single Nucleotide Polymorphism—Disease Classification Models in Schizophrenia

    Directory of Open Access Journals (Sweden)

    Cristian R. Munteanu

    2010-07-01

    Full Text Available Single nucleotide polymorphisms (SNPs can be used as inputs in disease computational studies such as pattern searching and classification models. Schizophrenia is an example of a complex disease with an important social impact. The multiple causes of this disease create the need of new genetic or proteomic patterns that can diagnose patients using biological information. This work presents a computational study of disease machine learning classification models using only single nucleotide polymorphisms at the HTR2A and DRD3 genes from Galician (Northwest Spain schizophrenic patients. These classification models establish for the first time, to the best knowledge of the authors, a relationship between the sequence of the nucleic acid molecule and schizophrenia (Quantitative Genotype – Disease Relationships that can automatically recognize schizophrenia DNA sequences and correctly classify between 78.3–93.8% of schizophrenia subjects when using datasets which include simulated negative subjects and a linear artificial neural network.

  16. Automatic Sleep Stages Classification Using EEG Entropy Features and Unsupervised Pattern Analysis Techniques

    Directory of Open Access Journals (Sweden)

    Jose Luis Rodríguez-Sotelo

    2014-12-01

    Full Text Available Sleep is a growing area of research interest in medicine and neuroscience. Actually, one major concern is to find a correlation between several physiologic variables and sleep stages. There is a scientific agreement on the characteristics of the five stages of human sleep, based on EEG analysis. Nevertheless, manual stage classification is still the most widely used approach. This work proposes a new automatic sleep classification method based on unsupervised feature classification algorithms recently developed, and on EEG entropy measures. This scheme extracts entropy metrics from EEG records to obtain a feature vector. Then, these features are optimized in terms of relevance using the Q-α algorithm. Finally, the resulting set of features is entered into a clustering procedure to obtain a final segmentation of the sleep stages. The proposed method reached up to an average of 80% correctly classified stages for each patient separately while keeping the computational cost low.

  17. Operator functional state classification using least-square support vector machine based recursive feature elimination technique.

    Science.gov (United States)

    Yin, Zhong; Zhang, Jianhua

    2014-01-01

    This paper proposed two psychophysiological-data-driven classification frameworks for operator functional states (OFS) assessment in safety-critical human-machine systems with stable generalization ability. The recursive feature elimination (RFE) and least square support vector machine (LSSVM) are combined and used for binary and multiclass feature selection. Besides typical binary LSSVM classifiers for two-class OFS assessment, two multiclass classifiers based on multiclass LSSVM-RFE and decision directed acyclic graph (DDAG) scheme are developed, one used for recognizing the high mental workload and fatigued state while the other for differentiating overloaded and base-line states from the normal states. Feature selection results have revealed that different dimensions of OFS can be characterized by specific set of psychophysiological features. Performance comparison studies show that reasonable high and stable classification accuracy of both classification frameworks can be achieved if the RFE procedure is properly implemented and utilized.

  18. 监督学习的发展动态%Current Directions in Supervised Learning Research

    Institute of Scientific and Technical Information of China (English)

    蒋艳凰; 周海芳; 杨学军

    2003-01-01

    Supervised learning is very important in machine learning area. It has been making great progress in manydirections. This article summarizes three of these directions ,which are the hot problems in supervised learning field.These three directions are (a) improving classification accuracy by learning ensembles of classifiers, (b) methods forscaling up supervised learning algorithm, (c) extracting understandable rules from classifiers.

  19. 粗糙集理论进行遥感图像监督分类的样本质量评价的研究%Exploring the Sample Quality Using Rough Sets Theory for the Supervised Classification of Remotely Sensed Imagery

    Institute of Scientific and Technical Information of China (English)

    葛咏; 白鹤翔; 李三平; 李德玉

    2008-01-01

    In the supervised classification process of remotely sensed imagery, the quantity of samples is one of the important factors affecting the accuracy of the image classification as well as the keys used to evaluate the image classification. In general, the samples are acquired on the basis of prior knowledge, experience and higher resolution images. With the same size of samples and the same sampling model, several sets of training sample data can be obtained. In such sets, which set reflects perfect spectral characteristics and ensure the accuracy of the classification can be known only after the accuracy of the classification has been assessed. So, before classification, it would be a meaningful research to measure and assess the quality of samples for guiding and optimizing the consequent classification process. Then, based on the rough set, a new measuring index for the sample quality is proposed. The experiment data is the Landsat TM imagery of the Chinese Yellow River Delta on August 8th, 1999. The experiment compares the Bhattacharrya distance matrices and purity index ⊿ and ⊿x based on rough set theory of 5 sample data and also analyzes its effect on sample quality.

  20. Realizing parameterless automatic classification of remote sensing imagery using ontology engineering and cyberinfrastructure techniques

    Science.gov (United States)

    Sun, Ziheng; Fang, Hui; Di, Liping; Yue, Peng

    2016-09-01

    It was an untouchable dream for remote sensing experts to realize total automatic image classification without inputting any parameter values. Experts usually spend hours and hours on tuning the input parameters of classification algorithms in order to obtain the best results. With the rapid development of knowledge engineering and cyberinfrastructure, a lot of data processing and knowledge reasoning capabilities become online accessible, shareable and interoperable. Based on these recent improvements, this paper presents an idea of parameterless automatic classification which only requires an image and automatically outputs a labeled vector. No parameters and operations are needed from endpoint consumers. An approach is proposed to realize the idea. It adopts an ontology database to store the experiences of tuning values for classifiers. A sample database is used to record training samples of image segments. Geoprocessing Web services are used as functionality blocks to finish basic classification steps. Workflow technology is involved to turn the overall image classification into a total automatic process. A Web-based prototypical system named PACS (Parameterless Automatic Classification System) is implemented. A number of images are fed into the system for evaluation purposes. The results show that the approach could automatically classify remote sensing images and have a fairly good average accuracy. It is indicated that the classified results will be more accurate if the two databases have higher quality. Once the experiences and samples in the databases are accumulated as many as an expert has, the approach should be able to get the results with similar quality to that a human expert can get. Since the approach is total automatic and parameterless, it can not only relieve remote sensing workers from the heavy and time-consuming parameter tuning work, but also significantly shorten the waiting time for consumers and facilitate them to engage in image

  1. An improved brain image classification technique with mining and shape prior segmentation procedure.

    Science.gov (United States)

    Rajendran, P; Madheswaran, M

    2012-04-01

    The shape prior segmentation procedure and pruned association rule with ImageApriori algorithm has been used to develop an improved brain image classification system are presented in this paper. The CT scan brain images have been classified into three categories namely normal, benign and malignant, considering the low-level features extracted from the images and high level knowledge from specialists to enhance the accuracy in decision process. The experimental results on pre-diagnosed brain images showed 97% sensitivity, 91% specificity and 98.5% accuracy. The proposed algorithm is expected to assist the physicians for efficient classification with multiple key features per image.

  2. Improving Maritime Domain Awareness Using Neural Networks for Target of Interest Classification

    Science.gov (United States)

    2015-03-01

    neural network training performance are presented using mean squared error convergence plots. In all implementations , the SCG learning...the implementation of the feature extraction techniques in MATLAB, implementation of the neural networks using the MATLAB Neural Network Toolbox, and...thesis. The Neural Network Toolbox supports supervised learning neural networks , which were chosen to best implement object classification.

  3. Subsampled Hessian Newton Methods for Supervised Learning.

    Science.gov (United States)

    Wang, Chien-Chih; Huang, Chun-Heng; Lin, Chih-Jen

    2015-08-01

    Newton methods can be applied in many supervised learning approaches. However, for large-scale data, the use of the whole Hessian matrix can be time-consuming. Recently, subsampled Newton methods have been proposed to reduce the computational time by using only a subset of data for calculating an approximation of the Hessian matrix. Unfortunately, we find that in some situations, the running speed is worse than the standard Newton method because cheaper but less accurate search directions are used. In this work, we propose some novel techniques to improve the existing subsampled Hessian Newton method. The main idea is to solve a two-dimensional subproblem per iteration to adjust the search direction to better minimize the second-order approximation of the function value. We prove the theoretical convergence of the proposed method. Experiments on logistic regression, linear SVM, maximum entropy, and deep networks indicate that our techniques significantly reduce the running time of the subsampled Hessian Newton method. The resulting algorithm becomes a compelling alternative to the standard Newton method for large-scale data classification.

  4. Study of USGS/NASA land use classification system. [compatibility of land use classification system with computer processing techniques employed for land use mapping from ERTS data

    Science.gov (United States)

    Spann, G. W.; Faust, N. L.

    1974-01-01

    It is known from several previous investigations that many categories of land-use can be mapped via computer processing of Earth Resources Technology Satellite data. The results are presented of one such experiment using the USGS/NASA land-use classification system. Douglas County, Georgia, was chosen as the test site for this project. It was chosen primarily because of its recent rapid growth and future growth potential. Results of the investigation indicate an overall land-use mapping accuracy of 67% with higher accuracies in rural areas and lower accuracies in urban areas. It is estimated, however, that 95% of the State of Georgia could be mapped by these techniques with an accuracy of 80% to 90%.

  5. Supervised Discrete Hashing With Relaxation.

    Science.gov (United States)

    Gui, Jie; Liu, Tongliang; Sun, Zhenan; Tao, Dacheng; Tan, Tieniu

    2016-12-29

    Data-dependent hashing has recently attracted attention due to being able to support efficient retrieval and storage of high-dimensional data, such as documents, images, and videos. In this paper, we propose a novel learning-based hashing method called ''supervised discrete hashing with relaxation'' (SDHR) based on ''supervised discrete hashing'' (SDH). SDH uses ordinary least squares regression and traditional zero-one matrix encoding of class label information as the regression target (code words), thus fixing the regression target. In SDHR, the regression target is instead optimized. The optimized regression target matrix satisfies a large margin constraint for correct classification of each example. Compared with SDH, which uses the traditional zero-one matrix, SDHR utilizes the learned regression target matrix and, therefore, more accurately measures the classification error of the regression model and is more flexible. As expected, SDHR generally outperforms SDH. Experimental results on two large-scale image data sets (CIFAR-10 and MNIST) and a large-scale and challenging face data set (FRGC) demonstrate the effectiveness and efficiency of SDHR.

  6. Helicopter and aircraft detection and classification using adaptive beamforming and tracking techniques

    NARCIS (Netherlands)

    Koersel, A.C. van; Beerens, S.P.

    2002-01-01

    Measurements of different types of aircraft are performed and used to obtain information on target characteristics and develop an algorithm to perform classification between jet aircraft, propeller aircraft and helicopters. To obtain a larger detection range, reduce background noise and to reduce

  7. Helicopter and aircraft detection and classification using adaptive beamforming and tracking techniques

    NARCIS (Netherlands)

    Koersel, A.C. van; Beerens, S.P.

    2002-01-01

    Measurements of different types of aircraft are performed and used to obtain information on target characteristics and develop an algorithm to perform classification between jet aircraft, propeller aircraft and helicopters. To obtain a larger detection range, reduce background noise and to reduce cl

  8. Development of a technique for classification of integumentary elements of a landscape

    Science.gov (United States)

    Voloshyn, V. I.; Bushuyev, Ye. I.; Parshyna, O. I.

    Methodical maintenance for classification of integumentary elements of a landscape is considered on the basis of data of the Earth's remote sensing as the primary goal of management of territory. The classes of integumentary elements and technological operations of satellite data processing are listed.

  9. Establishing structure-property correlations and classification of base oils using statistical techniques and artificial neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Kapur, G.S.; Sastry, M.I.S.; Jaiswal, A.K.; Sarpal, A.S

    2004-03-17

    The present paper describes various classification techniques like cluster analysis, principal component (PC)/factor analysis to classify different types of base stocks. The API classification of base oils (Group I-III) has been compared to a more detailed NMR derived chemical compositional and molecular structural parameters based classification in order to point out the similarities of the base oils in the same group and the differences between the oils placed in different groups. The detailed compositional parameters have been generated using {sup 1}H and {sup 13}C nuclear magnetic resonance (NMR) spectroscopic methods. Further, oxidation stability, measured in terms of rotating bomb oxidation test (RBOT) life, of non-conventional base stocks and their blends with conventional base stocks, has been quantitatively correlated with their {sup 1}H NMR and elemental (sulphur and nitrogen) data with the help of multiple linear regression (MLR) and artificial neural networks (ANN) techniques. The MLR based model developed using NMR and elemental data showed a high correlation between the 'measured' and 'estimated' RBOT values for both training (R=0.859) and validation (R=0.880) data sets. The ANN based model, developed using fewer number of input variables (only {sup 1}H NMR data) also showed high correlation between the 'measured' and 'estimated' RBOT values for training (R=0.881), validation (R=0.860) and test (R=0.955) data sets.

  10. The Comparison of Application of Supervised Classification Method Based on Spectral Characteristics in Yellow River Estuary Wetland%基于光谱特征的监督分类方法在黄河口湿地的应用比较

    Institute of Scientific and Technical Information of China (English)

    马尔仑; 郑艳楠

    2014-01-01

    使用基于光谱特征的六种常用监督分类方法,对同景黄河口湿地高光谱CHRIS影像进行分类,后对分类结果进行对比,进而分析并总结六种方法分类精度之间的差异和各自的特点。%Using six kinds of commonly used supervised classification method based on spectral characteristics to classify the Yellow River Estuary wetland hyperspectral CHRIS image, after comparing with the results of classification, and then it analyzes and summarizes the differences and their respective characteristics of classification accuracy between the six methods.

  11. Supervised Learning in Multilayer Spiking Neural Networks

    CERN Document Server

    Sporea, Ioana

    2012-01-01

    The current article introduces a supervised learning algorithm for multilayer spiking neural networks. The algorithm presented here overcomes some limitations of existing learning algorithms as it can be applied to neurons firing multiple spikes and it can in principle be applied to any linearisable neuron model. The algorithm is applied successfully to various benchmarks, such as the XOR problem and the Iris data set, as well as complex classifications problems. The simulations also show the flexibility of this supervised learning algorithm which permits different encodings of the spike timing patterns, including precise spike trains encoding.

  12. A Hybrid Ensemble Learning Approach to Star-Galaxy Classification

    CERN Document Server

    Kim, Edward J; Kind, Matias Carrasco

    2015-01-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template fitting method. Using data from the CFHTLenS survey, we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2, SDSS, VIPERS, and VVDS, and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, s...

  13. The creation of a digital soil map for Cyprus using decision-tree classification techniques

    Science.gov (United States)

    Camera, Corrado; Zomeni, Zomenia; Bruggeman, Adriana; Noller, Joy; Zissimos, Andreas

    2014-05-01

    Considering the increasing threats soil are experiencing especially in semi-arid, Mediterranean environments like Cyprus (erosion, contamination, sealing and salinisation), producing a high resolution, reliable soil map is essential for further soil conservation studies. This study aims to create a 1:50.000 soil map covering the area under the direct control of the Republic of Cyprus (5.760 km2). The study consists of two major steps. The first is the creation of a raster database of predictive variables selected according to the scorpan formula (McBratney et al., 2003). It is of particular interest the possibility of using, as soil properties, data coming from three older island-wide soil maps and the recently published geochemical atlas of Cyprus (Cohen et al., 2011). Ten highly characterizing elements were selected and used as predictors in the present study. For the other factors usual variables were used: temperature and aridity index for climate; total loss on ignition, vegetation and forestry types maps for organic matter; the DEM and related relief derivatives (slope, aspect, curvature, landscape units); bedrock, surficial geology and geomorphology (Noller, 2009) for parent material and age; and a sub-watershed map to better bound location related to parent material sources. In the second step, the digital soil map is created using the Random Forests package in R. Random Forests is a decision tree classification technique where many trees, instead of a single one, are developed and compared to increase the stability and the reliability of the prediction. The model is trained and verified on areas where a 1:25.000 published soil maps obtained from field work is available and then it is applied for predictive mapping to the other areas. Preliminary results obtained in a small area in the plain around the city of Lefkosia, where eight different soil classes are present, show very good capacities of the method. The Ramdom Forest approach leads to reproduce soil

  14. Semi-supervised image classification algorithm based on fuzzy rough sets%基于模糊粗糙集的半监督影像分类算法

    Institute of Scientific and Technical Information of China (English)

    张德军; 何发智; 袁志勇; 石强

    2016-01-01

    To address the problem that only a small number of samples are labeled in image classifica‐tion ,a semi‐supervised image classification approach based on fuzzy rough sets was proposed .Firstly , the fuzziness and roughness of data were modeled by fuzzy rough sets simultaneously ,then the rele‐vancy between the features and the decisions were measured by fuzzy entropy ,and the membership of one sample belonging to one class was approximated by fuzzy rough approximation operators .Second‐ly ,the feature evaluation approach was improved by fuzzy entropy under the regularization frame‐work ,and the optimal subset was selected under the framework of the semi‐supervised feature selec‐tion via spectral analysis .Thirdly ,the prediction of unlabeled sample was improved with neighbor‐constraints ,and the informative samples which were unlabeled were selected by constrained self‐learn‐ing based on fuzzy rough sets to update the training set .Finally ,the classifier was trained by updating sample set .Several experiments demonstrate that the proposed method can achieve higher classifica‐tion accuracy based on a small amount of samples .%针对影像分类中少量标记样本问题,提出了基于模糊粗糙集的影像半监督分类算法。首先,通过模糊粗糙集对数据的粗糙性与模糊性进行建模,采用归一化的模糊互信息来度量特征与类别信息的相关性,并利用模糊上下近似度量样本的类别隶属度;然后,结合归一化的模糊互信息改进正则化框架下的特征评价方法,在谱图分析的半监督特征选择框架下实现特征优选;其次,结合近邻约束提高模糊上下近似预测样本类别的准确性,设计基于模糊粗糙集的约束自学习,选择信息量大的未标记样本更新训练样本集;最后,利用新的样本集训练分类器,完成影像分类任务。多组实验表明所提算法能够在少量标记样本的条件下有效提高影像的分类精度。

  15. [Classification technique for hyperspectral image based on subspace of bands feature extraction and LS-SVM].

    Science.gov (United States)

    Gao, Heng-zhen; Wan, Jian-wei; Zhu, Zhen-zhen; Wang, Li-bao; Nian, Yong-jian

    2011-05-01

    The present paper proposes a novel hyperspectral image classification algorithm based on LS-SVM (least squares support vector machine). The LS-SVM uses the features extracted from subspace of bands (SOB). The maximum noise fraction (MNF) method is adopted as the feature extraction method. The spectral correlations of the hyperspectral image are used in order to divide the feature space into several SOBs. Then the MNF is used to extract characteristic features of the SOBs. The extracted features are combined into the feature vector for classification. So the strong bands correlation is avoided and the spectral redundancies are reduced. The LS-SVM classifier is adopted, which replaces inequality constraints in SVM by equality constraints. So the computation consumption is reduced and the learning performance is improved. The proposed method optimizes spectral information by feature extraction and reduces the spectral noise. The classifier performance is improved. Experimental results show the superiorities of the proposed algorithm.

  16. Correlation technique and least square support vector machine combine for frequency domain based ECG beat classification.

    Science.gov (United States)

    Dutta, Saibal; Chatterjee, Amitava; Munshi, Sugata

    2010-12-01

    The present work proposes the development of an automated medical diagnostic tool that can classify ECG beats. This is considered an important problem as accurate, timely detection of cardiac arrhythmia can help to provide proper medical attention to cure/reduce the ailment. The proposed scheme utilizes a cross-correlation based approach where the cross-spectral density information in frequency domain is used to extract suitable features. A least square support vector machine (LS-SVM) classifier is developed utilizing the features so that the ECG beats are classified into three categories: normal beats, PVC beats and other beats. This three-class classification scheme is developed utilizing a small training dataset and tested with an enormous testing dataset to show the generalization capability of the scheme. The scheme, when employed for 40 files in the MIT/BIH arrhythmia database, could produce high classification accuracy in the range 95.51-96.12% and could outperform several competing algorithms.

  17. A comprehensive analysis about the influence of low-level preprocessing techniques on mass spectrometry data for sample classification.

    Science.gov (United States)

    López-Fernández, Hugo; Reboiro-Jato, Miguel; Glez-Peña, Daniel; Fernández-Riverola, Florentino

    2014-01-01

    Matrix-Assisted Laser Desorption Ionisation Time-of-Flight (MALDI-TOF) is one of the high-throughput mass spectrometry technologies able to produce data requiring an extensive preprocessing before subsequent analyses. In this context, several low-level preprocessing techniques have been successfully developed for different tasks, including baseline correction, smoothing, normalisation, peak detection and peak alignment. In this work, we present a systematic comparison of different software packages aiding in the compulsory preprocessing of MALDI-TOF data. In order to guarantee the validity of our study, we test multiple configurations of each preprocessing technique that are subsequently used to train a set of classifiers whose performance (kappa and accuracy) provide us accurate information for the final comparison. Results from experiments show the real impact of preprocessing techniques on classification, evidencing that MassSpecWavelet provides the best performance and Support Vector Machines (SVM) are one of the most accurate classifiers.

  18. Classification Techniques for Quantum-Limited and Classical-Intensity Images

    Science.gov (United States)

    1989-12-01

    State Center for Advanced Optical Technology . iv Abstract Automatic classification of both quantum-limited and classical-intensity images is considered...based upon information gained through various measurement procedures. Increasingly, imaging technologies of all kinds are relied upon to provide this...E. M. Kellogg, S. S. Murray, and D. Bardas, IEEE Trans. Nucl. Sci. NS-26, 403- 410 (1979). 13. J. G. Timothy, G. H. Mount, and R. L. Bybee , SPIE

  19. Intelligent feature selection techniques for pattern classification of Lamb wave signals

    Energy Technology Data Exchange (ETDEWEB)

    Hinders, Mark K.; Miller, Corey A. [College of William and Mary, Department of Applied Science, Williamsburg, Virginia 23187-8795 (United States)

    2014-02-18

    Lamb wave interaction with flaws is a complex, three-dimensional phenomenon, which often frustrates signal interpretation schemes based on mode arrival time shifts predicted by dispersion curves. As the flaw severity increases, scattering and mode conversion effects will often dominate the time-domain signals, obscuring available information about flaws because multiple modes may arrive on top of each other. Even for idealized flaw geometries the scattering and mode conversion behavior of Lamb waves is very complex. Here, multi-mode Lamb waves in a metal plate are propagated across a rectangular flat-bottom hole in a sequence of pitch-catch measurements corresponding to the double crosshole tomography geometry. The flaw is sequentially deepened, with the Lamb wave measurements repeated at each flaw depth. Lamb wave tomography reconstructions are used to identify which waveforms have interacted with the flaw and thereby carry information about its depth. Multiple features are extracted from each of the Lamb wave signals using wavelets, which are then fed to statistical pattern classification algorithms that identify flaw severity. In order to achieve the highest classification accuracy, an optimal feature space is required but it’s never known a priori which features are going to be best. For structural health monitoring we make use of the fact that physical flaws, such as corrosion, will only increase over time. This allows us to identify feature vectors which are topologically well-behaved by requiring that sequential classes “line up” in feature vector space. An intelligent feature selection routine is illustrated that identifies favorable class distributions in multi-dimensional feature spaces using computational homology theory. Betti numbers and formal classification accuracies are calculated for each feature space subset to establish a correlation between the topology of the class distribution and the corresponding classification accuracy.

  20. On the Performance of Classification Techniques with Pixel Removal Applied to Digit Recognition

    Directory of Open Access Journals (Sweden)

    Jozette V. Roberts

    2016-08-01

    Full Text Available The successive loss of the outermost pixel values or frames in the digital representation of handwritten digits is postulated to have an increasing impact on the degree of accuracy of categorizations of these digits. This removal of frames is referred to as trimming. The first few frames do not contain significant amounts of information and the impact on accuracy should be negligible. As more frames are trimmed, the impact becomes more significant on the ability of each classification model to correctly identify digits. This study focuses on the effects of the trimming of frames of pixels, on the ability of the Recursive Partitioning and Classification Trees method, the Naive Bayes method, the k-Nearest Neighbor method and the Support Vector Machine method in the categorization of handwritten digits. The results from the application of the k-Nearest Neighbour and Recursive Partitioning and Classification Trees methods exemplified the white noise effect in the trimming of the first few frames whilst the Naive Bayes and the Support Vector Machine did not. With respect to time all models saw a relative decrease in time from the initial dataset. The k-Nearest Neighbour method had the greatest decreases whilst the Support Vector Machine had significantly fluctuating times.

  1. Networks of Professional Supervision

    Science.gov (United States)

    Annan, Jean; Ryba, Ken

    2013-01-01

    An ecological analysis of the supervisory activity of 31 New Zealand school psychologists examined simultaneously the theories of school psychology, supervision practices, and the contextual qualities that mediated participants' supervisory actions. The findings indicated that the school psychologists worked to achieve the supervision goals of…

  2. Forskellighed i supervision

    DEFF Research Database (Denmark)

    Petersen, Birgitte; Beck, Emma

    2009-01-01

    Indtryk og tendenser fra den anden danske konference om supervision, som blev holdt på Københavns Universitet i oktober 2008......Indtryk og tendenser fra den anden danske konference om supervision, som blev holdt på Københavns Universitet i oktober 2008...

  3. Experiments in Virtual Supervision.

    Science.gov (United States)

    Walker, Rob

    This paper examines the use of First Class conferencing software to create a virtual culture among research students and as a vehicle for supervision and advising. Topics discussed include: computer-mediated communication and research; entry to cyberculture, i.e., research students' induction into the research community; supervision and the…

  4. 探究工程监理系统中关键帧的提取技术%Study on the Extractive Technique of Key Frame in Construction Supervision System

    Institute of Scientific and Technical Information of China (English)

    王毅霞; 崔大力

    2013-01-01

    key frame is the picture which stresses on the task characteristics in two-dimensional animation and it can pro-vide clear key points we need in scene .With the increase in quantity of and reduction in quality in construction projects , the construction supervision becomes more and more important .The extractive technique of key frame in construction super-vision video data files can provide the scene and action needed by construction supervision and should be further studied for better application .The extractive technique of key frame for construction supervision system is discussed .%工程监理是指监理公司通过一定的方式或手段对工程建设全过程进行监督,使建筑公司能够保质保量完成合同所规定的任务、履行合同所规定的职责。帧是指视频中不可再分割的最小单位,关键帧是指二维动画中最着重刻画任务特点的那一个画面,关键帧的提取技术能够明确提供我们所最需要的镜头中的关键点。在工程数量日益增多、工程质量让人忧虑的今天,工程监理的重要性也日益加大,工程监理系统中关键帧提取技术能够提供工程监理所最需要的镜头和动作,需要深刻研究完善后更好的应用,本文意在探究工程监理系统中关键帧提取技术。

  5. Applicating Pilot of Domestic Waste Classification Collection & Transportation System and Informatization Supervision Technology in Pudong New Area%浦东新区生活垃圾分类收运体系与信息化监管技术应用试点研究

    Institute of Scientific and Technical Information of China (English)

    彭斌

    2013-01-01

    Based on the research of domestic waste classification pilot, the operation mode of food residue and other waste classified deposition, classified collection and classified transport in pilot area was established. Through comparative study of domestic waste classification supervisory technology and method, the supervision system of waste collection and transportation in pilot area was established by use of the Internet of things, GIS, Car GPS technology. The informatization supervision in the whole classification process was realized.%基于生活垃圾分类试点研究,在试点区域范围建立了厨余垃圾与其他垃圾分类投放、分类收集与分类转运的作业模式.通过对生活垃圾分类监管技术与方法比较研究,应用物联网、GIS、车载GPS等技术,建立了试点区域的收运监管系统,实现了垃圾分类全过程的信息化监管.

  6. A Novel Hybrid Dimension Reduction Technique for Undersized High Dimensional Gene Expression Data Sets Using Information Complexity Criterion for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Esra Pamukçu

    2015-01-01

    Full Text Available Gene expression data typically are large, complex, and highly noisy. Their dimension is high with several thousand genes (i.e., features but with only a limited number of observations (i.e., samples. Although the classical principal component analysis (PCA method is widely used as a first standard step in dimension reduction and in supervised and unsupervised classification, it suffers from several shortcomings in the case of data sets involving undersized samples, since the sample covariance matrix degenerates and becomes singular. In this paper we address these limitations within the context of probabilistic PCA (PPCA by introducing and developing a new and novel approach using maximum entropy covariance matrix and its hybridized smoothed covariance estimators. To reduce the dimensionality of the data and to choose the number of probabilistic PCs (PPCs to be retained, we further introduce and develop celebrated Akaike’s information criterion (AIC, consistent Akaike’s information criterion (CAIC, and the information theoretic measure of complexity (ICOMP criterion of Bozdogan. Six publicly available undersized benchmark data sets were analyzed to show the utility, flexibility, and versatility of our approach with hybridized smoothed covariance matrix estimators, which do not degenerate to perform the PPCA to reduce the dimension and to carry out supervised classification of cancer groups in high dimensions.

  7. Digital image processing techniques for enhancement and classification of SeaMARC II side scan sonar imagery

    Science.gov (United States)

    Reed, Thomas Beckett, IV; Hussong, Donald

    1989-06-01

    The recent growth in the production rate of digital side scan sonar images, coupled with the rapid expansion of systematic seafloor exploration programs, has created a need for fast and quantitative means of processing seafloor imagery. Computer-aided analytical techniques fill this need. A number of numerical techniques used to enhance and classify imagery produced by SeaMARC II, a long-range combination side scan sonar, and bathymetric seafloor mapping system are documented. Three categories of techniques are presented: (1) preprocessing corrections (radiometric and geometric), (2) feature extraction, and (3) image segmentation and classification. An introduction to the concept of "feature vectors" is provided, along with an explanation of the method of evaluation of a texture feature vector based upon gray-level co-occurrence matrices (GLCM). An alternative to the a priori texel (texture element) subdivision of images is presented in the form of region growing and texture analysis (REGATA). This routine provides a texture map of spatial resolution superior to that obtainable with arbitrarily assigned texel boundaries and minimizes the possibility of mixed texture signals due to the combination of two or more textures in an arbitrarily assigned texel. Computer classification of these textural features extracted via the GLCM technique results in transformation of images into maps of image texture. These maps may either be interpreted in terms of the theoretical relationships shown between texture signatures and wavelength or converted to geologic maps by correlation of texture signatures with ground truth data. These techniques are applied to SeaMARC II side scan sonar imagery from a variety of geologic environments, including lithified and nonlithified sedimentary formations, volcanic and sedimentary debris flows, and crystalline basaltic outcrops. Application of the above processing steps provided not only superior images for both subjective and quantitative

  8. Lung Cancer Early Diagnosis Using Some Data Mining Classification Techniques: A Survey

    Directory of Open Access Journals (Sweden)

    Thangaraju P

    2014-06-01

    Full Text Available Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining is primarily used to this requirement thus finding its applications in diverse fields such as retail, financial, communication, marketing organizations and medicine. Data Mining plays an important role in healthcare organization because with the growth of population and dangerous deadly diseases like Cancer, SARS, Leprosy, HIV etc, Lung cancer is one of the most dangerous disease. This survey for appropriate medical image mining, Data Preprocessing, Feature Extraction, rule generation and classification, it provides basic framework for further improvement in medical diagnosis.

  9. Lung Cancer Early Diagnosis Using Some Data Mining Classification Techniques: A Survey

    Directory of Open Access Journals (Sweden)

    Thangaraju P

    2015-11-01

    Full Text Available  Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining is primarily used to this requirement thus finding its applications in diverse fields such as retail, financial, communication, marketing organizations and medicine. Data Mining plays an important role in healthcare organization because with the growth of population and dangerous deadly diseases like Cancer, SARS, Leprosy, HIV etc, Lung cancer is one of the most dangerous disease. This survey for appropriate medical image mining, Data Preprocessing, Feature Extraction, rule generation and classification, it provides basic framework for further improvement in medical diagnosis.

  10. Restricted Boltzmann machines based oversampling and semi-supervised learning for false positive reduction in breast CAD.

    Science.gov (United States)

    Cao, Peng; Liu, Xiaoli; Bao, Hang; Yang, Jinzhu; Zhao, Dazhe

    2015-01-01

    The false-positive reduction (FPR) is a crucial step in the computer aided detection system for the breast. The issues of imbalanced data distribution and the limitation of labeled samples complicate the classification procedure. To overcome these challenges, we propose oversampling and semi-supervised learning methods based on the restricted Boltzmann machines (RBMs) to solve the classification of imbalanced data with a few labeled samples. To evaluate the proposed method, we conducted a comprehensive performance study and compared its results with the commonly used techniques. Experiments on benchmark dataset of DDSM demonstrate the effectiveness of the RBMs based oversampling and semi-supervised learning method in terms of geometric mean (G-mean) for false positive reduction in Breast CAD.

  11. Supervision som undervisningsform i voksenspecialundervisningen

    DEFF Research Database (Denmark)

    Kristensen, René

    2000-01-01

    Supervision som undervisningsform i voksenspecialundervisningen. Procesarbejde i undervisning af voksne.......Supervision som undervisningsform i voksenspecialundervisningen. Procesarbejde i undervisning af voksne....

  12. Feature extraction and classification for EEG signals using wavelet transform and machine learning techniques.

    Science.gov (United States)

    Amin, Hafeez Ullah; Malik, Aamir Saeed; Ahmad, Rana Fayyaz; Badruddin, Nasreen; Kamel, Nidal; Hussain, Muhammad; Chooi, Weng-Tink

    2015-03-01

    This paper describes a discrete wavelet transform-based feature extraction scheme for the classification of EEG signals. In this scheme, the discrete wavelet transform is applied on EEG signals and the relative wavelet energy is calculated in terms of detailed coefficients and the approximation coefficients of the last decomposition level. The extracted relative wavelet energy features are passed to classifiers for the classification purpose. The EEG dataset employed for the validation of the proposed method consisted of two classes: (1) the EEG signals recorded during the complex cognitive task--Raven's advance progressive metric test and (2) the EEG signals recorded in rest condition--eyes open. The performance of four different classifiers was evaluated with four performance measures, i.e., accuracy, sensitivity, specificity and precision values. The accuracy was achieved above 98 % by the support vector machine, multi-layer perceptron and the K-nearest neighbor classifiers with approximation (A4) and detailed coefficients (D4), which represent the frequency range of 0.53-3.06 and 3.06-6.12 Hz, respectively. The findings of this study demonstrated that the proposed feature extraction approach has the potential to classify the EEG signals recorded during a complex cognitive task by achieving a high accuracy rate.

  13. Automated Classification of Heritage Buildings for As-Built Bim Using Machine Learning Techniques

    Science.gov (United States)

    Bassier, M.; Vergauwen, M.; Van Genechten, B.

    2017-08-01

    Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.

  14. Critical analysis of classification techniques for polarimetric synthetic aperture radar data

    Directory of Open Access Journals (Sweden)

    Vikas Mittal

    2016-04-01

    Full Text Available Full polarimetry SAR data known as PolSAR contains information in terms of microwave energy backscattered through different scattering mechanisms (surface-, double- and volume-scattering by the targets on the surface of land. These scattering mechanisms information is different in different features. Similarly, different classifiers have different capabilities as far as identification of the targets corresponding to these scattering mechanisms. Extraction of different features and the role of classifier are important for the purpose of identifying which feature is the most suitable with which classifier for land cover classification. Selection of suitable features and their combinations have always been an active area of research for the development of advanced classification algorithms. Fully polarimetric data has its own advantages because its different channels give special scattering feature for various land cover. Therefore, first hand statistics HH, HV and VV of PolSAR data along with their ratios and linear combinations should be investigated for exploring their importance vis-à-vis relevant classifier for land management at the global scale. It has been observed that individually first hand statistics yield low accuracies. And their ratios are also not improving the results either. However, improved accuracies are achieved when these natural features are stacked together.

  15. IMAGE RECONSTRUCTION AND OBJECT CLASSIFICATION IN CT IMAGING SYSTEM

    Institute of Scientific and Technical Information of China (English)

    张晓明; 蒋大真; 等

    1995-01-01

    By obtaining a feasible filter function,reconstructed images can be got with linear interpolation and filtered backoprojection techniques.Considering the gray and spatial correlation neighbour informations of each pixel,a new supervised classification method is put forward for the reconstructed images,and an experiment with noise image is done,the result shows that the method is feasible and accurate compared with ideal phantoms.

  16. Pattern classification of Myo-Electrical signal during different Maximum Voluntary Contractions: A study using BSS techniques

    Science.gov (United States)

    Naik, Ganesh R.; Kumar, Dinesh K.; Arjunan, Sridhar P.

    2010-01-01

    The presence of noise and cross-talk from closely located and simultaneously active muscles is exaggerated when the level of muscle contraction is very low. Due to this the current applications of surface electromyogram (sEMG) are infeasible and unreliable in pattern classification. This research reports a new technique of sEMG using Independent Component Analysis (ICA). The technique uses blind source separation (BSS) methods to classify the patterns of Myo-electrical signals during different Maximum Voluntary Contraction (MVCs) at different low level finger movements. The results of the experiments indicate that patterns using ICA of sEMG is a reliable (pMVC. The authors propose that ICA is a useful indicator of muscle properties and is a useful indicator of the level of muscle activity.

  17. An on-chip instrument for white blood cells classification based on a lens-less shadow imaging technique

    Science.gov (United States)

    Wang, Runlong; Su, Dong

    2017-01-01

    Routine blood tests provide important basic information for disease diagnoses. The proportions of three subtypes of white blood cells (WBCs), which are neutrophils, monocytes, lymphocytes, is key information for disease diagnosis. However, current instruments for routine blood tests, such as blood cell analyzers, flow cytometers, and optical microscopes, are cumbersome, time consuming and expensive. To make a smaller, automatic low-cost blood cell analyzer, much research has focused on a technique called lens-less shadow imaging, which can obtain microscopic images of cells in a lens-less system. Nevertheless, the efficiency of this imaging system is not satisfactory because of two problems: low resolution and imaging diffraction phenomena. In this paper, a novel method of classifying cells with the shadow imaging technique was proposed. It could be used for the classification of the three subtypes of WBCs, and the correlation of the results of classification between the proposed system and the reference system (BC-5180, Mindray) was 0.93. However, the instrument was only 10 × 10 × 10 cm, and the cost was less than $100. Depending on the lens-free shadow imaging technology, the main hardware could be integrated on a chip scale and could be called an on-chip instrument. PMID:28350891

  18. First results on a process-oriented rain area classification technique using Meteosat Second Generation SEVIRI nighttime data

    Directory of Open Access Journals (Sweden)

    B. Thies

    2008-04-01

    Full Text Available A new technique for process-oriented rain area classification using Meteosat Second Generation SEVIRI nighttime data is introduced. It is based on a combination of the Advective Convective Technique (ACT which focuses on precipitation areas connected to convective processes and the Rain Area Delineation Scheme during Nighttime (RADS-N a new technique for the improved detection of stratiform precipitation areas (e.g. in connection with mid-latitude frontal systems. The ACT which uses positive brightness temperature differences between the water vapour (WV and the infrared (IR channels (ΔTWV-IR for the detection of convective clouds and connected precipitating clouds has been transferred from Meteosat First Generation (MFG Metesoat Visible and Infra-Red Imager radiometer (MVIRI to Meteosat Second Generation (MSG Spinning Enhanced Visible and InfraRed Imager (SEVIRI. RADS-N is based on the new conceptual model that precipitating cloud areas are characterised by a large cloud water path (cwp and the presence of ice particles in the upper part of the cloud. The technique considers information about both parameters inherent in the channel differences ΔT3.9-10.8, ΔT3.9-7.3, ΔT8.7-10.8, and ΔT10.8-12.1, to detect potentially precipitating cloud areas. All four channel differences are used to gain implicit knowledge about the cwp. ΔT8.7-10.8 and ΔT10.8-12.1 are additionally considered to gain information about the cloud phase. First results of a comparison study between the classified rain areas and corresponding ground based radar data for precipitation events in connection with a cold front occlusion show encouraging performance of the new proposed process-oriented rain area classification scheme.

  19. Determining the adulteration of spices with Sudan I-II-II-IV dyes by UV-visible spectroscopy and multivariate classification techniques.

    Science.gov (United States)

    Di Anibal, Carolina V; Odena, Marta; Ruisánchez, Itziar; Callao, M Pilar

    2009-08-15

    We propose a very simple and fast method for detecting Sudan dyes (I, II, III and IV) in commercial spices, based on characterizing samples through their UV-visible spectra and using multivariate classification techniques to establish classification rules. We applied three classification techniques: K-Nearest Neighbour (KNN), Soft Independent Modelling of Class Analogy (SIMCA) and Partial Least Squares Discriminant Analysis (PLS-DA). A total of 27 commercial spice samples (turmeric, curry, hot paprika and mild paprika) were analysed by chromatography (HPLC-DAD) to check that they were free of Sudan dyes. These samples were then spiked with Sudan dyes (I, II, III and IV) up to a concentration of 5 mg L(-1). Our final data set consisted of 135 samples distributed in five classes: samples without Sudan dyes, samples spiked with Sudan I, samples spiked with Sudan II, samples spiked with Sudan III and samples spiked with Sudan IV. Classification results were good and satisfactory using the classification techniques mentioned above: 99.3%, 96.3% and 90.4% of correct classification with PLS-DA, KNN and SIMCA, respectively. It should be pointed out that with SIMCA, there are no real classification errors as no samples were assigned to the wrong class: they were just not assigned to any of the pre-defined classes.

  20. Improved semi-supervised online boosting for object tracking

    Science.gov (United States)

    Li, Yicui; Qi, Lin; Tan, Shukun

    2016-10-01

    The advantage of an online semi-supervised boosting method which takes object tracking problem as a classification problem, is training a binary classifier from labeled and unlabeled examples. Appropriate object features are selected based on real time changes in the object. However, the online semi-supervised boosting method faces one key problem: The traditional self-training using the classification results to update the classifier itself, often leads to drifting or tracking failure, due to the accumulated error during each update of the tracker. To overcome the disadvantages of semi-supervised online boosting based on object tracking methods, the contribution of this paper is an improved online semi-supervised boosting method, in which the learning process is guided by positive (P) and negative (N) constraints, termed P-N constraints, which restrict the labeling of the unlabeled samples. First, we train the classification by an online semi-supervised boosting. Then, this classification is used to process the next frame. Finally, the classification is analyzed by the P-N constraints, which are used to verify if the labels of unlabeled data assigned by the classifier are in line with the assumptions made about positive and negative samples. The proposed algorithm can effectively improve the discriminative ability of the classifier and significantly alleviate the drifting problem in tracking applications. In the experiments, we demonstrate real-time tracking of our tracker on several challenging test sequences where our tracker outperforms other related on-line tracking methods and achieves promising tracking performance.

  1. Equality of Opportunity in Supervised Learning

    OpenAIRE

    Hardt, Moritz; Price, Eric; Srebro, Nathan

    2016-01-01

    We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to...

  2. Applying a Machine Learning Technique to Classification of Japanese Pressure Patterns

    Directory of Open Access Journals (Sweden)

    H Kimura

    2009-04-01

    Full Text Available In climate research, pressure patterns are often very important. When a climatologists need to know the days of a specific pressure pattern, for example "low pressure in Western areas of Japan and high pressure in Eastern areas of Japan (Japanese winter-type weather," they have to visually check a huge number of surface weather charts. To overcome this problem, we propose an automatic classification system using a support vector machine (SVM, which is a machine-learning method. We attempted to classify pressure patterns into two classes: "winter type" and "non-winter type". For both training datasets and test datasets, we used the JRA-25 dataset from 1981 to 2000. An experimental evaluation showed that our method obtained a greater than 0.8 F-measure. We noted that variations in results were based on differences in training datasets.

  3. Building a Classification Model for Enrollment In Higher Educational Courses using Data Mining Techniques

    OpenAIRE

    Saini, Priyanka

    2014-01-01

    Data Mining is the process of extracting useful patterns from the huge amount of database and many data mining techniques are used for mining these patterns. Recently, one of the remarkable facts in higher educational institute is the rapid growth data and this educational data is expanding quickly without any advantage to the educational management. The main aim of the management is to refine the education standard; therefore by applying the various data mining techniques on this data one ca...

  4. Clinical Supervision in Denmark

    DEFF Research Database (Denmark)

    Jacobsen, Claus Haugaard

    Data fra den danske undersøgelse af psykoterapeuters faglige udvikling indsamlet ved hjælp af DPCCQ. Oplægget fokuserer på supervision (modtaget, givet, uddannelse i) blandt danske psykoterapeutiske arbejdende psykologer....

  5. Supervision af psykoterapi

    DEFF Research Database (Denmark)

    SUPERVISION AF PSYKOTERAPI indtager en central position i uddannelsen og udviklingen af psykoterapeuter. Trods flere lighedspunkter med psykoterapi, undervisning og konsultation er psykoterapisupervision et selvstændigt virksomhedsområde. Supervisor må foruden at være en trænet psykoterapeut kende...... supervisionens rammer og indplacering i forhold til organisation og samfund. En række kapitler drejer sig om supervisors opgaver, roller og kontrolfunktion, supervision set fra supervisandens perspektiv samt betragtninger over relationer og processer i supervision. Der drøftes fordele og ulemper ved de...... forskellige måder, hvorpå en sag kan fremlægges. Bogens første del afsluttes med refleksioner over de etiske aspekter ved psykoterapisupervision. Bogens anden del handler om de særlige forhold, der gør sig gældende ved supervision af en række specialiserede behandlingsformer eller af psykoterapi med bestemte...

  6. Psykoterapi og supervision

    DEFF Research Database (Denmark)

    Jacobsen, Claus Haugaard

    2014-01-01

    Kapitlet beskriver supervisionen funktioner i forhold til psykoterapi. Supervision af psykoterapi henviser i almindelighed til, at en psykoterapeut konsulterer en ofte mere erfaren kollega (supervisor) med henblik på drøftelse af et konkret igangværende psykoterapeutisk behandlingsforløb. Formålet...... er at fremme denne fagpersons (psykoterapeutens) faglige udvikling samt sikre kvaliteten af behandlingen.kan defineres som i. Der redegøres for, hvorfor supervision er vigtig del af psykoterapeutens profession samt vises, hvorledes supervision foruden den faglige udvikling også er vigtigt redskab i...... psykoterapiens kvalitetssikring. Efter at have drøftet nogle etiske forhold ved supervision, fremlægges endelig nogle få forskningsresultater vedr. psykoterapisupervision af danske psykologer....

  7. Supervision and group dynamics

    DEFF Research Database (Denmark)

    Hansen, Søren; Jensen, Lars Peter

    2004-01-01

    as well as at Aalborg University. The first visible result has been participating supervisors telling us that the course has inspired them to try supervising group dynamics in the future. This paper will explore some aspects of supervising group dynamics as well as, how to develop the Aalborg model...... An important aspect of the problem based and project organized study at Aalborg University is the supervision of the project groups. At the basic education (first year) it is stated in the curriculum that part of the supervisors' job is to deal with group dynamics. This is due to the experience...... that many students are having difficulties with practical issues such as collaboration, communication, and project management. Most supervisors either ignore this demand, because they do not find it important or they find it frustrating, because they do not know, how to supervise group dynamics...

  8. Parameter optimization of image classification techniques to delineate crowns of coppice trees on UltraCam-D aerial imagery in woodlands

    Science.gov (United States)

    Erfanifard, Yousef; Stereńczak, Krzysztof; Behnia, Negin

    2014-01-01

    Estimating the optimal parameters of some classification techniques becomes their negative aspect as it affects their performance for a given dataset and reduces classification accuracy. It was aimed to optimize the combination of effective parameters of support vector machine (SVM), artificial neural network (ANN), and object-based image analysis (OBIA) classification techniques by the Taguchi method. The optimized techniques were applied to delineate crowns of Persian oak coppice trees on UltraCam-D very high spatial resolution aerial imagery in Zagros semiarid woodlands, Iran. The imagery was classified and the maps were assessed by receiver operating characteristic curve and other performance metrics. The results showed that Taguchi is a robust approach to optimize the combination of effective parameters in these image classification techniques. The area under curve (AUC) showed that the optimized OBIA could well discriminate tree crowns on the imagery (AUC=0.897), while SVM and ANN yielded slightly less AUC performances of 0.819 and 0.850, respectively. The indices of accuracy (0.999) and precision (0.999) and performance metrics of specificity (0.999) and sensitivity (0.999) in the optimized OBIA were higher than with other techniques. The optimization of effective parameters of image classification techniques by the Taguchi method, thus, provided encouraging results to discriminate the crowns of Persian oak coppice trees on UltraCam-D aerial imagery in Zagros semiarid woodlands.

  9. A new supervised learning algorithm for spiking neurons.

    Science.gov (United States)

    Xu, Yan; Zeng, Xiaoqin; Zhong, Shuiming

    2013-06-01

    The purpose of supervised learning with temporal encoding for spiking neurons is to make the neurons emit a specific spike train encoded by the precise firing times of spikes. If only running time is considered, the supervised learning for a spiking neuron is equivalent to distinguishing the times of desired output spikes and the other time during the running process of the neuron through adjusting synaptic weights, which can be regarded as a classification problem. Based on this idea, this letter proposes a new supervised learning method for spiking neurons with temporal encoding; it first transforms the supervised learning into a classification problem and then solves the problem by using the perceptron learning rule. The experiment results show that the proposed method has higher learning accuracy and efficiency over the existing learning methods, so it is more powerful for solving complex and real-time problems.

  10. Study on sanitation supervision of quantitative classification management for ship supply food companies at Shandong ports%山东口岸船舶供应食品企业卫生监督量化分级情况分析

    Institute of Scientific and Technical Information of China (English)

    邵柏; 杨冰; 刘明杰; 李磊

    2012-01-01

    Objective To confirm the method of sanitation supervision quantitative classification management. Methods A total of 65 ship supply food companies were conducted. Results After evaluation, A-level companies had 3 (4.62%) , B-level had 46 (70.76%), C-level had 16 (24.62%). Conclusion The new supervision mode can optimize health supervision resources.%目的 对优化口岸船舶供应食品企业卫生监督量化分级方法进行验证.方法 通过重新设计口岸船舶食品供应企业卫生许可审查和日常卫生监督量化分级评分表,对全省65家口岸船舶食品供应企业进行食品卫生信誉度分级.结果 评为A级企业有3家,占4.62%;评为B级企业有46家,占70.76%;评为C级企业有16家,占24.62%.结论 实施口岸船舶食品供应企业卫生监督量化分级管理可以优化卫生监督资源.

  11. Olive oil sensory defects classification with data fusion of instrumental techniques and multivariate analysis (PLS-DA).

    Science.gov (United States)

    Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga

    2016-07-15

    Three instrumental techniques, headspace-mass spectrometry (HS-MS), mid-infrared spectroscopy (MIR) and UV-visible spectrophotometry (UV-vis), have been combined to classify virgin olive oil samples based on the presence or absence of sensory defects. The reference sensory values were provided by an official taste panel. Different data fusion strategies were studied to improve the discrimination capability compared to using each instrumental technique individually. A general model was applied to discriminate high-quality non-defective olive oils (extra-virgin) and the lowest-quality olive oils considered non-edible (lampante). A specific identification of key off-flavours, such as musty, winey, fusty and rancid, was also studied. The data fusion of the three techniques improved the classification results in most of the cases. Low-level data fusion was the best strategy to discriminate musty, winey and fusty defects, using HS-MS, MIR and UV-vis, and the rancid defect using only HS-MS and MIR. The mid-level data fusion approach using partial least squares-discriminant analysis (PLS-DA) scores was found to be the best strategy for defective vs non-defective and edible vs non-edible oil discrimination. However, the data fusion did not sufficiently improve the results obtained by a single technique (HS-MS) to classify non-defective classes. These results indicate that instrumental data fusion can be useful for the identification of sensory defects in virgin olive oils.

  12. Discrete classification technique applied to TV advertisements liking recognition system based on low-cost EEG headsets.

    Science.gov (United States)

    Soria Morillo, Luis M; Alvarez-Garcia, Juan A; Gonzalez-Abril, Luis; Ortega Ramírez, Juan A

    2016-07-15

    In this paper a new approach is applied to the area of marketing research. The aim of this paper is to recognize how brain activity responds during the visualization of short video advertisements using discrete classification techniques. By means of low cost electroencephalography devices (EEG), the activation level of some brain regions have been studied while the ads are shown to users. We may wonder about how useful is the use of neuroscience knowledge in marketing, or what could provide neuroscience to marketing sector, or why this approach can improve the accuracy and the final user acceptance compared to other works. By using discrete techniques over EEG frequency bands of a generated dataset, C4.5, ANN and the new recognition system based on Ameva, a discretization algorithm, is applied to obtain the score given by subjects to each TV ad. The proposed technique allows to reach more than 75 % of accuracy, which is an excellent result taking into account the typology of EEG sensors used in this work. Furthermore, the time consumption of the algorithm proposed is reduced up to 30 % compared to other techniques presented in this paper. This bring about a battery lifetime improvement on the devices where the algorithm is running, extending the experience in the ubiquitous context where the new approach has been tested.

  13. Cross-classification analysis using prediction logic versus theory-testing logic : Comments on the use of the DEL-technique

    NARCIS (Netherlands)

    Kok, R.A.W.; Postma, T.J.B.M.; Steerneman, A.G.M.

    2008-01-01

    Without acknowledging the paradigm difference between testing theory and predicting events, researchers in the field of management and organization continue to use the DEL-technique as a promising technique to evaluate theory based on cross-classification data analysis. We address the purpose and in

  14. Two Approaches to Clinical Supervision.

    Science.gov (United States)

    Anderson, Eugene M.

    Criteria are established for a definition of "clinical supervision" and the effectiveness of such supervisory programs in a student teaching context are considered. Two differing genres of clinical supervision are constructed: "supervision by pattern analysis" is contrasted with "supervision by performance objectives." An outline of procedural…

  15. Counselor Supervision: A Consumer's Guide.

    Science.gov (United States)

    Yager, Geoffrey G.; Littrell, John M.

    This guide attempts to solve problems caused when a certain designated "brand" of supervision is forced on the counselor trainee with neither choice nor checklist of important criteria. As a tentative start on a guide to supervision the paper offers the following: a definition of supervision; a summary of the various types of supervision; a…

  16. Mineral classification map using MF and SAM techniques: A case study in the Nohwa Island, Korea

    Energy Technology Data Exchange (ETDEWEB)

    Son, Young-Sun; Yoon, Wang-Jung [Department of Energy and Resources Engineering, Chonnam National University, Gwangju 500-757 (Korea, Republic of)

    2015-03-10

    The purpose of this study is to map pyprophyllite distribution at surface of the Nohwa deposit, Korea by using Advanced Spaceborne Thermal Emission and Reflectance Radiometer (ASTER) data. For this, combined Spectral Angle Mapper (SAM), and Matched Filtering (MF) technique based on mathematical algorithm was applied. The regional distribution of high-grade and low-grade pyrophyllite in the Nohwa deposit area could be differentiated by this method. The results of this study show that ASTER data analysis using combination of SAM and MF techniques will assist in exploration of pyrophyllite at the exposed surface.

  17. Feature-Free Activity Classification of Inertial Sensor Data With Machine Vision Techniques: Method, Development, and Evaluation.

    Science.gov (United States)

    Dominguez Veiga, Jose Juan; O'Reilly, Martin; Whelan, Darragh; Caulfield, Brian; Ward, Tomas E

    2017-08-04

    Inertial sensors are one of the most commonly used sources of data for human activity recognition (HAR) and exercise detection (ED) tasks. The time series produced by these sensors are generally analyzed through numerical methods. Machine learning techniques such as random forests or support vector machines are popular in this field for classification efforts, but they need to be supported through the isolation of a potentially large number of additionally crafted features derived from the raw data. This feature preprocessing step can involve nontrivial digital signal processing (DSP) techniques. However, in many cases, the researchers interested in this type of activity recognition problems do not possess the necessary technical background for this feature-set development. The study aimed to present a novel application of established machine vision methods to provide interested researchers with an easier entry path into the HAR and ED fields. This can be achieved by removing the need for deep DSP skills through the use of transfer learning. This can be done by using a pretrained convolutional neural network (CNN) developed for machine vision purposes for exercise classification effort. The new method should simply require researchers to generate plots of the signals that they would like to build classifiers with, store them as images, and then place them in folders according to their training label before retraining the network. We applied a CNN, an established machine vision technique, to the task of ED. Tensorflow, a high-level framework for machine learning, was used to facilitate infrastructure needs. Simple time series plots generated directly from accelerometer and gyroscope signals are used to retrain an openly available neural network (Inception), originally developed for machine vision tasks. Data from 82 healthy volunteers, performing 5 different exercises while wearing a lumbar-worn inertial measurement unit (IMU), was collected. The ability of the

  18. Jay Haley's Supervision of a Case of Dissociative.

    Science.gov (United States)

    Haley, Jay

    2015-01-01

    This is a transcript of a supervision session with a young therapist caught in the complex world of a woman with multiple personality. Occurring very early in the written literature about treating multiple personalities, the highlight of this paper is the supervision style and technique of Jay Haley. His approach to supervision will make the reader wish that he or she could be in the room during this session.

  19. Multivariate Cross-Classification: Applying machine learning techniques to characterize abstraction in neural representations

    Directory of Open Access Journals (Sweden)

    Jonas eKaplan

    2015-03-01

    Full Text Available Here we highlight an emerging trend in the use of machine learning classifiers to test for abstraction across patterns of neural activity. When a classifier algorithm is trained on data from one cognitive context, and tested on data from another, conclusions can be drawn about the role of a given brain region in representing information that abstracts across those cognitive contexts. We call this kind of analysis Multivariate Cross-Classification (MVCC, and review several domains where it has recently made an impact. MVCC has been important in establishing correspondences among neural patterns across cognitive domains, including motor-perception matching and cross-sensory matching. It has been used to test for similarity between neural patterns evoked by perception and those generated from memory. Other work has used MVCC to investigate the similarity of representations for semantic categories across different kinds of stimulus presentation, and in the presence of different cognitive demands. We use these examples to demonstrate the power of MVCC as a tool for investigating neural abstraction and discuss some important methodological issues related to its application.

  20. Study on classification of soy sauce by electronic tongue technique combined with artificial neural network.

    Science.gov (United States)

    Ou-Yang, Qin; Zhao, Jie-Wen; Chen, Quan-Sheng; Lin, Hao; Huang, Xing-Yi

    2011-01-01

    Electronic tongue as an analytical tool coupled with pattern recognition was attempted to classify 4 different brands and 2 categories (produced by different processes) of Chinese soy sauce. An electronic tongue system was used for data acquisition of the samples. Some effective variables were extracted from electronic tongue data by principal component analysis (PCA). Backpropagation artificial neural network (BP-ANN) was applied to build identification models. PCA score plots show an obvious cluster trend of different brands and different categories of soy sauce in the 2-dimensional space. The optimal BP-ANN model for different brands was achieved when principal components (PCs) were 2, and the identification rate of the discrimination model was 100% in both the calibration set and the prediction set, and the optimal BP-ANN model for different categories had the same result. This work demonstrates that electronic tongue technology combined with a suitable pattern recognition method can be successfully used in the classification of different brands and categories of soy sauce.

  1. Hand Gesture recognition and classification by Discriminant and Principal Component Analysis using Machine Learning techniques

    Directory of Open Access Journals (Sweden)

    Sauvik Das Gupta

    2012-12-01

    Full Text Available This paper deals with the recognition of different hand gestures through machine learning approaches and principal component analysis. A Bio-Medical signal amplifier is built after doing a software simulation with the help of NI Multisim. At first a couple of surface electrodes are used to obtain the Electro-Myo-Gram (EMG signals from the hands. These signals from the surface electrodes have to be amplified with the help of the Bio-Medical Signal amplifier. The Bio-Medical Signal amplifier used is basically an Instrumentation amplifier made with the help of IC AD 620.The output from the Instrumentation amplifier is then filtered with the help of a suitable Band-Pass Filter. The output from the Band Pass filter is then fed to an Analog to Digital Converter (ADC which in this case is the NI USB 6008.The data from the ADC is then fed into a suitable algorithm which helps in recognition of the different hand gestures. The algorithm analysis is done in MATLAB. The results shown in this paper show a close to One-hundred per cent (100% classification result for three given hand gestures.

  2. Land Use / Land Cover Classification of kanniykumari Coast, Tamilnadu, India. Using Remote Sensing and Gis Techniques

    Directory of Open Access Journals (Sweden)

    Hajeeran Beevi.N,

    2015-07-01

    Full Text Available The land use/ land cover details of Kanniyakuamri coast which is Located in the southern part of Tamil Nadu (India is studied. Satellite imagery is used to identify the Land use/ Land cover status of the study area. The software like ERDAS and Arc GIS are used to demarcate the land use / Land cover features of Kanniyakuamari coast. Remote sensing and GIS provided consistent and accurate base line information than many of the conventional surveys employed for such a task. The total area of Kanniyakumari coast is 715 sq.km. The land use / land cover classes of the study area has been categorized into thirteen such as Plantation, Sandy area, Water logged area, Scrub forest, Crop Land, Water bodies, Land with scrub, Reserve forest, Land without Scrub, Salt area, Beach Ridge, Settlement and Fallow land on the basis NRSA Classifications. Among these categories, land with scrub land is predominantly found all over the study area, It is occupied about 336.36 sq.km (44.61 percent, Crop Land 273.82 sq.km(38.29 percent, water bodies lands sharing about 20.44 sq.km (2.85 percent , settlement occupied with 6.96 sq.km (0.97 percent, and Fallow land was occupied 13.98 sq.km ( 1.95 percent .

  3. Glaucoma detection using novel optic disc localization, hybrid feature set and classification techniques.

    Science.gov (United States)

    Akram, M Usman; Tariq, Anam; Khalid, Shehzad; Javed, M Younus; Abbas, Sarmad; Yasin, Ubaid Ullah

    2015-12-01

    Glaucoma is a chronic and irreversible neuro-degenerative disease in which the neuro-retinal nerve that connects the eye to the brain (optic nerve) is progressively damaged and patients suffer from vision loss and blindness. The timely detection and treatment of glaucoma is very crucial to save patient's vision. Computer aided diagnostic systems are used for automated detection of glaucoma that calculate cup to disc ratio from colored retinal images. In this article, we present a novel method for early and accurate detection of glaucoma. The proposed system consists of preprocessing, optic disc segmentation, extraction of features from optic disc region of interest and classification for detection of glaucoma. The main novelty of the proposed method lies in the formation of a feature vector which consists of spatial and spectral features along with cup to disc ratio, rim to disc ratio and modeling of a novel mediods based classier for accurate detection of glaucoma. The performance of the proposed system is tested using publicly available fundus image databases along with one locally gathered database. Experimental results using a variety of publicly available and local databases demonstrate the superiority of the proposed approach as compared to the competitors.

  4. Bioremediation techniques-classification based on site of application: principles, advantages, limitations and prospects.

    Science.gov (United States)

    Azubuike, Christopher Chibueze; Chikere, Chioma Blaise; Okpokwasili, Gideon Chijioke

    2016-11-01

    Environmental pollution has been on the rise in the past few decades owing to increased human activities on energy reservoirs, unsafe agricultural practices and rapid industrialization. Amongst the pollutants that are of environmental and public health concerns due to their toxicities are: heavy metals, nuclear wastes, pesticides, green house gases, and hydrocarbons. Remediation of polluted sites using microbial process (bioremediation) has proven effective and reliable due to its eco-friendly features. Bioremediation can either be carried out ex situ or in situ, depending on several factors, which include but not limited to cost, site characteristics, type and concentration of pollutants. Generally, ex situ techniques apparently are more expensive compared to in situ techniques as a result of additional cost attributable to excavation. However, cost of on-site installation of equipment, and inability to effectively visualize and control the subsurface of polluted sites are of major concerns when carrying out in situ bioremediation. Therefore, choosing appropriate bioremediation technique, which will effectively reduce pollutant concentrations to an innocuous state, is crucial for a successful bioremediation project. Furthermore, the two major approaches to enhance bioremediation are biostimulation and bioaugmentation provided that environmental factors, which determine the success of bioremediation, are maintained at optimal range. This review provides more insight into the two major bioremediation techniques, their principles, advantages, limitations and prospects.

  5. The Analysis of Dimensionality Reduction Techniques in Cryptographic Object Code Classification

    Energy Technology Data Exchange (ETDEWEB)

    Jason L. Wright; Milos Manic

    2010-05-01

    This paper compares the application of three different dimension reduction techniques to the problem of locating cryptography in compiled object code. A simple classi?er is used to compare dimension reduction via sorted covariance, principal component analysis, and correlation-based feature subset selection. The analysis concentrates on the classi?cation accuracy as the number of dimensions is increased.

  6. Supervised Object Class Colour Normalisation

    DEFF Research Database (Denmark)

    Riabchenko, Ekatarina; Lankinen, Jukka; Buch, Anders Glent;

    2013-01-01

    Colour is an important cue in many applications of computer vision and image processing, but robust usage often requires estimation of the unknown illuminant colour. Usually, to obtain images invariant to the illumination conditions under which they were taken, color normalisation is used....... In this work, we develop a such colour normalisation technique, where true colours are not important per se but where examples of same classes have photometrically consistent appearance. This is achieved by supervised estimation of a class specic canonical colour space where the examples have minimal variation...

  7. Comparing face processing strategies between typically-developed observers and observers with autism using sub-sampled-pixels presentation in response classification technique.

    Science.gov (United States)

    Nagai, Masayoshi; Bennett, Patrick J; Rutherford, M D; Gaspar, Carl M; Kumada, Takatsune; Sekuler, Allison B

    2013-03-07

    In the present study we modified the standard classification image method by subsampling visual stimuli to provide us with a technique capable of examining an individual's face-processing strategy in detail with fewer trials. Experiment 1 confirmed that one testing session (1450 trials) was sufficient to produce classification images that were qualitatively similar to those obtained previously with 10,000 trials (Sekuler et al., 2004). Experiment 2 used this method to compare classification images obtained from observers with autism spectrum disorders (ASD) and typically-developing (TD) observers. As was found in Experiment 1, classification images obtained from TD observers suggested that they all discriminated faces based on information conveyed by pixels in the eyes/brow region. In contrast, classification images obtained from ASD observers suggested that they used different perceptual strategies: three out of five ASD observers used a typical strategy of making use of information in the eye/brow region, but two used an atypical strategy that relied on information in the forehead region. The advantage of using the response classification technique is that there is no restriction to specific theoretical perspectives or a priori hypotheses, which enabled us to see unexpected strategies, like ASD's forehead strategy, and thus showed this technique is particularly useful in the examination of special populations.

  8. Comparison on three classification techniques for sex estimation from the bone length of Asian children below 19 years old: an analysis using different group of ages.

    Science.gov (United States)

    Darmawan, M F; Yusuf, Suhaila M; Kadir, M R Abdul; Haron, H

    2015-02-01

    Sex estimation is used in forensic anthropology to assist the identification of individual remains. However, the estimation techniques tend to be unique and applicable only to a certain population. This paper analyzed sex estimation on living individual child below 19 years old using the length of 19 bones of left hand applied for three classification techniques, which were Discriminant Function Analysis (DFA), Support Vector Machine (SVM) and Artificial Neural Network (ANN) multilayer perceptron. These techniques were carried out on X-ray images of the left hand taken from an Asian population data set. All the 19 bones of the left hand were measured using Free Image software, and all the techniques were performed using MATLAB. The group of age "16-19" years old and "7-9" years old were the groups that could be used for sex estimation with as their average of accuracy percentage was above 80%. ANN model was the best classification technique with the highest average of accuracy percentage in the two groups of age compared to other classification techniques. The results show that each classification technique has the best accuracy percentage on each different group of age. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  9. High Classification Rates for Continuous Cow Activity Recognition using Low-cost GPS Positioning Sensors and Standard Machine Learning Techniques

    DEFF Research Database (Denmark)

    Godsk, Torben; Kjærgaard, Mikkel Baun

    2011-01-01

    In precision livestock farming, spotting cows in need of extra attention due to health or welfare issues are essential, since the time a farmer can devote to each animal is decreasing due to growing herd sizes and increasing efficiency demands. Often, the symptoms of health and welfare state...... activities. By preprocessing the raw cow position data, we obtain high classification rates using standard machine learning techniques to recognize cow activities. Our objectives were to (i) determine to what degree it is possible to robustly recognize cow activities from GPS positioning data, using low......-cost GPS receivers; and (ii) determine which types of activities can be classified, and what robustness to expect within the different classes. To provide data for this study low-cost GPS receivers were mounted on 14 dairy cows on grass for a day while they were observed from a distance...

  10. Automatic detection and classification of damage zone(s) for incorporating in digital image correlation technique

    Science.gov (United States)

    Bhattacharjee, Sudipta; Deb, Debasis

    2016-07-01

    Digital image correlation (DIC) is a technique developed for monitoring surface deformation/displacement of an object under loading conditions. This method is further refined to make it capable of handling discontinuities on the surface of the sample. A damage zone is referred to a surface area fractured and opened in due course of loading. In this study, an algorithm is presented to automatically detect multiple damage zones in deformed image. The algorithm identifies the pixels located inside these zones and eliminate them from FEM-DIC processes. The proposed algorithm is successfully implemented on several damaged samples to estimate displacement fields of an object under loading conditions. This study shows that displacement fields represent the damage conditions reasonably well as compared to regular FEM-DIC technique without considering the damage zones.

  11. Comparative Study on the Different Testing Techniques in Tree Classification for Detecting the Learning Motivation

    Science.gov (United States)

    Juliane, C.; Arman, A. A.; Sastramihardja, H. S.; Supriana, I.

    2017-03-01

    Having motivation to learn is a successful requirement in a learning process, and needs to be maintained properly. This study aims to measure learning motivation, especially in the process of electronic learning (e-learning). Here, data mining approach was chosen as a research method. For the testing process, the accuracy comparative study on the different testing techniques was conducted, involving Cross Validation and Percentage Split. The best accuracy was generated by J48 algorithm with a percentage split technique reaching at 92.19 %. This study provided an overview on how to detect the presence of learning motivation in the context of e-learning. It is expected to be good contribution for education, and to warn the teachers for whom they have to provide motivation.

  12. SVM and ANN Based Classification of Plant Diseases Using Feature Reduction Technique

    OpenAIRE

    Pujari, Jagadeesh D.; Rajesh Yakkundimath; Abdulmunaf. Syedhusain. Byadgi

    2016-01-01

    Computers have been used for mechanization and automation in different applications of agriculture/horticulture. The critical decision on the agricultural yield and plant protection is done with the development of expert system (decision support system) using computer vision techniques. One of the areas considered in the present work is the processing of images of plant diseases affecting agriculture/horticulture crops. The first symptoms of plant disease have to be correctly detected, identi...

  13. Class Discovery in Galaxy Classification

    CERN Document Server

    Bazell, D; Bazell, David; Miller, David J.

    2004-01-01

    In recent years, automated, supervised classification techniques have been fruitfully applied to labeling and organizing large astronomical databases. These methods require off-line classifier training, based on labeled examples from each of the (known) object classes. In practice, only a small batch of labeled examples, hand-labeled by a human expert, may be available for training. Moreover, there may be no labeled examples for some classes present in the data, i.e. the database may contain several unknown classes. Unknown classes may be present due to 1) uncertainty in or lack of knowledge of the measurement process, 2) an inability to adequately ``survey'' a massive database to assess its content (classes), and/or 3) an incomplete scientific hypothesis. In recent work, new class discovery in mixed labeled/unlabeled data was formally posed, with a proposed solution based on mixture models. In this work we investigate this approach, propose a competing technique suitable for class discovery in neural network...

  14. Urdu Text Classification using Majority Voting

    Directory of Open Access Journals (Sweden)

    Muhammad Usman

    2016-08-01

    Full Text Available Text classification is a tool to assign the predefined categories to the text documents using supervised machine learning algorithms. It has various practical applications like spam detection, sentiment detection, and detection of a natural language. Based on the idea we applied five well-known classification techniques on Urdu language corpus and assigned a class to the documents using majority voting. The corpus contains 21769 news documents of seven categories (Business, Entertainment, Culture, Health, Sports, and Weird. The algorithms were not able to work directly on the data, so we applied the preprocessing techniques like tokenization, stop words removal and a rule-based stemmer. After preprocessing 93400 features are extracted from the data to apply machine learning algorithms. Furthermore, we achieved up to 94% precision and recall using majority voting.

  15. Classification of gamma-ray burst durations using robust model-comparison techniques

    Science.gov (United States)

    Kulkarni, Soham; Desai, Shantanu

    2017-04-01

    Gamma-Ray Bursts (GRBs) have been conventionally bifurcated into two distinct categories dubbed "short" and "long", depending on whether their durations are less than or greater than two seconds respectively. However, many authors have pointed to the existence of a third class of GRBs with mean durations intermediate between the short and long GRBs. Here, we apply multiple model comparison techniques to verify these claims. For each category, we obtain the best-fit parameters by maximizing a likelihood function based on a weighted superposition of two (or three) lognormal distributions. We then do model-comparison between each of these hypotheses by comparing the chi-square probabilities, Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). We uniformly apply these techniques to GRBs from Swift (both observer and intrinsic frame), BATSE, BeppoSAX, and Fermi-GBM. We find that the Swift GRB distributions (in the observer frame) for the entire dataset favor three categories at about 2.4σ from difference in chi-squares, and show decisive evidence in favor of three components using both AIC and BIC. However, when the same analysis is done for the subset of Swift GRBs with measured redshifts, two components are favored with marginal significance. For all the other datasets, evidence for three components is either very marginal or disfavored.

  16. A High Accuracy Method for Semi-supervised Information Extraction

    Energy Technology Data Exchange (ETDEWEB)

    Tratz, Stephen C.; Sanfilippo, Antonio P.

    2007-04-22

    Customization to specific domains of dis-course and/or user requirements is one of the greatest challenges for today’s Information Extraction (IE) systems. While demonstrably effective, both rule-based and supervised machine learning approaches to IE customization pose too high a burden on the user. Semi-supervised learning approaches may in principle offer a more resource effective solution but are still insufficiently accurate to grant realistic application. We demonstrate that this limitation can be overcome by integrating fully-supervised learning techniques within a semi-supervised IE approach, without increasing resource requirements.

  17. Resistance to group clinical supervision

    DEFF Research Database (Denmark)

    Buus, Niels; Delgado, Cynthia; Traynor, Michael

    2017-01-01

    This present study is a report of an interview study exploring personal views on participating in group clinical supervision among mental health nursing staff members who do not participate in supervision. There is a paucity of empirical research on resistance to supervision, which has traditiona......This present study is a report of an interview study exploring personal views on participating in group clinical supervision among mental health nursing staff members who do not participate in supervision. There is a paucity of empirical research on resistance to supervision, which has...... traditionally been theorized as a supervisee's maladaptive coping with anxiety in the supervision process. The aim of the present study was to examine resistance to group clinical supervision by interviewing nurses who did not participate in supervision. In 2015, we conducted semistructured interviews with 24...

  18. MULTILEVEL APPROACH OF CBIR TECHNIQUES FOR VEGETABLE CLASSIFICATION USING HYBRID IMAGE FEATURES

    Directory of Open Access Journals (Sweden)

    D. Latha

    2016-02-01

    Full Text Available CBIR is a technique to retrieve images semantically relevant to query image from an image database. The challenge in CBIR is to develop a method that should increase the retrieval accuracy and reduce the retrieval time. In order to improve the retrieval accuracy and runtime, a multilevel CBIR approach is proposed in this paper. In the first level, the color attributes like mean and standard deviations are proposed to calculate on HSV color space to retrieve the images with minimum disparity distance from the database. In order to minimize search area, in the second level Local Ternary Pattern is proposed on images which were selected from the first level. Experimental results and comparisons demonstrate the superiority of the proposed approach.

  19. DELINEATION OF TECHNIQUES TO IMPLEMENT ON THE ENHANCED PROPOSED MODEL USING DATA MINING FOR PROTEIN SEQUENCE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Ananya Basu

    2014-02-01

    Full Text Available In post genomic era with the advent of new technologies a huge amount of complex molecular data are generated with high throughput. The management of this biological data is definitely a challenging task due to complexity and heterogeneity of data for discovering new knowledge. Issues like managing noisy and incomplete data are needed to be dealt with. Use of data mining in biological domain has made its inventory success. Discovering new knowledge from the biological data is a major challenge in data mining technique. The novelty of the proposed model is its combined use of intelligent techniques to classify the protein sequence faster and efficiently. Use of FFT, fuzzy classifier, String weighted algorithm, gram encoding method, neural network model and rough set classifier in a single model and in an appropriate place can enhance the quality of the classification system .Thus the primary challenge is to identify and classify the large protein sequences in a very fast and easy but intellectual way to decrease the time complexity and space complexity

  20. Low-cost computer classification of land cover in the Portland area, Oregon, by signature extension techniques

    Science.gov (United States)

    Gaydos, Leonard

    1978-01-01

    Computer-aided techniques for interpreting multispectral data acquired by Landsat offer economies in the mapping of land cover. Even so, the actual establishment of the statistical classes, or "signatures," is one of the relatively more costly operations involved. Analysts have therefore been seeking cost-saving signature extension techniques that would accept training data acquired for one time or place and apply them to another. Opportunities to extend signatures occur in preprocessing steps and in the classification steps that follow. In the present example, land cover classes were derived by the simplest and most direct form of signature extension: Classes statistically derived from a Landsat scene for the Puget Sound area, Wash., were applied to the Portland area, Oreg., using data for the next Landsat scene acquired less than 25 seconds down orbit. Many features can be recognized on the reduced-scale version of the Portland land cover map shown in this report, although no statistical assessment of its accuracy is available.

  1. Classification of a set of vectors using self-organizing map- and rule-based technique

    Science.gov (United States)

    Ae, Tadashi; Okaniwa, Kaishirou; Nosaka, Kenzaburou

    2005-02-01

    There exist various objects, such as pictures, music, texts, etc., around our environment. We have a view for these objects by looking, reading or listening. Our view is concerned with our behaviors deeply, and is very important to understand our behaviors. We have a view for an object, and decide the next action (data selection, etc.) with our view. Such a series of actions constructs a sequence. Therefore, we propose a method which acquires a view as a vector from several words for a view, and apply the vector to sequence generation. We focus on sequences of the data of which a user selects from a multimedia database containing pictures, music, movie, etc... These data cannot be stereotyped because user's view for them changes by each user. Therefore, we represent the structure of the multimedia database as the vector representing user's view and the stereotyped vector, and acquire sequences containing the structure as elements. Such a vector can be classified by SOM (Self-Organizing Map). Hidden Markov Model (HMM) is a method to generate sequences. Therefore, we use HMM of which a state corresponds to the representative vector of user's view, and acquire sequences containing the change of user's view. We call it Vector-state Markov Model (VMM). We introduce the rough set theory as a rule-base technique, which plays a role of classifying the sets of data such as the sets of "Tour".

  2. Collective academic supervision

    DEFF Research Database (Denmark)

    Nordentoft, Helle Merete; Thomsen, Rie; Wichmann-Hansen, Gitte

    2013-01-01

    are interconnected. Collective Academic Supervision provides possibilities for systematic interaction between individual master students in their writing process. In this process they learn core academic competencies, such as the ability to assess theoretical and practical problems in their practice and present them...

  3. Reflecting reflection in supervision

    DEFF Research Database (Denmark)

    Lystbæk, Christian Tang

    Reflection has moved from the margins to the mainstream in supervision. Notions of reflection have become well established since the late 1980s. These notions have provided useful framing devices to help conceptualize some important processes in guidance and counseling. However, some applications...

  4. Clinical Supervision in Denmark

    DEFF Research Database (Denmark)

    Jacobsen, Claus Haugaard

    2011-01-01

    on giving and receiving clinical supervision as reported by therapists in Denmark. Method: Currently, the Danish sample consists of 350 clinical psychologist doing psychotherapy who completed DPCCQ. Data are currently being prepared for statistical analysis. Results: This paper will focus primarily...

  5. Kontraktetablering i supervision

    DEFF Research Database (Denmark)

    Mortensen, Karen Vibeke; Jacobsen, Claus Haugaard

    2007-01-01

    Kapitlet behandler kontraktetablering i supervision, et element, der ofte er blevet negligeret eller endog helt forbigået ved indledningen af supervisionsforløb. Sikre aftaler om emner som tid, sted, procedurer for fremlæggelse, fortrolighed, ansvarsfordeling og evaluering skaber imidlertid tryghed...

  6. Etiske betragtninger ved supervision

    DEFF Research Database (Denmark)

    Jacobsen, Claus Haugaard; Agerskov, Kirsten

    2007-01-01

    Kapitlet præsenterer nogle etiske betragtninger ved supervision. Mens der længe har eksisteret etiske retningslinjer for psykoterapeutisk arbejde, har der overraskende nok manglet tilsvarende vejledninger på supervisionsområdet. Det betyder imidlertid ikke, at de ikke er relevante. I kapitlet gøres...

  7. Multiclass Semi-Supervised Learning on Graphs using Ginzburg-Landau Functional Minimization

    CERN Document Server

    Garcia-Cardona, Cristina; Percus, Allon G

    2013-01-01

    We present a graph-based variational algorithm for classification of high-dimensional data, generalizing the binary diffuse interface model to the case of multiple classes. Motivated by total variation techniques, the method involves minimizing an energy functional made up of three terms. The first two terms promote a stepwise continuous classification function with sharp transitions between classes, while preserving symmetry among the class labels. The third term is a data fidelity term, allowing us to incorporate prior information into the model in a semi-supervised framework. The performance of the algorithm on synthetic data, as well as on the COIL and MNIST benchmark datasets, is competitive with state-of-the-art graph-based multiclass segmentation methods.

  8. Water area variations in seasonal lagoons from the Biosphere Reserve of "La Mancha Húmeda" (Spain) determined by remote sensing classification methods and data mining techniques

    Science.gov (United States)

    Dona, Carolina; Niclòs, Raquel; Chang, Ni-Bin; Caselles, Vicente; Sánchez, Juan Manuel; Camacho, Antonio

    2015-04-01

    La Mancha Húmeda is a wetland-rich area located in central Spain that was designated as a Biosphere reserve in 1980. This area includes several dozens of temporal lagoons, mostly saline, whose water level fluctuates and usually become dry during the warmest season. Water inflows into these lagoons come from both runoff of very small catchment and, in some cases, from groundwater although some of them also receive wastewater from nearby towns. Most lack surface outlets and they behave as endorheic systems, with the main water withdrawal due to evaporation causing salt accumulation in the lake beds. Under several law protection coverage additional to that of Biosphere Reserve, including Ramsar and Natura 2000 sites, management plans are being developed in order to accomplish the goals enforced by the European Water Framework Directive and the Habitats Directive, which establish that all EU countries have to achieve a good ecological status and a favorable conservation status of these sites, and especially of their water bodies. A core task to carry out the management plans is the understanding of the hydrological trend of these lagoons with a sound monitoring scheme. To do so, an estimation of the temporal evolution of the flooded area for each lagoon, and its relationship with meteorological patterns, which can be achieved using remote sensing technologies, is a key procedure. The current study aims to develop a remote sensing methodology capable of estimating the changing water coverage areas in each lagoon with satellite remote sensing images and ground truth data sets. ETM+ images onboard Landsat-7 were used to fulfill this goal. These images are useful to monitor small-to-medium size water bodies due to its 30-m spatial resolution. In this work several methods were applied to estimate the wet and dry pixels, such as water and vegetation indexes, single bands, supervised classification methods and genetic programming. All of the results were compared with ground

  9. Automatic classification of time-variable X-ray sources

    Energy Technology Data Exchange (ETDEWEB)

    Lo, Kitty K.; Farrell, Sean; Murphy, Tara; Gaensler, B. M. [Sydney Institute for Astronomy, School of Physics, The University of Sydney, Sydney, NSW 2006 (Australia)

    2014-05-01

    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the Second XMM-Newton Serendipitous Source Catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, and other multi-wavelength contextual information. The 10 fold cross validation accuracy of the training data is ∼97% on a 7 class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest derived outlier measure, we identified 12 anomalous sources, of which 2XMM J180658.7–500250 appears to be the most unusual source in the sample. Its X-ray spectra is suggestive of a ultraluminous X-ray source but its variability makes it highly unusual. Machine-learned classification and anomaly detection will facilitate scientific discoveries in the era of all-sky surveys.

  10. Weighted Chebyshev distance classification method for hyperspectral imaging

    Science.gov (United States)

    Demirci, S.; Erer, I.; Ersoy, O.

    2015-06-01

    The main objective of classification is to partition the surface materials into non-overlapping regions by using some decision rules. For supervised classification, the hyperspectral imagery (HSI) is compared with the reflectance spectra of the material containing similar spectral characteristic. As being a spectral similarity based classification method, prediction of different level of upper and lower spectral boundaries of all classes spectral signatures across spectral bands constitutes the basic principles of the Multi-Scale Vector Tunnel Algorithm (MS-VTA) classification algorithm. The vector tunnel (VT) scaling parameters obtained from means and standard deviations of the class references are used. In this study, MS-VT method is improved and a spectral similarity based technique referred to as Weighted Chebyshev Distance (WCD) method for the supervised classification of HSI is introduced. This is also shown to be equivalent to the use of the WCD in which the weights are chosen as an inverse power of the standard deviation per spectral band. The use of WCD measures in terms of the inverse power of standard deviations and optimization of power parameter constitute the most important side of the study. The algorithms are trained with the same kinds of training sets, and their performances are calculated for the power of the standard deviation. During these studies, various levels of the power parameters are evaluated based on the efficiency of the algorithms for choosing the best values of the weights.

  11. An AdaBoost algorithm for multiclass semi-supervised learning

    NARCIS (Netherlands)

    Tanha, J.; van Someren, M.; Afsarmanesh, H.; Zaki, M.J.; Siebes, A.; Yu, J.X.; Goethals, B.; Webb, G.; Wu, X.

    2012-01-01

    We present an algorithm for multiclass Semi-Supervised learning which is learning from a limited amount of labeled data and plenty of unlabeled data. Existing semi-supervised algorithms use approaches such as one-versus-all to convert the multiclass problem to several binary classification problems

  12. How To Survive a Classification Study.

    Science.gov (United States)

    Mosley, Shelley

    1998-01-01

    Examines job classification studies in public libraries, focusing on point-factor classification systems. Discusses job descriptions, job auditing biases, preparedness, what not to say to auditors, and the appeals process. Outlines typical factors in a job classification: decision-making, guidelines, education, experience, complexity, supervision,…

  13. Classification and mass production technique for three-quarter shoe insoles using non-weight-bearing plantar shapes.

    Science.gov (United States)

    Sun, Shuh-Ping; Chou, Yi-Jiun; Sue, Chun-Chia

    2009-07-01

    We have developed a technique for the mass production and classification of three-quarter shoe insoles via a 3D anthropometric measurement of full-size non-weight-bearing plantar shapes. The plantar shapes of fifty 40-60-year-old adults from Taiwan were categorized and, in conjunction with commercially available flat or leisure shoe models, three-quarter shoe-insole models were generated using a CAD system. Applying a rapid prototype system, these models were then used to provide the parameters for manufacturing the shoe insoles. The insoles developed in this study have been classified into S, M and L types that offer user-friendly options for foot-care providers. We concluded that these insoles can mate tightly with the foot arch and disperse the pressure in the heel and forefoot over the foot arch. Thus, practically, the pressure difference over the plantar region can be minimised, and the user can experience comfort when wearing flat or leisure shoes.

  14. Carbon Nanotube Emissions from Arc Discharge Production: Classification of Particle Types with Electron Microscopy and Comparison with Direct Reading Techniques.

    Science.gov (United States)

    Ludvigsson, Linus; Isaxon, Christina; Nilsson, Patrik T; Tinnerberg, Hakan; Messing, Maria E; Rissler, Jenny; Skaug, Vidar; Gudmundsson, Anders; Bohgard, Mats; Hedmer, Maria; Pagels, Joakim

    2016-05-01

    An increased production and use of carbon nanotubes (CNTs) is occurring worldwide. In parallel, a growing concern is emerging on the adverse effects the unintentional inhalation of CNTs can have on humans. There is currently a debate regarding which exposure metrics and measurement strategies are the most relevant to investigate workplace exposures to CNTs. This study investigated workplace CNT emissions using a combination of time-integrated filter sampling for scanning electron microscopy (SEM) and direct reading aerosol instruments (DRIs). Field measurements were performed during small-scale manufacturing of multiwalled carbon nanotubes using the arc discharge technique. Measurements with highly time- and size-resolved DRI techniques were carried out both in the emission and background (far-field) zones. Novel classifications and counting criteria were set up for the SEM method. Three classes of CNT-containing particles were defined: type 1: particles with aspect ratio length:width >3:1 (fibrous particles); type 2: particles without fibre characteristics but with high CNT content; and type 3: particles with visible embedded CNTs. Offline sampling using SEM showed emissions of CNT-containing particles in 5 out of 11 work tasks. The particles were classified into the three classes, of which type 1, fibrous CNT particles contributed 37%. The concentration of all CNT-containing particles and the occurrence of the particle classes varied strongly between work tasks. Based on the emission measurements, it was assessed that more than 85% of the exposure originated from open handling of CNT powder during the Sieving, mechanical work-up, and packaging work task. The DRI measurements provided complementary information, which combined with SEM provided information on: (i) the background adjusted emission concentration from each work task in different particle size ranges, (ii) identification of the key procedures in each work task that lead to emission peaks, (iii

  15. Semi-supervised learning for ordinal Kernel Discriminant Analysis.

    Science.gov (United States)

    Pérez-Ortiz, M; Gutiérrez, P A; Carbonero-Ruz, M; Hervás-Martínez, C

    2016-12-01

    Ordinal classification considers those classification problems where the labels of the variable to predict follow a given order. Naturally, labelled data is scarce or difficult to obtain in this type of problems because, in many cases, ordinal labels are given by a user or expert (e.g. in recommendation systems). Firstly, this paper develops a new strategy for ordinal classification where both labelled and unlabelled data are used in the model construction step (a scheme which is referred to as semi-supervised learning). More specifically, the ordinal version of kernel discriminant learning is extended for this setting considering the neighbourhood information of unlabelled data, which is proposed to be computed in the feature space induced by the kernel function. Secondly, a new method for semi-supervised kernel learning is devised in the context of ordinal classification, which is combined with our developed classification strategy to optimise the kernel parameters. The experiments conducted compare 6 different approaches for semi-supervised learning in the context of ordinal classification in a battery of 30 datasets, showing (1) the good synergy of the ordinal version of discriminant analysis and the use of unlabelled data and (2) the advantage of computing distances in the feature space induced by the kernel function.

  16. [Changes in the pleura of subjects occupationally-exposed to asbestos: radiological study technique, spectrum, etiological classification and coding according to the ILO classification].

    Science.gov (United States)

    Wiebe, V; Müller, K M; Reichel, G

    1991-01-01

    Pleural abnormalities of 119 occupationally asbestos-exposed with prominent internal stripe of the lateral thoracic wall were radiodiagnostically analysed by plain films of the thorax in four views and by computed tomography in the course of medical expert's certification. Abnormalities were coded according to 1980 ILO international classification of pneumoconioses. Hardly half of the patients had pleural abnormalities caused by asbestos exposure: Pleural plaques, "diffuse" pleural fibrosis, pleural effusions, organized pleural effusions and pleural tumors. The other half of the patients had pleural involvement of pulmonary and chest wall abnormalities or variations of the lateral thoracic wall not related to asbestos exposure. The 1980 ILO classification of pneumoconioses proved to be inadequate for complete coding of the abnormalities, since only the postero-anterior plain film of the thorax must be used, since the normal appearance of the pleura is insufficiently defined and since the entity of organized pleural effusion is lacking.

  17. Multiclass Semi-Supervised Boosting and Similarity Learning

    NARCIS (Netherlands)

    Tanha, J.; Saberian, M.J.; van Someren, M.; Xiong, H.; Karypis, G.; Thuraisingham, B.; Cook, D.; Wu, X.

    2013-01-01

    In this paper, we consider the multiclass semi-supervised classification problem. A boosting algorithm is proposed to solve the multiclass problem directly. The proposed multiclass approach uses a new multiclass loss function, which includes two terms. The first term is the cost of the multiclass ma

  18. Assessing Uncertainty in LULC Classification Accuracy by Using Bootstrap Resampling

    Directory of Open Access Journals (Sweden)

    Lin-Hsuan Hsiao

    2016-08-01

    Full Text Available Supervised land-use/land-cover (LULC classifications are typically conducted using class assignment rules derived from a set of multiclass training samples. Consequently, classification accuracy varies with the training data set and is thus associated with uncertainty. In this study, we propose a bootstrap resampling and reclassification approach that can be applied for assessing not only the uncertainty in classification results of the bootstrap-training data sets, but also the classification uncertainty of individual pixels in the study area. Two measures of pixel-specific classification uncertainty, namely the maximum class probability and Shannon entropy, were derived from the class probability vector of individual pixels and used for the identification of unclassified pixels. Unclassified pixels that are identified using the traditional chi-square threshold technique represent outliers of individual LULC classes, but they are not necessarily associated with higher classification uncertainty. By contrast, unclassified pixels identified using the equal-likelihood technique are associated with higher classification uncertainty and they mostly occur on or near the borders of different land-cover.

  19. Ethics in education supervision

    Directory of Open Access Journals (Sweden)

    Fatma ÖZMEN

    2008-06-01

    Full Text Available Supervision in education plays a crucial role in attaining educational goals. In addition to determining the present situation, it has a theoretical and practical function regarding the actions to be taken in general and the achievement of teacher development in particular to meet the educational goals in the most effective way. For the education supervisors to act ethically in their tasks while achieving this vital mission shall facilitate them to build up trust, to enhance the level of collaboration and sharing, thus it shall contribute to organizational effectiveness. Ethics is an essential component of educational supervision. Yet, it demonstrates rather vague quality due to the conditions, persons, and situations. Therefore, it is a difficult process to develop the ethical standards in institutions. This study aims to clarify the concept of ethics, to bring up its importance, and to make recommendations for more effective supervisions from the aspect of ethics, based on the literature review, some research results, and sample cases reported by teachers and supervisors.

  20. Evaluating pixel and object based image classification techniques for mapping plant invasions from UAV derived aerial imagery: Harrisia pomanensis as a case study

    Science.gov (United States)

    Mafanya, Madodomzi; Tsele, Philemon; Botai, Joel; Manyama, Phetole; Swart, Barend; Monate, Thabang

    2017-07-01

    Invasive alien plants (IAPs) not only pose a serious threat to biodiversity and water resources but also have impacts on human and animal wellbeing. To support decision making in IAPs monitoring, semi-automated image classifiers which are capable of extracting valuable information in remotely sensed data are vital. This study evaluated the mapping accuracies of supervised and unsupervised image classifiers for mapping Harrisia pomanensis (a cactus plant commonly known as the Midnight Lady) using two interlinked evaluation strategies i.e. point and area based accuracy assessment. Results of the point-based accuracy assessment show that with reference to 219 ground control points, the supervised image classifiers (i.e. Maxver and Bhattacharya) mapped H. pomanensis better than the unsupervised image classifiers (i.e. K-mediuns, Euclidian Length and Isoseg). In this regard, user and producer accuracies were 82.4% and 84% respectively for the Maxver classifier. The user and producer accuracies for the Bhattacharya classifier were 90% and 95.7%, respectively. Though the Maxver produced a higher overall accuracy and Kappa estimate than the Bhattacharya classifier, the Maxver Kappa estimate of 0.8305 is not significantly (statistically) greater than the Bhattacharya Kappa estimate of 0.8088 at a 95% confidence interval. The area based accuracy assessment results show that the Bhattacharya classifier estimated the spatial extent of H. pomanensis with an average mapping accuracy of 86.1% whereas the Maxver classifier only gave an average mapping accuracy of 65.2%. Based on these results, the Bhattacharya classifier is therefore recommended for mapping H. pomanensis. These findings will aid in the algorithm choice making for the development of a semi-automated image classification system for mapping IAPs.

  1. A New View of Classification in Astronomy with the Archetype Technique: An Astronomical Case of the NP-complete Set Cover Problem

    CERN Document Server

    Zhu, Guangtun

    2016-01-01

    We introduce a new generic Archetype technique for source classification and identification, based on the NP-complete set cover problem (SCP) in computer science and operations research (OR). We have developed a new heuristic SCP solver, by combining the greedy algorithm and the Lagrangian Relaxation (LR) approximation method. We test the performance of our code on the test cases from Beasley's OR Library and show that our SCP solver can efficiently yield solutions that are on average 99% optimal in terms of the cost. We discuss how to adopt SCP for classification purposes and put forward a new Archetype technique. We use an optical spectroscopic dataset of extragalactic sources from the Sloan Digital Sky Survey (SDSS) as an example to illustrate the steps of the technique. We show how the technique naturally selects a basis set of physically-motivated archetypal systems to represent all the extragalactic sources in the sample. We discuss several key aspects in the technique and in any general classification ...

  2. Polarimetric SAR Image Supervised Classification Method Integrating Eigenvalues%一种联合特征值信息的全极化SAR图像监督分类方法

    Institute of Scientific and Technical Information of China (English)

    邢艳肖; 张毅; 李宁; 王宇; 胡桂香

    2016-01-01

    基于H/a平面的分类器对于具有相似散射类型的地物的分类能力很差,为此该文直接使用特征值特征来进行分类。首先提取特征值特征,并使用一种自适应调整高斯分量个数的高斯混合模型对特征值分布进行较为准确地拟合,然后采用朴素贝叶斯分类器进行初步分类。针对可能存在特征值分布较为相近导致错分的问题,计算每两类地物的特征值分布的相似度,将相似度大于给定阈值的类别对组成相似性表,对于这些相似对再用基于Wishart距离的K近邻分类器进行细分。综合分析机载和星载SAR数据上的实验结果,表明这种方法能够克服基于H/a的非监督分类方法对于特征值利用的一些不足,且与基于SVM的分类方法效果相当。%Since classification methods based onH/a space have the drawback of yielding poor classification results for terrains with similar scattering features, in this study, we propose a polarimetric Synthetic Aperture Radar (SAR) image classification method based on eigenvalues. First, we extract eigenvalues and fit their distribution with an adaptive Gaussian mixture model. Then, using the naive Bayesian classifier, we obtain preliminary classification results. The distribution of eigenvalues in two kinds of terrains may be similar, leading to incorrect classification in the preliminary step. So, we calculate the similarity of every terrain pair, and add them to the similarity table if their similarity is greater than a given threshold. We then apply the Wishart distance-based KNN classifier to these similar pairs to obtain further classification results. We used the proposed method on both airborne and spaceborne SAR datasets, and the results show that our method can overcome the shortcoming of theH/a-based unsupervised classification method for eigenvalues usage, and produces comparable results with the Support Vector Machine (SVM)-based classification method.

  3. Representation learning for cross-modality classification

    NARCIS (Netherlands)

    G. van Tulder (Gijs); M. de Bruijne (Marleen)

    2017-01-01

    textabstractDifferences in scanning parameters or modalities can complicate image analysis based on supervised classification. This paper presents two representation learning approaches, based on autoencoders, that address this problem by learning representations that are similar across domains. Bot

  4. Strategies to Increase Accuracy in Text Classification

    NARCIS (Netherlands)

    D. Blommesteijn (Dennis)

    2014-01-01

    htmlabstractText classification via supervised learning involves various steps from processing raw data, features extraction to training and validating classifiers. Within these steps implementation decisions are critical to the resulting classifier accuracy. This paper contains a report of the

  5. Strategies to Increase Accuracy in Text Classification

    NARCIS (Netherlands)

    Blommesteijn, D.

    2014-01-01

    Text classification via supervised learning involves various steps from processing raw data, features extraction to training and validating classifiers. Within these steps implementation decisions are critical to the resulting classifier accuracy. This paper contains a report of the study performed

  6. 花卉的分类及栽培%Classification of the flower and its cultivation techniques

    Institute of Scientific and Technical Information of China (English)

    田雪慧; 何瑞林; 欧雅丽

    2015-01-01

    The flowers industry is a sunrise industry in recent years. Flowers refer to the plants with ornamental value, and are the floorboard of the grasses. The general meaning of the flowers is herbaceous and woody plants with ornamental value. Flowers’ special concept refers to the herbaceous plants with ornamental value. Besides high commodity value, its gorgeous color and refreshing fragrance can beautify landscape, improve the environment, enrich people’s life and promote the progress and development of human society. Today it is an indispensable part in world trade and social life. This article mainly discusses the development of flowers from the classification and cultivation technique of the flowers.%花卉产业是近几年新出现的一种朝阳产业,花卉中的“花”是指有观赏价值的植物,“卉”是草的总称。花卉的广义含义就是草本和木本中具有观赏价值的植物。花卉不仅具有很高的商品价值,而且它以绚丽的色彩、沁人心脾的芳香美化园林,改善环境,丰富人民的生活,推动人类社会的进步与发展,成为当今世界贸易及社会生活不可或缺的组成部分。文章主要从花卉的分类以及栽培技术方面探讨花卉的发展。

  7. Stratifying land use/land cover for spatial analysis of disease ecology and risk: an example using object-based classification techniques.

    Science.gov (United States)

    Koch, David E; Mohler, Rhett L; Goodin, Douglas G

    2007-11-01

    Landscape epidemiology has made significant strides recently, driven in part by increasing availability of land cover data derived from remotely-sensed imagery. Using an example from a study of land cover effects on hantavirus dynamics at an Atlantic Forest site in eastern Paraguay, we demonstrate how automated classification methods can be used to stratify remotely-sensed land cover for studies of infectious disease dynamics. For this application, it was necessary to develop a scheme that could yield both land cover and land use data from the same classification. Hypothesizing that automated discrimination between classes would be more accurate using an object-based method compared to a per-pixel method, we used a single Landsat Enhanced Thematic Mapper+ (ETM+) image to classify land cover into eight classes using both per-pixel and object-based classification algorithms. Our results show that the object-based method achieves 84% overall accuracy, compared to only 43% using the per-pixel method. Producer's and user's accuracies for the object-based map were higher for every class compared to the per-pixel classification. The Kappa statistic was also significantly higher for the object-based classification. These results show the importance of using image information from domains beyond the spectral domain, and also illustrate the importance of object-based techniques for remote sensing applications in epidemiological studies.

  8. Stratifying land use/land cover for spatial analysis of disease ecology and risk: an example using object-based classification techniques

    Directory of Open Access Journals (Sweden)

    David E. Koch

    2007-11-01

    Full Text Available Landscape epidemiology has made significant strides recently, driven in part by increasing availability of land cover data derived from remotely-sensed imagery. Using an example from a study of land cover effects on hantavirus dynamics at an Atlantic Forest site in eastern Paraguay, we demonstrate how automated classification methods can be used to stratify remotely-sensed land cover for studies of infectious disease dynamics. For this application, it was necessary to develop a scheme that could yield both land cover and land use data from the same classification. Hypothesizing that automated discrimination between classes would be more accurate using an object-based method compared to a per-pixel method, we used a single Landsat Enhanced Thematic Mapper+ (ETM+ image to classify land cover into eight classes using both per-pixel and object-based classification algorithms. Our results show that the objectbased method achieves 84% overall accuracy, compared to only 43% using the per-pixel method. Producer’s and user’s accuracies for the object-based map were higher for every class compared to the per-pixel classification. The Kappa statistic was also significantly higher for the object-based classification. These results show the importance of using image information from domains beyond the spectral domain, and also illustrate the importance of object-based techniques for remote sensing applications in epidemiological studies.

  9. Performance Analysis of Distributed Applications using Automatic Classification of Communication Inefficiencies

    Energy Technology Data Exchange (ETDEWEB)

    Vetter, J.

    1999-11-01

    We present a technique for performance analysis that helps users understand the communication behavior of their message passing applications. Our method automatically classifies individual communication operations and it reveals the cause of communication inefficiencies in the application. This classification allows the developer to focus quickly on the culprits of truly inefficient behavior, rather than manually foraging through massive amounts of performance data. Specifically, we trace the message operations of MPI applications and then classify each individual communication event using decision tree classification, a supervised learning technique. We train our decision tree using microbenchmarks that demonstrate both efficient and inefficient communication. Since our technique adapts to the target system's configuration through these microbenchmarks, we can simultaneously automate the performance analysis process and improve classification accuracy. Our experiments on four applications demonstrate that our technique can improve the accuracy of performance analysis, and dramatically reduce the amount of data that users must encounter.

  10. Supervised hub-detection for brain connectivity

    Science.gov (United States)

    Kasenburg, Niklas; Liptrot, Matthew; Reislev, Nina Linde; Garde, Ellen; Nielsen, Mads; Feragen, Aasa

    2016-03-01

    A structural brain network consists of physical connections between brain regions. Brain network analysis aims to find features associated with a parameter of interest through supervised prediction models such as regression. Unsupervised preprocessing steps like clustering are often applied, but can smooth discriminative signals in the population, degrading predictive performance. We present a novel hub-detection optimized for supervised learning that both clusters network nodes based on population level variation in connectivity and also takes the learning problem into account. The found hubs are a low-dimensional representation of the network and are chosen based on predictive performance as features for a linear regression. We apply our method to the problem of finding age-related changes in structural connectivity. We compare our supervised hub-detection (SHD) to an unsupervised hub-detection and a linear regression using the original network connections as features. The results show that the SHD is able to retain regression performance, while still finding hubs that represent the underlying variation in the population. Although here we applied the SHD to brain networks, it can be applied to any network regression problem. Further development of the presented algorithm will be the extension to other predictive models such as classification or non-linear regression.

  11. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  12. Researching online supervision

    DEFF Research Database (Denmark)

    Smedegaard Ernst Bengtsen, Søren; Mathiasen, Helle

    2014-01-01

    , or a poor substitution of such. This one-sidedness on the conceptual level makes it challenging to empirically study the deeper implications digital tools have for the supervisory dialogue. Drawing on phenomenology and systems theory we argue that we need new concepts in qualitative methodology that allow...... us to research the digital tools on their own premises as autonomous things in themselves, possessing an ontological creativity of their own. In order for qualitative research to match the ontological nature of digital tools we conclude the article by formulating three criteria of a ‘torn......’ methodology that makes room for new approaches to researching online supervision at the university....

  13. Researching online supervision

    DEFF Research Database (Denmark)

    Bengtsen, Søren S. E.; Mathiasen, Helle

    2014-01-01

    us to research the digital tools on their own premises as autonomous things in themselves, possessing an ontological creativity of their own. In order for qualitative research to match the ontological nature of digital tools we conclude the article by formulating three criteria of a ‘torn......’ methodology that makes room for new approaches to researching online supervision at the university......., or a poor substitution of such. This one-sidedness on the conceptual level makes it challenging to empirically study the deeper implications digital tools have for the supervisory dialogue. Drawing on phenomenology and systems theory we argue that we need new concepts in qualitative methodology that allow...

  14. Online supervision at the university

    DEFF Research Database (Denmark)

    Bengtsen, Søren Smedegaard; Jensen, Gry Sandholm

    2015-01-01

    The article presents and condenses the background, findings and results of a one yearlong research project on online supervision and feedback at the university. The article builds on presentations and discussions in different research environments and conferences on higher education research...... supervision proves unhelpful when trying to understand how online supervision and feedback is a pedagogical phenomenon in its own right, and irreducible to the face-to-face context. Secondly we show that not enough attention has been given to the way different digital tools and platforms influence...... the supervisory dialogue in the specific supervision context. We conclude by terming this challenge in online supervision a form of ‘torn pedagogy’; that online tools and platforms destabilise and ‘tear’ traditional understandings of supervision pedagogy ‘apart’. Also, we conclude that on the backdrop of a torn...

  15. Voltage sags and transient detection and classification using half/one-cycle windowing techniques based on continuous s-transform with neural network

    Science.gov (United States)

    Daud, Kamarulazhar; Abidin, Ahmad Farid; Ismail, Ahmad Puad

    2017-08-01

    This paper was conducted to detect and classify the different power quality disturbance (PQD) using Half and One-Cycle Windowing Technique (WT) based on Continuous S-Transform (CST) and Neural Network (NN). The system using 14 bus bars based on IEEE standard had been designing using MATLAB©/Simulink to provide PQD data. The datum of PQD is analyzed by using WT based on CST to extract features and it characteristics. Besides, the study focused an important issue concerning the identification of PQD selection and detection, the feature and characteristics of two types of signals such as voltage sag and transient signal are obtained. After the feature extraction, the classified process had been done using NN to show the percentage of classification PQD either voltage sags or transients. The analysis show which selection of cycle for windowing technique can provide the smooth detection of PQD and the suitable characteristic to provide the highest percentage of classification of PQD.

  16. Weakly supervised visual dictionary learning by harnessing image attributes.

    Science.gov (United States)

    Gao, Yue; Ji, Rongrong; Liu, Wei; Dai, Qionghai; Hua, Gang

    2014-12-01

    Bag-of-features (BoFs) representation has been extensively applied to deal with various computer vision applications. To extract discriminative and descriptive BoF, one important step is to learn a good dictionary to minimize the quantization loss between local features and codewords. While most existing visual dictionary learning approaches are engaged with unsupervised feature quantization, the latest trend has turned to supervised learning by harnessing the semantic labels of images or regions. However, such labels are typically too expensive to acquire, which restricts the scalability of supervised dictionary learning approaches. In this paper, we propose to leverage image attributes to weakly supervise the dictionary learning procedure without requiring any actual labels. As a key contribution, our approach establishes a generative hidden Markov random field (HMRF), which models the quantized codewords as the observed states and the image attributes as the hidden states, respectively. Dictionary learning is then performed by supervised grouping the observed states, where the supervised information is stemmed from the hidden states of the HMRF. In such a way, the proposed dictionary learning approach incorporates the image attributes to learn a semantic-preserving BoF representation without any genuine supervision. Experiments in large-scale image retrieval and classification tasks corroborate that our approach significantly outperforms the state-of-the-art unsupervised dictionary learning approaches.

  17. On protocols and measures for the validation of supervised methods for the inference of biological networks

    Directory of Open Access Journals (Sweden)

    Marie eSchrynemackers

    2013-12-01

    Full Text Available Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature.In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs.

  18. Crowdsourcing as a novel technique for retinal fundus photography classification: analysis of images in the EPIC Norfolk cohort on behalf of the UK Biobank Eye and Vision Consortium.

    Science.gov (United States)

    Mitry, Danny; Peto, Tunde; Hayat, Shabina; Morgan, James E; Khaw, Kay-Tee; Foster, Paul J

    2013-01-01

    Crowdsourcing is the process of outsourcing numerous tasks to many untrained individuals. Our aim was to assess the performance and repeatability of crowdsourcing for the classification of retinal fundus photography. One hundred retinal fundus photograph images with pre-determined disease criteria were selected by experts from a large cohort study. After reading brief instructions and an example classification, we requested that knowledge workers (KWs) from a crowdsourcing platform classified each image as normal or abnormal with grades of severity. Each image was classified 20 times by different KWs. Four study designs were examined to assess the effect of varying incentive and KW experience in classification accuracy. All study designs were conducted twice to examine repeatability. Performance was assessed by comparing the sensitivity, specificity and area under the receiver operating characteristic curve (AUC). Without restriction on eligible participants, two thousand classifications of 100 images were received in under 24 hours at minimal cost. In trial 1 all study designs had an AUC (95%CI) of 0.701(0.680-0.721) or greater for classification of normal/abnormal. In trial 1, the highest AUC (95%CI) for normal/abnormal classification was 0.757 (0.738-0.776) for KWs with moderate experience. Comparable results were observed in trial 2. In trial 1, between 64-86% of any abnormal image was correctly classified by over half of all KWs. In trial 2, this ranged between 74-97%. Sensitivity was ≥ 96% for normal versus severely abnormal detections across all trials. Sensitivity for normal versus mildly abnormal varied between 61-79% across trials. With minimal training, crowdsourcing represents an accurate, rapid and cost-effective method of retinal image analysis which demonstrates good repeatability. Larger studies with more comprehensive participant training are needed to explore the utility of this compelling technique in large scale medical image analysis.

  19. Crowdsourcing as a novel technique for retinal fundus photography classification: analysis of images in the EPIC Norfolk cohort on behalf of the UK Biobank Eye and Vision Consortium.

    Directory of Open Access Journals (Sweden)

    Danny Mitry

    Full Text Available AIM: Crowdsourcing is the process of outsourcing numerous tasks to many untrained individuals. Our aim was to assess the performance and repeatability of crowdsourcing for the classification of retinal fundus photography. METHODS: One hundred retinal fundus photograph images with pre-determined disease criteria were selected by experts from a large cohort study. After reading brief instructions and an example classification, we requested that knowledge workers (KWs from a crowdsourcing platform classified each image as normal or abnormal with grades of severity. Each image was classified 20 times by different KWs. Four study designs were examined to assess the effect of varying incentive and KW experience in classification accuracy. All study designs were conducted twice to examine repeatability. Performance was assessed by comparing the sensitivity, specificity and area under the receiver operating characteristic curve (AUC. RESULTS: Without restriction on eligible participants, two thousand classifications of 100 images were received in under 24 hours at minimal cost. In trial 1 all study designs had an AUC (95%CI of 0.701(0.680-0.721 or greater for classification of normal/abnormal. In trial 1, the highest AUC (95%CI for normal/abnormal classification was 0.757 (0.738-0.776 for KWs with moderate experience. Comparable results were observed in trial 2. In trial 1, between 64-86% of any abnormal image was correctly classified by over half of all KWs. In trial 2, this ranged between 74-97%. Sensitivity was ≥ 96% for normal versus severely abnormal detections across all trials. Sensitivity for normal versus mildly abnormal varied between 61-79% across trials. CONCLUSIONS: With minimal training, crowdsourcing represents an accurate, rapid and cost-effective method of retinal image analysis which demonstrates good repeatability. Larger studies with more comprehensive participant training are needed to explore the utility of this compelling

  20. Validation of automated supervised segmentation of multibeam backscatter data from the Chatham Rise, New Zealand

    Science.gov (United States)

    Hillman, Jess I. T.; Lamarche, Geoffroy; Pallentin, Arne; Pecher, Ingo A.; Gorman, Andrew R.; Schneider von Deimling, Jens

    2017-01-01

    Using automated supervised segmentation of multibeam backscatter data to delineate seafloor substrates is a relatively novel technique. Low-frequency multibeam echosounders (MBES), such as the 12-kHz EM120, present particular difficulties since the signal can penetrate several metres into the seafloor, depending on substrate type. We present a case study illustrating how a non-targeted dataset may be used to derive information from multibeam backscatter data regarding distribution of substrate types. The results allow us to assess limitations associated with low frequency MBES where sub-bottom layering is present, and test the accuracy of automated supervised segmentation performed using SonarScope® software. This is done through comparison of predicted and observed substrate from backscatter facies-derived classes and substrate data, reinforced using quantitative statistical analysis based on a confusion matrix. We use sediment samples, video transects and sub-bottom profiles acquired on the Chatham Rise, east of New Zealand. Inferences on the substrate types are made using the Generic Seafloor Acoustic Backscatter (GSAB) model, and the extents of the backscatter classes are delineated by automated supervised segmentation. Correlating substrate data to backscatter classes revealed that backscatter amplitude may correspond to lithologies up to 4 m below the seafloor. Our results emphasise several issues related to substrate characterisation using backscatter classification, primarily because the GSAB model does not only relate to grain size and roughness properties of substrate, but also accounts for other parameters that influence backscatter. Better understanding these limitations allows us to derive first-order interpretations of sediment properties from automated supervised segmentation.

  1. Classification of reflected signals from cavitated tooth surfaces using an artificial intelligence technique incorporating a fiber optic displacement sensor

    Science.gov (United States)

    Rahman, Husna Abdul; Harun, Sulaiman Wadi; Arof, Hamzah; Irawati, Ninik; Musirin, Ismail; Ibrahim, Fatimah; Ahmad, Harith

    2014-05-01

    An enhanced dental cavity diameter measurement mechanism using an intensity-modulated fiber optic displacement sensor (FODS) scanning and imaging system, fuzzy logic as well as a single-layer perceptron (SLP) neural network, is presented. The SLP network was employed for the classification of the reflected signals, which were obtained from the surfaces of teeth samples and captured using FODS. Two features were used for the classification of the reflected signals with one of them being the output of a fuzzy logic. The test results showed that the combined fuzzy logic and SLP network methodology contributed to a 100% classification accuracy of the network. The high-classification accuracy significantly demonstrates the suitability of the proposed features and classification using SLP networks for classifying the reflected signals from teeth surfaces, enabling the sensor to accurately measure small diameters of tooth cavity of up to 0.6 mm. The method remains simple enough to allow its easy integration in existing dental restoration support systems.

  2. Analysis of Different Classification Techniques for Two-Class Functional Near-Infrared Spectroscopy-Based Brain-Computer Interface

    Directory of Open Access Journals (Sweden)

    Noman Naseer

    2016-01-01

    Full Text Available We analyse and compare the classification accuracies of six different classifiers for a two-class mental task (mental arithmetic and rest using functional near-infrared spectroscopy (fNIRS signals. The signals of the mental arithmetic and rest tasks from the prefrontal cortex region of the brain for seven healthy subjects were acquired using a multichannel continuous-wave imaging system. After removal of the physiological noises, six features were extracted from the oxygenated hemoglobin (HbO signals. Two- and three-dimensional combinations of those features were used for classification of mental tasks. In the classification, six different modalities, linear discriminant analysis (LDA, quadratic discriminant analysis (QDA, k-nearest neighbour (kNN, the Naïve Bayes approach, support vector machine (SVM, and artificial neural networks (ANN, were utilized. With these classifiers, the average classification accuracies among the seven subjects for the 2- and 3-dimensional combinations of features were 71.6, 90.0, 69.7, 89.8, 89.5, and 91.4% and 79.6, 95.2, 64.5, 94.8, 95.2, and 96.3%, respectively. ANN showed the maximum classification accuracies: 91.4 and 96.3%. In order to validate the results, a statistical significance test was performed, which confirmed that the p values were statistically significant relative to all of the other classifiers (p < 0.005 using HbO signals.

  3. Analysis of Different Classification Techniques for Two-Class Functional Near-Infrared Spectroscopy-Based Brain-Computer Interface

    Science.gov (United States)

    Qureshi, Nauman Khalid; Noori, Farzan Majeed; Hong, Keum-Shik

    2016-01-01

    We analyse and compare the classification accuracies of six different classifiers for a two-class mental task (mental arithmetic and rest) using functional near-infrared spectroscopy (fNIRS) signals. The signals of the mental arithmetic and rest tasks from the prefrontal cortex region of the brain for seven healthy subjects were acquired using a multichannel continuous-wave imaging system. After removal of the physiological noises, six features were extracted from the oxygenated hemoglobin (HbO) signals. Two- and three-dimensional combinations of those features were used for classification of mental tasks. In the classification, six different modalities, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), k-nearest neighbour (kNN), the Naïve Bayes approach, support vector machine (SVM), and artificial neural networks (ANN), were utilized. With these classifiers, the average classification accuracies among the seven subjects for the 2- and 3-dimensional combinations of features were 71.6, 90.0, 69.7, 89.8, 89.5, and 91.4% and 79.6, 95.2, 64.5, 94.8, 95.2, and 96.3%, respectively. ANN showed the maximum classification accuracies: 91.4 and 96.3%. In order to validate the results, a statistical significance test was performed, which confirmed that the p values were statistically significant relative to all of the other classifiers (p < 0.005) using HbO signals.

  4. Public Supervision over Private Relationships : Towards European Supervision Private Law?

    NARCIS (Netherlands)

    Cherednychenko, O.O.

    2014-01-01

    The rise of public supervision over private relationships in many areas of private law has led to the development of what, in the author’s view, could be called ‘European supervision private law’. This emerging body of law forms part of European regulatory private law and is made up of contract-rela

  5. Supervising PETE Candidates Using the Situational Supervision Model

    Science.gov (United States)

    Levy, Linda S.; Johnson, Lynn V.

    2012-01-01

    Physical education teacher candidates (PETCs) often, as part of their curricular requirements, engage in early field experiences that prepare them for student teaching. Matching the PETC's developmental level with the mentor's supervision style enhances this experience. The situational supervision model, based on the situational leadership model,…

  6. Exploring Clinical Supervision as Instrument for Effective Teacher Supervision

    Science.gov (United States)

    Ibara, E. C.

    2013-01-01

    This paper examines clinical supervision approaches that have the potential to promote and implement effective teacher supervision in Nigeria. The various approaches have been analysed based on the conceptual framework of instructional supervisory behavior. The findings suggest that a clear distinction can be made between the prescriptive and…

  7. Supervision Learning as Conceptual Threshold Crossing: When Supervision Gets "Medieval"

    Science.gov (United States)

    Carter, Susan

    2016-01-01

    This article presumes that supervision is a category of teaching, and that we all "learn" how to teach better. So it enquires into what novice supervisors need to learn. An anonymised digital questionnaire sought data from supervisors [n226] on their experiences of supervision to find out what was difficult, and supervisor interviews…

  8. Supervision Learning as Conceptual Threshold Crossing: When Supervision Gets "Medieval"

    Science.gov (United States)

    Carter, Susan

    2016-01-01

    This article presumes that supervision is a category of teaching, and that we all "learn" how to teach better. So it enquires into what novice supervisors need to learn. An anonymised digital questionnaire sought data from supervisors [n226] on their experiences of supervision to find out what was difficult, and supervisor interviews…

  9. Supervision of Supervised Agricultural Experience Programs: A Synthesis of Research.

    Science.gov (United States)

    Dyer, James E.; Williams, David L.

    1997-01-01

    A review of literature from 1964 to 1993 found that supervised agricultural experience (SAE) teachers, students, parents, and employers value the teachers' supervisory role. Implementation practices vary widely and there are no cumulative data to guide policies and standards for SAE supervision. (SK)

  10. Binary classification of ¹⁸F-flutemetamol PET using machine learning

    DEFF Research Database (Denmark)

    Vandenberghe, Rik; Nelissen, Natalie; Salmon, Eric

    2013-01-01

    (18)F-flutemetamol is a positron emission tomography (PET) tracer for in vivo amyloid imaging. The ability to classify amyloid scans in a binary manner as 'normal' versus 'Alzheimer-like', is of high clinical relevance. We evaluated whether a supervised machine learning technique, support vector...... machines (SVM), can replicate the assignments made by visual readers blind to the clinical diagnosis, which image components have highest diagnostic value according to SVM and how (18)F-flutemetamol-based classification using SVM relates to structural MRI-based classification using SVM within the same...

  11. Process of social supervision in nursing: Possibility of transformation of the assistencial model process of supervision in nursing

    Directory of Open Access Journals (Sweden)

    Valesca Silveira Correia

    2013-01-01

    Full Text Available This is a qualitative, descriptive and exploratory study which has been carried out with the nurses of the Family Health Unit. It aimed to understand the social representation of the nurses on the process of social supervision in Nursing in the health strategy of the family. An semi-structured interviewand a focal group have been used as the technique for data collection. As for the analysis of the data, Bardin’s analysis of content has been used. The study showed that the situacional strategical planning, the work in team and the use of the techniques and instruments of supervision are strategies to be considered for the development of the process of social supervision in the health strategy of the family. However, the nurses showed representations which are supported by the traditional supervision when they conceive the disassociated planning of the execution and when they are reveal that they are influenced by the model of sanitarist compaign care in their professional practice. It is concluded that the social representations of the nurses concerning the process of social supervision in the team of health of the family points out to the necessity of overcoming of the traditional supervision with respect to a new dimension of vision of the practices in health through the social supervision, in view of the health of the family as proposal to change the hegemonic assistencial model.

  12. AN OVERVIEW OF RESEARCH CHALLENGES FOR CLASSIFICATION OF CARDIOTOCOGRAM DATA

    Directory of Open Access Journals (Sweden)

    C. Sundar

    2013-01-01

    Full Text Available Cardiotocography (CTG is a simultaneous recording of Fetal Heart Rate (FHR and Uterine Contractions (UC. The most common diagnostic techniques to evaluate maternal and fetal well-being during pregnancy and before delivery. By observing the Cardiotocography trace patterns doctors can understand the state of the fetus. There are several signal processing and computer programming based techniques for interpreting a typical Cardiotocography data. A model based CTG data classification system using a supervised Artificial Neural Network (ANN which can classify the CTG data based on its training data. The performance neural network based classification model has been compared with the most commonly used unsupervised clustering methods Fuzzy C-mean and k-mean clustering. The arrived results show that the performance of the supervised machine learning based classification approach provided significant performance than other compared unsupervised clustering methods. The traditional clustering methods can identify the Normal CTG patterns; they were incapable of finding Suspicious and Pathologic patterns. The ANN based classifier was capable of identifying Normal, Suspicious and Pathologic condition, from the nature of CTG data with very good accuracy.

  13. Tuning, Diagnostics & Data Preparation for Generalized Linear Models Supervised Algorithm in Data Mining Technologies

    Directory of Open Access Journals (Sweden)

    Sachin Bhaskar

    2015-07-01

    Full Text Available Data mining techniques are the result of a long process of research and product development. Large amount of data are searched by the practice of Data Mining to find out the trends and patterns that go beyond simple analysis. For segmentation of data and also to evaluate the possibility of future events, complex mathematical algorithms are used here. Specific algorithm produces each Data Mining model. More than one algorithms are used to solve in best way by some Data Mining problems. Data Mining technologies can be used through Oracle. Generalized Linear Models (GLM Algorithm is used in Regression and Classification Oracle Data Mining functions. For linear modelling, GLM is one the popular statistical techniques. For regression and binary classification, GLM is implemented by Oracle Data Mining. Row diagnostics as well as model statistics and extensive co-efficient statistics are provided by GLM. It also supports confidence bounds.. This paper outlines and produces analysis of GLM algorithm, which will guide to understand the tuning, diagnostics & data preparation process and the importance of Regression & Classification supervised Oracle Data Mining functions and it is utilized in marketing, time series prediction, financial forecasting, overall business planning, trend analysis, environmental modelling, biomedical and drug response modelling, etc.

  14. Inductive Supervised Quantum Learning

    Science.gov (United States)

    Monràs, Alex; Sentís, Gael; Wittek, Peter

    2017-05-01

    In supervised learning, an inductive learning algorithm extracts general rules from observed training instances, then the rules are applied to test instances. We show that this splitting of training and application arises naturally, in the classical setting, from a simple independence requirement with a physical interpretation of being nonsignaling. Thus, two seemingly different definitions of inductive learning happen to coincide. This follows from the properties of classical information that break down in the quantum setup. We prove a quantum de Finetti theorem for quantum channels, which shows that in the quantum case, the equivalence holds in the asymptotic setting, that is, for large numbers of test instances. This reveals a natural analogy between classical learning protocols and their quantum counterparts, justifying a similar treatment, and allowing us to inquire about standard elements in computational learning theory, such as structural risk minimization and sample complexity.

  15. Supervision in Special Language Programs.

    Science.gov (United States)

    Florez-Tighe, Viola

    Too little emphasis is placed on instructional supervision in special language programs for limited-English-proficient students. Such supervision can provide a mechanism to promote the growth of instructional staff, improve the instructional program, and lead to curriculum development. Many supervisors are undertrained and unable to provide…

  16. Unfinished Business: Subjectivity and Supervision

    Science.gov (United States)

    Green, Bill

    2005-01-01

    Within the now burgeoning literature on doctoral research education, postgraduate research supervision continues to be a problematical issue, practically and theoretically. This paper seeks to explore and understand supervision as a distinctive kind of pedagogic practice. Informed by a larger research project, it draws on poststructuralism,…

  17. Supervision af psykoterapi via Skype

    DEFF Research Database (Denmark)

    Jacobsen, Claus Haugaard; Grünbaum, Liselotte

    2011-01-01

    clinical experience of Skype™ in supervision, mainly of psychoanalytic child psychotherapy, is presented and reflected upon. Finally, the reluctance of the Danish Board for Psychologists’s to recognize audiovisual distance supervision as part of the required training demands is discussed. It is concluded...

  18. Supervisees' Perception of Clinical Supervision

    Science.gov (United States)

    Willis, Lisa

    2010-01-01

    Supervisors must become aware of the possible conflicts that could arise during clinical supervision. It is important that supervisors communicate their roles and expectations effectively with their supervisees. This paper supports the notion that supervision is a mutual agreement between the supervisee and the supervisor and the roles of…

  19. Assessment of Counselors' Supervision Processes

    Science.gov (United States)

    Ünal, Ali; Sürücü, Abdullah; Yavuz, Mustafa

    2013-01-01

    The aim of this study is to investigate elementary and high school counselors' supervision processes and efficiency of their supervision. The interview method was used as it was thought to be better for realizing the aim of the study. The study group was composed of ten counselors who were chosen through purposeful sampling method. Data were…

  20. Tværfaglig supervision

    DEFF Research Database (Denmark)

    Tværfaglig supervision dækker over supervision af forskellige faggrupper. Det er en kompleks disciplin der stiller store krav tl supervisor. Bogens første del præsenterer fire faglige supervisionsmodeller: En almen, en psykodynamisk, en kognitiv adfærdsterapeutisk og en narrativ. Anden del...