WorldWideScience

Sample records for based classification model

  1. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases the...... classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...... datasets. Our model also outperforms A Decision Cluster Classification (ADCC) and the Decision Cluster Forest Classification (DCFC) models on the Reuters-21578 dataset....

  2. An Agent Based Classification Model

    CERN Document Server

    Gu, Feng; Greensmith, Julie

    2009-01-01

    The major function of this model is to access the UCI Wisconsin Breast Can- cer data-set[1] and classify the data items into two categories, which are normal and anomalous. This kind of classifi cation can be referred as anomaly detection, which discriminates anomalous behaviour from normal behaviour in computer systems. One popular solution for anomaly detection is Artifi cial Immune Sys- tems (AIS). AIS are adaptive systems inspired by theoretical immunology and observed immune functions, principles and models which are applied to prob- lem solving. The Dendritic Cell Algorithm (DCA)[2] is an AIS algorithm that is developed specifi cally for anomaly detection. It has been successfully applied to intrusion detection in computer security. It is believed that agent-based mod- elling is an ideal approach for implementing AIS, as intelligent agents could be the perfect representations of immune entities in AIS. This model evaluates the feasibility of re-implementing the DCA in an agent-based simulation environ- ...

  3. An Agent Based Classification Model

    OpenAIRE

    Gu, Feng; Aickelin, Uwe; Greensmith, Julie

    2009-01-01

    The major function of this model is to access the UCI Wisconsin Breast Can- cer data-set[1] and classify the data items into two categories, which are normal and anomalous. This kind of classifi cation can be referred as anomaly detection, which discriminates anomalous behaviour from normal behaviour in computer systems. One popular solution for anomaly detection is Artifi cial Immune Sys- tems (AIS). AIS are adaptive systems inspired by theoretical immunology and observed immune functions, p...

  4. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304. ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  5. An Efficient Semantic Model For Concept Based Clustering And Classification

    Directory of Open Access Journals (Sweden)

    SaiSindhu Bandaru

    2012-03-01

    Full Text Available Usually in text mining techniques the basic measures like term frequency of a term (word or phrase is computed to compute the importance of the term in the document. But with statistical analysis, the original semantics of the term may not carry the exact meaning of the term. To overcome this problem, a new framework has been introduced which relies on concept based model and synonym based approach. The proposed model can efficiently find significant matching and related concepts between documents according to concept based and synonym based approaches. Large sets of experiments using the proposed model on different set in clustering and classification are conducted. Experimental results demonstrate the substantialenhancement of the clustering quality using sentence based, document based, corpus based and combined approach concept analysis. A new similarity measure has been proposed to find the similarity between adocument and the existing clusters, which can be used in classification of the document with existing clusters.

  6. A Fuzzy Similarity Based Concept Mining Model for Text Classification

    CERN Document Server

    Puri, Shalini

    2012-01-01

    Text Classification is a challenging and a red hot field in the current scenario and has great importance in text categorization applications. A lot of research work has been done in this field but there is a need to categorize a collection of text documents into mutually exclusive categories by extracting the concepts or features using supervised learning paradigm and different classification algorithms. In this paper, a new Fuzzy Similarity Based Concept Mining Model (FSCMM) is proposed to classify a set of text documents into pre - defined Category Groups (CG) by providing them training and preparing on the sentence, document and integrated corpora levels along with feature reduction, ambiguity removal on each level to achieve high system performance. Fuzzy Feature Category Similarity Analyzer (FFCSA) is used to analyze each extracted feature of Integrated Corpora Feature Vector (ICFV) with the corresponding categories or classes. This model uses Support Vector Machine Classifier (SVMC) to classify correct...

  7. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    Science.gov (United States)

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  8. Semi-Supervised Classification based on Gaussian Mixture Model for remote imagery

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Semi-Supervised Classification (SSC),which makes use of both labeled and unlabeled data to determine classification borders in feature space,has great advantages in extracting classification information from mass data.In this paper,a novel SSC method based on Gaussian Mixture Model (GMM) is proposed,in which each class’s feature space is described by one GMM.Experiments show the proposed method can achieve high classification accuracy with small amount of labeled data.However,for the same accuracy,supervised classification methods such as Support Vector Machine,Object Oriented Classification,etc.should be provided with much more labeled data.

  9. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    OpenAIRE

    Li, N.; Liu, C; Pfeifer, N; Yin, J. F.; Liao, Z.Y.; Zhou, Y.

    2016-01-01

    Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could kee...

  10. Pitch Based Sound Classification

    OpenAIRE

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U.

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classif...

  11. Assessing the Performance of a Classification-Based Vulnerability Analysis Model

    OpenAIRE

    Wang, Tai-Ran; Mousseau, Vincent; Pedroni, Nicola; Zio, Enrico

    2015-01-01

    In this article, a classification model based on the majority rule sorting (MR-Sort) method is employed to evaluate the vulnerability of safety-critical systems with respect to malevolent intentional acts. The model is built on the basis of a (limited-size) set of data representing (a priori known) vulnerability classification examples. The empirical construction of the clas-sification model introduces a source of uncertainty into the vulnerability analysis process: a quantitative assessment ...

  12. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft...

  13. Human Cancer Classification: A Systems Biology- Based Model Integrating Morphology, Cancer Stem Cells, Proteomics, and Genomics

    OpenAIRE

    Halliday A Idikio

    2011-01-01

    Human cancer classification is currently based on the idea of cell of origin, light and electron microscopic attributes of the cancer. What is not yet integrated into cancer classification are the functional attributes of these cancer cells. Recent innovative techniques in biology have provided a wealth of information on the genomic, transcriptomic and proteomic changes in cancer cells. The emergence of the concept of cancer stem cells needs to be included in a classification model to capture...

  14. About Classification Methods Based on Tensor Modelling for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Salah Bourennane

    2010-03-01

    Full Text Available Denoising and Dimensionality Reduction (DR are key issue to improve the classifiers efficiency for Hyper spectral images (HSI. The multi-way Wiener filtering recently developed is used, Principal and independent component analysis (PCA; ICA and projection pursuit(PP approaches to DR have been investigated. These matrix algebra methods are applied on vectorized images. Thereof, the spatial rearrangement is lost. To jointly take advantage of the spatial and spectral information, HSI has been recently represented as tensor. Offering multiple ways to decompose data orthogonally, we introduced filtering and DR methods based on multilinear algebra tools. The DR is performed on spectral way using PCA, or PP joint to an orthogonal projection onto a lower subspace dimension of the spatial ways. Weshow the classification improvement using the introduced methods in function to existing methods. This experiment is exemplified using real-world HYDICE data. Multi-way filtering, Dimensionality reduction, matrix and multilinear algebra tools, tensor processing.

  15. Topic Modelling for Object-Based Classification of Vhr Satellite Images Based on Multiscale Segmentations

    Science.gov (United States)

    Shen, Li; Wu, Linmei; Li, Zhipeng

    2016-06-01

    Multiscale segmentation is a key prerequisite step for object-based classification methods. However, it is often not possible to determine a sole optimal scale for the image to be classified because in many cases different geo-objects and even an identical geo-object may appear at different scales in one image. In this paper, an object-based classification method based on mutliscale segmentation results in the framework of topic modelling is proposed to classify VHR satellite images in an entirely unsupervised fashion. In the stage of topic modelling, grayscale histogram distributions for each geo-object class and each segment are learned in an unsupervised manner from multiscale segments. In the stage of classification, each segment is allocated a geo-object class label by the similarity comparison between the grayscale histogram distributions of each segment and each geo-object class. Experimental results show that the proposed method can perform better than the traditional methods based on topic modelling.

  16. An Efficient Semantic Model For Concept Based Clustering And Classification

    OpenAIRE

    SaiSindhu Bandaru; Dr. K B Madhuri

    2012-01-01

    Usually in text mining techniques the basic measures like term frequency of a term (word or phrase) is computed to compute the importance of the term in the document. But with statistical analysis, the original semantics of the term may not carry the exact meaning of the term. To overcome this problem, a new framework has been introduced which relies on concept based model and synonym based approach. The proposed model can efficiently find significant matching and related concepts between doc...

  17. Classification Based on Hierarchical Linear Models: The Need for Incorporation of Social Contexts in Classification Analysis

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qui

    2009-01-01

    Many areas in educational and psychological research involve the use of classification statistical analysis. For example, school districts might be interested in attaining variables that provide optimal prediction of school dropouts. In psychology, a researcher might be interested in the classification of a subject into a particular psychological…

  18. Content-based similarity for 3D model retrieval and classification

    Institute of Scientific and Technical Information of China (English)

    Ke Lü; Ning He; Jian Xue

    2009-01-01

    With the rapid development of 3D digital shape information,content-based 3D model retrieval and classification has become an important research area.This paper presents a novel 3D model retrieval and classification algorithm.For feature representation,a method combining a distance histogram and moment invariants is proposed to improve the retrieval performance.The major advantage of using a distance histogram is its invariance to the transforms of scaling,translation and rotation.Based on the premise that two similar objects should have high mutual information,the querying of 3D data should convey a great deal of information on the shape of the two objects,and so we propose a mutual information distance measurement to perform the similarity comparison of 3D objects.The proposed algorithm is tested with a 3D model retrieval and classification prototype,and the experimental evaluation demonstrates satisfactory retrieval results and classification accuracy.

  19. Ligand and structure-based classification models for Prediction of P-glycoprotein inhibitors

    DEFF Research Database (Denmark)

    Klepsch, Freya; Poongavanam, Vasanthanathan; Ecker, Gerhard Franz

    2014-01-01

    obtained by docking into a homology model of P-gp, to supervised machine learning methods, such as Kappa nearest neighbor, support vector machine (SVM), random forest and binary QSAR, by using a large, structurally diverse data set. In addition, the applicability domain of the models was assessed using an...... algorithm based on Euclidean distance. Results show that random forest and SVM performed best for classification of P-gp inhibitors and non-inhibitors, correctly predicting 73/75 % of the external test set compounds. Classification based on the docking experiments using the scoring function Chem...

  20. Wearable-Sensor-Based Classification Models of Faller Status in Older Adults

    Science.gov (United States)

    2016-01-01

    Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521). Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment. PMID:27054878

  1. Computerized Classification Testing under the One-Parameter Logistic Response Model with Ability-Based Guessing

    Science.gov (United States)

    Wang, Wen-Chung; Huang, Sheng-Yun

    2011-01-01

    The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…

  2. A Feature-based Classification of Model Repair Approaches

    OpenAIRE

    Macedo, Nuno; Jorge, Tiago; Cunha, Alcino

    2015-01-01

    Consistency management, the ability to detect, diagnose and handle inconsistencies, is crucial during the development process in Model-driven Engineering (MDE). As the popularity and application scenarios of MDE expanded, a variety of different techniques were proposed to address these tasks in specific contexts. Of the various stages of consistency management, this work focuses on inconsistency fixing in MDE, where such task is embodied by model repair techniques. This paper proposes a featu...

  3. Hierarchical Web Page Classification Based on a Topic Model and Neighboring Pages Integration

    OpenAIRE

    Wongkot Sriurai; Phayung Meesad; Choochart Haruechaiyasak

    2010-01-01

    Most Web page classification models typically apply the bag of words (BOW) model to represent the feature space. The original BOW representation, however, is unable to recognize semantic relationships between terms. One possible solution is to apply the topic model approach based on the Latent Dirichlet Allocation algorithm to cluster the term features into a set of latent topics. Terms assigned into the same topic are semantically related. In this paper, we propose a novel hierarchical class...

  4. A Trust Model Based on Service Classification in Mobile Services

    CERN Document Server

    Liu, Yang; Xia, Feng; Lv, Xiaoning; Bu, Fanyu

    2010-01-01

    Internet of Things (IoT) and B3G/4G communication are promoting the pervasive mobile services with its advanced features. However, security problems are also baffled the development. This paper proposes a trust model to protect the user's security. The billing or trust operator works as an agent to provide a trust authentication for all the service providers. The services are classified by sensitive value calculation. With the value, the user's trustiness for corresponding service can be obtained. For decision, three trust regions are divided, which is referred to three ranks: high, medium and low. The trust region tells the customer, with his calculated trust value, which rank he has got and which authentication methods should be used for access. Authentication history and penalty are also involved with reasons.

  5. A model presented for classification ECG signals base on Case-Based Reasoning

    Directory of Open Access Journals (Sweden)

    Elaheh Sayari

    2013-07-01

    Full Text Available Early detection of heart diseases/abnormalities can prolong life and enhance the quality of living through appropriate treatment; thus classifying cardiac signals will be helped to immediate diagnosing of heart beat type in cardiac patients. The present paper utilizes the case base reasoning (CBR for classification of ECG signals. Four types of ECG beats (normal beat, congestive heart failure beat, ventricular tachyarrhythmia beat and atrial fibrillation beat obtained from the PhysioBank database was classified by the proposed CBR model. The main purpose of this article is classifying heart signals and diagnosing the type of heart beat in cardiac patients that in proposed CBR (Case Base Reasoning system, Training and testing data for diagnosing and classifying types of heart beat have been used. The evaluation results from the model are shown that the proposed model has high accuracy in classifying heart signals and helps to clinical decisions for diagnosing the type of heart beat in cardiac patients which indeed has high impact on diagnosing the type of heart beat aided computer.

  6. Latent classification models

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre

    2005-01-01

    parametric family ofdistributions.  In this paper we propose a new set of models forclassification in continuous domains, termed latent classificationmodels. The latent classification model can roughly be seen ascombining the \\NB model with a mixture of factor analyzers,thereby relaxing the assumptions of...... classification model, and wedemonstrate empirically that the accuracy of the proposed model issignificantly higher than the accuracy of other probabilisticclassifiers....

  7. Model-based Methods of Classification: Using the mclust Software in Chemometrics

    Directory of Open Access Journals (Sweden)

    Chris Fraley

    2007-01-01

    Full Text Available Due to recent advances in methods and software for model-based clustering, and to the interpretability of the results, clustering procedures based on probability models are increasingly preferred over heuristic methods. The clustering process estimates a model for the data that allows for overlapping clusters, producing a probabilistic clustering that quantifies the uncertainty of observations belonging to components of the mixture. The resulting clustering model can also be used for some other important problems in multivariate analysis, including density estimation and discriminant analysis. Examples of the use of model-based clustering and classification techniques in chemometric studies include multivariate image analysis, magnetic resonance imaging, microarray image segmentation, statistical process control, and food authenticity. We review model-based clustering and related methods for density estimation and discriminant analysis, and show how the R package mclust can be applied in each instance.

  8. Sparse coding based dense feature representation model for hyperspectral image classification

    Science.gov (United States)

    Oguslu, Ender; Zhou, Guoqing; Zheng, Zezhong; Iftekharuddin, Khan; Li, Jiang

    2015-11-01

    We present a sparse coding based dense feature representation model (a preliminary version of the paper was presented at the SPIE Remote Sensing Conference, Dresden, Germany, 2013) for hyperspectral image (HSI) classification. The proposed method learns a new representation for each pixel in HSI through the following four steps: sub-band construction, dictionary learning, encoding, and feature selection. The new representation usually has a very high dimensionality requiring a large amount of computational resources. We applied the l1/lq regularized multiclass logistic regression technique to reduce the size of the new representation. We integrated the method with a linear support vector machine (SVM) and a composite kernels SVM (CKSVM) to discriminate different types of land cover. We evaluated the proposed algorithm on three well-known HSI datasets and compared our method to four recently developed classification methods: SVM, CKSVM, simultaneous orthogonal matching pursuit, and image fusion and recursive filtering. Experimental results show that the proposed method can achieve better overall and average classification accuracies with a much more compact representation leading to more efficient sparse models for HSI classification.

  9. Semi-Automated Object-Based Classification of Coral Reef Habitat using Discrete Choice Models

    Directory of Open Access Journals (Sweden)

    Steven Saul

    2015-11-01

    Full Text Available As for terrestrial remote sensing, pixel-based classifiers have traditionally been used to map coral reef habitats. For pixel-based classifiers, habitat assignment is based on the spectral or textural properties of each individual pixel in the scene. More recently, however, object-based classifications, those based on information from a set of contiguous pixels with similar properties, have found favor with the reef mapping community and are starting to be extensively deployed. Object-based classifiers have an advantage over pixel-based in that they are less compromised by the inevitable inhomogeneity in per-pixel spectral response caused, primarily, by variations in water depth. One aspect of the object-based classification workflow is the assignment of each image object to a habitat class on the basis of its spectral, textural, or geometric properties. While a skilled image interpreter can achieve this task accurately through manual editing, full or partial automation is desirable for large-scale reef mapping projects of the magnitude which are useful for marine spatial planning. To this end, this paper trials the use of multinomial logistic discrete choice models to classify coral reef habitats identified through object-based segmentation of satellite imagery. Our results suggest that these models can attain assignment accuracies of about 85%, while also reducing the time needed to produce the map, as compared to manual methods. Limitations of this approach include misclassification of image objects at the interface between some habitat types due to the soft gradation in nature between habitats, the robustness of the segmentation algorithm used, and the selection of a strong training dataset. Finally, due to the probabilistic nature of multinomial logistic models, the analyst can estimate a map of uncertainty associated with the habitat classifications. Quantifying uncertainty is important to the end-user when developing marine spatial

  10. Unsupervised polarimetric SAR urban area classification based on model-based decomposition with cross scattering

    Science.gov (United States)

    Xiang, Deliang; Tang, Tao; Ban, Yifang; Su, Yi; Kuang, Gangyao

    2016-06-01

    Since it has been validated that cross-polarized scattering (HV) is caused not only by vegetation but also by rotated dihedrals, in this study, we use rotated dihedral corner reflectors to form a cross scattering matrix and propose an extended four-component model-based decomposition method for PolSAR data over urban areas. Unlike other urban area decomposition techniques which need to discriminate the urban and natural areas before decomposition, this proposed method is applied on PolSAR image directly. The building orientation angle is considered in this scattering matrix, making it flexible and adaptive in the decomposition. Therefore, we can separate cross scattering of urban areas from the overall HV component. Further, the cross and helix scattering components are also compared. Then, using these decomposed scattering powers, the buildings and natural areas can be easily discriminated from each other using a simple unsupervised K-means classifier. Moreover, buildings aligned and not aligned along the radar flight direction can be also distinguished clearly. Spaceborne RADARSAT-2 and airborne AIRSAR full polarimetric SAR data are used to validate the performance of our proposed method. The cross scattering power of oriented buildings is generated, leading to a better decomposition result for urban areas with respect to other state-of-the-art urban decomposition techniques. The decomposed scattering powers significantly improve the classification accuracy for urban areas.

  11. A technical study and analysis on fuzzy similarity based models for text classification

    CERN Document Server

    Puri, Shalini; 10.5121/ijdkp.2012.2201

    2012-01-01

    In this new and current era of technology, advancements and techniques, efficient and effective text document classification is becoming a challenging and highly required area to capably categorize text documents into mutually exclusive categories. Fuzzy similarity provides a way to find the similarity of features among various documents. In this paper, a technical review on various fuzzy similarity based models is given. These models are discussed and compared to frame out their use and necessity. A tour of different methodologies is provided which is based upon fuzzy similarity related concerns. It shows that how text and web documents are categorized efficiently into different categories. Various experimental results of these models are also discussed. The technical comparisons among each model's parameters are shown in the form of a 3-D chart. Such study and technical review provide a strong base of research work done on fuzzy similarity based text document categorization.

  12. Generative embedding for model-based classification of fMRI data.

    Directory of Open Access Journals (Sweden)

    Kay H Brodersen

    2011-06-01

    Full Text Available Decoding models, such as those underlying multivariate classification algorithms, have been increasingly used to infer cognitive or clinical brain states from measures of brain activity obtained by functional magnetic resonance imaging (fMRI. The practicality of current classifiers, however, is restricted by two major challenges. First, due to the high data dimensionality and low sample size, algorithms struggle to separate informative from uninformative features, resulting in poor generalization performance. Second, popular discriminative methods such as support vector machines (SVMs rarely afford mechanistic interpretability. In this paper, we address these issues by proposing a novel generative-embedding approach that incorporates neurobiologically interpretable generative models into discriminative classifiers. Our approach extends previous work on trial-by-trial classification for electrophysiological recordings to subject-by-subject classification for fMRI and offers two key advantages over conventional methods: it may provide more accurate predictions by exploiting discriminative information encoded in 'hidden' physiological quantities such as synaptic connection strengths; and it affords mechanistic interpretability of clinical classifications. Here, we introduce generative embedding for fMRI using a combination of dynamic causal models (DCMs and SVMs. We propose a general procedure of DCM-based generative embedding for subject-wise classification, provide a concrete implementation, and suggest good-practice guidelines for unbiased application of generative embedding in the context of fMRI. We illustrate the utility of our approach by a clinical example in which we classify moderately aphasic patients and healthy controls using a DCM of thalamo-temporal regions during speech processing. Generative embedding achieves a near-perfect balanced classification accuracy of 98% and significantly outperforms conventional activation-based and

  13. Gene function classification using Bayesian models with hierarchy-based priors

    Directory of Open Access Journals (Sweden)

    Neal Radford M

    2006-10-01

    Full Text Available Abstract Background We investigate whether annotation of gene function can be improved using a classification scheme that is aware that functional classes are organized in a hierarchy. The classifiers look at phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit (MNL model, a hierarchical model based on a set of nested MNL models, and an MNL model with a prior that introduces correlations between the parameters for classes that are nearby in the hierarchy. We also provide a new scheme for combining different sources of information. We use these models to predict the functional class of Open Reading Frames (ORFs from the E. coli genome. Results The results from all three models show substantial improvement over previous methods, which were based on the C5 decision tree algorithm. The MNL model using a prior based on the hierarchy outperforms both the non-hierarchical MNL model and the nested MNL model. In contrast to previous attempts at combining the three sources of information in this dataset, our new approach to combining data sources produces a higher accuracy rate than applying our models to each data source alone. Conclusion Together, these results show that gene function can be predicted with higher accuracy than previously achieved, using Bayesian models that incorporate suitable prior information.

  14. Integrated knowledge-based modeling and its application for classification problems

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Knowledge discovery from data directly can hardly avoid the fact that it is biased towards the collected experimental data, whereas, expert systems are always baffled with the manual knowledge acquisition bottleneck. So it is believable that integrating the knowledge embedded in data and those possessed by experts can lead to a superior modeling approach. Aiming at the classification problems, a novel integrated knowledge-based modeling methodology, oriented by experts and driven by data, is proposed. It starts from experts identifying modeling parameters, and then the input space is partitioned followed by fuzzification. Afterwards, single rules are generated and then aggregated to form a rule base. on which a fuzzy inference mechanism is proposed. The experts are allowed to make necessary changes on the rule base to improve the model accuracy. A real-world application, welding fault diagnosis, is presented to demonstrate the effectiveness of the methodology.

  15. Research on evaluating water resource resilience based on projection pursuit classification model

    Science.gov (United States)

    Liu, Dong; Zhao, Dan; Liang, Xu; Wu, Qiuchen

    2016-03-01

    Water is a fundamental natural resource while agriculture water guarantees the grain output, which shows that the utilization and management of water resource have a significant practical meaning. Regional agricultural water resource system features with unpredictable, self-organization, and non-linear which lays a certain difficulty on the evaluation of regional agriculture water resource resilience. The current research on water resource resilience remains to focus on qualitative analysis and the quantitative analysis is still in the primary stage, thus, according to the above issues, projection pursuit classification model is brought forward. With the help of artificial fish-swarm algorithm (AFSA), it optimizes the projection index function, seeks for the optimal projection direction, and improves AFSA with the application of self-adaptive artificial fish step and crowding factor. Taking Hongxinglong Administration of Heilongjiang as the research base and on the basis of improving AFSA, it established the evaluation of projection pursuit classification model to agriculture water resource system resilience besides the proceeding analysis of projection pursuit classification model on accelerating genetic algorithm. The research shows that the water resource resilience of Hongxinglong is the best than Raohe Farm, and the last 597 Farm. And the further analysis shows that the key driving factors influencing agricultural water resource resilience are precipitation and agriculture water consumption. The research result reveals the restoring situation of the local water resource system, providing foundation for agriculture water resource management.

  16. SAR Images Statistical Modeling and Classification Based on the Mixture of Alpha-Stable Distributions

    Directory of Open Access Journals (Sweden)

    Fangling Pu

    2013-05-01

    Full Text Available This paper proposes the mixture of Alpha-stable (MAS distributions for modeling statistical property of Synthetic Aperture Radar (SAR images in a supervised Markovian classification algorithm. Our work is motivated by the fact that natural scenes consist of various reflectors with different types that are typically concentrated within a small area, and SAR images generally exhibit sharp peaks, heavy tails, and even multimodal statistical property, especially at high resolution. Unimodal distributions do not fit such statistical property well, and thus a multimodal approach is necessary. Driven by the multimodality and impulsiveness of high resolution SAR images histogram, we utilize the mixture of Alpha-stable distributions to describe such characteristics. A pseudo-simulated annealing (PSA estimator based on Markov chain Monte Carlo (MCMC is present to efficiently estimate model parameters of the mixture of Alpha-stable distributions. To validate the proposed PSA estimator, we apply it to simulated data and compare its performance to that of a state-of-the-art estimator. Finally, we exploit the MAS distributions and a Markovian context for SAR images classification. The effectiveness of the proposed classifier is demonstrated by experiments on TerraSAR-X images, which verifies the validity of the MAS distributions for modeling and classification of SAR images.

  17. Chinese Short-Text Classification Based on Topic Model with High-Frequency Feature Expansion

    Directory of Open Access Journals (Sweden)

    Hu Y. Jun

    2013-08-01

    Full Text Available Short text differs from traditional documents in its shortness and sparseness. Feature extension can ease the problem of high sparseness in the vector space model, but it inevitably introduces noise. To resolve this problem, this paper proposes a high-frequency feature expansion method based on a latent Dirichlet allocation (LDA topic model. High-frequency features are extracted from each category as the feature space, using LDA to derive latent topics from the corpus, and topic words are extended to the short text. Extensive experiments are conducted on Chinese short messages and news titles. The proposed method for classifying Chinese short texts outperforms conventional classification methods.

  18. A FEASIBILITY STUDY ON USING PHYSICS-BASED MODELER OUTPUTS TO TRAIN PROBABILISTIC NEURAL NETWORKS FOR UXO CLASSIFICATION

    Science.gov (United States)

    A probabilistic neural network (PNN) has been applied to the detection and classification of unexploded ordnance (UXO) measured using magnetometry data collected using the Multi-sensor Towed Array Detection System (MTADS). Physical parameters obtained from a physics based modeler...

  19. A Quaternary-Stage User Interest Model Based on User Browsing Behavior and Web Page Classification

    Institute of Scientific and Technical Information of China (English)

    Zongli Jiang; Hang Su

    2012-01-01

    The key to personalized search engine lies in user model. Traditional personalized model results in that the search results of secondary search are partial to the long-term interests, besides, forgetting to the long-term interests disenables effective recollection of user interests. This paper presents a quaternary-stage user interest model based on user browsing behavior and web page classification, which consults the principles of cache and recycle bin in operating system, by setting up an illuminating text-stage and a recycle bin interest-stage in front and rear of the traditional interest model respectively to constitute the quaternary-stage user interest model. The model can better reflect the user interests, by using an adaptive natural weight and its calculation method, and by efficiently integrating user browsing behavior and web document content.

  20. SAR Imagery Simulation of Ship Based on Electromagnetic Calculations and Sea Clutter Modelling for Classification Applications

    International Nuclear Information System (INIS)

    Ship detection and classification with space-borne SAR has many potential applications within the maritime surveillance, fishery activity management, monitoring ship traffic, and military security. While ship detection techniques with SAR imagery are well established, ship classification is still an open issue. One of the main reasons may be ascribed to the difficulties on acquiring the required quantities of real data of vessels under different observation and environmental conditions with precise ground truth. Therefore, simulation of SAR images with high scenario flexibility and reasonable computation costs is compulsory for ship classification algorithms development. However, the simulation of SAR imagery of ship over sea surface is challenging. Though great efforts have been devoted to tackle this difficult problem, it is far from being conquered. This paper proposes a novel scheme for SAR imagery simulation of ship over sea surface. The simulation is implemented based on high frequency electromagnetic calculations methods of PO, MEC, PTD and GO. SAR imagery of sea clutter is modelled by the representative K-distribution clutter model. Then, the simulated SAR imagery of ship can be produced by inserting the simulated SAR imagery chips of ship into the SAR imagery of sea clutter. The proposed scheme has been validated with canonical and complex ship targets over a typical sea scene

  1. Dynamic Latent Classification Model

    DEFF Research Database (Denmark)

    Zhong, Shengtong; Martínez, Ana M.; Nielsen, Thomas Dyhre;

    possible. Motivated by this problem setting, we propose a generative model for dynamic classification in continuous domains. At each time point the model can be seen as combining a naive Bayes model with a mixture of factor analyzers (FA). The latent variables of the FA are used to capture the dynamics in...

  2. Stability classification model of mine-lane surrounding rock based on distance discriminant analysis method

    Institute of Scientific and Technical Information of China (English)

    ZHANG Wei; LI Xi-bing; GONG Feng-qiang

    2008-01-01

    Based on the principle of Mahalanobis distance discriminant analysis (DDA) theory, a stability classification model for mine-lane surrounding rock was established, including six indexes of discriminant factors that reflect the engineering quality of surrounding rock: lane depth below surface, span of lane, ratio of directly top layer thickness to coal thickness, uniaxial comprehensive strength of surrounding rock, development degree coefficient of surrounding rock joint and range of broken surrounding rock zone. A DDA model was obtained through training 15 practical measuring samples. The re-substitution method was introduced to verify the stability of DDA model and the ratio of mis-discrimination is zero. The DDA model was used to discriminate3 new samples and the results are identical with actual rock kind. Compared with the artificial neural network method and support vector mechanic method, the results show that this model has high prediction accuracy and can be used in practical engineering.

  3. A physiologically-inspired model of numerical classification based on graded stimulus coding

    Directory of Open Access Journals (Sweden)

    John Pearson

    2010-01-01

    Full Text Available In most natural decision contexts, the process of selecting among competing actions takes place in the presence of informative, but potentially ambiguous, stimuli. Decisions about magnitudes—quantities like time, length, and brightness that are linearly ordered—constitute an important subclass of such decisions. It has long been known that perceptual judgments about such quantities obey Weber’s Law, wherein the just-noticeable difference in a magnitude is proportional to the magnitude itself. Current physiologically inspired models of numerical classification assume discriminations are made via a labeled line code of neurons selectively tuned for numerosity, a pattern observed in the firing rates of neurons in the ventral intraparietal area (VIP of the macaque. By contrast, neurons in the contiguous lateral intraparietal area (LIP signal numerosity in a graded fashion, suggesting the possibility that numerical classification could be achieved in the absence of neurons tuned for number. Here, we consider the performance of a decision model based on this analog coding scheme in a paradigmatic discrimination task—numerosity bisection. We demonstrate that a basic two-neuron classifier model, derived from experimentally measured monotonic responses of LIP neurons, is sufficient to reproduce the numerosity bisection behavior of monkeys, and that the threshold of the classifier can be set by reward maximization via a simple learning rule. In addition, our model predicts deviations from Weber Law scaling of choice behavior at high numerosity. Together, these results suggest both a generic neuronal framework for magnitude-based decisions and a role for reward contingency in the classification of such stimuli.

  4. Unsupervised amplitude and texture based classification of SAR images with multinomial latent model

    OpenAIRE

    Kayabol, Koray; Zerubia, Josiane

    2012-01-01

    We combine both amplitude and texture statistics of the Synthetic Aperture Radar (SAR) images for classification purpose. We use Nakagami density to model the class amplitudes and a non-Gaussian Markov Random Field (MRF) texture model with t-distributed regression error to model the textures of the classes. A non-stationary Multinomial Logistic (MnL) latent class label model is used as a mixture density to obtain spatially smooth class segments. The Classification Expectation-Maximization (CE...

  5. Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models.

    Science.gov (United States)

    Fernandez-Lozano, Carlos; Cuiñas, Rubén F; Seoane, José A; Fernández-Blanco, Enrique; Dorado, Julian; Munteanu, Cristian R

    2015-11-01

    Signaling proteins are an important topic in drug development due to the increased importance of finding fast, accurate and cheap methods to evaluate new molecular targets involved in specific diseases. The complexity of the protein structure hinders the direct association of the signaling activity with the molecular structure. Therefore, the proposed solution involves the use of protein star graphs for the peptide sequence information encoding into specific topological indices calculated with S2SNet tool. The Quantitative Structure-Activity Relationship classification model obtained with Machine Learning techniques is able to predict new signaling peptides. The best classification model is the first signaling prediction model, which is based on eleven descriptors and it was obtained using the Support Vector Machines-Recursive Feature Elimination (SVM-RFE) technique with the Laplacian kernel (RFE-LAP) and an AUROC of 0.961. Testing a set of 3114 proteins of unknown function from the PDB database assessed the prediction performance of the model. Important signaling pathways are presented for three UniprotIDs (34 PDBs) with a signaling prediction greater than 98.0%. PMID:26297890

  6. Hidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers

    Directory of Open Access Journals (Sweden)

    M. Beyreuther

    2011-02-01

    Full Text Available Automatic earthquake detection and classification is required for efficient analysis of large seismic datasets. Such techniques are particularly important now because access to measures of ground motion is nearly unlimited and the target waveforms (earthquakes are often hard to detect and classify. Here, we propose to use models from speech synthesis which extend the double stochastic models from speech recognition by integrating a more realistic duration of the target waveforms. The method, which has general applicability, is applied to earthquake detection and classification. First, we generate characteristic functions from the time-series. The Hidden semi-Markov Models are estimated from the characteristic functions and Weighted Finite-State Transducers are constructed for the classification. We test our scheme on one month of continuous seismic data, which corresponds to 370 151 classifications, showing that incorporating the time dependency explicitly in the models significantly improves the results compared to Hidden Markov Models.

  7. Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models

    Directory of Open Access Journals (Sweden)

    Jin Dai

    2014-01-01

    Full Text Available The similarity between objects is the core research area of data mining. In order to reduce the interference of the uncertainty of nature language, a similarity measurement between normal cloud models is adopted to text classification research. On this basis, a novel text classifier based on cloud concept jumping up (CCJU-TC is proposed. It can efficiently accomplish conversion between qualitative concept and quantitative data. Through the conversion from text set to text information table based on VSM model, the text qualitative concept, which is extraction from the same category, is jumping up as a whole category concept. According to the cloud similarity between the test text and each category concept, the test text is assigned to the most similar category. By the comparison among different text classifiers in different feature selection set, it fully proves that not only does CCJU-TC have a strong ability to adapt to the different text features, but also the classification performance is also better than the traditional classifiers.

  8. Site effect classification based on microtremor data analysis using concentration–area fractal model

    Directory of Open Access Journals (Sweden)

    A. Adib

    2014-07-01

    Full Text Available The aim of this study is to classify the site effect using concentration–area (C–A fractal model in Meybod city, Central Iran, based on microtremor data analysis. Log–log plots of the frequency, amplification and vulnerability index (k-g indicate a multifractal nature for the parameters in the area. The results obtained from the C–A fractal modeling reveal that proper soil types are located around the central city. The results derived via the fractal modeling were utilized to improve the Nogoshi's classification results in the Meybod city. The resulted categories are: (1 hard soil and weak rock with frequency of 6.2 to 8 Hz, (2 stiff soil with frequency of about 4.9 to 6.2 Hz, (3 moderately soft soil with the frequency of 2.4 to 4.9 Hz, and (4 soft soil with the frequency lower than 2.4 Hz.

  9. [Eco-value level classification model of forest ecosystem based on modified projection pursuit technique].

    Science.gov (United States)

    Wu, Chengzhen; Hong, Wei; Hong, Tao

    2006-03-01

    To optimize the projection function and direction of projection pursuit technique, predigest its realization process, and overcome the shortcomings in long time calculation and in the difficulty of optimizing projection direction and computer programming, this paper presented a modified simplex method (MSM), and based on it, brought forward the eco-value level classification model (EVLCM) of forest ecosystem, which could integrate the multidimensional classification index into one-dimensional projection value, with high projection value denoting high ecosystem services value. Examples of forest ecosystem could be reasonably classified by the new model according to their projection value, suggesting that EVLCM driven directly by samples data of forest ecosystem was simple and feasible, applicable, and maneuverable. The calculating time and value of projection function were 34% and 143% of those with the traditional projection pursuit technique, respectively. This model could be applied extensively to classify and estimate all kinds of non-linear and multidimensional data in ecology, biology, and regional sustainable development. PMID:16724723

  10. Research of the Classification Model Based on Dominance Rough Set Approach for China Emergency Communication

    Directory of Open Access Journals (Sweden)

    Fan Zifu

    2015-01-01

    Full Text Available Ensuring smooth communication and recovering damaged communication system quickly and efficiently are the key to the entire emergency response, command, control, and rescue during the whole accident. The classification of emergency communication level is the premise of emergency communication guarantee. So, we use dominance rough set approach (DRSA to construct the classification model for the judgment of emergency communication in this paper. In this model, we propose a classification index system of emergency communication using the method of expert interview firstly and then use DRSA to complete data sample, reduct attribute, and extract the preference decision rules of the emergency communication classification. Finally, the recognition accuracy of this model is verified; the testing result proves the model proposed in this paper is valid.

  11. AGE CLASSIFICATIONS BASED ON SECOND ORDER IMAGE COMPRESSED AND FUZZY REDUCED GREY LEVEL (SICFRG MODEL

    Directory of Open Access Journals (Sweden)

    Jangala. Sasi Kiran

    2013-06-01

    Full Text Available One of the most fundamental issues in image classification and recognition are how to characterize images using derived features. Many texture classification and recognition problems in the literature usually require the computation on entire image set and with large range of gray level values in order to achieve efficient and precise classification and recognition. This leads to lot of complexity in evaluating feature parameters. To address this,the present paper derives a Second Order image Compressed and Fuzzy Reduced Grey level (SICFRG model, which reduces the image dimension and grey level range without any loss of significant feature information. The present paper derives GLCM features on the proposed SICFRG model for efficient age classification that classifies facial image into a five groups. The SICFRG image mode of age classification is derived in three stages. In the first stage the 5 x 5 matrix is compressed into a 2 x 2 second order sub matrix without loosing anysignificant attributes, primitives, and any other local properties. In stage 2 Fuzzy logic is applied tPo reduce the Gray level range of compressed model of the image. In stage 3 GLCM is derived on SICFRG model of the image. The experimental evidence on FG-NET and Google aging database clearly indicates the high classification rate of the proposed method over the other methods.

  12. A speaker classification framework for non-intrusive user modeling : speech-based personalization of in-car services

    OpenAIRE

    Feld, Michael

    2011-01-01

    Speaker Classification, i.e. the automatic detection of certain characteristics of a person based on his or her voice, has a variety of applications in modern computer technology and artificial intelligence: As a non-intrusive source for user modeling, it can be employed for personalization of human-machine interfaces in numerous domains. This dissertation presents a principled approach to the design of a novel Speaker Classification system for automatic age and gender recognition which meets...

  13. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Science.gov (United States)

    Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933

  14. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  15. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    OpenAIRE

    C. Fernandez-Lozano; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected.

  16. Model-based Clustering of Categorical Time Series with Multinomial Logit Classification

    Science.gov (United States)

    Frühwirth-Schnatter, Sylvia; Pamminger, Christoph; Winter-Ebmer, Rudolf; Weber, Andrea

    2010-09-01

    A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series. However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter [6], is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to further explain group membership we present an extension to the approach of Pamminger and Frühwirth-Schnatter [6] by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule by using a multinomial logit model. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo (MCMC) sampling scheme representing a (full) Gibbs-type sampler which involves only draws from standard distributions. Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market.

  17. Classification of thermal waters based on their inorganic fingerprint and hydrogeothermal modelling

    Directory of Open Access Journals (Sweden)

    I. Delgado-Outeiriño

    2011-05-01

    Full Text Available Hydrothermal features in Galicia have been used since ancient times for therapeutic purposes. A characterization of these thermal waters was carried out in order to understand their behaviour based on inorganic pattern and water-rock interaction mechanisms. In this way 15 thermal water samples were collected in the same hydrographical system. The results of the hydrogeochemistry analysis showed one main water family of bicarbonate type sodium waters, typical in the post-orogenic basins of Galicia. Principal component analysis (PCA and partial lest squared (PLS clustered the selected thermal waters in two groups, regarding to their chemical composition. This classification agreed with the results obtained by the use of geothermometers and the hydrogeochemical modelling. The first included thermal samples that could be in contact with surface waters and therefore, their residence time in the reservoir and their water-rock interaction would be less important than for the thermal waters of the second group.

  18. Mel-frequencies Stochastic Model for Gender Classification based on Pitch and Formant

    Directory of Open Access Journals (Sweden)

    Syifaun Nafisah

    2016-02-01

    Full Text Available Speech recognition applications are becoming more and more useful nowadays. Before this technology is applied, the first step is test the system to measure the reliability of system.  The reliability of system can be measured using accuracy to recognize the speaker such as speaker identity or gender.  This paper introduces the stochastic model based on mel-frequencies to identify the gender of speaker in a noisy environment.  The Euclidean minimum distance and back propagation neural networks were used to create a model to recognize the gender from his/her speech signal based on formant and pitch of Mel-frequencies. The system uses threshold technique as identification tool. By using this threshold value, the proposed method can identifies the gender of speaker up to 94.11% and the average of processing duration is 15.47 msec. The implementation result shows a good performance of the proposed technique in gender classification based on speech signal in a noisy environment.

  19. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  20. Climatic Classification over Asia during the Middle Holocene Climatic Optimum Based on PMIP Models

    Institute of Scientific and Technical Information of China (English)

    Hyuntaik Oh; Ho-Jeong Shin

    2016-01-01

    ABSTRACT:When considering potential global warming projections, it is useful to understand the im-pact of each climate condition at 6 kyr before present. Asian paleoclimate was simulated by performing an integration of the multi-model ensemble with the paleoclimate modeling intercomparison project (PMIP) models. The reconstructed winter (summer) surface air temperature at 6 kyr before present was 0.85 ºC (0.21 ºC) lower (higher) than the present day over Asia, 60ºE–150ºE, 10ºN–60ºN. The seasonal variation and heating differences of land and ocean in summer at 6 kyr before present might be much larger than present day. The winter and summer precipitation of 6 kyr before present were 0.067 and 0.017 mm·day-1 larger than present day, respectively. The Group B climate, which means the dry climates based on Köppen climate classification, at 6 kyr before present decreased 17%compared to present day, but the Group D which means the continental and microthermal climates at 6 kyr before present increased over 7%. Comparison between the results from the model simulation and published paleo-proxy record agrees within the limited sparse paleo-proxy record data.

  1. Novel classification method for remote sensing images based on information entropy discretization algorithm and vector space model

    Science.gov (United States)

    Xie, Li; Li, Guangyao; Xiao, Mang; Peng, Lei

    2016-04-01

    Various kinds of remote sensing image classification algorithms have been developed to adapt to the rapid growth of remote sensing data. Conventional methods typically have restrictions in either classification accuracy or computational efficiency. Aiming to overcome the difficulties, a new solution for remote sensing image classification is presented in this study. A discretization algorithm based on information entropy is applied to extract features from the data set and a vector space model (VSM) method is employed as the feature representation algorithm. Because of the simple structure of the feature space, the training rate is accelerated. The performance of the proposed method is compared with two other algorithms: back propagation neural networks (BPNN) method and ant colony optimization (ACO) method. Experimental results confirm that the proposed method is superior to the other algorithms in terms of classification accuracy and computational efficiency.

  2. Discrimination-based Artificial Immune System: Modeling the Learning Mechanism of Self and Non-self Discrimination for Classification

    Directory of Open Access Journals (Sweden)

    Kazushi Igawa

    2007-01-01

    Full Text Available This study presents a new artificial immune system for classification. It was named discrimination-based artificial immune system (DAIS and was based on the principle of self and non-self discrimination by T cells in the human immune system. Ability of a natural immune system to distinguish between self and non-self molecules was applicable for classification in a way that one class was distinguished from others. We model this and the mechanism of the education in a thymus for classification. Especially, we introduce the method to decide the recognition distance threshold of the artificial lymphocyte, as the negative selection algorithm. We apply DAIS to real world datasets and show its performance to be comparable to that of other classifier systems. We conclude that this modeling was appropriate and DAIS was a useful classifier.

  3. In Vivo Mouse Intervertebral Disc Degeneration Model Based on a New Histological Classification

    Science.gov (United States)

    Ohnishi, Takashi; Sudo, Hideki; Iwasaki, Koji; Tsujimoto, Takeru; Ito, Yoichi M.; Iwasaki, Norimasa

    2016-01-01

    Although human intervertebral disc degeneration can lead to several spinal diseases, its pathogenesis remains unclear. This study aimed to create a new histological classification applicable to an in vivo mouse intervertebral disc degeneration model induced by needle puncture. One hundred six mice were operated and the L4/5 intervertebral disc was punctured with a 35- or 33-gauge needle. Micro-computed tomography scanning was performed, and the punctured region was confirmed. Evaluation was performed by using magnetic resonance imaging and histology by employing our classification scoring system. Our histological classification scores correlated well with the findings of magnetic resonance imaging and could detect degenerative progression, irrespective of the punctured region. However, the magnetic resonance imaging analysis revealed that there was no significant degenerative intervertebral disc change between the ventrally punctured and non-punctured control groups. To induce significant degeneration in the lumbar intervertebral discs, the central or dorsal region should be punctured instead of the ventral region. PMID:27482708

  4. Interpretable exemplar-based shape classification using constrained sparse linear models

    Science.gov (United States)

    Sigurdsson, Gunnar A.; Yang, Zhen; Tran, Trac D.; Prince, Jerry L.

    2015-03-01

    Many types of diseases manifest themselves as observable changes in the shape of the affected organs. Using shape classification, we can look for signs of disease and discover relationships between diseases. We formulate the problem of shape classification in a holistic framework that utilizes a lossless scalar field representation and a non-parametric classification based on sparse recovery. This framework generalizes over certain classes of unseen shapes while using the full information of the shape, bypassing feature extraction. The output of the method is the class whose combination of exemplars most closely approximates the shape, and furthermore, the algorithm returns the most similar exemplars along with their similarity to the shape, which makes the result simple to interpret. Our results show that the method offers accurate classification between three cerebellar diseases and controls in a database of cerebellar ataxia patients. For reproducible comparison, promising results are presented on publicly available 2D datasets, including the ETH-80 dataset where the method achieves 88.4% classification accuracy.

  5. Hidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers

    OpenAIRE

    M. Beyreuther; Wassermann, J.

    2011-01-01

    Automatic earthquake detection and classification is required for efficient analysis of large seismic datasets. Such techniques are particularly important now because access to measures of ground motion is nearly unlimited and the target waveforms (earthquakes) are often hard to detect and classify. Here, we propose to use models from speech synthesis which extend the double stochastic models from speech recognition by integrating a more realistic duration of the target waveforms. The method,...

  6. BClass: A Bayesian Approach Based on Mixture Models for Clustering and Classification of Heterogeneous Biological Data

    Directory of Open Access Journals (Sweden)

    Arturo Medrano-Soto

    2004-12-01

    Full Text Available Based on mixture models, we present a Bayesian method (called BClass to classify biological entities (e.g. genes when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects.

  7. Modeling Wood Fibre Length in Black Spruce (Picea mariana (Mill.) BSP) Based on Ecological Land Classification

    OpenAIRE

    Elisha Townshend; Bharat Pokharel; Art Groot; Doug Pitt; DECH, JEFFERY P.

    2015-01-01

    Effective planning to optimize the forest value chain requires accurate and detailed information about the resource; however, estimates of the distribution of fibre properties on the landscape are largely unavailable prior to harvest. Our objective was to fit a model of the tree-level average fibre length related to ecosite classification and other forest inventory variables depicted at the landscape scale. A series of black spruce increment cores were collected at breast height from trees in...

  8. Avoiding overfit by restricted model search in tree-based EEG classification

    Czech Academy of Sciences Publication Activity Database

    Klaschka, Jan

    The Hague: International Statistical Institute, 2012, s. 5077-5082. ISBN 978-90-73592-33-9. [ISI 2011. Session of the International Statistical Institute /58./. Dublin (IE), 21.08.2011-26.08.2011] R&D Projects: GA MŠk ME 949 Institutional research plan: CEZ:AV0Z10300504 Keywords : model search * electroencephalography * classification trees and forests * random forests Subject RIV: BB - Applied Statistics, Operational Research http://2011.isiproceedings.org/papers/950644.pdf

  9. Pulmonary emphysema classification based on an improved texton learning model by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2013-03-01

    In this paper, we present a texture classification method based on texton learned via sparse representation (SR) with new feature histogram maps in the classification of emphysema. First, an overcomplete dictionary of textons is learned via KSVD learning on every class image patches in the training dataset. In this stage, high-pass filter is introduced to exclude patches in smooth area to speed up the dictionary learning process. Second, 3D joint-SR coefficients and intensity histograms of the test images are used for characterizing regions of interest (ROIs) instead of conventional feature histograms constructed from SR coefficients of the test images over the dictionary. Classification is then performed using a classifier with distance as a histogram dissimilarity measure. Four hundreds and seventy annotated ROIs extracted from 14 test subjects, including 6 paraseptal emphysema (PSE) subjects, 5 centrilobular emphysema (CLE) subjects and 3 panlobular emphysema (PLE) subjects, are used to evaluate the effectiveness and robustness of the proposed method. The proposed method is tested on 167 PSE, 240 CLE and 63 PLE ROIs consisting of mild, moderate and severe pulmonary emphysema. The accuracy of the proposed system is around 74%, 88% and 89% for PSE, CLE and PLE, respectively.

  10. Modeling Wood Fibre Length in Black Spruce (Picea mariana (Mill. BSP Based on Ecological Land Classification

    Directory of Open Access Journals (Sweden)

    Elisha Townshend

    2015-09-01

    Full Text Available Effective planning to optimize the forest value chain requires accurate and detailed information about the resource; however, estimates of the distribution of fibre properties on the landscape are largely unavailable prior to harvest. Our objective was to fit a model of the tree-level average fibre length related to ecosite classification and other forest inventory variables depicted at the landscape scale. A series of black spruce increment cores were collected at breast height from trees in nine different ecosite groups within the boreal forest of northeastern Ontario, and processed using standard techniques for maceration and fibre length measurement. Regression tree analysis and random forests were used to fit hierarchical classification models and find the most important predictor variables for the response variable area-weighted mean stem-level fibre length. Ecosite group was the best predictor in the regression tree. Longer mean fibre-length was associated with more productive ecosites that supported faster growth. The explanatory power of the model of fitted data was good; however, random forests simulations indicated poor generalizability. These results suggest the potential to develop localized models linking wood fibre length in black spruce to landscape-level attributes, and improve the sustainability of forest management by identifying ideal locations to harvest wood that has desirable fibre characteristics.

  11. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  12. Biogeography based Satellite Image Classification

    CERN Document Server

    Panchal, V K; Kaur, Navdeep; Kundra, Harish

    2009-01-01

    Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. This paper is focused on classification of the satellite image of a particular land cover using the theory of Biogeography based Optimization. The original BBO algorithm does not have the inbuilt property of clustering which is required during image classification. Hence modifications have been proposed to the original algorithm and...

  13. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models

    DEFF Research Database (Denmark)

    Kheir, Rania Bou; Greve, Mogens Humlekrog; Bøcher, Peder Klith;

    2010-01-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps...... the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect......, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to...

  14. Persistent pulmonary subsolid nodules: model-based iterative reconstruction for nodule classification and measurement variability on low-dose CT

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Hyungjin; Kim, Seong Ho; Lee, Sang Min; Lee, Kyung Hee [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University Medical Research Center, Institute of Radiation Medicine, Seoul (Korea, Republic of); Park, Chang Min; Park, Sang Joon; Goo, Jin Mo [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University Medical Research Center, Institute of Radiation Medicine, Seoul (Korea, Republic of); Seoul National University, Cancer Research Institute, Seoul (Korea, Republic of)

    2014-11-15

    To compare the pulmonary subsolid nodule (SSN) classification agreement and measurement variability between filtered back projection (FBP) and model-based iterative reconstruction (MBIR). Low-dose CTs were reconstructed using FBP and MBIR for 47 patients with 47 SSNs. Two readers independently classified SSNs into pure or part-solid ground-glass nodules, and measured the size of the whole nodule and solid portion twice on both reconstruction algorithms. Nodule classification agreement was analyzed using Cohen's kappa and compared between reconstruction algorithms using McNemar's test. Measurement variability was investigated using Bland-Altman analysis and compared with the paired t-test. Cohen's kappa for inter-reader SSN classification agreement was 0.541-0.662 on FBP and 0.778-0.866 on MBIR. Between the two readers, nodule classification was consistent in 79.8 % (75/94) with FBP and 91.5 % (86/94) with MBIR (p = 0.027). Inter-reader measurement variability range was -5.0-2.1 mm on FBP and -3.3-1.8 mm on MBIR for whole nodule size, and was -6.5-0.9 mm on FBP and -5.5-1.5 mm on MBIR for solid portion size. Inter-reader measurement differences were significantly smaller on MBIR (p = 0.027, whole nodule; p = 0.011, solid portion). MBIR significantly improved SSN classification agreement and reduced measurement variability of both whole nodules and solid portions between readers. (orig.)

  15. Classification-based reasoning

    Science.gov (United States)

    Gomez, Fernando; Segami, Carlos

    1991-01-01

    A representation formalism for N-ary relations, quantification, and definition of concepts is described. Three types of conditions are associated with the concepts: (1) necessary and sufficient properties, (2) contingent properties, and (3) necessary properties. Also explained is how complex chains of inferences can be accomplished by representing existentially quantified sentences, and concepts denoted by restrictive relative clauses as classification hierarchies. The representation structures that make possible the inferences are explained first, followed by the reasoning algorithms that draw the inferences from the knowledge structures. All the ideas explained have been implemented and are part of the information retrieval component of a program called Snowy. An appendix contains a brief session with the program.

  16. Fuzzy One-Class Classification Model Using Contamination Neighborhoods

    Directory of Open Access Journals (Sweden)

    Lev V. Utkin

    2012-01-01

    Full Text Available A fuzzy classification model is studied in the paper. It is based on the contaminated (robust model which produces fuzzy expected risk measures characterizing classification errors. Optimal classification parameters of the models are derived by minimizing the fuzzy expected risk. It is shown that an algorithm for computing the classification parameters is reduced to a set of standard support vector machine tasks with weighted data points. Experimental results with synthetic data illustrate the proposed fuzzy model.

  17. Normalization Benefits Microarray-Based Classification

    Directory of Open Access Journals (Sweden)

    Chen Yidong

    2006-01-01

    Full Text Available When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.

  18. Normalization Benefits Microarray-Based Classification

    Directory of Open Access Journals (Sweden)

    Edward R. Dougherty

    2006-08-01

    Full Text Available When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.

  19. hERG classification model based on a combination of support vector machine method and GRIND descriptors

    DEFF Research Database (Denmark)

    Li, Qiyuan; Jorgensen, Flemming Steen; Oprea, Tudor;

    2008-01-01

    The human Ether-a-go-go Related Gene (hERG) potassium channel is one of the major critical factors associated with QT interval prolongation and development of arrhythmia called Torsades de Pointes (TdP). It has become a growing concern of both regulatory agencies and pharmaceutical industries who...... invest substantial effort in the assessment of cardiac toxicity of drugs. The development of in silico tools to filter out potential hERG channel inhibitors in earlystages of the drug discovery process is of considerable interest. Here, we describe binary classification models based on a large and...... diverse library of 495 compounds. The models combine pharmacophore-based GRIND descriptors with a support vector machine (SVM) classifier in order to discriminate between hERG blockers and nonblockers. Our models were applied at different thresholds from 1 to 40 mu m and achieved an overall accuracy up to...

  20. AdaBoost classification for model-based segmentation of the outer wall of the common carotid artery in CTA

    Science.gov (United States)

    Vukadinovic, D.; van Walsum, T.; Manniesing, R.; van der Lugt, A.; de Weert, T. T.; Niessen, W. J.

    2008-03-01

    A novel 2D slice based fully automatic method for model based segmentation of the outer vessel wall of the common carotid artery in CTA data set is introduced. The method utilizes a lumen segmentation and AdaBoost, a fast and robust machine learning algorithm, to initially classify (mark) regions outside and inside the vessel wall using the distance from the lumen and intensity profiles sampled radially from the gravity center of the lumen. A similar method using the distance from the lumen and the image intensity as features is used to classify calcium regions. Subsequently, an ellipse shaped deformable model is fitted to the classification result. The method achieves smaller detection error than the inter observer variability, and the method is robust against variation of the training data sets.

  1. Clear-sky classification procedures and models using a world-wide data-base

    International Nuclear Information System (INIS)

    Clear-sky data need to be extracted from all-sky measured solar-irradiance dataset, often by using algorithms that rely on other measured meteorological parameters. Current procedures for clear-sky data extraction have been examined and compared with each other to determine their reliability and location dependency. New clear-sky determination algorithms are proposed that are based on a combination of clearness index, diffuse ratio, cloud cover and Linke's turbidity limits. Various researchers have proposed clear-sky irradiance models that rely on synoptic parameters; four of these models, MRM, PRM, YRM and REST2 have been compared for six world-wide-locations. Based on a previously-developed comprehensive accuracy scoring method, the models MRM, REST2 and YRM were found to be of satisfactory performance in decreasing order. The so-called Page radiation model (PRM) was found to underestimate solar radiation, even though local turbidity data were provided for its operation

  2. a Kernel Method Based on Topic Model for Very High Spatial Resolution (vhsr) Remote Sensing Image Classification

    Science.gov (United States)

    Wu, Linmei; Shen, Li; Li, Zhipeng

    2016-06-01

    A kernel-based method for very high spatial resolution remote sensing image classification is proposed in this article. The new kernel method is based on spectral-spatial information and structure information as well, which is acquired from topic model, Latent Dirichlet Allocation model. The final kernel function is defined as K = u1Kspec + u2Kspat + u3Kstru, in which Kspec, Kspat, Kstru are radial basis function (RBF) and u1 + u2 + u3 = 1. In the experiment, comparison with three other kernel methods, including the spectral-based, the spectral- and spatial-based and the spectral- and structure-based method, is provided for a panchromatic QuickBird image of a suburban area with a size of 900 × 900 pixels and spatial resolution of 0.6 m. The result shows that the overall accuracy of the spectral- and structure-based kernel method is 80 %, which is higher than the spectral-based kernel method, as well as the spectral- and spatial-based which accuracy respectively is 67 % and 74 %. What's more, the accuracy of the proposed composite kernel method that jointly uses the spectral, spatial, and structure information is highest among the four methods which is increased to 83 %. On the other hand, the result of the experiment also verifies the validity of the expression of structure information about the remote sensing image.

  3. Multinomial mixture model with heterogeneous classification probabilities

    Science.gov (United States)

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  4. Combining Individual and Global Tree-based Models in EEG Classification

    Czech Academy of Sciences Publication Activity Database

    Klaschka, Jan

    Lisabon : Instituto Nacional de Estatística, 2008 - (Gomes, M.; Pinto Martins, J.; Silva, J.), s. 3786-3789 ISBN 978-972-673-992-0. [ISI 2007. Session of the International Statistical Institute /56./. Lisboa (PT), 22.08.2007-29.08.2007] R&D Projects: GA MŠk ME 701 Institutional research plan: CEZ:AV0Z10300504 Keywords : EEG spectra * classification forest * random forests * OOB estimates Subject RIV: BB - Applied Statistics, Operational Research

  5. Combining Individual and Global Tree-based Models in EEG Classification

    Czech Academy of Sciences Publication Activity Database

    Klaschka, Jan

    Lisabon: Instituto Nacional de Estatística, 2008 - (Gomes, M.; Pinto Martins, J.; Silva, J.), s. 3786-3789 ISBN 978-972-673-992-0. [ISI 2007. Session of the International Statistical Institute /56./. Lisboa (PT), 22.08.2007-29.08.2007] R&D Projects: GA MŠk ME 701 Institutional research plan: CEZ:AV0Z10300504 Keywords : EEG spectra * classification forest * random forests * OOB estimates Subject RIV: BB - Applied Statistics, Operational Research

  6. Modulation classification based on spectrogram

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    The aim of modulation classification (MC) is to identify the modulation type of a communication signal. It plays an important role in many cooperative or noncooperative communication applications. Three spectrogram-based modulation classification methods are proposed. Their reccgnition scope and performance are investigated or evaluated by theoretical analysis and extensive simulation studies. The method taking moment-like features is robust to frequency offset while the other two, which make use of principal component analysis (PCA) with different transformation inputs,can achieve satisfactory accuracy even at low SNR (as low as 2 dB). Due to the properties of spectrogram, the statistical pattern recognition techniques, and the image preprocessing steps, all of our methods are insensitive to unknown phase and frequency offsets, timing errors, and the arriving sequence of symbols.

  7. Ligand and Structure-Based Classification Models for Prediction of P-Glycoprotein Inhibitors

    OpenAIRE

    Klepsch, Freya; Vasanthanathan, Poongavanam; Ecker, Gerhard F

    2014-01-01

    The ABC transporter P-glycoprotein (P-gp) actively transports a wide range of drugs and toxins out of cells, and is therefore related to multidrug resistance and the ADME profile of therapeutics. Thus, development of predictive in silico models for the identification of P-gp inhibitors is of great interest in the field of drug discovery and development. So far in silico P-gp inhibitor prediction was dominated by ligand-based approaches because of the lack of high-quality structural informatio...

  8. Predicting student satisfaction with courses based on log data from a virtual learning environment – a neural network and classification tree model

    Directory of Open Access Journals (Sweden)

    Ivana Đurđević Babić

    2015-03-01

    Full Text Available Student satisfaction with courses in academic institutions is an important issue and is recognized as a form of support in ensuring effective and quality education, as well as enhancing student course experience. This paper investigates whether there is a connection between student satisfaction with courses and log data on student courses in a virtual learning environment. Furthermore, it explores whether a successful classification model for predicting student satisfaction with course can be developed based on course log data and compares the results obtained from implemented methods. The research was conducted at the Faculty of Education in Osijek and included analysis of log data and course satisfaction on a sample of third and fourth year students. Multilayer Perceptron (MLP with different activation functions and Radial Basis Function (RBF neural networks as well as classification tree models were developed, trained and tested in order to classify students into one of two categories of course satisfaction. Type I and type II errors, and input variable importance were used for model comparison and classification accuracy. The results indicate that a successful classification model using tested methods can be created. The MLP model provides the highest average classification accuracy and the lowest preference in misclassification of students with a low level of course satisfaction, although a t-test for the difference in proportions showed that the difference in performance between the compared models is not statistically significant. Student involvement in forum discussions is recognized as a valuable predictor of student satisfaction with courses in all observed models.

  9. A Classification of Constructivist Instructional Design Models Based on Learning and Teaching Approaches

    Science.gov (United States)

    Fardanesh, Hashem

    2006-01-01

    In a conceptual-analytical study using a deductive classificatory content analysis method ten constructivist instructional design models were selected, and learning/teaching approaches within each model were appraised. Using the original writings of the originators of each design model, the learning and teaching approaches employed or permitted to…

  10. Deep Reconstruction Models for Image Set Classification.

    Science.gov (United States)

    Hayat, Munawar; Bennamoun, Mohammed; An, Senjian

    2015-04-01

    Image set classification finds its applications in a number of real-life scenarios such as classification from surveillance videos, multi-view camera networks and personal albums. Compared with single image based classification, it offers more promises and has therefore attracted significant research attention in recent years. Unlike many existing methods which assume images of a set to lie on a certain geometric surface, this paper introduces a deep learning framework which makes no such prior assumptions and can automatically discover the underlying geometric structure. Specifically, a Template Deep Reconstruction Model (TDRM) is defined whose parameters are initialized by performing unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBMs). The initialized TDRM is then separately trained for images of each class and class-specific DRMs are learnt. Based on the minimum reconstruction errors from the learnt class-specific models, three different voting strategies are devised for classification. Extensive experiments are performed to demonstrate the efficacy of the proposed framework for the tasks of face and object recognition from image sets. Experimental results show that the proposed method consistently outperforms the existing state of the art methods. PMID:26353289

  11. Projection Classification Based Iterative Algorithm

    Science.gov (United States)

    Zhang, Ruiqiu; Li, Chen; Gao, Wenhua

    2015-05-01

    Iterative algorithm has good performance as it does not need complete projection data in 3D image reconstruction area. It is possible to be applied in BGA based solder joints inspection but with low convergence speed which usually acts with x-ray Laminography that has a worse reconstruction image compared to the former one. This paper explores to apply one projection classification based method which tries to separate the object to three parts, i.e. solute, solution and air, and suppose that the reconstruction speed decrease from solution to two other parts on both side lineally. And then SART and CAV algorithms are improved under the proposed idea. Simulation experiment result with incomplete projection images indicates the fast convergence speed of the improved iterative algorithms and the effectiveness of the proposed method. Less the projection images, more the superiority is also founded.

  12. Models for concurrency: towards a classification

    DEFF Research Database (Denmark)

    Sassone, Vladimiro; Nielsen, Mogens; Winskel, Glynn

    1996-01-01

    Models for concurrency can be classified with respect to three relevant parameters: behaviour/ system, interleaving/noninterleaving, linear/branching time. When modelling a process, a choice concerning such parameters corresponds to choosing the level of abstraction of the resulting semantics. In...... this paper, we move a step towards a classification of models for concurrency based on the parameters above. Formally, we choose a representative of any of the eight classes of models obtained by varying the three parameters, and we study the formal relationships between them using the language of...

  13. Supervised and Unsupervised Classification Using Mixture Models

    Science.gov (United States)

    Girard, S.; Saracco, J.

    2016-05-01

    This chapter is dedicated to model-based supervised and unsupervised classification. Probability distributions are defined over possible labels as well as over the observations given the labels. To this end, the basic tools are the mixture models. This methodology yields a posterior distribution over the labels given the observations which allows to quantify the uncertainty of the classification. The role of Gaussian mixture models is emphasized leading to Linear Discriminant Analysis and Quadratic Discriminant Analysis methods. Some links with Fisher Discriminant Analysis and logistic regression are also established. The Expectation-Maximization algorithm is introduced and compared to the K-means clustering method. The methods are illustrated both on simulated datasets as well as on real datasets using the R software.

  14. 基于SAS的web文本分类模型研究%WEB TEXT CLASSIFICATION MODEL STUDY BASED ON SAS

    Institute of Scientific and Technical Information of China (English)

    向来生; 孙威; 刘希玉

    2016-01-01

    通过建立模型对电商企业的客户查询信息进行文本分类分析,帮助企业掌握用户的消费习惯,同时帮助用户及时找到需要的商品。本文首先获取客户查询数据并对该文本数据进行预处理,利用改进的TF-IDF方法获得文本特征向量,最后结合朴素贝叶斯文本分类及半监督的EM迭代算法建立分类模型,并应用各种标准对模型进行评估,验证模型的有效性。多类别文本集选取文本特征时,关键词权值容易产生波动,本研究改进关键词权值计算公式来改善分类结果。实验结果表明分类器具有良好的分类效果。%In this paper,we establish a model to analysis business enterprise customer query information for text classification to help e-commerce companies control the userˊs spending habits,and help users to find their needed goods. This study accesses to customer inquiry data and preprocesses these text data firstly. And then,the improved TF - IDF principle is applied to obtain the text feature vectors. Finally,this study establishes the classification model combining the Naive Bayes text classification and the semi-supervised EM iterative algorithm, and uses various criteria to evaluate the model. When facing multi - class text classification feature selection, keyword weights prone to great volatility. This study improves the keyword weight calculation formula to perfect the classification results. The experimental results show that classification has good classification effect.

  15. Web spam detection : new classification features based on qualified link analysis and language models

    OpenAIRE

    Araujo, Lourdes; Martínez-Romo, Juan

    2010-01-01

    Web spam is a serious problem for search engines because the quality of their results can be severely degraded by the presence of this kind of page. In this paper, we present an efficient spam detection system based on a classifier that combines new lin

  16. General regression and representation model for classification.

    Directory of Open Access Journals (Sweden)

    Jianjun Qian

    Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.

  17. Classification of thermal waters based on their inorganic fingerprint and hydrogeothermal modelling

    OpenAIRE

    I. Delgado-Outeiriño; Araujo-Nespereira, P.; J. A. Cid-Fernández; J. C. Mejuto; E. Martínez-Carballo; Simal-Gándara, J.

    2011-01-01

    Hydrothermal features in Galicia have been used since ancient times for therapeutic purposes. A characterization of these thermal waters was carried out in order to understand their behaviour based on inorganic pattern and water-rock interaction mechanisms. In this way 15 thermal water samples were collected in the same hydrographical system. The results of the hydrogeochemistry analysis showed one main water family of bicarbonate type sodium waters, typical in the post-orogenic basins of Gal...

  18. Classification techniques based on AI application to defect classification in cast aluminum

    Science.gov (United States)

    Platero, Carlos; Fernandez, Carlos; Campoy, Pascual; Aracil, Rafael

    1994-11-01

    This paper describes the Artificial Intelligent techniques applied to the interpretation process of images from cast aluminum surface presenting different defects. The whole process includes on-line defect detection, feature extraction and defect classification. These topics are discussed in depth through the paper. Data preprocessing process, as well as segmentation and feature extraction are described. At this point, algorithms employed along with used descriptors are shown. Syntactic filter has been developed to modelate the information and to generate the input vector to the classification system. Classification of defects is achieved by means of rule-based systems, fuzzy models and neural nets. Different classification subsystems perform together for the resolution of a pattern recognition problem (hybrid systems). Firstly, syntactic methods are used to obtain the filter that reduces the dimension of the input vector to the classification process. Rule-based classification is achieved associating a grammar to each defect type; the knowledge-base will be formed by the information derived from the syntactic filter along with the inferred rules. The fuzzy classification sub-system uses production rules with fuzzy antecedent and their consequents are ownership rates to every defect type. Different architectures of neural nets have been implemented with different results, as shown along the paper. In the higher classification level, the information given by the heterogeneous systems as well as the history of the process is supplied to an Expert System in order to drive the casting process.

  19. Ladar-based terrain cover classification

    Science.gov (United States)

    Macedo, Jose; Manduchi, Roberto; Matthies, Larry H.

    2001-09-01

    An autonomous vehicle driving in a densely vegetated environment needs to be able to discriminate between obstacles (such as rocks) and penetrable vegetation (such as tall grass). We propose a technique for terrain cover classification based on the statistical analysis of the range data produced by a single-axis laser rangefinder (ladar). We first present theoretical models for the range distribution in the presence of homogeneously distributed grass and of obstacles partially occluded by grass. We then validate our results with real-world cases, and propose a simple algorithm to robustly discriminate between vegetation and obstacles based on the local statistical analysis of the range data.

  20. Transportation Mode Choice Analysis Based on Classification Methods

    OpenAIRE

    Zeņina, N; Borisovs, A

    2011-01-01

    Mode choice analysis has received the most attention among discrete choice problems in travel behavior literature. Most traditional mode choice models are based on the principle of random utility maximization derived from econometric theory. This paper investigates performance of mode choice analysis with classification methods - decision trees, discriminant analysis and multinomial logit. Experimental results have demonstrated satisfactory quality of classification.

  1. The high-density lipoprotein-adjusted SCORE model worsens SCORE-based risk classification in a contemporary population of 30 824 Europeans

    DEFF Research Database (Denmark)

    Mortensen, Martin B; Afzal, Shoaib; Nordestgaard, Børge G;

    2015-01-01

    AIMS: Recent European guidelines recommend to include high-density lipoprotein (HDL) cholesterol in risk assessment for primary prevention of cardiovascular disease (CVD), using a SCORE-based risk model (SCORE-HDL). We compared the predictive performance of SCORE-HDL with SCORE in an independent...... with SCORE, but deteriorated risk classification based on NRI. Future guidelines should consider lower decision thresholds and prioritize CVD morbidity and people above age 65....

  2. Digital image-based classification of biodiesel.

    Science.gov (United States)

    Costa, Gean Bezerra; Fernandes, David Douglas Sousa; Almeida, Valber Elias; Araújo, Thomas Souto Policarpo; Melo, Jessica Priscila; Diniz, Paulo Henrique Gonçalves Dias; Véras, Germano

    2015-07-01

    This work proposes a simple, rapid, inexpensive, and non-destructive methodology based on digital images and pattern recognition techniques for classification of biodiesel according to oil type (cottonseed, sunflower, corn, or soybean). For this, differing color histograms in RGB (extracted from digital images), HSI, Grayscale channels, and their combinations were used as analytical information, which was then statistically evaluated using Soft Independent Modeling by Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and variable selection using the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). Despite good performances by the SIMCA and PLS-DA classification models, SPA-LDA provided better results (up to 95% for all approaches) in terms of accuracy, sensitivity, and specificity for both the training and test sets. The variables selected Successive Projections Algorithm clearly contained the information necessary for biodiesel type classification. This is important since a product may exhibit different properties, depending on the feedstock used. Such variations directly influence the quality, and consequently the price. Moreover, intrinsic advantages such as quick analysis, requiring no reagents, and a noteworthy reduction (the avoidance of chemical characterization) of waste generation, all contribute towards the primary objective of green chemistry. PMID:25882407

  3. An Automatic Segmentation and Classification Framework Based on PCNN Model for Single Tooth in MicroCT Images

    Science.gov (United States)

    Wang, Liansheng; Li, Shusheng; Chen, Rongzhen; Liu, Sze-Yu; Chen, Jyh-Cheng

    2016-01-01

    Accurate segmentation and classification of different anatomical structures of teeth from medical images plays an essential role in many clinical applications. Usually, the anatomical structures of teeth are manually labelled by experienced clinical doctors, which is time consuming. However, automatic segmentation and classification is a challenging task because the anatomical structures and surroundings of the tooth in medical images are rather complex. Therefore, in this paper, we propose an effective framework which is designed to segment the tooth with a Selective Binary and Gaussian Filtering Regularized Level Set (GFRLS) method improved by fully utilizing three dimensional (3D) information, and classify the tooth by employing unsupervised learning Pulse Coupled Neural Networks (PCNN) model. In order to evaluate the proposed method, the experiments are conducted on the different datasets of mandibular molars and the experimental results show that our method can achieve better accuracy and robustness compared to other four state of the art clustering methods. PMID:27322421

  4. A Model for Classification Secondary School Student Enrollment Approval Based on E-Learning Management System and E-Games

    Directory of Open Access Journals (Sweden)

    Hany Mohamed El-katary

    2016-02-01

    Full Text Available Student is the key of the educational process, where students’ creativity and interactions are strongly encouraged. There are many tools embedded in Learning Management Systems (LMS that considered as a goal evaluation of learners. A problem that currently appeared is that assessment process is not always fair or accurate in classifying students according to accumulated knowledge. Therefore, there is a need to apply a new model for better decision making for students’ enrollment and assessments. The proposed model may run along with an assessment tool within a LMS. The proposed model performs analysis and obtains knowledge regarding the classification capability of the assessment process. It offers knowledge for course managers regarding the course materials, quizzes, activities and e-games. The proposed model is an accurate assessment tool and thus better classification among learners. The proposed model was developed for learning management systems, which are commonly used in e-learning in Egyptian language schools. The proposed model demonstrated good accuracy compared to real sample data (250 students.

  5. Waste Classification based on Waste Form Heat Generation in Advanced Nuclear Fuel Cycles Using the Fuel-Cycle Integration and Tradeoffs (FIT) Model

    Energy Technology Data Exchange (ETDEWEB)

    Denia Djokic; Steven J. Piet; Layne F. Pincock; Nick R. Soelberg

    2013-02-01

    This study explores the impact of wastes generated from potential future fuel cycles and the issues presented by classifying these under current classification criteria, and discusses the possibility of a comprehensive and consistent characteristics-based classification framework based on new waste streams created from advanced fuel cycles. A static mass flow model, Fuel-Cycle Integration and Tradeoffs (FIT), was used to calculate the composition of waste streams resulting from different nuclear fuel cycle choices. This analysis focuses on the impact of waste form heat load on waste classification practices, although classifying by metrics of radiotoxicity, mass, and volume is also possible. The value of separation of heat-generating fission products and actinides in different fuel cycles is discussed. It was shown that the benefits of reducing the short-term fission-product heat load of waste destined for geologic disposal are neglected under the current source-based radioactive waste classification system , and that it is useful to classify waste streams based on how favorable the impact of interim storage is in increasing repository capacity.

  6. Arabic Text Mining Using Rule Based Classification

    OpenAIRE

    Fadi Thabtah; Omar Gharaibeh; Rashid Al-Zubaidy

    2012-01-01

    A well-known classification problem in the domain of text mining is text classification, which concerns about mapping textual documents into one or more predefined category based on its content. Text classification arena recently attracted many researchers because of the massive amounts of online documents and text archives which hold essential information for a decision-making process. In this field, most of such researches focus on classifying English documents while there are limited studi...

  7. Texture Classification based on Gabor Wavelet

    OpenAIRE

    Amandeep Kaur; Savita Gupta

    2012-01-01

    This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vec...

  8. Nonlinear Inertia Classification Model and Application

    Directory of Open Access Journals (Sweden)

    Mei Wang

    2014-01-01

    Full Text Available Classification model of support vector machine (SVM overcomes the problem of a big number of samples. But the kernel parameter and the punishment factor have great influence on the quality of SVM model. Particle swarm optimization (PSO is an evolutionary search algorithm based on the swarm intelligence, which is suitable for parameter optimization. Accordingly, a nonlinear inertia convergence classification model (NICCM is proposed after the nonlinear inertia convergence (NICPSO is developed in this paper. The velocity of NICPSO is firstly defined as the weighted velocity of the inertia PSO, and the inertia factor is selected to be a nonlinear function. NICPSO is used to optimize the kernel parameter and a punishment factor of SVM. Then, NICCM classifier is trained by using the optical punishment factor and the optical kernel parameter that comes from the optimal particle. Finally, NICCM is applied to the classification of the normal state and fault states of online power cable. It is experimentally proved that the iteration number for the proposed NICPSO to reach the optimal position decreases from 15 to 5 compared with PSO; the training duration is decreased by 0.0052 s and the recognition precision is increased by 4.12% compared with SVM.

  9. Cardiac arrhythmia classification using autoregressive modeling

    Directory of Open Access Journals (Sweden)

    Srinivasan Narayanan

    2002-11-01

    Full Text Available Abstract Background Computer-assisted arrhythmia recognition is critical for the management of cardiac disorders. Various techniques have been utilized to classify arrhythmias. Generally, these techniques classify two or three arrhythmias or have significantly large processing times. A simpler autoregressive modeling (AR technique is proposed to classify normal sinus rhythm (NSR and various cardiac arrhythmias including atrial premature contraction (APC, premature ventricular contraction (PVC, superventricular tachycardia (SVT, ventricular tachycardia (VT and ventricular fibrillation (VF. Methods AR Modeling was performed on ECG data from normal sinus rhythm as well as various arrhythmias. The AR coefficients were computed using Burg's algorithm. The AR coefficients were classified using a generalized linear model (GLM based algorithm in various stages. Results AR modeling results showed that an order of four was sufficient for modeling the ECG signals. The accuracy of detecting NSR, APC, PVC, SVT, VT and VF were 93.2% to 100% using the GLM based classification algorithm. Conclusion The results show that AR modeling is useful for the classification of cardiac arrhythmias, with reasonably high accuracies. Further validation of the proposed technique will yield acceptable results for clinical implementation.

  10. Operational risk modeled analytically II: the consequences of classification invariance

    OpenAIRE

    Vivien Brunel

    2015-01-01

    Most of the banks' operational risk internal models are based on loss pooling in risk and business line categories. The parameters and outputs of operational risk models are sensitive to the pooling of the data and the choice of the risk classification. In a simple model, we establish the link between the number of risk cells and the model parameters by requiring invariance of the bank's loss distribution upon a change in classification. We provide details on the impact of this requirement on...

  11. Domain-Based Classification of CSCW Systems

    Directory of Open Access Journals (Sweden)

    M. Khan

    2011-11-01

    Full Text Available CSCW systems are widely used for group activities in different organizations and setups. This study briefly describes the existing classifications of CSCW systems and their shortcomings. These existing classifications are helpful to categorize systems based on a general set of CSCW characteristics but do not provide any guidance towards system design and evaluation. After literature review of ACM CSCW conference (1986-2010, a new classification is proposed to categorize CSCW systems on the basis of domains. This proposed classification may help researchers to come up with more effective design and evaluation methods for CSCW systems.

  12. Computerized Classification Testing with the Rasch Model

    Science.gov (United States)

    Eggen, Theo J. H. M.

    2011-01-01

    If classification in a limited number of categories is the purpose of testing, computerized adaptive tests (CATs) with algorithms based on sequential statistical testing perform better than estimation-based CATs (e.g., Eggen & Straetmans, 2000). In these computerized classification tests (CCTs), the Sequential Probability Ratio Test (SPRT) (Wald,…

  13. The classification of LDA model essay based on information gain%基于信息增益的LDA模型的短文本分类

    Institute of Scientific and Technical Information of China (English)

    沈竞

    2011-01-01

    In this paper the classification of short essay was improved based on LDA.The information gain of the essay with LDA classification method was put forward.Using the information gain calculation to calculate the text classification vocabulary contribution,to improve "function word" weight,and to filter out "the function word",at last the passage of the filtered was in the LDA theme modeling,and the center vector method was used to establish the text category model.The experimental results prove that with the reducing of function word ratio,classification performance is distinctly improved in the method.%在基于LDA的短文本分类基础上进行改进,提出信息增益结合LDA的短文本分类方法.该方法采用信息增益计算词汇对于文本分类的贡献度,提高"作用词"的权重,过滤掉"非作用词",最后对过滤后的短文本进行LDA主题建模,并采用中心向量法建立文本类别模型.实验证明,该方法随着作用词比例的减少,分类性能有较大的提高.

  14. Using DRG to analyze hospital production: a re-classification model based on a linear tree-network topology

    Directory of Open Access Journals (Sweden)

    Achille Lanzarini

    2014-09-01

    Full Text Available Background: Hospital discharge records are widely classified through the Diagnosis Related Group (DRG system; the version currently used in Italy counts 538 different codes, including thousands of diagnosis and procedures. These numbers reflect the considerable effort of simplification, yet the current classification system is of little use to evaluate hospital production and performance.Methods: As the case-mix of a given Hospital Unit (HU is driven by its physicians’ specializations, a grouping of DRGs into a specialization-driven classification system has been conceived through the analysis of HUs discharging and the ICD-9-CM codes. We propose a three-folded classification, based on the analysis of 1,670,755 Hospital Discharge Cards (HDCs produced by Lombardy Hospitals in 2010; it consists of 32 specializations (e.g. Neurosurgery, 124 sub-specialization (e.g. skull surgery and 337 sub-sub-specialization (e.g. craniotomy.Results: We give a practical application of the three-layered approach, based on the production of a Neurosurgical HU; we observe synthetically the profile of production (1,305 hospital discharges for 79 different DRG codes of 16 different MDC are grouped in few groups of homogeneous DRG codes, a more informative production comparison (through process-specific comparisons, rather than crude or case-mix standardized comparisons and a potentially more adequate production planning (considering the Neurosurgical HUs of the same city, those produce a limited quote of the whole neurosurgical production, because the same activity can be realized by non-Neurosugical HUs.Conclusion: Our work may help to evaluate the hospital production for a rational planning of available resources, blunting information asymmetries between physicians and managers. 

  15. Knowledge-Based Classification in Automated Soil Mapping

    Institute of Scientific and Technical Information of China (English)

    ZHOU BIN; WANG RENCHAO

    2003-01-01

    A machine-learning approach was developed for automated building of knowledge bases for soil resourcesmapping by using a classification tree to generate knowledge from training data. With this method, buildinga knowledge base for automated soil mapping was easier than using the conventional knowledge acquisitionapproach. The knowledge base built by classification tree was used by the knowledge classifier to perform thesoil type classification of Longyou County, Zhejiang Province, China using Landsat TM bi-temporal imagesand GIS data. To evaluate the performance of the resultant knowledge bases, the classification results werecompared to existing soil map based on a field survey. The accuracy assessment and analysis of the resultantsoil maps suggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.

  16. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Directory of Open Access Journals (Sweden)

    Le Li

    Full Text Available Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.

  17. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Science.gov (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  18. Behavior Based Social Dimensions Extraction for Multi-Label Classification

    Science.gov (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes’ behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes’ connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  19. Texture Classification Based on Texton Features

    Directory of Open Access Journals (Sweden)

    U Ravi Babu

    2012-08-01

    Full Text Available Texture Analysis plays an important role in the interpretation, understanding and recognition of terrain, biomedical or microscopic images. To achieve high accuracy in classification the present paper proposes a new method on textons. Each texture analysis method depends upon how the selected texture features characterizes image. Whenever a new texture feature is derived it is tested whether it precisely classifies the textures. Here not only the texture features are important but also the way in which they are applied is also important and significant for a crucial, precise and accurate texture classification and analysis. The present paper proposes a new method on textons, for an efficient rotationally invariant texture classification. The proposed Texton Features (TF evaluates the relationship between the values of neighboring pixels. The proposed classification algorithm evaluates the histogram based techniques on TF for a precise classification. The experimental results on various stone textures indicate the efficacy of the proposed method when compared to other methods.

  20. Fingerprint Classification based on Orientaion Estimation

    Directory of Open Access Journals (Sweden)

    Manish Mathuria

    2013-06-01

    Full Text Available The geometric characteristics of an object make it distinguishable. The objects present in the Environment known by their features and properties. The fingerprint image as object may classify into sub classes based on minutiae structure. The minutiae structure may categorize as ridge curves generated by the orientation estimation. The extracted curves are invariant to location, rotation and scaling. This classification approach helps to manage fingerprints along their classes. This research provides a better collaboration of data mining based on classification.

  1. Fuzzy Rule Base System for Software Classification

    Directory of Open Access Journals (Sweden)

    Adnan Shaout

    2013-07-01

    Full Text Available Given the central role that software development plays in the delivery and application of informationtechnology, managers have been focusing on process improvement in the software development area. Thisimprovement has increased the demand for software measures, or metrics to manage the process. Thismetrics provide a quantitative basis for the development and validation of models during the softwaredevelopment process. In this paper a fuzzy rule-based system will be developed to classify java applicationsusing object oriented metrics. The system will contain the following features:Automated method to extract the OO metrics from the source code,Default/base set of rules that can be easily configured via XML file so companies, developers, teamleaders,etc, can modify the set of rules according to their needs,Implementation of a framework so new metrics, fuzzy sets and fuzzy rules can be added or removeddepending on the needs of the end user,General classification of the software application and fine-grained classification of the java classesbased on OO metrics, andTwo interfaces are provided for the system: GUI and command.

  2. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  3. Classification using Hierarchical Naive Bayes models

    DEFF Research Database (Denmark)

    Langseth, Helge; Dyhre Nielsen, Thomas

    2006-01-01

    Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an...... instance are conditionally independent given the class of that instance. When this assumption is violated (which is often the case in practice) it can reduce classification accuracy due to “information double-counting” and interaction omission. In this paper we focus on a relatively new set of models...... context of classification. Experimental results show that the learned models can significantly improve classification accuracy as compared to other frameworks....

  4. Support vector classification algorithm based on variable parameter linear programming

    Institute of Scientific and Technical Information of China (English)

    Xiao Jianhua; Lin Jian

    2007-01-01

    To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed.In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model.The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given.An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.

  5. Knowledge discovery from patients' behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services.

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers. PMID:27610177

  6. Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.

  7. A Comparison of Computer-Based Classification Testing Approaches Using Mixed-Format Tests with the Generalized Partial Credit Model

    Science.gov (United States)

    Kim, Jiseon

    2010-01-01

    Classification testing has been widely used to make categorical decisions by determining whether an examinee has a certain degree of ability required by established standards. As computer technologies have developed, classification testing has become more computerized. Several approaches have been proposed and investigated in the context of…

  8. Iris Image Classification Based on Hierarchical Visual Codebook.

    Science.gov (United States)

    Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang

    2014-06-01

    Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection. PMID:26353275

  9. Texture Classification based on Gabor Wavelet

    Directory of Open Access Journals (Sweden)

    Amandeep Kaur

    2012-07-01

    Full Text Available This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vector Machine, K-nearest neighbor method and decision tree induction method. The results shows that classification using Support vector machines gives better results as compare to the other classifiers. It can accurately discriminate between a testing image data and training data.

  10. Latent Classification Models for Binary Data

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre

    2009-01-01

    class of that instance. To relax this independence assumption, we have in previous work proposed a family of models, called latent classification models (LCMs). LCMs are defined for continuous domains and generalize the naive Bayes model by using latent variables to model class-conditional dependencies...... between the attributes. In addition to providing good classification accuracy, the LCM model has several appealing properties, including a relatively small parameter space making it less susceptible to over-fitting. In this paper we take a first-step towards generalizing LCMs to hybrid domains, by...

  11. Enhanced water level model in image classification

    Science.gov (United States)

    Deng, Shangrong; Qian, Kai; Hung, Chih-Cheng

    2005-05-01

    Water-Level model is an effective method in density-based classification. We use biased sampling, local similarity and popularity as preprocessing, and employ a merging operation in the water-level model for classification. Biased sampling is to get some information about the global structure. Similarity and local density are mainly used to understand the local structure. In biased sampling, images are divided into many l x l patches and a sample pixel is selected from each patch. Similarity at a point p, denoted by sim(p), measures the change of gray level between point p and its neighborhood N(p). Besides using biased sampling to combine spectral and spatial information, we use similarity and local popularity in selecting sample points. A sample point is chosen based on the minimum value of sim(p) + [1-P(p)] after normalization. The selected pixel is a better representative, especially near the border of an object. To make it more effective, one has to deal with small spikes and bumps. To get rid of the small spikes, we establish a threshold |[f(P1)-f(P2)]*(P1-P2)| > c*l*l , where c is a constant, P1 is a local maximum point to be tested and P2 is the nearest local minimum from P1. The condition is only related to the size of the patches l*l. The merging operation we include in the model makes the threshold constant less sensitive in the process. DBScan is combined with the enhanced water level model to reduce noise and to get connected components. Preliminary experiments have been conducted using the proposed methods and the results are promising.

  12. Identification of new events in Apollo 16 lunar seismic data by Hidden Markov Model-based event detection and classification

    Science.gov (United States)

    Knapmeyer-Endrun, Brigitte; Hammer, Conny

    2015-10-01

    Detection and identification of interesting events in single-station seismic data with little prior knowledge and under tight time constraints is a typical scenario in planetary seismology. The Apollo lunar seismic data, with the only confirmed events recorded on any extraterrestrial body yet, provide a valuable test case. Here we present the application of a stochastic event detector and classifier to the data of station Apollo 16. Based on a single-waveform example for each event class and some hours of background noise, the system is trained to recognize deep moonquakes, impacts, and shallow moonquakes and performs reliably over 3 years of data. The algorithm's demonstrated ability to detect rare events and flag previously undefined signal classes as new event types is of particular interest in the analysis of the first seismic recordings from a completely new environment. We are able to classify more than 50% of previously unclassified lunar events, and additionally find over 200 new events not listed in the current lunar event catalog. These events include deep moonquakes as well as impacts and could be used to update studies on temporal variations in event rate or deep moonquakes stacks used in phase picking for localization. No unambiguous new shallow moonquake was detected, but application to data of the other Apollo stations has the potential for additional new discoveries 40 years after the data were recorded. Besides, the classification system could be useful for future seismometer missions to other planets, e.g., the InSight mission to Mars.

  13. Full-polarization radar remote sensing and data mining for tropical crops mapping: a successful SVM-based classification model

    Science.gov (United States)

    Denize, J.; Corgne, S.; Todoroff, P.; LE Mezo, L.

    2015-12-01

    In Reunion, a tropical island of 2,512 km², 700 km east of Madagascar in the Indian Ocean, constrained by a rugged relief, agricultural sectors are competing in highly fragmented agricultural land constituted by heterogeneous farming systems from corporate to small-scale farming. Policymakers, planners and institutions are in dire need of reliable and updated land use references. Actually conventional land use mapping methods are inefficient under the tropic with frequent cloud cover and loosely synchronous vegetative cycles of the crops due to a constant temperature. This study aims to provide an appropriate method for the identification and mapping of tropical crops by remote sensing. For this purpose, we assess the potential of polarimetric SAR imagery associated with associated with machine learning algorithms. The method has been developed and tested on a study area of 25*25 km thanks to 6 RADARSAT-2 images in 2014 in full-polarization. A set of radar indicators (backscatter coefficient, bands ratios, indices, polarimetric decompositions (Freeman-Durden, Van zyl, Yamaguchi, Cloude and Pottier, Krogager), texture, etc.) was calculated from the coherency matrix. A random forest procedure allowed the selection of the most important variables on each images to reduce the dimension of the dataset and the processing time. Support Vector Machines (SVM), allowed the classification of these indicators based on a learning database created from field observations in 2013. The method shows an overall accuracy of 88% with a Kappa index of 0.82 for the identification of four major crops.

  14. Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models.

    Directory of Open Access Journals (Sweden)

    R Geetha Ramani

    Full Text Available Detecting divergence between oncogenic tumors plays a pivotal role in cancer diagnosis and therapy. This research work was focused on designing a computational strategy to predict the class of lung cancer tumors from the structural and physicochemical properties (1497 attributes of protein sequences obtained from genes defined by microarray analysis. The proposed methodology involved the use of hybrid feature selection techniques (gain ratio and correlation based subset evaluators with Incremental Feature Selection followed by Bayesian Network prediction to discriminate lung cancer tumors as Small Cell Lung Cancer (SCLC, Non-Small Cell Lung Cancer (NSCLC and the COMMON classes. Moreover, this methodology eliminated the need for extensive data cleansing strategies on the protein properties and revealed the optimal and minimal set of features that contributed to lung cancer tumor classification with an improved accuracy compared to previous work. We also attempted to predict via supervised clustering the possible clusters in the lung tumor data. Our results revealed that supervised clustering algorithms exhibited poor performance in differentiating the lung tumor classes. Hybrid feature selection identified the distribution of solvent accessibility, polarizability and hydrophobicity as the highest ranked features with Incremental feature selection and Bayesian Network prediction generating the optimal Jack-knife cross validation accuracy of 87.6%. Precise categorization of oncogenic genes causing SCLC and NSCLC based on the structural and physicochemical properties of their protein sequences is expected to unravel the functionality of proteins that are essential in maintaining the genomic integrity of a cell and also act as an informative source for drug design, targeting essential protein properties and their composition that are found to exist in lung cancer tumors.

  15. Risk-based classification system of nanomaterials

    International Nuclear Information System (INIS)

    Various stakeholders are increasingly interested in the potential toxicity and other risks associated with nanomaterials throughout the different stages of a product's life cycle (e.g., development, production, use, disposal). Risk assessment methods and tools developed and applied to chemical and biological materials may not be readily adaptable for nanomaterials because of the current uncertainty in identifying the relevant physico-chemical and biological properties that adequately describe the materials. Such uncertainty is further driven by the substantial variations in the properties of the original material due to variable manufacturing processes employed in nanomaterial production. To guide scientists and engineers in nanomaterial research and application as well as to promote the safe handling and use of these materials, we propose a decision support system for classifying nanomaterials into different risk categories. The classification system is based on a set of performance metrics that measure both the toxicity and physico-chemical characteristics of the original materials, as well as the expected environmental impacts through the product life cycle. Stochastic multicriteria acceptability analysis (SMAA-TRI), a formal decision analysis method, was used as the foundation for this task. This method allowed us to cluster various nanomaterials in different ecological risk categories based on our current knowledge of nanomaterial physico-chemical characteristics, variation in produced material, and best professional judgments. SMAA-TRI uses Monte Carlo simulations to explore all feasible values for weights, criteria measurements, and other model parameters to assess the robustness of nanomaterial grouping for risk management purposes.

  16. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  17. Failure diagnosis using deep belief learning based health state classification

    International Nuclear Information System (INIS)

    Effective health diagnosis provides multifarious benefits such as improved safety, improved reliability and reduced costs for operation and maintenance of complex engineered systems. This paper presents a novel multi-sensor health diagnosis method using deep belief network (DBN). DBN has recently become a popular approach in machine learning for its promised advantages such as fast inference and the ability to encode richer and higher order network structures. The DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines and works through a layer by layer successive learning process. The proposed multi-sensor health diagnosis methodology using DBN based state classification can be structured in three consecutive stages: first, defining health states and preprocessing sensory data for DBN training and testing; second, developing DBN based classification models for diagnosis of predefined health states; third, validating DBN classification models with testing sensory dataset. Health diagnosis using DBN based health state classification technique is compared with four existing diagnosis techniques. Benchmark classification problems and two engineering health diagnosis applications: aircraft engine health diagnosis and electric power transformer health diagnosis are employed to demonstrate the efficacy of the proposed approach

  18. Malware Classification based on Call Graph Clustering

    OpenAIRE

    Kinable, Joris; Kostakis, Orestis

    2010-01-01

    Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations...

  19. Bayesian modeling and classification of neural signals

    OpenAIRE

    Lewicki, Michael S.

    1994-01-01

    Signal processing and classification algorithms often have limited applicability resulting from an inaccurate model of the signal's underlying structure. We present here an efficient, Bayesian algorithm for modeling a signal composed of the superposition of brief, Poisson-distributed functions. This methodology is applied to the specific problem of modeling and classifying extracellular neural waveforms which are composed of a superposition of an unknown number of action potentials CAPs). ...

  20. Image-based Vehicle Classification System

    CERN Document Server

    Ng, Jun Yee

    2012-01-01

    Electronic toll collection (ETC) system has been a common trend used for toll collection on toll road nowadays. The implementation of electronic toll collection allows vehicles to travel at low or full speed during the toll payment, which help to avoid the traffic delay at toll road. One of the major components of an electronic toll collection is the automatic vehicle detection and classification (AVDC) system which is important to classify the vehicle so that the toll is charged according to the vehicle classes. Vision-based vehicle classification system is one type of vehicle classification system which adopt camera as the input sensing device for the system. This type of system has advantage over the rest for it is cost efficient as low cost camera is used. The implementation of vision-based vehicle classification system requires lower initial investment cost and very suitable for the toll collection trend migration in Malaysia from single ETC system to full-scale multi-lane free flow (MLFF). This project ...

  1. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    Science.gov (United States)

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1

  2. Robust Model Selection for Classification of Microarrays

    Directory of Open Access Journals (Sweden)

    Ikumi Suzuki

    2009-01-01

    Full Text Available Recently, microarray-based cancer diagnosis systems have been increasingly investigated. However, cost reduction and reliability assurance of such diagnosis systems are still remaining problems in real clinical scenes. To reduce the cost, we need a supervised classifier involving the smallest number of genes, as long as the classifier is sufficiently reliable. To achieve a reliable classifier, we should assess candidate classifiers and select the best one. In the selection process of the best classifier, however, the assessment criterion must involve large variance because of limited number of samples and non-negligible observation noise. Therefore, even if a classifier with a very small number of genes exhibited the smallest leave-one-out cross-validation (LOO error rate, it would not necessarily be reliable because classifiers based on a small number of genes tend to show large variance. We propose a robust model selection criterion, the min-max criterion, based on a resampling bootstrap simulation to assess the variance of estimation of classification error rates. We applied our assessment framework to four published real gene expression datasets and one synthetic dataset. We found that a state- of-the-art procedure, weighted voting classifiers with LOO criterion, had a non-negligible risk of selecting extremely poor classifiers and, on the other hand, that the new min-max criterion could eliminate that risk. These finding suggests that our criterion presents a safer procedure to design a practical cancer diagnosis system.

  3. A new circulation type classification based upon Lagrangian air trajectories

    Science.gov (United States)

    Ramos, Alexandre; Sprenger, Michael; Wernli, Heini; Durán-Quesada, Ana María; Lorenzo, Maria Nieves; Gimeno, Luis

    2014-10-01

    A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula) is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories). The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification. A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  4. A new circulation type classification based upon Lagrangian air trajectories

    Directory of Open Access Journals (Sweden)

    Alexandre M. Ramos

    2014-10-01

    Full Text Available A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories. The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification.A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  5. Classification of LiDAR Data with Point Based Classification Methods

    Science.gov (United States)

    Yastikli, N.; Cetin, Z.

    2016-06-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features

  6. Movie Review Classification and Feature based Summarization of Movie Reviews

    Directory of Open Access Journals (Sweden)

    Sabeeha Mohammed Basheer#1, Syed Farook

    2013-07-01

    Full Text Available Sentiment classification and feature based summarization are essential steps involved with the classification and summarization of movie reviews. The movie review classification is based on sentiment classification and condensed descriptions of movie reviews are generated from the feature based summarization. Experiments are conducted to identify the best machine learning based sentiment classification approach. Latent Semantic Analysis and Latent Dirichlet Allocation were compared to identify features which in turn affects the summary size. The focus of the system design is on classification accuracy and system response time.

  7. Online Network Traffic Classification Algorithm Based on RVM

    Directory of Open Access Journals (Sweden)

    Zhang Qunhui

    2013-06-01

    Full Text Available Since compared with the Support Vector Machine (SVM, the Relevance Vector Machine (RVM not only has the advantage of avoiding the over- learn which is the characteristic of the SVM, but also greatly reduces the amount of computation of the kernel function and avoids the defects of the SVM that the scarcity is not strong, the large amount of calculation as well as the kernel function must satisfy the Mercer's condition and that human empirically determined parameters, so we proposed a new online traffic classification algorithm base on the RVM for this purpose. Through the analysis of the basic principles of RVM and the steps of the modeling, we made use of the training traffic classification model of the RVM to identify the network traffic in the real time through this model and the “port number+ DPI”. When the RVM predicts that the probability is in the query interval, we jointly used the "port number" and "DPI". Finally, we made a detailed experimental validation which shows that: compared with the Support Vector Machine (SVM network traffic classification algorithm, this algorithm can achieve the online network traffic classification, and the classification predication probability is greatly improved.

  8. Mechanism-based drug exposure classification in pharmacoepidemiological studies

    NARCIS (Netherlands)

    Verdel, B.M.

    2010-01-01

    Mechanism-based classification of drug exposure in pharmacoepidemiological studies In pharmacoepidemiology and pharmacovigilance, the relation between drug exposure and clinical outcomes is crucial. Exposure classification in pharmacoepidemiological studies is traditionally based on pharmacotherapeu

  9. Module-Based Breast Cancer Classification

    OpenAIRE

    Zhang, Yuji; Xuan, Jianhua; Clarke, Robert; Ressom, Habtom W

    2013-01-01

    The reliability and reproducibility of gene biomarkers for classification of cancer patients has been challenged due to measurement noise and biological heterogeneity among patients. In this paper, we propose a novel module-based feature selection framework, which integrates biological network information and gene expression data to identify biomarkers not as individual genes but as functional modules. Results from four breast cancer studies demonstrate that the identified module biomarkers i...

  10. Contextual Deep CNN Based Hyperspectral Classification

    OpenAIRE

    Lee, Hyungtae; Kwon, Heesung

    2016-01-01

    In this paper, we describe a novel deep convolutional neural networks (CNN) based approach called contextual deep CNN that can jointly exploit spatial and spectral features for hyperspectral image classification. The contextual deep CNN first concurrently applies multiple 3-dimensional local convolutional filters with different sizes jointly exploiting spatial and spectral features of a hyperspectral image. The initial spatial and spectral feature maps obtained from applying the variable size...

  11. 高斯颜色模型在瓷片图像分类中的应用%Porcelain shard images classification based on Gaussian color model

    Institute of Scientific and Technical Information of China (English)

    郑霞; 胡浩基; 周明全; 樊亚春

    2012-01-01

    由于RGB颜色空间不能很好贴近人的视觉感知,同时也缺少对空间结构的描述,因此采用兼顾颜色信息和空间信息的高斯颜色模型以获取更全面的特征,提出了一种基于高斯颜色模型和多尺度滤波器组的彩色纹理图像分类法,用于瓷器碎片图像的分类.首先将原始图像的RGB颜色空间转换到高斯颜色模型;再用正规化多尺度LM滤波器组对高斯颜色模型的3个通道构造滤波图像,并借助主成分分析寻找主特征图,接着选取各通道的最大高斯拉普拉斯和最大高斯响应图像,与特征图联合构成特征图像组用以进行参数提取;最后以支持向量机作为分类器进行学习和分类.实验结果表明,与基于灰度的、基于RGB模型的和基于RGB_bior4.4小波的方法相比,本文方法具有更好的分类结果,其中在0utex纹理图像库上获得的分类准确率为96.7%,在瓷片图像集上获得的分类准确率为94.2%.此方法可推广应用到其他彩色纹理分类任务.%Since the RGB color space does not closely match the human visual perception and has no ability to describe the spatial structures, the Gaussian color model, which uses the spatial and color information in an integrated model, is used to obtain more complete image features. A color-texture approach based on the Gaussian color model and a multi-scale filter bank is introduced to classify the porcelain shard images. First, the RGB color space of the image is transformed into the Gaussian color model and then the normalized multi-scale LM filter bank is used to construct the filtered images on three channels. Afterwards, the primary feature images are found by using principal components analysis and the maximum responses of the Laplacian of Gaussian filters and Gaussian filters are separately selected. These images compose a feature image set, in which the feature parameters are extracted. Finally, a support vector machine is used to learning

  12. Uncovering Arabidopsis membrane protein interactome enriched in transporters using mating-based split ubiquitin assays and classification models

    Directory of Open Access Journals (Sweden)

    Jin eChen

    2012-06-01

    Full Text Available High-throughput data are a double-edged sword; for the benefit of large amount of data, there is an associated cost of noise. To increase reliability and scalability of high-throughput protein interaction data generation, we tested the efficacy of classification to enrich potential protein-protein interactions (pPPIs. We applied this method to identify interactions among Arabidopsis membrane proteins enriched in transporters. We validated our method with multiple retests. Classification improved the quality of the ensuing interaction network and was effective in reducing the search space and increasing true positive rate. The final network of 541 interactions among 239 proteins (of which 179 are transporters is the first protein interaction network enriched in membrane transporters reported for any organism. This network has similar topological attributes to other published protein interaction networks. It also extends and fills gaps in currently available biological networks in plants and allows building a number of hypotheses about processes and mechanisms involving signal-transduction and transport systems.

  13. Hierarchical Real-time Network Traffic Classification Based on ECOC

    Directory of Open Access Journals (Sweden)

    Yaou Zhao

    2013-09-01

    Full Text Available Classification of network traffic is basic and essential for manynetwork researches and managements. With the rapid development ofpeer-to-peer (P2P application using dynamic port disguisingtechniques and encryption to avoid detection, port-based and simplepayload-based network traffic classification methods were diminished.An alternative method based on statistics and machine learning hadattracted researchers' attention in recent years. However, most ofthe proposed algorithms were off-line and usually used a single classifier.In this paper a new hierarchical real-time model was proposed which comprised of a three tuple (source ip, destination ip and destination portlook up table(TT-LUT part and layered milestone part. TT-LUT was used to quickly classify short flows whichneed not to pass the layered milestone part, and milestones in layered milestone partcould classify the other flows in real-time with the real-time feature selection and statistics.Every milestone was a ECOC(Error-Correcting Output Codes based model which was usedto improve classification performance. Experiments showed that the proposedmodel can improve the efficiency of real-time to 80%, and themulti-class classification accuracy encouragingly to 91.4% on the datasets which had been captured from the backbone router in our campus through a week.

  14. Collaborative Representation based Classification for Face Recognition

    CERN Document Server

    Zhang, Lei; Feng, Xiangchu; Ma, Yi; Zhang, David

    2012-01-01

    By coding a query sample as a sparse linear combination of all training samples and then classifying it by evaluating which class leads to the minimal coding residual, sparse representation based classification (SRC) leads to interesting results for robust face recognition. It is widely believed that the l1- norm sparsity constraint on coding coefficients plays a key role in the success of SRC, while its use of all training samples to collaboratively represent the query sample is rather ignored. In this paper we discuss how SRC works, and show that the collaborative representation mechanism used in SRC is much more crucial to its success of face classification. The SRC is a special case of collaborative representation based classification (CRC), which has various instantiations by applying different norms to the coding residual and coding coefficient. More specifically, the l1 or l2 norm characterization of coding residual is related to the robustness of CRC to outlier facial pixels, while the l1 or l2 norm c...

  15. Texture feature based liver lesion classification

    Science.gov (United States)

    Doron, Yeela; Mayer-Wolf, Nitzan; Diamant, Idit; Greenspan, Hayit

    2014-03-01

    Liver lesion classification is a difficult clinical task. Computerized analysis can support clinical workflow by enabling more objective and reproducible evaluation. In this paper, we evaluate the contribution of several types of texture features for a computer-aided diagnostic (CAD) system which automatically classifies liver lesions from CT images. Based on the assumption that liver lesions of various classes differ in their texture characteristics, a variety of texture features were examined as lesion descriptors. Although texture features are often used for this task, there is currently a lack of detailed research focusing on the comparison across different texture features, or their combinations, on a given dataset. In this work we investigated the performance of Gray Level Co-occurrence Matrix (GLCM), Local Binary Patterns (LBP), Gabor, gray level intensity values and Gabor-based LBP (GLBP), where the features are obtained from a given lesion`s region of interest (ROI). For the classification module, SVM and KNN classifiers were examined. Using a single type of texture feature, best result of 91% accuracy, was obtained with Gabor filtering and SVM classification. Combination of Gabor, LBP and Intensity features improved the results to a final accuracy of 97%.

  16. PLANNING BASED ON CLASSIFICATION BY INDUCTION GRAPH

    Directory of Open Access Journals (Sweden)

    Sofia Benbelkacem

    2013-11-01

    Full Text Available In Artificial Intelligence, planning refers to an area of research that proposes to develop systems that can automatically generate a result set, in the form of an integrated decisionmaking system through a formal procedure, known as plan. Instead of resorting to the scheduling algorithms to generate plans, it is proposed to operate the automatic learning by decision tree to optimize time. In this paper, we propose to build a classification model by induction graph from a learning sample containing plans that have an associated set of descriptors whose values change depending on each plan. This model will then operate for classifying new cases by assigning the appropriate plan.

  17. Extension of Companion Modeling Using Classification Learning

    Science.gov (United States)

    Torii, Daisuke; Bousquet, François; Ishida, Toru

    Companion Modeling is a methodology of refining initial models for understanding reality through a role-playing game (RPG) and a multiagent simulation. In this research, we propose a novel agent model construction methodology in which classification learning is applied to the RPG log data in Companion Modeling. This methodology enables a systematic model construction that handles multi-parameters, independent of the modelers ability. There are three problems in applying classification learning to the RPG log data: 1) It is difficult to gather enough data for the number of features because the cost of gathering data is high. 2) Noise data can affect the learning results because the amount of data may be insufficient. 3) The learning results should be explained as a human decision making model and should be recognized by the expert as being the result that reflects reality. We realized an agent model construction system using the following two approaches: 1) Using a feature selction method, the feature subset that has the best prediction accuracy is identified. In this process, the important features chosen by the expert are always included. 2) The expert eliminates irrelevant features from the learning results after evaluating the learning model through a visualization of the results. Finally, using the RPG log data from the Companion Modeling of agricultural economics in northeastern Thailand, we confirm the capability of this methodology.

  18. Directional wavelet based features for colonic polyp classification.

    Science.gov (United States)

    Wimmer, Georg; Tamaki, Toru; Tischendorf, J J W; Häfner, Michael; Yoshida, Shigeto; Tanaka, Shinji; Uhl, Andreas

    2016-07-01

    -of-the-art methods on most of the databases. We will also show that the Weibull distribution is better suited to model the subband coefficient distribution than other commonly used probability distributions like the Gaussian distribution and the generalized Gaussian distribution. So this work gives a reasonable summary of wavelet based methods for colonic polyp classification and the huge amount of endoscopic polyp databases used for our experiments assures a high significance of the achieved results. PMID:26948110

  19. BROAD PHONEME CLASSIFICATION USING SIGNAL BASED FEATURES

    Directory of Open Access Journals (Sweden)

    Deekshitha G

    2014-12-01

    Full Text Available Speech is the most efficient and popular means of human communication Speech is produced as a sequence of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC features derived through short time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme classification is achieved using features derived directly from the speech at the signal level itself. Broad phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR, short time energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first three formants. Features derived from short time frames of training speech are used to train a multilayer feedforward neural network based classifier with manually marked class label as output and classification accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction which is useful for applications such as automatic speech recognition and automatic language identification.

  20. Nominated Texture Based Cervical Cancer Classification

    Directory of Open Access Journals (Sweden)

    Edwin Jayasingh Mariarputham

    2015-01-01

    Full Text Available Accurate classification of Pap smear images becomes the challenging task in medical image processing. This can be improved in two ways. One way is by selecting suitable well defined specific features and the other is by selecting the best classifier. This paper presents a nominated texture based cervical cancer (NTCC classification system which classifies the Pap smear images into any one of the seven classes. This can be achieved by extracting well defined texture features and selecting best classifier. Seven sets of texture features (24 features are extracted which include relative size of nucleus and cytoplasm, dynamic range and first four moments of intensities of nucleus and cytoplasm, relative displacement of nucleus within the cytoplasm, gray level cooccurrence matrix, local binary pattern histogram, tamura features, and edge orientation histogram. Few types of support vector machine (SVM and neural network (NN classifiers are used for the classification. The performance of the NTCC algorithm is tested and compared to other algorithms on public image database of Herlev University Hospital, Denmark, with 917 Pap smear images. The output of SVM is found to be best for the most of the classes and better results for the remaining classes.

  1. Optimizing Mining Association Rules for Artificial Immune System based Classification

    Directory of Open Access Journals (Sweden)

    SAMEER DIXIT

    2011-08-01

    Full Text Available The primary function of a biological immune system is to protect the body from foreign molecules known as antigens. It has great pattern recognition capability that may be used to distinguish between foreigncells entering the body (non-self or antigen and the body cells (self. Immune systems have many characteristics such as uniqueness, autonomous, recognition of foreigners, distributed detection, and noise tolerance . Inspired by biological immune systems, Artificial Immune Systems have emerged during the last decade. They are incited by many researchers to design and build immune-based models for a variety of application domains. Artificial immune systems can be defined as a computational paradigm that is inspired by theoretical immunology, observed immune functions, principles and mechanisms. Association rule mining is one of the most important and well researched techniques of data mining. The goal of association rules is to extract interesting correlations, frequent patterns, associations or casual structures among sets of items in thetransaction databases or other data repositories. Association rules are widely used in various areas such as inventory control, telecommunication networks, intelligent decision making, market analysis and risk management etc. Apriori is the most widely used algorithm for mining the association rules. Other popular association rule mining algorithms are frequent pattern (FP growth, Eclat, dynamic itemset counting (DIC etc. Associative classification uses association rule mining in the rule discovery process to predict the class labels of the data. This technique has shown great promise over many other classification techniques. Associative classification also integrates the process of rule discovery and classification to build the classifier for the purpose of prediction. The main problem with the associative classification approach is the discovery of highquality association rules in a very large space of

  2. Interaction profile-based protein classification of death domain

    Directory of Open Access Journals (Sweden)

    Pio Frederic

    2004-06-01

    Full Text Available Abstract Background The increasing number of protein sequences and 3D structure obtained from genomic initiatives is leading many of us to focus on proteomics, and to dedicate our experimental and computational efforts on the creation and analysis of information derived from 3D structure. In particular, the high-throughput generation of protein-protein interaction data from a few organisms makes such an approach very important towards understanding the molecular recognition that make-up the entire protein-protein interaction network. Since the generation of sequences, and experimental protein-protein interactions increases faster than the 3D structure determination of protein complexes, there is tremendous interest in developing in silico methods that generate such structure for prediction and classification purposes. In this study we focused on classifying protein family members based on their protein-protein interaction distinctiveness. Structure-based classification of protein-protein interfaces has been described initially by Ponstingl et al. 1 and more recently by Valdar et al. 2 and Mintseris et al. 3, from complex structures that have been solved experimentally. However, little has been done on protein classification based on the prediction of protein-protein complexes obtained from homology modeling and docking simulation. Results We have developed an in silico classification system entitled HODOCO (Homology modeling, Docking and Classification Oracle, in which protein Residue Potential Interaction Profiles (RPIPS are used to summarize protein-protein interaction characteristics. This system applied to a dataset of 64 proteins of the death domain superfamily was used to classify each member into its proper subfamily. Two classification methods were attempted, heuristic and support vector machine learning. Both methods were tested with a 5-fold cross-validation. The heuristic approach yielded a 61% average accuracy, while the machine

  3. A MapReduce based Parallel SVM for Email Classification

    OpenAIRE

    Ke Xu; Cui Wen; Qiong Yuan; Xiangzhu He; Jun Tie

    2014-01-01

    Support Vector Machine (SVM) is a powerful classification and regression tool. Varying approaches including SVM based techniques are proposed for email classification. Automated email classification according to messages or user-specific folders and information extraction from chronologically ordered email streams have become interesting areas in text machine learning research. This paper presents a parallel SVM based on MapReduce (PSMR) algorithm for email classification. We discuss the chal...

  4. Variable-length Positional Modeling for Biological Sequence Classification

    OpenAIRE

    Malousi, Andigoni; Chouvarda, Ioanna; Koutkias, Vassilis; Kouidou, Sofia; Maglaveras, Nicos

    2008-01-01

    Selecting the most informative features in supervised biological classification problems is a decisive preprocessing step for two main reasons: (1) to deal with the dimensionality reduction problem, and (2) to ascribe biological meaning to the underlying feature interactions. This paper presents a filter-based feature selection method that is suitable for positional modeling of biological sequences. The basic motivation is the problem of using a positional model of fixed length that sub-optim...

  5. Fast rule-based bioactivity prediction using associative classification mining

    Directory of Open Access Journals (Sweden)

    Yu Pulan

    2012-11-01

    Full Text Available Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM, which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR, classification based on multiple association rules (CMAR and classification based on association rules (CBA are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB, mutagenicity and hERG (the human Ether-a-go-go-Related Gene blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM methods, and produce highly interpretable models.

  6. Reducing Spatial Data Complexity for Classification Models

    Science.gov (United States)

    Ruta, Dymitr; Gabrys, Bogdan

    2007-11-01

    Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be frequently retrained which further hinders their use. Various data reduction techniques ranging from data sampling up to density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions. As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of classification performance at the

  7. Reducing Spatial Data Complexity for Classification Models

    International Nuclear Information System (INIS)

    Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be frequently retrained which further hinders their use. Various data reduction techniques ranging from data sampling up to density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions. As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of classification performance at the

  8. Histological image classification using biologically interpretable shape-based features

    International Nuclear Information System (INIS)

    Automatic cancer diagnostic systems based on histological image classification are important for improving therapeutic decisions. Previous studies propose textural and morphological features for such systems. These features capture patterns in histological images that are useful for both cancer grading and subtyping. However, because many of these features lack a clear biological interpretation, pathologists may be reluctant to adopt these features for clinical diagnosis. We examine the utility of biologically interpretable shape-based features for classification of histological renal tumor images. Using Fourier shape descriptors, we extract shape-based features that capture the distribution of stain-enhanced cellular and tissue structures in each image and evaluate these features using a multi-class prediction model. We compare the predictive performance of the shape-based diagnostic model to that of traditional models, i.e., using textural, morphological and topological features. The shape-based model, with an average accuracy of 77%, outperforms or complements traditional models. We identify the most informative shapes for each renal tumor subtype from the top-selected features. Results suggest that these shapes are not only accurate diagnostic features, but also correlate with known biological characteristics of renal tumors. Shape-based analysis of histological renal tumor images accurately classifies disease subtypes and reveals biologically insightful discriminatory features. This method for shape-based analysis can be extended to other histological datasets to aid pathologists in diagnostic and therapeutic decisions

  9. Cirrhosis classification based on texture classification of random features.

    Science.gov (United States)

    Liu, Hui; Shao, Ying; Guo, Dongmei; Zheng, Yuanjie; Zhao, Zuowei; Qiu, Tianshuang

    2014-01-01

    Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD) can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM) features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage). CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM. PMID:24707317

  10. National security vulnerability database classification based on an LDA topic model%基于LDA主题模型的安全漏洞分类

    Institute of Scientific and Technical Information of China (English)

    廖晓锋; 王永吉; 范修斌; 吴敬征

    2012-01-01

    The current vulnerabilities in China are analyzed using a dataset from the China National Vulnerability Database of Information Security(CNNVD),with a combined latent Dirichlet allocation(LDA) topic model and a support vector machine(SVM) to construct a classifier in the topic vector space.Tests show that the classifier based on topic vectors has about 8% better classification performance than that based on text vectors.%采用隐含Dirichlet分布主题模型(latent Dirichletallocation,LDA)和支持向量机(support vector machine,SVM)相结合的方法,在主题向量空间构建一个自动漏洞分类器。以中国国家信息安全漏洞库(CNNVD)中漏洞记录为实验数据。实验表明:基于主题向量构建的分类器的分类准确度比直接使用词汇向量构建的分类器有8%的提高。

  11. PSG-Based Classification of Sleep Phases

    OpenAIRE

    Králík, M.

    2015-01-01

    This work is focused on classification of sleep phases using artificial neural network. The unconventional approach was used for calculation of classification features using polysomnographic data (PSG) of real patients. This approach allows to increase the time resolution of the analysis and, thus, to achieve more accurate results of classification.

  12. From Local Patterns to Classification Models

    Science.gov (United States)

    Bringmann, Björn; Nijssen, Siegfried; Zimmermann, Albrecht

    Using pattern mining techniques for building a predictive model is currently a popular topic of research. The aim of these techniques is to obtain classifiers of better predictive performance as compared to greedily constructed models, as well as to allow the construction of predictive models for data not represented in attribute-value vectors. In this chapter we provide an overview of recent techniques we developed for integrating pattern mining and classification tasks. The range of techniques spans the entire range from approaches that select relevant patterns from a previously mined set for propositionalization of the data, over inducing patternbased rule sets, to algorithms that integrate pattern mining and model construction. We provide an overview of the algorithms which are most closely related to our approaches in order to put our techniques in a context.

  13. Classification of Regional Ionospheric Disturbances Based on Support Vector Machines

    Science.gov (United States)

    Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil

    2016-07-01

    Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification

  14. Review of Image Classification Techniques Based on LDA, PCA and Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Mukul Yadav

    2014-02-01

    Full Text Available Image classification is play an important role in security surveillance in current scenario of huge Amount of image data base. Due to rapid change of feature content of image are major issues in classification. The image classification is improved by various authors using different model of classifier. The efficiency of classifier model depends on feature extraction process of traffic image. For the feature extraction process various authors used a different technique such as Gabor feature extraction, histogram and many more method on extraction process for classification. We apply the FLDA-GA for improved the classification rate of content based image classification. The improved method used heuristic function genetic algorithm. In the form of optimal GA used as feature optimizer for FLDA classification. The normal FLDA suffered from a problem of core and outlier problem. The both side kernel technique improved the classification process of support vector machine.FLDA perform a better classification in compression of another binary multi-class classification. Directed acyclic graph applied a graph portion technique for the mapping of feature data. The mapping space of feature data mapped correctly automatically improved the voting process of classification.

  15. Network Traffic Anomalies Identification Based on Classification Methods

    Directory of Open Access Journals (Sweden)

    Donatas Račys

    2015-07-01

    Full Text Available A problem of network traffic anomalies detection in the computer networks is analyzed. Overview of anomalies detection methods is given then advantages and disadvantages of the different methods are analyzed. Model for the traffic anomalies detection was developed based on IBM SPSS Modeler and is used to analyze SNMP data of the router. Investigation of the traffic anomalies was done using three classification methods and different sets of the learning data. Based on the results of investigation it was determined that C5.1 decision tree method has the largest accuracy and performance and can be successfully used for identification of the network traffic anomalies.

  16. Malware Classification based on Call Graph Clustering

    CERN Document Server

    Kinable, Joris

    2010-01-01

    Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, and enable the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and DB...

  17. Classification of archaeological pieces into their respective stratum by a chemometric model based on the soil concentration of 25 selected elements

    International Nuclear Information System (INIS)

    The aim of this work was to demonstrate that an archaeological ceramic piece has remained buried underground in the same stratum for centuries without being removed. For this purpose, a chemometric model based on Principal Component Analysis, Soft Independent Modelling of Class Analogy and Linear Discriminant Analysis classification techniques was created with the concentration of some selected elements of both soil of the stratum and soil adhered to the ceramic piece. Some ceramic pieces from four different stratigraphic units, coming from a roman archaeological site in Alava (North of Spain), and its respective stratum soils were collected. The soil adhered to the ceramic pieces was removed and treated in the same way as the soil from its respective stratum. The digestion was carried out following the US Environmental Pollution Agency EPA 3051A method. A total of 54 elements were determined in the extracts by a rapid screening inductively coupled plasma mass spectrometry method. After rejecting the major elements and those which could have changed from the original composition of the soils (migration or retention from/to the buried objects), the following elements (25) were finally taken into account to construct the model: Li, V, Co, As, Y, Nb, Sn, Ba, La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, Au, Th and U. A total of 33 subsamples were treated from 10 soils belonging to 4 different stratigraphic units. The final model groups and discriminate them in four groups, according to the stratigraphic unit, having both the stratum and soils adhered to the pieces falling down in the same group.

  18. A unified Bayesian hierarchical model for MRI tissue classification.

    Science.gov (United States)

    Feng, Dai; Liang, Dong; Tierney, Luke

    2014-04-15

    Various works have used magnetic resonance imaging (MRI) tissue classification extensively to study a number of neurological and psychiatric disorders. Various noise characteristics and other artifacts make this classification a challenging task. Instead of splitting the procedure into different steps, we extend a previous work to develop a unified Bayesian hierarchical model, which addresses both the partial volume effect and intensity non-uniformity, the two major acquisition artifacts, simultaneously. We adopted a normal mixture model with the means and variances depending on the tissue types of voxels to model the observed intensity values. We modeled the relationship among the components of the index vector of tissue types by a hidden Markov model, which captures the spatial similarity of voxels. Furthermore, we addressed the partial volume effect by construction of a higher resolution image in which each voxel is divided into subvoxels. Finally, We achieved the bias field correction by using a Gaussian Markov random field model with a band precision matrix designed in light of image filtering. Sparse matrix methods and parallel computations based on conditional independence are exploited to improve the speed of the Markov chain Monte Carlo simulation. The unified model provides more accurate tissue classification results for both simulated and real data sets. PMID:24738112

  19. Automatic web services classification based on rough set theory

    Institute of Scientific and Technical Information of China (English)

    陈立; 张英; 宋自林; 苗壮

    2013-01-01

    With development of web services technology, the number of existing services in the internet is growing day by day. In order to achieve automatic and accurate services classification which can be beneficial for service related tasks, a rough set theory based method for services classification was proposed. First, the services descriptions were preprocessed and represented as vectors. Elicited by the discernibility matrices based attribute reduction in rough set theory and taking into account the characteristic of decision table of services classification, a method based on continuous discernibility matrices was proposed for dimensionality reduction. And finally, services classification was processed automatically. Through the experiment, the proposed method for services classification achieves approving classification result in all five testing categories. The experiment result shows that the proposed method is accurate and could be used in practical web services classification.

  20. SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Thiruvengatanadhan Ramalingam

    2014-01-01

    Full Text Available Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.

  1. Modeling of the high-performance PLD-based sectioning method for classification of the shape of optical object images

    OpenAIRE

    Tymchenko, Leonid; Petrovskiy, Mycola; Kokryatskaya, Natalia; Gubernatorov, Volodymir; Kutaev, Yuriy

    2013-01-01

    An analysis of the sectioning method to control a shape of the laser beam spot is conducted in this paper. A possibility is discussed to use a form factor for solving the problem raised, and the experimental study of the sectioning method and the form factor for control of the shape of the laser beam spot was conducted. A PLD-based technical implementation of the method was realized.

  2. Visualization of Nonlinear Classification Models in Neuroimaging - Signed Sensitivity Maps

    DEFF Research Database (Denmark)

    Rasmussen, Peter Mondrup; Schmah, Tanya; Madsen, Kristoffer Hougaard;

    2012-01-01

    underlying neural encoding of an experiment defining multiple brain states. In this relation there is a great desire for the researcher to generate brain maps, that highlight brain locations of importance to the classifiers decisions. Based on sensitivity analysis, we develop further procedures for model...... direction the individual locations influence the classification. We illustrate the visualization procedure on a real data from a simple functional magnetic resonance imaging experiment....

  3. Hidden Markov Modeling for humpback whale (Megaptera novaeangliae) call classification

    OpenAIRE

    PACE, Federica; White, Paul; Adam, Olivier

    2012-01-01

    International audience This study proposes a new approach for the classification of the calls detected in the songs with the use of Hidden Markov Models (HMMs) based on the concept of subunits as building blocks. HMMs have been used once before for such task but in an unsupervised algorithm with promising results, and they are used extensively in speech recognition and in few bioacoustics studies. Their flexibility suggests that they may be suitable for the analysis of the varied repertoir...

  4. Classification model of arousal and valence mental states by EEG signals analysis and Brodmann correlations

    Directory of Open Access Journals (Sweden)

    Adrian Rodriguez Aguinaga

    2015-06-01

    Full Text Available This paper proposes a methodology to perform emotional states classification by the analysis of EEG signals, wavelet decomposition and an electrode discrimination process, that associates electrodes of a 10/20 model to Brodmann regions and reduce computational burden. The classification process were performed by a Support Vector Machines Classification process, achieving a 81.46 percent of classification rate for a multi-class problem and the emotions modeling are based in an adjusted space from the Russell Arousal Valence Space and the Geneva model.

  5. Predicting student satisfaction with courses based on log data from a virtual learning environment – a neural network and classification tree model

    OpenAIRE

    Ivana Đurđević Babić

    2015-01-01

    Student satisfaction with courses in academic institutions is an important issue and is recognized as a form of support in ensuring effective and quality education, as well as enhancing student course experience. This paper investigates whether there is a connection between student satisfaction with courses and log data on student courses in a virtual learning environment. Furthermore, it explores whether a successful classification model for predicting student satisfaction with course can be...

  6. Graph-based Methods for Orbit Classification

    Energy Technology Data Exchange (ETDEWEB)

    Bagherjeiran, A; Kamath, C

    2005-09-29

    An important step in the quest for low-cost fusion power is the ability to perform and analyze experiments in prototype fusion reactors. One of the tasks in the analysis of experimental data is the classification of orbits in Poincare plots. These plots are generated by the particles in a fusion reactor as they move within the toroidal device. In this paper, we describe the use of graph-based methods to extract features from orbits. These features are then used to classify the orbits into several categories. Our results show that existing machine learning algorithms are successful in classifying orbits with few points, a situation which can arise in data from experiments.

  7. A power spectrum based backpropagation artificial neural network model for classification of sleep-wake stages in rats

    Directory of Open Access Journals (Sweden)

    Amit Kumar Ray

    2003-05-01

    Full Text Available Three layered feed-forward backpropagation artificial neural network architecture is designed to classify sleep-wake stages in rats. Continuous three channel polygraphic signals such as electroencephalogram, electrooculogram and electromyogram were recorded from conscious rats for eight hours during day time. Signals were also stored in computer hard disk with the help of analog to digital converter and its compatible data acquisition software. The power spectra (in dB scale of the digitized signals in three sleep-wake stages were calculated. Selected power spectrum data of all three simultaneously recorded polygraphic signals were used for training the network and to classify slow wave sleep, rapid eye movement sleep and awake stages. The ANN architecture used in present study shows a very good agreement with manual sleep stage scoring with an average of 94.83% for all the 1200 samples tested from SWS, REM and AWA stages. The high performance observed with the system based on ANN highlights the need of this computational tool into the field of sleep research.

  8. A MapReduce based Parallel SVM for Email Classification

    Directory of Open Access Journals (Sweden)

    Ke Xu

    2014-06-01

    Full Text Available Support Vector Machine (SVM is a powerful classification and regression tool. Varying approaches including SVM based techniques are proposed for email classification. Automated email classification according to messages or user-specific folders and information extraction from chronologically ordered email streams have become interesting areas in text machine learning research. This paper presents a parallel SVM based on MapReduce (PSMR algorithm for email classification. We discuss the challenges that arise from differences between email foldering and traditional document classification. We show experimental results from an array of automated classification methods and evaluation methodologies, including Naive Bayes, SVM and PSMR method of foldering results on the Enron datasets based on the timeline. By distributing, processing and optimizing the subsets of the training data across multiple participating nodes, the parallel SVM based on MapReduce algorithm reduces the training time significantly

  9. Model sparsity and brain pattern interpretation of classification models in neuroimaging

    DEFF Research Database (Denmark)

    Rasmussen, Peter Mondrup; Madsen, Kristoffer Hougaard; Churchill, Nathan W; Hansen, Lars Kai; Strother, Stephen C

    2012-01-01

    Interest is increasing in applying discriminative multivariate analysis techniques to the analysis of functional neuroimaging data. Model interpretation is of great importance in the neuroimaging context, and is conventionally based on a ‘brain map’ derived from the classification model. In this...... study we focus on the relative influence of model regularization parameter choices on both the model generalization, the reliability of the spatial patterns extracted from the classification model, and the ability of the resulting model to identify relevant brain networks defining the underlying neural...... for both ℓ2 and ℓ1 regularization. Importantly, we illustrate a trade-off between model spatial reproducibility and prediction accuracy. We show that known parts of brain networks can be overlooked in pursuing maximization of classification accuracy alone with either ℓ2 and/or ℓ1 regularization. This...

  10. Small Sample Issues for Microarray-Based Classification

    OpenAIRE

    Dougherty, Edward R

    2006-01-01

    In order to study the molecular biological differences between normal and diseased tissues, it is desirable to perform classification among diseases and stages of disease using microarray-based gene-expression values. Owing to the limited number of microarrays typically used in these studies, serious issues arise with respect to the design, performance and analysis of classifiers based on microarray data. This paper reviews some fundamental issues facing small-sample classification: classific...

  11. Gender Classification Based on Geometry Features of Palm Image

    OpenAIRE

    Ming Wu; Yubo Yuan

    2014-01-01

    This paper presents a novel gender classification method based on geometry features of palm image which is simple, fast, and easy to handle. This gender classification method based on geometry features comprises two main attributes. The first one is feature extraction by image processing. The other one is classification system with polynomial smooth support vector machine (PSSVM). A total of 180 palm images were collected from 30 persons to verify the validity of the proposed gender classi...

  12. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor forrisk analysis of network security. In the currentdecade various method and framework are availablefor intrusion detection and security awareness.Some method based on knowledge discovery processand some framework based on neural network.These entire model take rule based decision for thegeneration of security alerts. In this paper weproposed a novel method for intrusion awarenessusing data fusion and SVM classification. Datafusion work on the biases of features gathering ofevent. Support vector machine is super classifier ofdata. Here we used SVM for the detection of closeditem of ruled based technique. Our proposedmethod simulate on KDD1999 DARPA data set andget better empirical evaluation result in comparisonof rule based technique and neural network model.

  13. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor for risk analysis of network security. In the current decade various method and framework are available for intrusion detection and security awareness. Some method based on knowledge discovery process and some framework based on neural network. These entire model take rule based decision for the generation of security alerts. In this paper we proposed a novel method for intrusion awareness using data fusion and SVM classification. Data fusion work on the biases of features gathering of event. Support vector machine is super classifier of data. Here we used SVM for the detection of closed item of ruled based technique. Our proposed method simulate on KDD1999 DARPA data set and get better empirical evaluation result in comparison of rule based technique and neural network model.

  14. DNA sequence analysis using hierarchical ART-based classification networks

    Energy Technology Data Exchange (ETDEWEB)

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  15. Archaeological pieces classification into their respective stratum by a chemometric model based on the soil metal concentration of 23 selected elements

    International Nuclear Information System (INIS)

    Complete text of publication follows. An archaeological ceramic piece, which has been buried underground in the same stratum for centuries, when it is removed in the excavation it will have soil remains necessarily associated to the soil where it was buried. This soil adhered to the ceramic piece and the stratum soil should have similar composition for some chemical elements. The aim of this work was to find the best chemometric model (by means of Principal Component Analysis (PCA) and Discriminant Analysis (DA) classification techniques), based on the concentration of selected metals of both soils, which groups the corresponding soils and discriminate the different ones. Some ceramic pieces from four different stratigraphic units, coming from a roman archaeological site in Alava (North of Spain), and its respective stratum soils were collected. The soil adhered to the ceramic pieces was removed and treated in the same way as the soil from its respective stratum. The samples were air dried (fume hood for 24 h) before milling and sieving to <2 mm. The digestion was carried out following the US Environmental Pollution Agency EPA 3051A method based on the microwave assisted acid digestion of soil samples. A total of 54 elements were determined in the extracts by a rapid screening ICP-MS method. As is done in environmental analysis (S. Fernandez et al., J. Marine Systems 72 (2008) 332-341.), data of metal concentration were normalized to Mn as the conservative metal (Fe,Si,Ti, Ca and Mg were also tested). After rejecting the major elements and those which could change from the original composition of the soils (migration or retention from/to the buried objects), the next elements were finally taken into account to construct the model: Li, Ti, Cr, Co, Ni, Ga, Rb, Sr, Zr, Nb, Mo, Ag, Sb, Cs, Ba, Sn, Eu, Tm, Hf, W, Pb, Th and U. A total of 33 subsamples were treated from 10 soils belonging to 4 different stratigraphic units. The final model groups and discriminate them in

  16. Hierarchical diagnostic classification models morphing into unidimensional 'diagnostic' classification models-a commentary.

    Science.gov (United States)

    von Davier, Matthias; Haberman, Shelby J

    2014-04-01

    This commentary addresses the modeling and final analytical path taken, as well as the terminology used, in the paper "Hierarchical diagnostic classification models: a family of models for estimating and testing attribute hierarchies" by Templin and Bradshaw (Psychometrika, doi: 10.1007/s11336-013-9362-0, 2013). It raises several issues concerning use of cognitive diagnostic models that either assume attribute hierarchies or assume a certain form of attribute interactions. The issues raised are illustrated with examples, and references are provided for further examination. PMID:24478022

  17. Classification of CMEs Based on Their Dynamics

    Science.gov (United States)

    Nicewicz, J.; Michalek, G.

    2016-05-01

    A large set of coronal mass ejections CMEs (6621) has been selected to study their dynamics seen with the Large Angle and Spectroscopic Coronagraph (LASCO) onboard the Solar and Heliospheric Observatory (SOHO) field of view (LFOV). These events were selected based on having at least six height-time measurements so that their dynamic properties, in the LFOV, can be evaluated with reasonable accuracy. Height-time measurements (in the SOHO/LASCO catalog) were used to determine the velocities and accelerations of individual CMEs at successive distances from the Sun. Linear and quadratic functions were fitted to these data points. On the basis of the best fits to the velocity data points, we were able to classify CMEs into four groups. The types of CMEs do not only have different dynamic behaviors but also different masses, widths, velocities, and accelerations. We also show that these groups of events are initiated by different onset mechanisms. The results of our study allow us to present a consistent classification of CMEs based on their dynamics.

  18. Constructing Customer Consumption Classification Models Based on Rough Sets and Neural Network%基于粗糙神经网络的客户消费分类模型研究

    Institute of Scientific and Technical Information of China (English)

    万映红; 胡万平; 曹小鹏

    2011-01-01

    针对客户消费属性的多维、相关及不确定的特点,提出了基于粗糙神经网络(RS-NN)的客户消费分类模型.在揭示了客户消费分类问题的粗糙集特性基础上,设计出由预处理分类知识空间、建立消费分类模型、分类模型应用构成的研究框架,系统阐述了基于粗糙集的约简消费属性、提取分类规则、构建粗糙集神经网络初始拓扑结构、训练和检验网络模型等一系列关键技术,最后以某地区电信客户管理为建模示例.结果表明:RS-NN模型在模型结构、模型效率、分类预测精度方面均优于BP-NN算法,是一种有效和实用的客户分类新方法.%The customer consumption classification topic is receiving increasing attention from researchers in the field of customer relationship management.The current research on customer consumption classification can be funber improved in many areas. For instance, customer consumption classification models should take into consideration multidimensional and other related consumption attributes into classificaticu analysis,avoidance of attribute redundancy, and selection of core classification attributes. Customer consumption models should identify input neurons,hidden layers and hidden neurons in order to reduce the complexity of classification structure and improve model's explanatory power. Existing classification methods are not effective at representing the inconsistency of consumption attributes and classes.JPThis paper proposed a customer consumption classification model by integrating rough set and neural networks based on the rough setneural network (RS-NN) model. Rongh set is the core theory underpinning this study. This paper reduced attribute values and adopted core consumption attributes in order to solve attribute redundancy and inconsistency problems. This paper also used customer classification roles and solved attribute inconsistency problems. In addition, by integrating classification

  19. The role of catchment classification in rainfall-runoff modeling

    OpenAIRE

    He, Y.; A. Bárdossy; E. Zehe

    2011-01-01

    A sound catchment classification scheme is a fundamental step towards improved catchment hydrology science and prediction in ungauged basins. Two categories of catchment classification methods are presented in the paper. The first one is based directly on physiographic properties and climatic conditions over a catchment and regarded as a Linnaean type or natural classification scheme. The second one is based on numerical clustering and regionalization methods and considered as a statistical o...

  20. Classification problems in object-based representation systems

    OpenAIRE

    Napoli, Amedeo

    1999-01-01

    Classification is a process that consists in two dual operations: generating a set of classes and then classifying given objects into the created classes. The class generation may be understood as a learning process and object classification as a problem-solving process. The goal of this position paper is to introduce and to make precise the notion of a classification problem in object-based representation systems, e.g. a query against a class hierarchy, to define a subsumption relation betwe...

  1. Fuzzy Inference System & Fuzzy Cognitive Maps based Classification

    OpenAIRE

    Kanika Bhutani; Gaurav; Megha Kumar

    2015-01-01

    Fuzzy classification is very necessary because it has the ability to use interpretable rules. It has got control over the limitations of crisp rule based classifiers. This paper mainly deals with classification on the basis of soft computing techniques fuzzy cognitive maps and fuzzy inference system on the lenses dataset. The results obtained with FIS shows 100% accuracy. Sometimes the data available for classification contain missing or ambiguous data so Neutrosophic logic is used for cla...

  2. A new classification algorithm based on RGH-tree search

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In this paper, we put forward a new classification algorithm based on RGH-Tree search and perform the classification analysis and comparison study. This algorithm can save computing resource and increase the classification efficiency. The experiment shows that this algorithm can get better effect in dealing with three dimensional multi-kind data. We find that the algorithm has better generalization ability for small training set and big testing result.

  3. A model for anomaly classification in intrusion detection systems

    Science.gov (United States)

    Ferreira, V. O.; Galhardi, V. V.; Gonçalves, L. B. L.; Silva, R. C.; Cansian, A. M.

    2015-09-01

    Intrusion Detection Systems (IDS) are traditionally divided into two types according to the detection methods they employ, namely (i) misuse detection and (ii) anomaly detection. Anomaly detection has been widely used and its main advantage is the ability to detect new attacks. However, the analysis of anomalies generated can become expensive, since they often have no clear information about the malicious events they represent. In this context, this paper presents a model for automated classification of alerts generated by an anomaly based IDS. The main goal is either the classification of the detected anomalies in well-defined taxonomies of attacks or to identify whether it is a false positive misclassified by the IDS. Some common attacks to computer networks were considered and we achieved important results that can equip security analysts with best resources for their analyses.

  4. Joint Probability-Based Neuronal Spike Train Classification

    Directory of Open Access Journals (Sweden)

    Yan Chen

    2009-01-01

    Full Text Available Neuronal spike trains are used by the nervous system to encode and transmit information. Euclidean distance-based methods (EDBMs have been applied to quantify the similarity between temporally-discretized spike trains and model responses. In this study, using the same discretization procedure, we developed and applied a joint probability-based method (JPBM to classify individual spike trains of slowly adapting pulmonary stretch receptors (SARs. The activity of individual SARs was recorded in anaesthetized, paralysed adult male rabbits, which were artificially-ventilated at constant rate and one of three different volumes. Two-thirds of the responses to the 600 stimuli presented at each volume were used to construct three response models (one for each stimulus volume consisting of a series of time bins, each with spike probabilities. The remaining one-third of the responses where used as test responses to be classified into one of the three model responses. This was done by computing the joint probability of observing the same series of events (spikes or no spikes, dictated by the test response in a given model and determining which probability of the three was highest. The JPBM generally produced better classification accuracy than the EDBM, and both performed well above chance. Both methods were similarly affected by variations in discretization parameters, response epoch duration, and two different response alignment strategies. Increasing bin widths increased classification accuracy, which also improved with increased observation time, but primarily during periods of increasing lung inflation. Thus, the JPBM is a simple and effective method performing spike train classification.

  5. Fuzzy classification rules based on similarity

    Czech Academy of Sciences Publication Activity Database

    Holeňa, Martin; Štefka, D.

    Seňa : PONT s.r.o., 2012 - (Horváth, T.), s. 25-31 ISBN 978-80-971144-0-4. [ITAT 2012. Conference on Theory and Practice of Information Technologies. Ždiar (SK), 17.09.2012-21.09.2012] R&D Projects: GA ČR GA201/08/0802 Institutional support: RVO:67985807 Keywords : classification rules * fuzzy classification * fuzzy integral * fuzzy measure * similarity Subject RIV: IN - Informatics, Computer Science

  6. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  7. Application of Crossed Classification Credibility Model in Third-party Auto Insurance in Slovak Republic

    Directory of Open Access Journals (Sweden)

    Erik Šoltés

    2006-12-01

    Full Text Available In this article we reviewed the two-way crossed classification credibility model. This model is an extension of the hierarchical models of Jewell and Taylor. When the risk factors are not nested then 3 hierarchical model is not applicable. In the crossed classification credibility models, the risk factors are modelled without restrictions of a hierarchical structure and that makes them of great practical interest. In the two-way crossed classification credibility model the risks in a portfolio are classified based on two risk factors. In this model the credibility premium for a certain contract is equal to the overall mean for the portfolio plus adjustments for the risk experience within the contract itself and the risk experience within the class of the risk factors to which it belongs. The objective of this article is to show alternatives of an application of crossed classification credibility models in third-party auto insurance in Slovak Republic.

  8. Object-Based Classification of Abandoned Logging Roads under Heavy Canopy Using LiDAR

    OpenAIRE

    Jason Sherba; Leonhard Blesius; Jerry Davis

    2014-01-01

    LiDAR-derived slope models may be used to detect abandoned logging roads in steep forested terrain. An object-based classification approach of abandoned logging road detection was employed in this study. First, a slope model of the study site in Marin County, California was created from a LiDAR derived DEM. Multiresolution segmentation was applied to the slope model and road seed objects were iteratively grown into candidate objects. A road classification accuracy of 86% was achieved using th...

  9. Preliminary Research on Grassland Fine-classification Based on MODIS

    International Nuclear Information System (INIS)

    Grassland ecosystem is important for climatic regulation, maintaining the soil and water. Research on the grassland monitoring method could provide effective reference for grassland resource investigation. In this study, we used the vegetation index method for grassland classification. There are several types of climate in China. Therefore, we need to use China's Main Climate Zone Maps and divide the study region into four climate zones. Based on grassland classification system of the first nation-wide grass resource survey in China, we established a new grassland classification system which is only suitable for this research. We used MODIS images as the basic data resources, and use the expert classifier method to perform grassland classification. Based on the 1:1,000,000 Grassland Resource Map of China, we obtained the basic distribution of all the grassland types and selected 20 samples evenly distributed in each type, then used NDVI/EVI product to summarize different spectral features of different grassland types. Finally, we introduced other classification auxiliary data, such as elevation, accumulate temperature (AT), humidity index (HI) and rainfall. China's nation-wide grassland classification map is resulted by merging the grassland in different climate zone. The overall classification accuracy is 60.4%. The result indicated that expert classifier is proper for national wide grassland classification, but the classification accuracy need to be improved

  10. Drug related webpages classification using images and text information based on multi-kernel learning

    Science.gov (United States)

    Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan

    2015-12-01

    In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.

  11. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  12. Classification of Product Requirements Based on Product Environment

    OpenAIRE

    Chen, Zhen Yu; Zeng, Yong

    2006-01-01

    Abstract Effective management of product requirements is critical for designers to deliver a quality design solution in a reasonable range of cost and time. The management depends on a well-defined classification and a flexible representation of product requirements. This article proposes two classification criteria in terms of different partitions of product environment based on a formal structure of produ...

  13. A Curriculum-Based Classification System for Community Colleges.

    Science.gov (United States)

    Schuyler, Gwyer

    2003-01-01

    Proposes and tests a community college classification system based on curricular characteristics and their association with institutional characteristics. Seeks readily available data correlates to represent percentage of a college's course offerings that are in the liberal arts. A simple two-category classification system using total enrollment…

  14. An Object-Based Method for Chinese Landform Types Classification

    Science.gov (United States)

    Ding, Hu; Tao, Fei; Zhao, Wufan; Na, Jiaming; Tang, Guo'an

    2016-06-01

    Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM). In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  15. A deep learning approach to the classification of 3D CAD models

    Institute of Scientific and Technical Information of China (English)

    Fei-wei QIN; Lu-ye LI; Shu-ming GAO; Xiao-ling YANG; Xiang CHEN

    2014-01-01

    Model classification is essential to the management and reuse of 3D CAD models. Manual model classification is laborious and error prone. At the same time, the automatic classification methods are scarce due to the intrinsic complexity of 3D CAD models. In this paper, we propose an automatic 3D CAD model classification approach based on deep neural networks. According to prior knowledge of the CAD domain, features are selected and extracted from 3D CAD models first, and then pre-processed as high dimensional input vectors for category recognition. By analogy with the thinking process of engineers, a deep neural network classifier for 3D CAD models is constructed with the aid of deep learning techniques. To obtain an optimal solution, multiple strategies are appropriately chosen and applied in the training phase, which makes our classifier achieve better per-formance. We demonstrate the efficiency and effectiveness of our approach through experiments on 3D CAD model datasets.

  16. Shape classification based on singular value decomposition transform

    Institute of Scientific and Technical Information of China (English)

    SHAABAN Zyad; ARIF Thawar; BABA Sami; KREKOR Lala

    2009-01-01

    In this paper, a new shape classification system based on singular value decomposition (SVD) transform using nearest neighbour classifier was proposed. The gray scale image of the shape object was converted into a black and white image. The squared Euclidean distance transform on binary image was applied to extract the boundary image of the shape. SVD transform features were extracted from the the boundary of the object shapes. In this paper, the proposed classification system based on SVD transform feature extraction method was compared with classifier based on moment invariants using nearest neighbour classifier. The experimental results showed the advantage of our proposed classification system.

  17. Multiclass Classification Based on the Analytical Center of Version Space

    Institute of Scientific and Technical Information of China (English)

    ZENGFanzi; QIUZhengding; YUEJianhai; LIXiangqian

    2005-01-01

    Analytical center machine, based on the analytical center of version space, outperforms support vector machine, especially when the version space is elongated or asymmetric. While analytical center machine for binary classification is well understood, little is known about corresponding multiclass classification.Moreover, considering that the current multiclass classification method: “one versus all” needs repeatedly constructing classifiers to separate a single class from all the others, which leads to daunting computation and low efficiency of classification, and that though multiclass support vector machine corresponds to a simple quadratic optimization, it is not very effective when the version spaceis asymmetric or elongated, Thus, the multiclass classification approach based on the analytical center of version space is proposed to address the above problems. Experiments on wine recognition and glass identification dataset demonstrate validity of the approach proposed.

  18. Music Genre Classification using the multivariate AR feature integration model

    DEFF Research Database (Denmark)

    Ahrendt, Peter; Meng, Anders

    2005-01-01

    Music genre classification systems are normally build as a feature extraction module followed by a classifier. The features are often short-time features with time frames of 10-30ms, although several characteristics of music require larger time scales. Thus, larger time frames are needed to take...... informative decisions about musical genre. For the MIREX music genre contest several authors derive long time features based either on statistical moments and/or temporal structure in the short time features. In our contribution we model a segment (1.2 s) of short time features (texture) using a multivariate...

  19. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    Science.gov (United States)

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy. PMID:26687087

  20. Software Design Level Vulnerability Classification Model

    OpenAIRE

    Shabana Rehman; Khurram Mustafa

    2012-01-01

    Classification of software security vulnerability no doubt facilitates the understanding of security-related information and accelerates vulnerability analysis. The lack of proper classification not only hinders its understanding but also renders the strategy of developing mitigation mechanism for clustered vulnerabilities. Now software developers and researchers are agreed on the fact that requirement and design phase of the software are the phases where security incorporation yields maximum...

  1. Development Of An Econometric Model Case Study: Romanian Classification System

    Directory of Open Access Journals (Sweden)

    Savescu Roxana

    2015-08-01

    Full Text Available The purpose of the paper is to illustrate an econometric model used to predict the lean meat content in pig carcasses, based on the muscle thickness and back fat thickness measured by the means of an optical probe (OptiGrade PRO.The analysis goes through all steps involved in the development of the model: statement of theory, specification of the mathematical model, sampling and collection of data, estimation of the parameters of the chosen econometric model, tests of the hypothesis derived from the model and prediction equations. The data have been in a controlled experiment conducted by the Romanian Carcass Classification Commission in 2007. The purpose of the experiment was to develop the prediction formulae to be used in the implementation of SEUROP classification system, imposed by European Union legislation. The research methodology used by the author in this study consisted in reviewing the existing literature and normative acts, analyzing the primary data provided by and organization conducting the experiment and interviewing the representatives of the working team that participated in the trial.

  2. A Chemistry-Based Classification for Peridotite Xenoliths

    Science.gov (United States)

    Block, K. A.; Ducea, M.; Raye, U.; Stern, R. J.; Anthony, E. Y.; Lehnert, K. A.

    2007-12-01

    The development of a petrological and geochemical database for mantle xenoliths is important for interpreting EarthScope geophysical results. Interpretation of compositional characteristics of xenoliths requires a sound basis for comparing geochemical results, even when no petrographic modes are available. Peridotite xenoliths are generally classified on the basis of mineralogy (Streckeisen, 1973) derived from point-counting methods. Modal estimates, particularly on heterogeneous samples, are conducted using various methodologies and are therefore subject to large statistical error. Also, many studies simply do not report the modes. Other classifications for peridotite xenoliths based on host matrix or tectonic setting (cratonic vs. non-cratonic) are poorly defined and provide little information on where samples from transitional settings fit within a classification scheme (e.g., xenoliths from circum-cratonic locations). We present here a classification for peridotite xenoliths based on bulk rock major element chemistry, which is one of the most common types of data reported in the literature. A chemical dataset of over 1150 peridotite xenoliths is compiled from two online geochemistry databases, the EarthChem Deep Lithosphere Dataset and from GEOROC (http://www.earthchem.org), and is downloaded with the rock names reported in the original publications. Ternary plots of combinations of the SiO2- CaO-Al2O3-MgO (SCAM) components display sharp boundaries that define the dunite, harzburgite, lherzolite, or wehrlite-pyroxenite fields and provide a graphical basis for classification. In addition, for the CaO-Al2O3-MgO (CAM) diagram, a boundary between harzburgite and lherzolite at approximately 19% CaO is defined by a plot of over 160 abyssal peridotite compositions calculated from observed modes using the methods of Asimow (1999) and Baker and Beckett (1999). We anticipate that our SCAM classification is a first step in the development of a uniform basis for

  3. Robust Pedestrian Classification Based on Hierarchical Kernel Sparse Representation.

    Science.gov (United States)

    Sun, Rui; Zhang, Guanghai; Yan, Xiaoxing; Gao, Jun

    2016-01-01

    Vision-based pedestrian detection has become an active topic in computer vision and autonomous vehicles. It aims at detecting pedestrians appearing ahead of the vehicle using a camera so that autonomous vehicles can assess the danger and take action. Due to varied illumination and appearance, complex background and occlusion pedestrian detection in outdoor environments is a difficult problem. In this paper, we propose a novel hierarchical feature extraction and weighted kernel sparse representation model for pedestrian classification. Initially, hierarchical feature extraction based on a CENTRIST descriptor is used to capture discriminative structures. A max pooling operation is used to enhance the invariance of varying appearance. Then, a kernel sparse representation model is proposed to fully exploit the discrimination information embedded in the hierarchical local features, and a Gaussian weight function as the measure to effectively handle the occlusion in pedestrian images. Extensive experiments are conducted on benchmark databases, including INRIA, Daimler, an artificially generated dataset and a real occluded dataset, demonstrating the more robust performance of the proposed method compared to state-of-the-art pedestrian classification methods. PMID:27537888

  4. Pixel classification based color image segmentation using quaternion exponent moments.

    Science.gov (United States)

    Wang, Xiang-Yang; Wu, Zhi-Fang; Chen, Liang; Zheng, Hong-Liang; Yang, Hong-Ying

    2016-02-01

    Image segmentation remains an important, but hard-to-solve, problem since it appears to be application dependent with usually no a priori information available regarding the image structure. In recent years, many image segmentation algorithms have been developed, but they are often very complex and some undesired results occur frequently. In this paper, we propose a pixel classification based color image segmentation using quaternion exponent moments. Firstly, the pixel-level image feature is extracted based on quaternion exponent moments (QEMs), which can capture effectively the image pixel content by considering the correlation between different color channels. Then, the pixel-level image feature is used as input of twin support vector machines (TSVM) classifier, and the TSVM model is trained by selecting the training samples with Arimoto entropy thresholding. Finally, the color image is segmented with the trained TSVM model. The proposed scheme has the following advantages: (1) the effective QEMs is introduced to describe color image pixel content, which considers the correlation between different color channels, (2) the excellent TSVM classifier is utilized, which has lower computation time and higher classification accuracy. Experimental results show that our proposed method has very promising segmentation performance compared with the state-of-the-art segmentation approaches recently proposed in the literature. PMID:26618250

  5. Calibration of a Plastic Classification System with the Ccw Model

    International Nuclear Information System (INIS)

    This document describes the calibration of a plastic Classification system with the Ccw model (Classification by Quantum's built with Wavelet Coefficients). The method is applied to spectra of plastics usually present in domestic wastes. Obtained results are showed. (Author) 16 refs

  6. Program Classification for Performance-Based Budgeting

    OpenAIRE

    Robinson, Marc

    2013-01-01

    This guide provides practical guidance on program classification, that is, on how to define programs and their constituent elements under a program budgeting system. Program budgeting is the most widespread form of performance budgeting as applied to the government budget as a whole. The defining characteristics of program budgeting are: (1) funds are allocated in the budget to results-bas...

  7. A Fuzzy Logic Based Sentiment Classification

    Directory of Open Access Journals (Sweden)

    J.I.Sheeba

    2014-07-01

    Full Text Available Sentiment classification aims to detect information such as opinions, explicit , implicit feelings expressed in text. The most existing approaches are able to detect either explicit expressions or implicit expressions of sentiments in the text separately. In this proposed framework it will detect both Implicit and Explicit expressions available in the meeting transcripts. It will classify the Positive, Negative, Neutral words and also identify the topic of the particular meeting transcripts by using fuzzy logic. This paper aims to add some additional features for improving the classification method. The quality of the sentiment classification is improved using proposed fuzzy logic framework .In this fuzzy logic it includes the features like Fuzzy rules and Fuzzy C-means algorithm.The quality of the output is evaluated using the parameters such as precision, recall, f-measure. Here Fuzzy C-means Clustering technique measured in terms of Purity and Entropy. The data set was validated using 10-fold cross validation method and observed 95% confidence interval between the accuracy values .Finally, the proposed fuzzy logic method produced more than 85 % accurate results and error rate is very less compared to existing sentiment classification techniques.

  8. Model classification rate control algorithm for video coding

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    A model classification rate control method for video coding is proposed. The macro-blocks are classified according to their prediction errors, and different parameters are used in the rate-quantization and distortion-quantization model.The different model parameters are calculated from the previous frame of the same type in the process of coding. These models are used to estimate the relations among rate, distortion and quantization of the current frame. Further steps,such as R-D optimization based quantization adjustment and smoothing of quantization of adjacent macroblocks, are used to improve the quality. The results of the experiments prove that the technique is effective and can be realized easily. The method presented in the paper can be a good way for MPEG and H. 264 rate control.

  9. A novel neural network based image reconstruction model with scale and rotation invariance for target identification and classification for Active millimetre wave imaging

    Science.gov (United States)

    Agarwal, Smriti; Bisht, Amit Singh; Singh, Dharmendra; Pathak, Nagendra Prasad

    2014-12-01

    Millimetre wave imaging (MMW) is gaining tremendous interest among researchers, which has potential applications for security check, standoff personal screening, automotive collision-avoidance, and lot more. Current state-of-art imaging techniques viz. microwave and X-ray imaging suffers from lower resolution and harmful ionizing radiation, respectively. In contrast, MMW imaging operates at lower power and is non-ionizing, hence, medically safe. Despite these favourable attributes, MMW imaging encounters various challenges as; still it is very less explored area and lacks suitable imaging methodology for extracting complete target information. Keeping in view of these challenges, a MMW active imaging radar system at 60 GHz was designed for standoff imaging application. A C-scan (horizontal and vertical scanning) methodology was developed that provides cross-range resolution of 8.59 mm. The paper further details a suitable target identification and classification methodology. For identification of regular shape targets: mean-standard deviation based segmentation technique was formulated and further validated using a different target shape. For classification: probability density function based target material discrimination methodology was proposed and further validated on different dataset. Lastly, a novel artificial neural network based scale and rotation invariant, image reconstruction methodology has been proposed to counter the distortions in the image caused due to noise, rotation or scale variations. The designed neural network once trained with sample images, automatically takes care of these deformations and successfully reconstructs the corrected image for the test targets. Techniques developed in this paper are tested and validated using four different regular shapes viz. rectangle, square, triangle and circle.

  10. Hadoop-based Multi-classification Fusion for Intrusion Detection

    OpenAIRE

    Xun-Yi Ren; Yu-Zhu Qi

    2013-01-01

    Intrusion detection system is the most important security technology in computer network, currently clustering and classification of data mining technology are often used to build detection model. However, different classification and clustering device has its own advantages and disadvantages and the testing result of detection model is not ideal. Cloud Computing, which can integrate multiple inexpensive computing nodes into a distributed system with a stro...

  11. Egocentric visual event classification with location-based priors

    OpenAIRE

    Sundaram, Sudeep; Mayol-Cuevas, Walterio

    2010-01-01

    We present a method for visual classification of actions and events captured from an egocentric point of view. The method tackles the challenge of a moving camera by creating deformable graph models for classification of actions. Action models are learned from low resolution, roughly stabilized difference images acquired using a single monocular camera. In parallel, raw images from the camera are used to estimate the user's location using a visual Simultaneous Localization and Mapping (SLAM) ...

  12. A NEW WASTE CLASSIFYING MODEL: HOW WASTE CLASSIFICATION CAN BECOME MORE OBJECTIVE?

    Directory of Open Access Journals (Sweden)

    Burcea Stefan Gabriel

    2015-07-01

    Full Text Available The waste management specialist must be able to identify and analyze waste generation sources and to propose proper solutions to prevent the waste generation and encurage the waste minimisation. In certain situations like implementing an integrated waste management sustem and configure the waste collection methods and capacities, practitioners can face the challenge to classify the generated waste. This will tend to be the more demanding as the literature does not provide a coherent system of criteria required for an objective waste classification process. The waste incineration will determine no doubt a different waste classification than waste composting or mechanical and biological treatment. In this case the main question is what are the proper classification criteria witch can be used to realise an objective waste classification? The article provide a short critical literature review of the existing waste classification criteria and suggests the conclusion that the literature can not provide unitary waste classification system which is unanimously accepted and assumed by ideologists and practitioners. There are various classification criteria and more interesting perspectives in the literature regarding the waste classification, but the most common criteria based on which specialists classify waste into several classes, categories and types are the generation source, physical and chemical features, aggregation state, origin or derivation, hazardous degree etc. The traditional classification criteria divided waste into various categories, subcategories and types; such an approach is a conjectural one because is inevitable that according to the context in which the waste classification is required the used criteria to differ significantly; hence the need to uniformizating the waste classification systems. For the first part of the article it has been used indirect observation research method by analyzing the literature and the various

  13. Computer vision-based limestone rock-type classification using probabilistic neural network

    Institute of Scientific and Technical Information of China (English)

    Ashok Kumar Patel; Snehamoy Chatterjee

    2016-01-01

    Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper, a computer vision-based rock-type classification algorithm is proposed for fast and reliable identification without human intervention. A laboratory scale vision-based model was developed using probabilistic neural network (PNN) where color histogram features are used as input. The color image histogram-based features that include weighted mean, skewness and kurtosis features are extracted for all three color space red, green, and blue. A total nine features are used as input for the PNN classification model. The smoothing parameter for PNN model is selected judicially to develop an optimal or close to the optimum classification model. The developed PPN is validated using the test data set and results reveal that the proposed vision-based model can perform satisfactorily for classifying limestone rock-types. Overall the error of mis-classification is below 6%. When compared with other three classifica-tion algorithms, it is observed that the proposed method performs substantially better than all three classification algorithms.

  14. Power Disturbances Classification Using S-Transform Based GA-PNN

    Science.gov (United States)

    Manimala, K.; Selvi, K.

    2015-09-01

    The significance of detection and classification of power quality events that disturb the voltage and/or current waveforms in the electrical power distribution networks is well known. Consequently, in spite of a large number of research reports in this area, a research on the selection of proper parameter for specific classifiers was so far not explored. The parameter selection is very important for successful modelling of input-output relationship in a function approximation model. In this study, probabilistic neural network (PNN) has been used as a function approximation tool for power disturbance classification and genetic algorithm (GA) is utilised for optimisation of the smoothing parameter of the PNN. The important features extracted from raw power disturbance signal using S-Transform are given to the PNN for effective classification. The choice of smoothing parameter for PNN classifier will significantly impact the classification accuracy. Hence, GA based parameter optimization is done to ensure good classification accuracy by selecting suitable parameter of the PNN classifier. Testing results show that the proposed S-Transform based GA-PNN model has better classification ability than classifiers based on conventional grid search method for parameter selection. The noisy and practical signals are considered for the classification process to show the effectiveness of the proposed method in comparison with existing methods.

  15. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Directory of Open Access Journals (Sweden)

    Derek G Groenendyk

    Full Text Available Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization

  16. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Science.gov (United States)

    Groenendyk, Derek G; Ferré, Ty P A; Thorp, Kelly R; Rice, Amy K

    2015-01-01

    Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape

  17. Multiclass cancer classification based on gene expression comparison

    OpenAIRE

    Yang Sitan; Naiman Daniel Q.

    2014-01-01

    As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analyses, microarray-based cancer classification comprising multiple discriminatory molecular markers is an emerging trend. Such multiclass classification problems pose new methodological and computational challenges for developing novel and effective statistical approaches. In this paper, we introduce a new approach for classifying multiple disease states associated with cancer based on gene expre...

  18. Network planning tool based on network classification and load prediction

    OpenAIRE

    Hammami, Seif eddine; Afifi, Hossam; Marot, Michel; Gauthier, Vincent

    2016-01-01

    Real Call Detail Records (CDR) are analyzed and classified based on Support Vector Machine (SVM) algorithm. The daily classification results in three traffic classes. We use two different algorithms, K-means and SVM to check the classification efficiency. A second support vector regression (SVR) based algorithm is built to make an online prediction of traffic load using the history of CDRs. Then, these algorithms will be integrated to a network planning tool which will help cellular operators...

  19. Compensatory neurofuzzy model for discrete data classification in biomedical

    Science.gov (United States)

    Ceylan, Rahime

    2015-03-01

    Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.

  20. A Classification-based Review Recommender

    Science.gov (United States)

    O'Mahony, Michael P.; Smyth, Barry

    Many online stores encourage their users to submit product/service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare the performance of several classification techniques using a range of features derived from hotel reviews. We then describe how these classifiers can be used as the basis for a practical recommender that automatically suggests the mosthelpful contrasting reviews to end-users. We present an empirical evaluation which shows that our approach achieves a statistically significant improvement over alternative review ranking schemes.

  1. Fast Wavelet-Based Visual Classification

    OpenAIRE

    Yu, Guoshen; Slotine, Jean-Jacques

    2008-01-01

    We investigate a biologically motivated approach to fast visual classification, directly inspired by the recent work of Serre et al. Specifically, trading-off biological accuracy for computational efficiency, we explore using wavelet and grouplet-like transforms to parallel the tuning of visual cortex V1 and V2 cells, alternated with max operations to achieve scale and translation invariance. A feature selection procedure is applied during learning to accelerate recognition. We introduce a si...

  2. Blurred Image Classification based on Adaptive Dictionary

    OpenAIRE

    Xiaofei Zhou; Guangling Sun; Jie Yin

    2012-01-01

    Two frameworks for blurred image classification bas ed on adaptive dictionary are proposed. Given a blurred image, instead of image deblurring, the sem antic category of the image is determined by blur insensitive sparse coefficients calculated dependin g on an adaptive dictionary. The dictionary is adap tive to an assumed space invariant Point Spread Function (PSF) estimated from the input blurred image. In o ne of th...

  3. A classification-based review recommender

    OpenAIRE

    O'Mahony, Michael P.; Smyth, Barry

    2010-01-01

    Many online stores encourage their users to submit product or service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare...

  4. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  5. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  6. Hybrid Support Vector Machines-Based Multi-fault Classification

    Institute of Scientific and Technical Information of China (English)

    GAO Guo-hua; ZHANG Yong-zhong; ZHU Yu; DUAN Guang-huang

    2007-01-01

    Support Vector Machines (SVM) is a new general machine-learning tool based on structural risk minimization principle. This characteristic is very signific ant for the fault diagnostics when the number of fault samples is limited. Considering that SVM theory is originally designed for a two-class classification, a hybrid SVM scheme is proposed for multi-fault classification of rotating machinery in our paper. Two SVM strategies, 1-v-1 (one versus one) and 1-v-r (one versus rest), are respectively adopted at different classification levels. At the parallel classification level, using 1-v-1 strategy, the fault features extracted by various signal analysis methods are transferred into the multiple parallel SVM and the local classification results are obtained. At the serial classification level, these local results values are fused by one serial SVM based on 1-v-r strategy. The hybrid SVM scheme introduced in our paper not only generalizes the performance of signal binary SVMs but improves the precision and reliability of the fault classification results. The actually testing results show the availability suitability of this new method.

  7. Words semantic orientation classification based on HowNet

    Institute of Scientific and Technical Information of China (English)

    LI Dun; MA Yong-tao; GUO Jian-li

    2009-01-01

    Based on the text orientation classification, a new measurement approach to semantic orientation of words was proposed. According to the integrated and detailed definition of words in HowNet, seed sets including the words with intense orientations were built up. The orientation similarity between the seed words and the given word was then calculated using the sentiment weight priority to recognize the semantic orientation of common words. Finally, the words' semantic orientation and the context were combined to recognize the given words' orientation. The experiments show that the measurement approach achieves better results for common words' orientation classification and contributes particularly to the text orientation classification of large granularities.

  8. A survey of feature selection models for classification

    Directory of Open Access Journals (Sweden)

    B. Kalpana

    2012-01-01

    Full Text Available The success of a machine learning algorithm depends on quality of data .The data given for classification, should not contain irrelevant or redundant attributes. This increases the processing time. The data set, selected for classification should contain the right attributes for accurate results. Feature selection is an essential data processing step, prior to applying a learning algorithm. Here we discuss some basic feature selection models and evaluation function. Experimental results are compared for individual datasets with filter and wrapper model.

  9. Analysis of uncertainty in multi-temporal object-based classification

    Science.gov (United States)

    Löw, Fabian; Knöfel, Patrick; Conrad, Christopher

    2015-07-01

    Agricultural management increasingly uses crop maps based on classification of remotely sensed data. However, classification errors can translate to errors in model outputs, for instance agricultural production monitoring (yield, water demand) or crop acreage calculation. Hence, knowledge on the spatial variability of the classier performance is important information for the user. But this is not provided by traditional assessments of accuracy, which are based on the confusion matrix. In this study, classification uncertainty was analyzed, based on the support vector machines (SVM) algorithm. SVM was applied to multi-spectral time series data of RapidEye from different agricultural landscapes and years. Entropy was calculated as a measure of classification uncertainty, based on the per-object class membership estimations from the SVM algorithm. Permuting all possible combinations of available images allowed investigating the impact of the image acquisition frequency and timing, respectively, on the classification uncertainty. Results show that multi-temporal datasets decrease classification uncertainty for different crops compared to single data sets, but there was no "one-image-combination-fits-all" solution. The number and acquisition timing of the images, for which a decrease in uncertainty could be realized, proved to be specific to a given landscape, and for each crop they differed across different landscapes. For some crops, an increase of uncertainty was observed when increasing the quantity of images, even if classification accuracy was improved. Random forest regression was employed to investigate the impact of different explanatory variables on the observed spatial pattern of classification uncertainty. It was strongly influenced by factors related with the agricultural management and training sample density. Lower uncertainties were revealed for fields close to rivers or irrigation canals. This study demonstrates that classification uncertainty estimates

  10. Models of parallel computation :a survey and classification

    Institute of Scientific and Technical Information of China (English)

    ZHANG Yunquan; CHEN Guoliang; SUN Guangzhong; MIAO Qiankun

    2007-01-01

    In this paper,the state-of-the-art parallel computational model research is reviewed.We will introduce various models that were developed during the past decades.According to their targeting architecture features,especially memory organization,we classify these parallel computational models into three generations.These models and their characteristics are discussed based on three generations classification.We believe that with the ever increasing speed gap between the CPU and memory systems,incorporating non-uniform memory hierarchy into computational models will become unavoidable.With the emergence of multi-core CPUs,the parallelism hierarchy of current computing platforms becomes more and more complicated.Describing this complicated parallelism hierarchy in future computational models becomes more and more important.A semi-automatic toolkit that can extract model parameters and their values on real computers can reduce the model analysis complexity,thus allowing more complicated models with more parameters to be adopted.Hierarchical memory and hierarchical parallelism will be two very important features that should be considered in future model design and research.

  11. Feature Extraction based Face Recognition, Gender and Age Classification

    Directory of Open Access Journals (Sweden)

    Venugopal K R

    2010-01-01

    Full Text Available The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper, we proposed Feature Extraction based Face Recognition, Gender and Age Classification (FEBFRGAC algorithm with only small training sets and it yields good results even with one image per person. This process involves three stages: Pre-processing, Feature Extraction and Classification. The geometric features of facial images like eyes, nose, mouth etc. are located by using Canny edge operator and face recognition is performed. Based on the texture and shape information gender and age classification is done using Posteriori Class Probability and Artificial Neural Network respectively. It is observed that the face recognition is 100%, the gender and age classification is around 98% and 94% respectively.

  12. A Human Gait Classification Method Based on Radar Doppler Spectrograms

    Directory of Open Access Journals (Sweden)

    Fok Hing Chi Tivive

    2010-01-01

    Full Text Available An image classification technique, which has recently been introduced for visual pattern recognition, is successfully applied for human gait classification based on radar Doppler signatures depicted in the time-frequency domain. The proposed method has three processing stages. The first two stages are designed to extract Doppler features that can effectively characterize human motion based on the nature of arm swings, and the third stage performs classification. Three types of arm motion are considered: free-arm swings, one-arm confined swings, and no-arm swings. The last two arm motions can be indicative of a human carrying objects or a person in stressed situations. The paper discusses the different steps of the proposed method for extracting distinctive Doppler features and demonstrates their contributions to the final and desirable classification rates.

  13. A NOVEL RULE-BASED FINGERPRINT CLASSIFICATION APPROACH

    Directory of Open Access Journals (Sweden)

    Faezeh Mirzaei

    2014-03-01

    Full Text Available Fingerprint classification is an important phase in increasing the speed of a fingerprint verification system and narrow down the search of fingerprint database. Fingerprint verification is still a challenging problem due to the difficulty of poor quality images and the need for faster response. The classification gets even harder when just one core has been detected in the input image. This paper has proposed a new classification approach which includes the images with one core. The algorithm extracts singular points (core and deltas from the input image and performs classification based on the number, locations and surrounded area of the detected singular points. The classifier is rule-based, where the rules are generated independent of a given data set. Moreover, shortcomings of a related paper has been reported in detail. The experimental results and comparisons on FVC2002 database have shown the effectiveness and efficiency of the proposed method.

  14. Analysis of Kernel Approach in Fuzzy-Based Image Classifications

    Directory of Open Access Journals (Sweden)

    Mragank Singhal

    2013-03-01

    Full Text Available This paper presents a framework of kernel approach in the field of fuzzy based image classification in remote sensing. The goal of image classification is to separate images according to their visual content into two or more disjoint classes. Fuzzy logic is relatively young theory. Major advantage of this theory is that it allows the natural description, in linguistic terms, of problems that should be solved rather than in terms of relationships between precise numerical values. This paper describes how remote sensing data with uncertainty are handled with fuzzy based classification using Kernel approach for land use/land cover maps generation. The introduction to fuzzification using Kernel approach provides the basis for the development of more robust approaches to the remote sensing classification problem. The kernel explicitly defines a similarity measure between two samples and implicitly represents the mapping of the input space to the feature space.

  15. Bazhenov Fm Classification Based on Wireline Logs

    Science.gov (United States)

    Simonov, D. A.; Baranov, V.; Bukhanov, N.

    2016-03-01

    This paper considers the main aspects of Bazhenov Formation interpretation and application of machine learning algorithms for the Kolpashev type section of the Bazhenov Formation, application of automatic classification algorithms that would change the scale of research from small to large. Machine learning algorithms help interpret the Bazhenov Formation in a reference well and in other wells. During this study, unsupervised and supervised machine learning algorithms were applied to interpret lithology and reservoir properties. This greatly simplifies the routine problem of manual interpretation and has an economic effect on the cost of laboratory analysis.

  16. Assessment of optimized Markov models in protein fold classification.

    Science.gov (United States)

    Lampros, Christos; Simos, Thomas; Exarchos, Themis P; Exarchos, Konstantinos P; Papaloukas, Costas; Fotiadis, Dimitrios I

    2014-08-01

    Protein fold classification is a challenging task strongly associated with the determination of proteins' structure. In this work, we tested an optimization strategy on a Markov chain and a recently introduced Hidden Markov Model (HMM) with reduced state-space topology. The proteins with unknown structure were scored against both these models. Then the derived scores were optimized following a local optimization method. The Protein Data Bank (PDB) and the annotation of the Structural Classification of Proteins (SCOP) database were used for the evaluation of the proposed methodology. The results demonstrated that the fold classification accuracy of the optimized HMM was substantially higher compared to that of the Markov chain or the reduced state-space HMM approaches. The proposed methodology achieved an accuracy of 41.4% on fold classification, while Sequence Alignment and Modeling (SAM), which was used for comparison, reached an accuracy of 38%. PMID:25152041

  17. A Novel Fault Classification Scheme Based on Least Square SVM

    OpenAIRE

    Dubey, Harishchandra; Tiwari, A. K.; Nandita; Ray, P. K.; Mohanty, S. R.; Kishor, Nand

    2016-01-01

    This paper presents a novel approach for fault classification and section identification in a series compensated transmission line based on least square support vector machine. The current signal corresponding to one-fourth of the post fault cycle is used as input to proposed modular LS-SVM classifier. The proposed scheme uses four binary classifier; three for selection of three phases and fourth for ground detection. The proposed classification scheme is found to be accurate and reliable in ...

  18. Feature Extraction based Face Recognition, Gender and Age Classification

    OpenAIRE

    Venugopal K R2; L M Patnaik; Ramesha K; K B Raja

    2010-01-01

    The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper, we proposed Feature Extraction based Face Recognition, Gender and Age Classification (FEBFRGAC) algorithm with only small training sets and it yields good results even with one image per person. This process involves three stages: Pre-processing, Feature Extraction and Classification. The geometric features of facial images like eyes, nose, mouth etc. are loc...

  19. A wavelet transform based feature extraction and classification of cardiac disorder.

    Science.gov (United States)

    Sumathi, S; Beaulah, H Lilly; Vanithamani, R

    2014-09-01

    This paper approaches an intellectual diagnosis system using hybrid approach of Adaptive Neuro-Fuzzy Inference System (ANFIS) model for classification of Electrocardiogram (ECG) signals. This method is based on using Symlet Wavelet Transform for analyzing the ECG signals and extracting the parameters related to dangerous cardiac arrhythmias. In these particular parameters were used as input of ANFIS classifier, five most important types of ECG signals they are Normal Sinus Rhythm (NSR), Atrial Fibrillation (AF), Pre-Ventricular Contraction (PVC), Ventricular Fibrillation (VF), and Ventricular Flutter (VFLU) Myocardial Ischemia. The inclusion of ANFIS in the complex investigating algorithms yields very interesting recognition and classification capabilities across a broad spectrum of biomedical engineering. The performance of the ANFIS model was evaluated in terms of training performance and classification accuracies. The results give importance to that the proposed ANFIS model illustrates potential advantage in classifying the ECG signals. The classification accuracy of 98.24 % is achieved. PMID:25023652

  20. Vertebrae classification models - Validating classification models that use morphometrics to identify ancient salmonid (Oncorhynchus spp.) vertebrae to species

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Using morphometric characteristics of modern salmonid (Oncorhynchus spp.) vertebrae, we have developed classification models to identify salmonid vertebrae to the...

  1. Land Cover Classification from Full-Waveform LIDAR Data Based on Support Vector Machines

    Science.gov (United States)

    Zhou, M.; Li, C. R.; Ma, L.; Guan, H. C.

    2016-06-01

    In this study, a land cover classification method based on multi-class Support Vector Machines (SVM) is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM) and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs) method and it showed that SVM method could achieve better classification results.

  2. Medical diagnosis of cardiovascular diseases using an interval-valued fuzzy rule-based classification system

    OpenAIRE

    Sanz Delgado, José Antonio; Galar Idoate, Mikel; Jurío Munárriz, Aránzazu; Brugos Larumbe, Antonio; Pagola Barrio, Miguel; Bustince Sola, Humberto

    2013-01-01

    Objective: To develop a classifier that tackles the problem of determining the risk of a patient of suffering from a cardiovascular disease within the next ten years. The system has to provide both a diagnosis and an interpretable model explaining the decision. In this way, doctors are able to analyse the usefulness of the information given by the system. Methods: Linguistic fuzzy rule-based classification systems are used, since they provide a good classification rate and a highly interpreta...

  3. A Multi-Lead ECG Classification Based on Random Projection Features

    OpenAIRE

    Bogdanova Vandergheynst, Iva; Vallejos, Rincon; Javier, Francisco; Atienza Alonso, David

    2012-01-01

    This paper presents a novel method for classification of multilead electrocardiogram (ECG) signals. The feature extraction is based on the random projection (RP) concept for dimensionality reduction. Furthermore, the classification is performed by a neuro-fuzzy classifier. Such a model can be easily implemented on portable systems for practical applications in both health monitoring and diagnostic purposes. Moreover, the RP implementation on portable systems is very challenging featuring both...

  4. A Multi-Lead Ecg Classification Based On Random Projection Features

    OpenAIRE

    Bogdanova, Iva; Rincon, Francisco; Atienza, David

    2012-01-01

    This paper presents a novel method for classification of multi-lead electrocardiogram (ECG) signals. The feature extraction is based on the random projection (RP) concept for dimensionality reduction. Furthermore, the classification is performed by a neuro-fuzzy classifier. Such a model can be easily implemented on portable systems for practical applications in both health monitoring and diagnostic purposes. Moreover, the RP implementation on portable systems is very challenging featuring bot...

  5. A Soft Intelligent Risk Evaluation Model for Credit Scoring Classification

    Directory of Open Access Journals (Sweden)

    Mehdi Khashei

    2015-09-01

    Full Text Available Risk management is one of the most important branches of business and finance. Classification models are the most popular and widely used analytical group of data mining approaches that can greatly help financial decision makers and managers to tackle credit risk problems. However, the literature clearly indicates that, despite proposing numerous classification models, credit scoring is often a difficult task. On the other hand, there is no universal credit-scoring model in the literature that can be accurately and explanatorily used in all circumstances. Therefore, the research for improving the efficiency of credit-scoring models has never stopped. In this paper, a hybrid soft intelligent classification model is proposed for credit-scoring problems. In the proposed model, the unique advantages of the soft computing techniques are used in order to modify the performance of the traditional artificial neural networks in credit scoring. Empirical results of Australian credit card data classifications indicate that the proposed hybrid model outperforms its components, and also other classification models presented for credit scoring. Therefore, the proposed model can be considered as an appropriate alternative tool for binary decision making in business and finance, especially in high uncertainty conditions.

  6. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models.

    Science.gov (United States)

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  7. Analysis of data mining based customer classification model for electric power industry%基于数据挖掘的电力行业客户细分模型分析

    Institute of Scientific and Technical Information of China (English)

    宋才华; 王永才; 蓝源娟; 郑锦卿

    2014-01-01

    社会经济的不断发展推动了电力行业的快速发展。随着客户细分理论与方法的发展,其已广泛应用于我国电力、电信、银行及零售业等行业的营销实践中。由于客户行为能有效、直接地反映出消费者的需求,在经济市场中的应用更为广泛,是市场细分中的最佳起点。主要以客户细分的概述为出发点,对电力客户细分现状、基于数据挖掘的客户细分及模型构建进行探讨。%The continuous development of social economy promote the rapid development of electric power industry. The theory and method of customer classification has been widely used in Chinese markets of electric power,telecommunications, bank and retail industries. Proceeding from the overview of customer classification,the status quo and model construction of cus-tomer classification based on data mining are discussed.

  8. Classification approach based on association rules mining for unbalanced data

    CERN Document Server

    Ndour, Cheikh

    2012-01-01

    This paper deals with the supervised classification when the response variable is binary and its class distribution is unbalanced. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression, classification tree, discriminant analysis, etc. To overcome this short-coming of these methods that provide classifiers with low sensibility, we tackled the classification problem here through an approach based on the association rules learning because this approach has the advantage of allowing the identification of the patterns that are well correlated with the target class. Association rules learning is a well known method in the area of data-mining. It is used when dealing with large database for unsupervised discovery of local patterns that expresses hidden relationships between variables. In considering association rules from a supervised learning point of view, a relevant set of weak classifiers is obtained from which one derives a classification rule...

  9. Ensemble polarimetric SAR image classification based on contextual sparse representation

    Science.gov (United States)

    Zhang, Lamei; Wang, Xiao; Zou, Bin; Qiao, Zhijun

    2016-05-01

    Polarimetric SAR image interpretation has become one of the most interesting topics, in which the construction of the reasonable and effective technique of image classification is of key importance. Sparse representation represents the data using the most succinct sparse atoms of the over-complete dictionary and the advantages of sparse representation also have been confirmed in the field of PolSAR classification. However, it is not perfect, like the ordinary classifier, at different aspects. So ensemble learning is introduced to improve the issue, which makes a plurality of different learners training and obtained the integrated results by combining the individual learner to get more accurate and ideal learning results. Therefore, this paper presents a polarimetric SAR image classification method based on the ensemble learning of sparse representation to achieve the optimal classification.

  10. Blurred Image Classification Based on Adaptive Dictionary

    Directory of Open Access Journals (Sweden)

    Guangling Sun

    2013-02-01

    Full Text Available Two frameworks for blurred image classification bas ed on adaptive dictionary are proposed. Given a blurred image, instead of image deblurring, the sem antic category of the image is determined by blur insensitive sparse coefficients calculated dependin g on an adaptive dictionary. The dictionary is adap tive to an assumed space invariant Point Spread Function (PSF estimated from the input blurred image. In o ne of the proposed two frameworks, the PSF is inferred separately and in the other, the PSF is updated combined with sparse coefficients calculation in an alternative and iterative manner. The experimental results have evaluated three types of blur namely d efocus blur, simple motion blur and camera shake bl ur. The experiment results confirm the effectiveness of the proposed frameworks.

  11. Genetic Programming for the Generation of Crisp and Fuzzy Rule Bases in Classification and Diagnosis of Medical Data

    DEFF Research Database (Denmark)

    Dounias, George; Tsakonas, Athanasios; Jantzen, Jan; Axer, Hubertus; Bjerregaard, Beth; Keyserlingk, Diedrich Graf von

    2002-01-01

    the classification between all common types. A third model consisted of a GP-generated fuzzy rule-based system is tested on the same domain. The second medical domain is the classification of Pap-Smear Test examinations where a crisp rule-based system is constructed. Results denote the effectiveness...

  12. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  13. ELABORATION OF A VECTOR BASED SEMANTIC CLASSIFICATION OVER THE WORDS AND NOTIONS OF THE NATURAL LANGUAGE

    OpenAIRE

    Safonov, K.; Lichargin, D.

    2009-01-01

    The problem of vector-based semantic classification over the words and notions of the natural language is discussed. A set of generative grammar rules is offered for generating the semantic classification vector. Examples of the classification application and a theorem of optional formal classification incompleteness are presented. The principles of assigning the meaningful phrases functions over the classification word groups are analyzed.

  14. Classification of types of stuttering symptoms based on brain activity.

    Directory of Open Access Journals (Sweden)

    Jing Jiang

    Full Text Available Among the non-fluencies seen in speech, some are more typical (MT of stuttering speakers, whereas others are less typical (LT and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT whole-word repetitions (WWR should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  15. A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model.

    Science.gov (United States)

    Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F

    2014-06-01

    To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). PMID:24007752

  16. Co-occurrence Models in Music Genre Classification

    DEFF Research Database (Denmark)

    Ahrendt, Peter; Goutte, Cyril; Larsen, Jan

    2005-01-01

    Music genre classification has been investigated using many different methods, but most of them build on probabilistic models of feature vectors x\\_r which only represent the short time segment with index r of the song. Here, three different co-occurrence models are proposed which instead consider...... difficult 11 genre data set with a variety of modern music. The basis was a so-called AR feature representation of the music. Besides the benefit of having proper probabilistic models of the whole song, the lowest classification test errors were found using one of the proposed models....

  17. A Bayes fusion method based ensemble classification approach for Brown cloud application

    Directory of Open Access Journals (Sweden)

    M.Krishnaveni

    2014-03-01

    Full Text Available Classification is a recurrent task of determining a target function that maps each attribute set to one of the predefined class labels. Ensemble fusion is one of the suitable classifier model fusion techniques which combine the multiple classifiers to perform high classification accuracy than individual classifiers. The main objective of this paper is to combine base classifiers using ensemble fusion methods namely Decision Template, Dempster-Shafer and Bayes to compare the accuracy of the each fusion methods on the brown cloud dataset. The base classifiers like KNN, MLP and SVM have been considered in ensemble classification in which each classifier with four different function parameters. From the experimental study it is proved, that the Bayes fusion method performs better classification accuracy of 95% than Decision Template of 80%, Dempster-Shaferof 85%, in a Brown Cloud image dataset.

  18. D Land Cover Classification Based on Multispectral LIDAR Point Clouds

    Science.gov (United States)

    Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong

    2016-06-01

    Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.

  19. Super pixel density based clustering automatic image classification method

    Science.gov (United States)

    Xu, Mingxing; Zhang, Chuan; Zhang, Tianxu

    2015-12-01

    The image classification is an important means of image segmentation and data mining, how to achieve rapid automated image classification has been the focus of research. In this paper, based on the super pixel density of cluster centers algorithm for automatic image classification and identify outlier. The use of the image pixel location coordinates and gray value computing density and distance, to achieve automatic image classification and outlier extraction. Due to the increased pixel dramatically increase the computational complexity, consider the method of ultra-pixel image preprocessing, divided into a small number of super-pixel sub-blocks after the density and distance calculations, while the design of a normalized density and distance discrimination law, to achieve automatic classification and clustering center selection, whereby the image automatically classify and identify outlier. After a lot of experiments, our method does not require human intervention, can automatically categorize images computing speed than the density clustering algorithm, the image can be effectively automated classification and outlier extraction.

  20. Network traffic classification based on ensemble learning and co-training

    Institute of Scientific and Technical Information of China (English)

    HE HaiTao; LUO XiaoNan; MA FeiTeng; CHE ChunHui; WANG JianMin

    2009-01-01

    Classification of network traffic Is the essential step for many network researches. However, with the rapid evolution of Internet applications the effectiveness of the port-based or payload-based identifi-cation approaches has been greatly diminished In recent years. And many researchers begin to turn their attentions to an alternative machine learning based method. This paper presents a novel machine learning-based classification model, which combines ensemble learning paradigm with co-training tech-niques. Compared to previous approaches, most of which only employed single classifier, multiple clas-sifiers and semi-supervised learning are applied in our method and it mainly helps to overcome three shortcomings: limited flow accuracy rate, weak adaptability and huge demand of labeled training set. In this paper, statistical characteristics of IP flows are extracted from the packet level traces to establish the feature set, then the classification model is created and tested and the empirical results prove its feasibility and effectiveness.

  1. Classification of Gait Types Based on the Duty-factor

    DEFF Research Database (Denmark)

    Fihl, Preben; Moeslund, Thomas B.

    2007-01-01

    This paper deals with classification of human gait types based on the notion that different gait types are in fact different types of locomotion, i.e., running is not simply walking done faster. We present the duty-factor, which is a descriptor based on this notion. The duty-factor is independent...... with known ground support. Silhouettes are extracted using the Codebook method and represented using Shape Contexts. The matching with database silhouettes is done using the Hungarian method. While manually estimated duty-factors show a clear classification the presented system contains...

  2. A Novel Multi label Text Classification Model using Semi supervised learning

    Directory of Open Access Journals (Sweden)

    Shweta C. Dharmadhikari

    2012-07-01

    Full Text Available Automatic text categorization (ATC is a prominent research area within Information retrieval. Throughthis paper a classification model for ATC in multi-label domain is discussed. We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document. Our model is greatly influenced by graph based framework and Semi supervised learning. We demonstrate the effectiveness of our model using Enron , Slashdot , Bibtex and RCV1 datasets. Our experimental results indicate that the use of Semi Supervised Learning in MLTC greatly improves the decision making capability of classifier.

  3. A Novel Multi label Text Classification Model using Semi supervised learning

    Directory of Open Access Journals (Sweden)

    Shweta C. Dharmadhikari

    2012-09-01

    Full Text Available Automatic text categorization (ATC is a prominent research area within Information retrieval. Throughthis paper a classification model for ATC in multi-label domain is discussed. We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document. Our model is greatly influenced by graph based framework and Semi supervised learning. We demonstrate the effectiveness of our model using Enron , Slashdot , Bibtex and RCV1 datasets. Our experimental results indicate that the use of Semi Supervised Learning in MLTC greatly improves the decision making capability of classifier.

  4. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-11-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  5. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-07-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  6. 基于情感词属性和云模型的文本情感分类方法%Classification Method of Texts Sentiment Based on Sentiment Word Attributes and Cloud Model

    Institute of Scientific and Technical Information of China (English)

    孙劲光; 马志芳; 孟祥福

    2013-01-01

    受语言固有的模糊性、随机性以及传统文本特征词权重值计算方法不适用于情感词等因素的影响,文本情感分类的正确率很难达到传统文本主题分类的水平。为此,提出一种基于情感词属性和云模型的情感分类方法。结合情感词属性和简单句法结构以确定情感词的权重值,并利用云模型对情感词进行定性定量表示的转换。实验结果表明,该方法对情感词权重值计算是有效的,召回率最高达到78.8%,且与基于词典的方法相比,其文本情感分类结果更精确,正确率最高达到68.4%,增加了约9%的精度。%In the era of big data, how to obtain valid information from the Web becomes a keen topic for business, government, and research workers. User’s opinion mining becomes a research topic for the area of Natural Language Processing(NLP) and text mining. However, due to the inherent fuzziness and randomness of language, as well as the traditional term weight value calculation method is not suitable for the sentiment word and other factors, the text sentiment classification accuracy is difficult to achieve the performance of traditional text subject classification. To solve these problems, this paper proposes a sentiment classification method based on sentiment word attributes and cloud model. It calculates weight of sentiment words by combining attributes and syntactic structure of sentiment words, and converts qualitative and quantitative of sentiment words based on cloud model. Experimental results show that this method to calculate weights of sentiment words is valid, and the recall rate is up to 78.8%. Text sentiment classification results are more accurate than that based on dictionary, the correction rate is up to 68.4%, and the accuracy is increased by about 9%.

  7. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System

    Science.gov (United States)

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3–6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable ‘if-then’ rules. PMID:26461492

  8. A study of land use/land cover information extraction classification technology based on DTC

    Science.gov (United States)

    Wang, Ping; Zheng, Yong-guo; Yang, Feng-jie; Jia, Wei-jie; Xiong, Chang-zhen

    2008-10-01

    Decision Tree Classification (DTC) is one organizational form of the multi-level recognition system, which changes the complicated classification into simple categories, and then gradually resolves it. The paper does LULC Decision Tree Classification research on some areas of Gansu Province in the west of China. With the mid-resolution remote sensing data as the main data resource, the authors adopt decision-making classification technology method, taking advantage of its character that it imitates the processing pattern of human judgment and thinking and its fault-tolerant character, and also build the decision tree LULC classical pattern. The research shows that the methods and techniques can increase the level of automation and accuracy of LULC information extraction, and better carry out LULC information extraction on the research areas. The main aspects of the research are as follows: 1. We collected training samples firstly, established a comprehensive database which is supported by remote sensing and ground data; 2. By utilizing CART system, and based on multiply sources and time phases remote sensing data and other assistance data, the DTC's technology effectively combined the unsupervised classification results with the experts' knowledge together. The method and procedure for distilling the decision tree information were specifically developed. 3. In designing the decision tree, based on the various object of types classification rules, we established and pruned DTC'S model for the purpose of achieving effective treatment of subdivision classification, and completed the land use and land cover classification of the research areas. The accuracy of evaluation showed that the classification accuracy reached upwards 80%.

  9. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  10. Object-Based Classification of Abandoned Logging Roads under Heavy Canopy Using LiDAR

    Directory of Open Access Journals (Sweden)

    Jason Sherba

    2014-05-01

    Full Text Available LiDAR-derived slope models may be used to detect abandoned logging roads in steep forested terrain. An object-based classification approach of abandoned logging road detection was employed in this study. First, a slope model of the study site in Marin County, California was created from a LiDAR derived DEM. Multiresolution segmentation was applied to the slope model and road seed objects were iteratively grown into candidate objects. A road classification accuracy of 86% was achieved using this fully automated procedure and post processing increased this accuracy to 90%. In order to assess the sensitivity of the road classification to LiDAR ground point spacing, the LiDAR ground point cloud was repeatedly thinned by a fraction of 0.5 and the classification procedure was reapplied. The producer’s accuracy of the road classification declined from 79% with a ground point spacing of 0.91 to below 50% with a ground point spacing of 2, indicating the importance of high point density for accurate classification of abandoned logging roads.

  11. Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network

    Institute of Scientific and Technical Information of China (English)

    Rafael Geraldeli Rossi; Alneu de Andrade Lopes; Thiago de Paulo Faleiros; Solange Oliveira Rezende

    2014-01-01

    Algorithms for numeric data classification have been applied for text classification. Usually the vector space model is used to represent text collections. The characteristics of this representation such as sparsity and high dimensionality sometimes impair the quality of general-purpose classifiers. Networks can be used to represent text collections, avoiding the high sparsity and allowing to model relationships among different objects that compose a text collection. Such network-based representations can improve the quality of the classification results. One of the simplest ways to represent textual collections by a network is through a bipartite heterogeneous network, which is composed of objects that represent the documents connected to objects that represent the terms. Heterogeneous bipartite networks do not require computation of similarities or relations among the objects and can be used to model any type of text collection. Due to the advantages of representing text collections through bipartite heterogeneous networks, in this article we present a text classifier which builds a classification model using the structure of a bipartite heterogeneous network. Such an algorithm, referred to as IMBHN (Inductive Model Based on Bipartite Heterogeneous Network), induces a classification model assigning weights to ob jects that represent the terms for each class of the text collection. An empirical evaluation using a large amount of text collections from different domains shows that the proposed IMBHN algorithm produces significantly better results than k-NN, C4.5, SVM, and Naive Bayes algorithms.

  12. Classification of consumers based on perceptions

    DEFF Research Database (Denmark)

    Høg, Esben; Juhl, Hans Jørn; Poulsen, Carsten Stig

    1999-01-01

    This paper reports some results from a recent Danish study of fish consumption. One purpose of the study was to identify consumer segments according to their perceptions of fish in comparison with other food categories. We present a model, which has the capabilities to determine the number of...... segments and putting in order of priority the alternatives examined. The model allows for ties, i.e. the consumer's expression of no preference among alternatives. The parameters in the model are estimated simultaneously by the method of maximum likelihood. The approach is illustrated using data from the...

  13. Classification of consumers based on perceptions

    DEFF Research Database (Denmark)

    Høg, Esben; Juhl, Hans Jørn; Poulsen, Carsten Stig

    1999-01-01

    This paper reports some results from a recent Danish study of fish consumption. One major purpose of the study was to identify consumer segments according to their perceptions of fish in comparison with other food categories. We present a model which has the capabilities to determine the number of...... segments and putting in order of priority the alternatives examined. Data consist of paiwise comparisons per respondent. The model allows for ties, i.e. the consumer´s expression of no preference among alternatives. All the parameters in the model are estimated simultaneously by the method of maximum...

  14. Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

    Directory of Open Access Journals (Sweden)

    Ghazi Raho

    2015-02-01

    Full Text Available Feature selection is necessary for effective text classification. Dataset preprocessing is essential to make upright result and effective performance. This paper investigates the effectiveness of using feature selection. In this paper we have been compared the performance between different classifiers in different situations using feature selection with stemming, and without stemming.Evaluation used a BBC Arabic dataset, different classification algorithms such as decision tree (D.T, K-nearest neighbors (KNN, Naïve Bayesian (NB method and Naïve Bayes Multinomial(NBM classifier were used. The experimental results are presented in term of precision, recall, F-Measures, accuracy and time to build model.

  15. Credit Risk Evaluation Using a C-Variable Least Squares Support Vector Classification Model

    Science.gov (United States)

    Yu, Lean; Wang, Shouyang; Lai, K. K.

    Credit risk evaluation is one of the most important issues in financial risk management. In this paper, a C-variable least squares support vector classification (C-VLSSVC) model is proposed for credit risk analysis. The main idea of this model is based on the prior knowledge that different classes may have different importance for modeling and more weights should be given to those classes with more importance. The C-VLSSVC model can be constructed by a simple modification of the regularization parameter in LSSVC, whereby more weights are given to the lease squares classification errors with important classes than the lease squares classification errors with unimportant classes while keeping the regularized terms in its original form. For illustration purpose, a real-world credit dataset is used to test the effectiveness of the C-VLSSVC model.

  16. A hidden Markov model based algorithm for data stream classification algorithm%基于隐马尔可夫模型的流数据分类算法

    Institute of Scientific and Technical Information of China (English)

    潘怡; 何可可; 李国徽

    2014-01-01

    为优化周期性概念漂移分类精度,提出了一种基于隐马尔可夫模型的周期性流式数据分类(HMM -SDC)算法,算法结合实际可观测序列的输出建立漂移概念状态序列的转移矩阵概率模型,由观测值概率分布密度来预测状态的转移序列。当预测误差超过用户定义阈值时,算法能够更新优化转移矩阵参数,无须重复学习历史概念即可实现对数据概念漂移的有效预测。此外,算法采用半监督 K-M ean学习方法训练样本集,降低了人工标记样例的代价,能够避免隐形马尔可夫模型因标记样例不足而产生的欠学习问题。实验结果表明:相对传统集成分类算法,新算法对周期性数据漂移具有更好的分类精确度及分类时效性。%To improve the classification accuracy on data stream ,HMM -SDC(hidden Markov model based stream data classification )algorithm was presented . The invisible data concept states was a-ligned with the observable sequences through a hidden Markov chain model ,and the drifted concept could be forecasted with the actual observation value .When the mean predictive error was larger than a user defined threshold ,the state transition probability matrix was updated automatically without re-learning the historical data concepts . In addition , part of the unlabeled samples were classified through the semi-supervised K-Means method ,which reduced the impact of the insufficient labeled data for training the hidden Markov model .The experimental results show that the new algorithm has better performance than the traditional ensemble classification algorithm in periodical data stream clas-sification .

  17. Conceptualising Business Models: Definitions, Frameworks and Classifications

    Directory of Open Access Journals (Sweden)

    Erwin Fielt

    2013-12-01

    Full Text Available The business model concept is gaining traction in different disciplines but is still criticized for being fuzzy and vague and lacking consensus on its definition and compositional elements. In this paper we set out to advance our understanding of the business model concept by addressing three areas of foundational research: business model definitions, business model elements, and business model archetypes. We define a business model as a representation of the value logic of an organization in terms of how it creates and captures customer value. This abstract and generic definition is made more specific and operational by the compositional elements that need to address the customer, value proposition, organizational architecture (firm and network level and economics dimensions. Business model archetypes complement the definition and elements by providing a more concrete and empirical understanding of the business model concept. The main contributions of this paper are (1 explicitly including the customer value concept in the business model definition and focussing on value creation, (2 presenting four core dimensions that business model elements need to cover, (3 arguing for flexibility by adapting and extending business model elements to cater for different purposes and contexts (e.g. technology, innovation, strategy (4 stressing a more systematic approach to business model archetypes by using business model elements for their description, and (5 suggesting to use business model archetype research for the empirical exploration and testing of business model elements and their relationships.

  18. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is constr

  19. Time Series Classification by Class-Based Mahalanobis Distances

    CERN Document Server

    Prekopcsák, Zoltán

    2010-01-01

    To classify time series by nearest neighbor, we need to specify or learn a distance. We consider several variations of the Mahalanobis distance and the related Large Margin Nearest Neighbor Classification (LMNN). We find that the conventional Mahalanobis distance is counterproductive. However, both LMNN and the class-based diagonal Mahalanobis distance are competitive.

  20. Classification-Based Method of Linear Multicriteria Optimization

    OpenAIRE

    Vassilev, Vassil; Genova, Krassimira; Vassileva, Mariyana; Narula, Subhash

    2003-01-01

    The paper describes a classification-based learning-oriented interactive method for solving linear multicriteria optimization problems. The method allows the decision makers describe their preferences with greater flexibility, accuracy and reliability. The method is realized in an experimental software system supporting the solution of multicriteria optimization problems.

  1. Pulse frequency classification based on BP neural network

    Institute of Scientific and Technical Information of China (English)

    WANG Rui; WANG Xu; YANG Dan; FU Rong

    2006-01-01

    In Traditional Chinese Medicine (TCM), it is an important parameter of the clinic disease diagnosis to analysis the pulse frequency. This article accords to pulse eight major essentials to identify pulse type of the pulse frequency classification based on back-propagation neural networks (BPNN). The pulse frequency classification includes slow pulse, moderate pulse, rapid pulse etc. By feature parameter of the pulse frequency analysis research and establish to identify system of pulse frequency features. The pulse signal from detecting system extracts period, frequency etc feature parameter to compare with standard feature value of pulse type. The result shows that identify-rate attains 92.5% above.

  2. Classification of CT-brain slices based on local histograms

    Science.gov (United States)

    Avrunin, Oleg G.; Tymkovych, Maksym Y.; Pavlov, Sergii V.; Timchik, Sergii V.; Kisała, Piotr; Orakbaev, Yerbol

    2015-12-01

    Neurosurgical intervention is a very complicated process. Modern operating procedures based on data such as CT, MRI, etc. Automated analysis of these data is an important task for researchers. Some modern methods of brain-slice segmentation use additional data to process these images. Classification can be used to obtain this information. To classify the CT images of the brain, we suggest using local histogram and features extracted from them. The paper shows the process of feature extraction and classification CT-slices of the brain. The process of feature extraction is specialized for axial cross-section of the brain. The work can be applied to medical neurosurgical systems.

  3. AN EFFICIENT CLASSIFICATION OF GENOMES BASED ON CLASSES AND SUBCLASSES

    Directory of Open Access Journals (Sweden)

    B.V. DHANDRA,

    2010-08-01

    Full Text Available The grass family has been the subject of intense research over the past. Reliable and fast classification / sub-classification of large sequences which are rapidly gaining importance due to genome sequencing projects all over the world is contributing large amount of genome sequences to public gene bank . Hence sequence classification has gained importance for predicting the genome function, structure, evolutionary relationships and also gives the insight into the features associated with the biological role of the class. Thus, classification of functional genome is an important andchallenging task to both computer scientists and biologists. The presence of motifs in grass genome chains predicts the functional behavior of the grass genome. The correlation between grass genome properties and their motifs is not always obvious since more than one motif may exist within a genome chain. Due to the complexity of this association most of the data mining algorithms are either non efficient or time consuming. Hence, in this paper we proposed an efficient method for main classes based on classes to reduce the time complexity for the classification of large sequences of grass genomes dataset. The proposed approaches classify the given dataset into classes with conserved threshold and again reclassify the class relaxed threshold into major classes. Experimental results indicate that the proposed method reduces the time complexity keepingclassification accuracy level as that compared with general NNCalgorithm.

  4. CLASSIFICATION OF LiDAR DATA WITH POINT BASED CLASSIFICATION METHODS

    OpenAIRE

    N. Yastikli; Cetin, Z.

    2016-01-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applicati...

  5. Torrent classification - Base of rational management of erosive regions

    Energy Technology Data Exchange (ETDEWEB)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta [Institute for the Development of Water Resources ' Jaroslav Cerni' , 11226 Beograd (Pinosava), Jaroslava Cernog 80 (Serbia)], E-mail: gavrilovicz@sbb.rs

    2008-11-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  6. Torrent classification - Base of rational management of erosive regions

    International Nuclear Information System (INIS)

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  7. Optimal query-based relevance feedback in medical image retrieval using score fusion-based classification.

    Science.gov (United States)

    Behnam, Mohammad; Pourghassem, Hossein

    2015-04-01

    In this paper, a new content-based medical image retrieval (CBMIR) framework using an effective classification method and a novel relevance feedback (RF) approach are proposed. For a large-scale database with diverse collection of different modalities, query image classification is inevitable due to firstly, reducing the computational complexity and secondly, increasing influence of data fusion by removing unimportant data and focus on the more valuable information. Hence, we find probability distribution of classes in the database using Gaussian mixture model (GMM) for each feature descriptor and then using the fusion of obtained scores from the dependency probabilities, the most relevant clusters are identified for a given query. Afterwards, visual similarity of query image and images in relevant clusters are calculated. This method is performed separately on all feature descriptors, and then the results are fused together using feature similarity ranking level fusion algorithm. In the RF level, we propose a new approach to find the optimal queries based on relevant images. The main idea is based on density function estimation of positive images and strategy of moving toward the aggregation of estimated density function. The proposed framework has been evaluated on ImageCLEF 2005 database consisting of 10,000 medical X-ray images of 57 semantic classes. The experimental results show that compared with the existing CBMIR systems, our framework obtains the acceptable performance both in the image classification and in the image retrieval by RF. PMID:25246167

  8. Mathematical model for classification of EEG signals

    Science.gov (United States)

    Ortiz, Victor H.; Tapia, Juan J.

    2015-09-01

    A mathematical model to filter and classify brain signals from a brain machine interface is developed. The mathematical model classifies the signals from the different lobes of the brain to differentiate the signals: alpha, beta, gamma and theta, besides the signals from vision, speech, and orientation. The model to develop further eliminates noise signals that occur in the process of signal acquisition. This mathematical model can be used on different platforms interfaces for rehabilitation of physically handicapped persons.

  9. Classification of ECG Using Chaotic Models

    Directory of Open Access Journals (Sweden)

    Khandakar Mohammad Ishtiak

    2012-09-01

    Full Text Available Chaotic analysis has been shown to be useful in a variety of medical applications, particularly in cardiology. Chaotic parameters have shown potential in the identification of diseases, especially in the analysis of biomedical signals like electrocardiogram (ECG. In this work, underlying chaos in ECG signals has been analyzed using various non-linear techniques. First, the ECG signal is processed through a series of steps to extract the QRS complex. From this extracted feature, bit-to-bit interval (BBI and instantaneous heart rate (IHR have been calculated. Then some nonlinear parameters like standard deviation, and coefficient of variation and nonlinear techniques like central tendency measure (CTM, and phase space portrait have been determined from both the BBI and IHR. Standard database of MIT-BIH is used as the reference data where each ECG record contains 650000 samples. CTM is calculated for both BBI and IHR for each ECG record of the database. A much higher value of CTM for IHR is observed for eleven patients with normal beats with a mean of 0.7737 and SD of 0.0946. On the contrary, the CTM for IHR of eleven patients with abnormal rhythm shows low value with a mean of 0.0833 and SD 0.0748. CTM for BBI of the same eleven normal rhythm records also shows high values with a mean of 0.6172 and SD 0.1472. CTM for BBI of eleven abnormal rhythm records show low values with a mean of 0.0478 and SD 0.0308. Phase space portrait also demonstrates visible attractor with little dispersion for a healthy person’s ECG and a widely dispersed plot in 2-D plane for the ailing person’s ECG. These results indicate that ECG can be classified based on this chaotic modeling which works on the nonlinear dynamics of the system.

  10. Investigation of the Effect of Traffic Parameters on Road Hazard Using Classification Tree Model

    Directory of Open Access Journals (Sweden)

    Md. Mahmud Hasan

    2012-09-01

    Full Text Available This paper presents a method for the identification of hazardous situations on the freeways. For this study, about 18 km long section of Eastern Freeway in Melbourne, Australia was selected as a test bed. Three categories of data i.e. traffic, weather and accident record data were used for the analysis and modelling. In developing the crash risk probability model, classification tree based model was developed in this study. In formulating the models, it was found that weather conditions did not have significant impact on accident occurrence so the classification tree was built using two traffic indices; traffic flow and vehicle speed only. The formulated classification tree is able to identify the possible hazard and non-hazard situations on freeway. The outcome of the study will aid the hazard mitigation strategies.

  11. Classification of integrable discrete Klein-Gordon models

    Science.gov (United States)

    Habibullin, Ismagil T.; Gudkova, Elena V.

    2011-04-01

    The Lie algebraic integrability test is applied to the problem of classification of integrable Klein-Gordon-type equations on quad graphs. The list of equations passing the test is presented, containing several well-known integrable models. A new integrable example is found; its higher symmetry is presented.

  12. Classification of integrable discrete Klein-Gordon models

    International Nuclear Information System (INIS)

    The Lie algebraic integrability test is applied to the problem of classification of integrable Klein-Gordon-type equations on quad graphs. The list of equations passing the test is presented, containing several well-known integrable models. A new integrable example is found; its higher symmetry is presented.

  13. Application of Classification Models to Pharyngeal High-Resolution Manometry

    Science.gov (United States)

    Mielens, Jason D.; Hoffman, Matthew R.; Ciucci, Michelle R.; McCulloch, Timothy M.; Jiang, Jack J.

    2012-01-01

    Purpose: The authors present 3 methods of performing pattern recognition on spatiotemporal plots produced by pharyngeal high-resolution manometry (HRM). Method: Classification models, including the artificial neural networks (ANNs) multilayer perceptron (MLP) and learning vector quantization (LVQ), as well as support vector machines (SVM), were…

  14. Computerized Classification Testing under the Generalized Graded Unfolding Model

    Science.gov (United States)

    Wang, Wen-Chung; Liu, Chen-Wei

    2011-01-01

    The generalized graded unfolding model (GGUM) has been recently developed to describe item responses to Likert items (agree-disagree) in attitude measurement. In this study, the authors (a) developed two item selection methods in computerized classification testing under the GGUM, the current estimate/ability confidence interval method and the cut…

  15. Fuzzy modeling of farmers' knowledge for land suitability classification

    NARCIS (Netherlands)

    Sicat, R.S.; Carranza, E.J.M.; Nidumolu, U.B.

    2005-01-01

    In a case study, we demonstrate fuzzy modeling of farmers' knowledge (FK) for agricultural land suitability classification using GIS. Capture of FK was through rapid rural participatory approach. The farmer respondents consider, in order of decreasing importance, cropping season, soil color, soil te

  16. Habitat classification modeling with incomplete data: pushing the habitat envelope.

    Science.gov (United States)

    Zarnetske, Phoebe L; Edwards, Thomas C; Moisen, Gretchen G

    2007-09-01

    Habitat classification models (HCMs) are invaluable tools for species conservation, land-use planning, reserve design, and metapopulation assessments, particularly at broad spatial scales. However, species occurrence data are often lacking and typically limited to presence points at broad scales. This lack of absence data precludes the use of many statistical techniques for HCMs. One option is to generate pseudo-absence points so that the many available statistical modeling tools can bb used. Traditional techniques generate pseudo-absence points at random across broadly defined species ranges, often failing to include biological knowledge concerning the species-habitat relationship. We incorporated biological knowledge of the species-habitat relationship into pseudo-absence points by creating habitat envelopes that constrain the region from which points were randomly selected. We define a habitat envelope as an ecological representation of a species, or species feature's (e.g., nest) observed distribution (i.e., realized niche) based on a single attribute, or the spatial intersection of multiple attributes. We created HCMs for Northern Goshawk (Accipiter gentilis atricapillus) nest habitat during the breeding season across Utah forests with extant nest presence points and ecologically based pseudo-absence points using logistic regression. Predictor variables were derived from 30-m USDA Landfire and 250-m Forest Inventory and Analysis (FIA) map products. These habitat-envelope-based models were then compared to null envelope models which use traditional practices for generating pseudo-absences. Models were assessed for fit and predictive capability using metrics such as kappa, threshold-independent receiver operating characteristic (ROC) plots, adjusted deviance (D(adj)2), and cross-validation, and were also assessed for ecological relevance. For all cases, habitat envelope-based models outperformed null envelope models and were more ecologically relevant

  17. Multiple Sclerosis and Employment: A Research Review Based on the International Classification of Function

    Science.gov (United States)

    Frain, Michael P.; Bishop, Malachy; Rumrill, Phillip D., Jr.; Chan, Fong; Tansey, Timothy N.; Strauser, David; Chiu, Chung-Yi

    2015-01-01

    Multiple sclerosis (MS) is an unpredictable, sometimes progressive chronic illness affecting people in the prime of their working lives. This article reviews the effects of MS on employment based on the World Health Organization's International Classification of Functioning, Disability and Health model. Correlations between employment and…

  18. A Multi-Dimensional Classification Model for Scientific Workflow Characteristics

    Energy Technology Data Exchange (ETDEWEB)

    Ramakrishnan, Lavanya; Plale, Beth

    2010-04-05

    Workflows have been used to model repeatable tasks or operations in manufacturing, business process, and software. In recent years, workflows are increasingly used for orchestration of science discovery tasks that use distributed resources and web services environments through resource models such as grid and cloud computing. Workflows have disparate re uirements and constraints that affects how they might be managed in distributed environments. In this paper, we present a multi-dimensional classification model illustrated by workflow examples obtained through a survey of scientists from different domains including bioinformatics and biomedical, weather and ocean modeling, astronomy detailing their data and computational requirements. The survey results and classification model contribute to the high level understandingof scientific workflows.

  19. A classification of empirical CGE modelling

    OpenAIRE

    Thissen, Mark

    1998-01-01

    This paper investigates asymmetric effects of monetary policy over the business cycle. A two-state Markov Switching Model is employed to model both recessions and expansions. For the United States and Germany, strong evidence is found that monetary policy is more effective in a recession than during a boom. Also some evidence is found for asymmetry in the United Kingdom and Belgium. In the Netherlands, monetary policy is not very effective in either regime.

  20. Upper limit for context based crop classification

    DEFF Research Database (Denmark)

    Midtiby, Henrik; Åstrand, Björn; Jørgensen, Rasmus Nyholm;

    2012-01-01

    Mechanical in-row weed control of crops like sugarbeet require precise knowledge of where individual crop plants are located. If crop plants are placed in known pattern, information about plant locations can be used to discriminate between crop and weed plants. The success rate of such a classifier...... depends on the weed pressure, the position uncertainty of the crop plants and the crop upgrowth percentage. The first two measures can be combined to a normalized weed pressure, \\lambda. Given the normalized weed pressure an upper bound on the positive predictive value is shown to be 1/(1+\\lambda). If the...... weed pressure is \\rho = 400/m^2 and the crop position uncertainty is \\sigma_x = 0.0148m along the row and \\sigma_y = 0.0108m perpendicular to the row, the normalized weed pressure is \\lambda ~ 0.40$; the upper bound on the positive predictive value is then 0.71. This means that when a position based...

  1. Object-Based Classification and Change Detection of Hokkaido, Japan

    Science.gov (United States)

    Park, J. G.; Harada, I.; Kwak, Y.

    2016-06-01

    Topography and geology are factors to characterize the distribution of natural vegetation. Topographic contour is particularly influential on the living conditions of plants such as soil moisture, sunlight, and windiness. Vegetation associations having similar characteristics are present in locations having similar topographic conditions unless natural disturbances such as landslides and forest fires or artificial disturbances such as deforestation and man-made plantation bring about changes in such conditions. We developed a vegetation map of Japan using an object-based segmentation approach with topographic information (elevation, slope, slope direction) that is closely related to the distribution of vegetation. The results found that the object-based classification is more effective to produce a vegetation map than the pixel-based classification.

  2. Cardiac arrhythmia classification based on mutiple lead electrocardiogram signals and multivariate autoregressive modeling method%基于多导联心电信号和多变量回归模型的心律失常的分类

    Institute of Scientific and Technical Information of China (English)

    葛丁飞; 李时辉; Krishnan S. M.

    2004-01-01

    心电信号(ECG)智能分析非常有利于严重心脏病人的自动诊断.本文介绍了多变量回归模型(MAR)建模法,利用MAR模型从双导联ECG中提取特征对ECG信号进行分类.在分类时,利用MAR模型系数及其K-L变换(K-L MAR系数)作为信号特征,并采用了树状决策过程和二次判别函数(QDF)分类器.利用文中方法对MIT-BIH标准数据库中的正常窦性心律(NSR)、期收缩(APC)、心室早期收缩(PVC)、心室性心动过速(VT)和心室纤维性颤动(VF)各300个样本信号进行了建模和测试. 结果表明,为了达到分类目的,MAR模型阶数取4是足够的,基于MAR系数的分类取得了比基于K-L MAR系数的分类稍好的结果.基于MAR系数的分类获得了97.3%~98.6%的分类精度.%Artificial-intelligence analysis of electrocardiogram (ECG) signals is great benefit to the automatic diagnosis in critical ill patients. Multivariate autoregressive modeling (MAR) for the purpose of classification of cardiac arrhythmias has been introduced. The MAR coefficients and K-L transformation of MAR coefficients extracted from two-lead ECG signals have been utilized for representing the ECG signals. The ECG data obtained from MIT-BIH database included normal sinus rhythm, atria premature contraction, premature ventricular contraction, ventricular tachycardia, and ventricular fibrillation. The current classification was performed using a stage-by-stage quadratic discriminant function (QDF). The results showed a MAR order of 4 was sufficient for the purpose of classification, and MAR coefficients produced slightly better results than K-L transformation of MAR coefficients. The classification accuracy of 97.3% to 98.6% based on MAR coefficients is obtained in the research.

  3. Neural Network based Vehicle Classification for Intelligent Traffic Control

    Directory of Open Access Journals (Sweden)

    Saeid Fazli

    2012-06-01

    Full Text Available Nowadays, number of vehicles has been increased and traditional systems of traffic controlling couldn’t be able to meet the needs that cause to emergence of Intelligent Traffic Controlling Systems. They improve controlling and urban management and increase confidence index in roads and highways. The goal of thisarticle is vehicles classification base on neural networks. In this research, it has been used a immovable camera which is located in nearly close height of the road surface to detect and classify the vehicles. The algorithm that used is included two general phases; at first, we are obtaining mobile vehicles in the traffic situations by using some techniques included image processing and remove background of the images and performing edge detection and morphology operations. In the second phase, vehicles near the camera areselected and the specific features are processed and extracted. These features apply to the neural networks as a vector so the outputs determine type of vehicle. This presented model is able to classify the vehicles in three classes; heavy vehicles, light vehicles and motorcycles. Results demonstrate accuracy of the algorithm and its highly functional level.

  4. Classification Model with High Deviation for Intrusion Detection on System Call Traces

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    A new classification model for host intrusion detection based on the unidentified short sequences and RIPPER algorithm is proposed. The concepts of different short sequences on the system call traces are strictly defined on the basis of in-depth analysis of completeness and correctness of pattern databases. Labels of short sequences are predicted by learned RIPPER rule set and the nature of the unidentified short sequences is confirmed by statistical method. Experiment results indicate that the classification model increases clearly the deviation between the attack and the normal traces and improves detection capability against known and unknown attacks.

  5. A novel hybrid classification model of genetic algorithms, modified k-Nearest Neighbor and developed backpropagation neural network.

    Directory of Open Access Journals (Sweden)

    Nader Salari

    Full Text Available Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that

  6. ALADDIN: a neural model for event classification in dynamic processes

    International Nuclear Information System (INIS)

    ALADDIN is a prototype system which combines fuzzy clustering techniques and artificial neural network (ANN) models in a novel approach to the problem of classifying events in dynamic processes. The main motivation for the development of such a system derived originally from the problem of finding new principled methods to perform alarm structuring/suppression in a nuclear power plant (NPP) alarm system. One such method consists in basing the alarm structuring/suppression on a fast recognition of the event generating the alarms, so that a subset of alarms sufficient to efficiently handle the current fault can be selected to be presented to the operator, minimizing in this way the operator's workload in a potentially stressful situation. The scope of application of a system like ALADDIN goes however beyond alarm handling, to include diagnostic tasks in general. The eventual application of the system to domains other than NPPs was also taken into special consideration during the design phase. In this document we report on the first phase of the ALADDIN project which consisted mainly in a comparative study of a series of ANN-based approaches to event classification, and on the proposal of a first system prototype which is to undergo further tests and, eventually, be integrated in existing alarm, diagnosis, and accident management systems such as CASH, IDS, and CAMS. (author)

  7. Metagenome fragment classification based on multiple motif-occurrence profiles

    Directory of Open Access Journals (Sweden)

    Naoki Matsushita

    2014-09-01

    Full Text Available A vast amount of metagenomic data has been obtained by extracting multiple genomes simultaneously from microbial communities, including genomes from uncultivable microbes. By analyzing these metagenomic data, novel microbes are discovered and new microbial functions are elucidated. The first step in analyzing these data is sequenced-read classification into reference genomes from which each read can be derived. The Naïve Bayes Classifier is a method for this classification. To identify the derivation of the reads, this method calculates a score based on the occurrence of a DNA sequence motif in each reference genome. However, large differences in the sizes of the reference genomes can bias the scoring of the reads. This bias might cause erroneous classification and decrease the classification accuracy. To address this issue, we have updated the Naïve Bayes Classifier method using multiple sets of occurrence profiles for each reference genome by normalizing the genome sizes, dividing each genome sequence into a set of subsequences of similar length and generating profiles for each subsequence. This multiple profile strategy improves the accuracy of the results generated by the Naïve Bayes Classifier method for simulated and Sargasso Sea datasets.

  8. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  9. Cardiac arrhythmia classification using autoregressive modeling

    OpenAIRE

    Srinivasan Narayanan; Ge Dingfei; Krishnan Shankar M

    2002-01-01

    Abstract Background Computer-assisted arrhythmia recognition is critical for the management of cardiac disorders. Various techniques have been utilized to classify arrhythmias. Generally, these techniques classify two or three arrhythmias or have significantly large processing times. A simpler autoregressive modeling (AR) technique is proposed to classify normal sinus rhythm (NSR) and various cardiac arrhythmias including atrial premature contraction (APC), premature ventricular contraction (...

  10. Invariance Properties for General Diagnostic Classification Models

    Science.gov (United States)

    Bradshaw, Laine P.; Madison, Matthew J.

    2016-01-01

    In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic…

  11. Virtual Sensor Based Fault Detection and Classification on a Plasma Etch Reactor

    CERN Document Server

    Sofge, D A

    2007-01-01

    The SEMATECH sponsored J-88-E project teaming Texas Instruments with NeuroDyne (et al.) focused on Fault Detection and Classification (FDC) on a Lam 9600 aluminum plasma etch reactor, used in the process of semiconductor fabrication. Fault classification was accomplished by implementing a series of virtual sensor models which used data from real sensors (Lam Station sensors, Optical Emission Spectroscopy, and RF Monitoring) to predict recipe setpoints and wafer state characteristics. Fault detection and classification were performed by comparing predicted recipe and wafer state values with expected values. Models utilized include linear PLS, Polynomial PLS, and Neural Network PLS. Prediction of recipe setpoints based upon sensor data provides a capability for cross-checking that the machine is maintaining the desired setpoints. Wafer state characteristics such as Line Width Reduction and Remaining Oxide were estimated on-line using these same process sensors (Lam, OES, RFM). Wafer-to-wafer measurement of thes...

  12. A computational theory for the classification of natural biosonar targets based on a spike code

    CERN Document Server

    Müller, R

    2003-01-01

    A computational theory for classification of natural biosonar targets is developed based on the properties of an example stimulus ensemble. An extensive set of echoes 84 800 from four different foliages was transcribed into a spike code using a parsimonious model (linear filtering, half-wave rectification, thresholding). The spike code is assumed to consist of time differences (interspike intervals) between threshold crossings. Among the elementary interspike intervals flanked by exceedances of adjacent thresholds, a few intervals triggered by disjoint half-cycles of the carrier oscillation stand out in terms of resolvability, visibility across resolution levels and a simple stochastic structure (uncorrelatedness). They are therefore argued to be a stochastic analogue to edges in vision. A three-dimensional feature vector representing these interspike intervals sustained a reliable target classification performance (0.06% classification error) in a sequential probability ratio test, which models sequential pr...

  13. QSAR models for oxidation of organic micropollutants in water based on ozone and hydroxyl radical rate constants and their chemical classification

    KAUST Repository

    Sudhakaran, Sairam

    2013-03-01

    Ozonation is an oxidation process for the removal of organic micropollutants (OMPs) from water and the chemical reaction is governed by second-order kinetics. An advanced oxidation process (AOP), wherein the hydroxyl radicals (OH radicals) are generated, is more effective in removing a wider range of OMPs from water than direct ozonation. Second-order rate constants (kOH and kO3) are good indices to estimate the oxidation efficiency, where higher rate constants indicate more rapid oxidation. In this study, quantitative structure activity relationships (QSAR) models for O3 and AOP processes were developed, and rate constants, kOH and kO3, were predicted based on target compound properties. The kO3 and kOH values ranged from 5 * 10-4 to 105 M-1s-1 and 0.04 to 18 * (109) M-1 s-1, respectively. Several molecular descriptors which potentially influence O3 and OH radical oxidation were identified and studied. The QSAR-defining descriptors were double bond equivalence (DBE), ionisation potential (IP), electron-affinity (EA) and weakly-polar component of solvent accessible surface area (WPSA), and the chemical and statistical significance of these descriptors was discussed. Multiple linear regression was used to build the QSAR models, resulting in high goodness-of-fit, r2 (>0.75). The models were validated by internal and external validation along with residual plots. © 2012 Elsevier Ltd.

  14. An AERONET-based aerosol classification using the Mahalanobis distance

    Science.gov (United States)

    Hamill, Patrick; Giordano, Marco; Ward, Carolyne; Giles, David; Holben, Brent

    2016-09-01

    We present an aerosol classification based on AERONET aerosol data from 1993 to 2012. We used the AERONET Level 2.0 almucantar aerosol retrieval products to define several reference aerosol clusters which are characteristic of the following general aerosol types: Urban-Industrial, Biomass Burning, Mixed Aerosol, Dust, and Maritime. The classification of a particular aerosol observation as one of these aerosol types is determined by its five-dimensional Mahalanobis distance to each reference cluster. We have calculated the fractional aerosol type distribution at 190 AERONET sites, as well as the monthly variation in aerosol type at those locations. The results are presented on a global map and individually in the supplementary material. Our aerosol typing is based on recognizing that different geographic regions exhibit characteristic aerosol types. To generate reference clusters we only keep data points that lie within a Mahalanobis distance of 2 from the centroid. Our aerosol characterization is based on the AERONET retrieved quantities, therefore it does not include low optical depth values. The analysis is based on "point sources" (the AERONET sites) rather than globally distributed values. The classifications obtained will be useful in interpreting aerosol retrievals from satellite borne instruments.

  15. A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark

    Directory of Open Access Journals (Sweden)

    Yong Wang

    2016-02-01

    Full Text Available Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.

  16. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    Science.gov (United States)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  17. Cancer pain: A critical review of mechanism-based classification and physical therapy management in palliative care

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2011-01-01

    Full Text Available Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient′s symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient′s findings, using a biopsychosocial model of pain.

  18. Active Dictionary Learning in Sparse Representation Based Classification

    OpenAIRE

    Xu, Jin; He, Haibo; Man, Hong

    2014-01-01

    Sparse representation, which uses dictionary atoms to reconstruct input vectors, has been studied intensively in recent years. A proper dictionary is a key for the success of sparse representation. In this paper, an active dictionary learning (ADL) method is introduced, in which classification error and reconstruction error are considered as the active learning criteria in selection of the atoms for dictionary construction. The learned dictionaries are caculated in sparse representation based...

  19. Understanding Acupuncture Based on ZHENG Classification from System Perspective

    OpenAIRE

    Junwei Fang; Ningning Zheng; Yang Wang; Huijuan Cao; Shujun Sun; Jianye Dai; Qianhua Li; Yongyu Zhang

    2013-01-01

    Acupuncture is an efficient therapy method originated in ancient China, the study of which based on ZHENG classification is a systematic research on understanding its complexity. The system perspective is contributed to understand the essence of phenomena, and, as the coming of the system biology era, broader technology platforms such as omics technologies were established for the objective study of traditional chinese medicine (TCM). Omics technologies could dynamically determine molecular c...

  20. BCI Signal Classification using a Riemannian-based kernel

    OpenAIRE

    Barachant, Alexandre; Bonnet, Stéphane; Congedo, Marco; Jutten, Christian

    2012-01-01

    The use of spatial covariance matrix as feature is investigated for motor imagery EEG-based classification. A new kernel is derived by establishing a connection with the Riemannian geometry of symmetric positive definite matrices. Different kernels are tested, in combination with support vector machines, on a past BCI competition dataset. We demonstrate that this new approach outperforms significantly state of the art results without the need for spatial filtering.

  1. DATA MINING BASED TECHNIQUE FOR IDS ALERT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Hany Nashat Gabra

    2015-06-01

    Full Text Available Intrusion detection systems (IDSs have become a widely used measure for security systems. The main problem for such systems is the irrelevant alerts. We propose a data mining based method for classification to distinguish serious and irrelevant alerts with a performance of 99.9%, which is better in comparison with the other recent data mining methods that achieved 97%. A ranked alerts list is also created according to the alert’s importance to minimize human interventions.

  2. DATA MINING BASED TECHNIQUE FOR IDS ALERT CLASSIFICATION

    OpenAIRE

    Hany Nashat Gabra; Bahaa-Eldin, Ayman M.; Hoda Korashy Mohammed

    2015-01-01

    Intrusion detection systems (IDSs) have become a widely used measure for security systems. The main problem for such systems is the irrelevant alerts. We propose a data mining based method for classification to distinguish serious and irrelevant alerts with a performance of 99.9%, which is better in comparison with the other recent data mining methods that achieved 97%. A ranked alerts list is also created according to the alert’s importance to minimize human interventions.

  3. Data Mining Based Technique for IDS Alerts Classification

    OpenAIRE

    Gabra, Hany N.; Bahaa-Eldin, Ayman M.; Mohamed, Hoda K.

    2012-01-01

    Intrusion detection systems (IDSs) have become a widely used measure for security systems. The main problem for those systems results is the irrelevant alerts on those results. We will propose a data mining based method for classification to distinguish serious alerts and irrelevant one with a performance of 99.9% which is better in comparison with the other recent data mining methods that have reached the performance of 97%. A ranked alerts list also created according to alerts importance to...

  4. Classification of objects in images based on various object representations

    OpenAIRE

    Cichocki, Radoslaw

    2006-01-01

    Object recognition is a hugely researched domain that employs methods derived from mathematics, physics and biology. This thesis combines the approaches for object classification that base on two features – color and shape. Color is represented by color histograms and shape by skeletal graphs. Four hybrids are proposed which combine those approaches in different manners and the hybrids are then tested to find out which of them gives best results.

  5. A Cluster Based Approach for Classification of Web Results

    OpenAIRE

    Apeksha Khabia; M. B. Chandak

    2014-01-01

    Nowadays significant amount of information from web is present in the form of text, e.g., reviews, forum postings, blogs, news articles, email messages, web pages. It becomes difficult to classify documents in predefined categories as the number of document grows. Clustering is the classification of a data into clusters, so that the data in each cluster share some common trait – often vicinity according to some defined measure. Underlying distribution of data set can somewhat be depicted base...

  6. A new Multiple ANFIS model for classification of hemiplegic gait.

    Science.gov (United States)

    Yardimci, A; Asilkan, O

    2014-01-01

    Neuro-fuzzy system is a combination of neural network and fuzzy system in such a way that neural network learning algorithms, is used to determine parameters of the fuzzy system. This paper describes the application of multiple adaptive neuro-fuzzy inference system (MANFIS) model which has hybrid learning algorithm for classification of hemiplegic gait acceleration (HGA) signals. Decision making was performed in two stages: feature extraction using the wavelet transforms (WT) and the ANFIS trained with the backpropagation gradient descent method in combination with the least squares method. The performance of the ANFIS model was evaluated in terms of training performance and classification accuracies and the results confirmed that the proposed ANFIS model has potential in classifying the HGA signals. PMID:25160151

  7. Expected energy-based restricted Boltzmann machine for classification.

    Science.gov (United States)

    Elfwing, S; Uchibe, E; Doya, K

    2015-04-01

    In classification tasks, restricted Boltzmann machines (RBMs) have predominantly been used in the first stage, either as feature extractors or to provide initialization of neural networks. In this study, we propose a discriminative learning approach to provide a self-contained RBM method for classification, inspired by free-energy based function approximation (FE-RBM), originally proposed for reinforcement learning. For classification, the FE-RBM method computes the output for an input vector and a class vector by the negative free energy of an RBM. Learning is achieved by stochastic gradient-descent using a mean-squared error training objective. In an earlier study, we demonstrated that the performance and the robustness of FE-RBM function approximation can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that the learning performance of RBM function approximation can be further improved by computing the output by the negative expected energy (EE-RBM), instead of the negative free energy. To create a deep learning architecture, we stack several RBMs on top of each other. We also connect the class nodes to all hidden layers to try to improve the performance even further. We validate the classification performance of EE-RBM using the MNIST data set and the NORB data set, achieving competitive performance compared with other classifiers such as standard neural networks, deep belief networks, classification RBMs, and support vector machines. The purpose of using the NORB data set is to demonstrate that EE-RBM with binary input nodes can achieve high performance in the continuous input domain. PMID:25318375

  8. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  9. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    International Nuclear Information System (INIS)

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  10. Design and implementation based on the classification protection vulnerability scanning system

    International Nuclear Information System (INIS)

    With the application and spread of the classification protection, Network Security Vulnerability Scanning should consider the efficiency and the function expansion. It proposes a kind of a system vulnerability from classification protection, and elaborates the design and implementation of a vulnerability scanning system based on vulnerability classification plug-in technology and oriented classification protection. According to the experiment, the application of classification protection has good adaptability and salability with the system, and it also approves the efficiency of scanning. (authors)

  11. A new gammagraphic and functional-based classification for hyperthyroidism

    International Nuclear Information System (INIS)

    The absence of an universal classification for hyperthyroidism's (HT), give rise to inadequate interpretation of series and trials, and prevents decision making. We offer a tentative classification based on gammagraphic and functional findings. Clinical records from patients who underwent thyroidectomy in our Department since 1967 to 1997 were reviewed. Those with functional measurements of hyperthyroidism were considered. All were managed according to the same preestablished guidelines. HT was the surgical indication in 694 (27,1%) of the 2559 thyroidectomy. Based on gammagraphic studies, we classified HTs in: parenchymatous increased-uptake, which could be diffuse, diffuse with cold nodules or diffuse with at least one nodule, and nodular increased-uptake (Autonomous Functioning Thyroid Nodes-AFTN), divided into solitary AFTN or toxic adenoma and multiple AFTN o toxic multi-nodular goiter. This gammagraphic-based classification in useful and has high sensitivity to detect these nodules assessing their activity, allowing us to make therapeutic decision making and, in some cases, to choose surgical technique. (authors)

  12. The application of the Kohonen neural network in the nonparametric-quality-based classification of tomatoes

    Science.gov (United States)

    Boniecki, P.; Nowakowski, K.; Tomczak, R.; Kujawa, S.; Piekarska-Boniecka, H.

    2012-04-01

    By using the classification properties of Kohonen-type networks (Tipping 1996), a neural model was built for the qualitybased identification of tomatoes. The resulting empirical data in the form of digital images of tomatoes at various stages of storage were subsequently used to draw up a topological SOFM (Self-Organizing Feature Map) which features cluster centers of "comparable" cases (Tadeusiewicz 1997, Boniecki 2008). Radial neurons from the Kohonen topological map were labeled appropriately to allow for the practical quality-based classification of tomatoes (De Grano 2007).

  13. Prediction of Runoff Model Based on Bayes Classification of Markov Method%基于Bayes分类的Markov径流量预测模型

    Institute of Scientific and Technical Information of China (English)

    邱林; 安可君; 王文川

    2011-01-01

    Aiming at the characteristics of the complexity of runoff cause and randomness of hydrological processes,and limitation of a single prediction method applied,a new method,called Bayes-Markov combined model,is presented based on Bayes theory and Markov theory.This paper attempts to use the Bayes formula to classify the low high annual runoff firstly,then to create forecasting model with the weighted Markov analysis method.The two prediction methods were scientifically combined,which generalizes advantages of the ones and raises the accuracy of runoff prediction.The prediction model was identified by taking prediction of annual runoff variation in Lanzhou Station of Yellow River Basin.The results show that the predicted values from 2003 to 2009 meet the requirements of the Specifications,and the accuracy of it was 85.7%.%针对河川径流成因复杂性和水文过程随机性的特点,且用单一预测法存在一定局限性的现状,提出混合Bayes-Markov预测模型。先用Bayes公式对径流进行丰枯分类,然后采用加权Markov分析方法建立预测模型,该模型可综合利用Bayes和Markov方法的优点,提高径流预测精度。以兰州站河川径流量预测为例,进行模型验证。结果表明,2003~2009年径流量预测精度达到85.7%,能满足规范要求。

  14. Music Genre Classification using an Auditory Memory Model

    DEFF Research Database (Denmark)

    Jensen, Kristoffer

    2011-01-01

    Audio feature estimation is potentially improved by including higher- level models. One such model is the Auditory Short Term Memory (STM) model. A new paradigm of audio feature estimation is obtained by adding the influence of notes in the STM. These notes are identified when the perceptual...... results, and an initial experiment with sensory dissonance has been undertaken with good results. The parameters obtained form the auditory memory model, along with the dissonance measure, are shown here to be of interest in genre classification....

  15. 3D head model classification using optimized EGI

    Science.gov (United States)

    Tong, Xin; Wong, Hau-san; Ma, Bo

    2006-02-01

    With the general availability of 3D digitizers and scanners, 3D graphical models have been used widely in a variety of applications. This has led to the development of search engines for 3D models. Especially, 3D head model classification and retrieval have received more and more attention in view of their many potential applications in criminal identifications, computer animation, movie industry and medical industry. This paper addresses the 3D head model classification problem using 2D subspace analysis methods such as 2D principal component analysis (2D PCA[3]) and 2D fisher discriminant analysis (2DLDA[5]). It takes advantage of the fact that the histogram is a 2D image, and we can extract the most useful information from these 2D images to get a good result accordingingly. As a result, there are two main advantages: First, we can perform less calculation to obtain the same rate of classification; second, we can reduce the dimensionality more than PCA to obtain a higher efficiency.

  16. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  17. Sports Video Classification Based on Marked Genre Shots and Bag of Words Model%基于类型标志镜头与词袋模型的体育视频分类

    Institute of Scientific and Technical Information of China (English)

    朱映映; 朱艳艳; 文振焜

    2013-01-01

    基于内容的体育视频分类是高效管理大量体育视频数据的关键步骤之一,为提高体育视频分类方法的正确率及泛化能力,提出一种基于类型标志镜头与视觉词袋模型相结合的体育视频分类方法.首先给出类型标志镜头的定义,并通过类型标志镜头构建该镜头视频帧训练库;然后构建基于视频帧训练库的金字塔视觉词袋模型,将视频帧标志为归一化的词频向量,使用SVM对视频帧进行分类;再通过分析视频帧分类错误的原因及表现形式提出基于时序连续性孤立帧去除算法,以消除视频帧的错误归类.由于体育视频按组合类型可分为单一体育视频与混合体育视频,因此分别提出了单一体育视频及混合体育视频2种分类算法.实验结果表明,文中算法具有实现简单、处理速度快和准确度高的优点.%Content-based classification of sports video is one of the critical steps in the efficient management of a large number of sports video data.To improve the accuracy and generalization ability of sports video classification,a new sports video classification method based on the combination of marked genre shots and bag of visual words model is proposed.Firstly,the definition of the marked genre shots is given,and the video frame training database of marked genre shots is constructed with the marked genre shots.Secondly,the pyramid visual word bag model is constructed based on the video frame training database,each video frame is represented with a visual words frequency vector,and then the SVM is used to classify the video frame.Subsequently,by analyzing the misclassification causes,the isolated frame removal algorithm is proposed to eliminate the representative frame misclassification.Finally,as the sports video,according to its combination type,can be divided into single sports video and mixed sports video,two different classification algorithms for single sports video and mixed sports

  18. Genre classification using chords and stochastic language models

    OpenAIRE

    Pérez Sancho, Carlos; Rizo Valero, David; Iñesta Quereda, José Manuel

    2009-01-01

    Music genre meta-data is of paramount importance for the organisation of music repositories. People use genre in a natural way when entering a music store or looking into music collections. Automatic genre classification has become a popular topic in music information retrieval research both, with digital audio and symbolic data. This work focuses on the symbolic approach, bringing to music cognition some technologies, like the stochastic language models, already successfully applied to text ...

  19. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    Science.gov (United States)

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  20. [Classification of THz Transmission Spectrum Based on Kevnel Function of Convex Combination].

    Science.gov (United States)

    Wang, Rui-qi; Shen, Tao; Ma, Shuai; Quo, Jian-yi; Yu, Zheng-tao

    2015-05-01

    In the present paper, support vector machine (SVM) based on convex combination kernel function will be used for classification of THz pulse transmission spectra. Wavelet transform is used in data pre-processing. Peaks and valleys are regarded as location features of THz pulse transmission spectra, which are injected into maximum interval features of term frequency-inverse document frequency (TF-IDF). We can conclude weight of each sampling point from the information theory. The weight represents the possibility that sampling point becomes feature. According to the situation that different terahertz-transmission spectra are lack of obvious features, we composed a SVM classification model based on convex combination kernel function. Evaluation function should be used as an evaluation method for obtaining the parameters of optimal convex combination to achieve a better accuracy. When the optimal parameter of kenal founction was determined, we should compose the model for process of classification and prediction. Compared with the single kernel function, the method can be combined with transmission spectroscopic features with classification model iteratively. Thanks to the dimensional mapping process, outstanding margin of features can be gained for the samples of different terahertz transmission spectrum. We carried out experiments using different samples The results demonstrated that the new approach is on par or superior in terms of accuracy and much better in feature fusion than SVM with single kernel function. PMID:26415425

  1. Various forms of indexing HDMR for modelling multivariate classification problems

    International Nuclear Information System (INIS)

    The Indexing HDMR method was recently developed for modelling multivariate interpolation problems. The method uses the Plain HDMR philosophy in partitioning the given multivariate data set into less variate data sets and then constructing an analytical structure through these partitioned data sets to represent the given multidimensional problem. Indexing HDMR makes HDMR be applicable to classification problems having real world data. Mostly, we do not know all possible class values in the domain of the given problem, that is, we have a non-orthogonal data structure. However, Plain HDMR needs an orthogonal data structure in the given problem to be modelled. In this sense, the main idea of this work is to offer various forms of Indexing HDMR to successfully model these real life classification problems. To test these different forms, several well-known multivariate classification problems given in UCI Machine Learning Repository were used and it was observed that the accuracy results lie between 80% and 95% which are very satisfactory

  2. A Novel Algorithm of Network Trade Customer Classification Based on Fourier Basis Functions

    Directory of Open Access Journals (Sweden)

    Li Xinwu

    2013-11-01

    Full Text Available Learning algorithm of neural network is always an important research contents in neural network theory research and application field, learning algorithm about the feed-forward neural network has no satisfactory solution in particular for its defects in calculation speed. The paper presents a new Fourier basis functions neural network algorithm and applied it to classify network trade customer. First, 21 customer classification indicators are designed, based on characteristics and behaviors analysis of network trade customer, including customer characteristics type variables and customer behaviors type variables,; Second, Fourier basis functions is used to improve the calculation flow and algorithm structure of original BP neural network algorithm to speed up its convergence and then a new Fourier basis neural network model is constructed. Finally the experimental results show that the problem of convergence speed can been solved, and the accuracy of the customer classification are ensured when the new algorithm is used in network trade customer classification practically.

  3. A Novel Land Cover Classification Map Based on a MODIS Time-Series in Xinjiang, China

    Directory of Open Access Journals (Sweden)

    Linlin Lu

    2014-04-01

    Full Text Available Accurate mapping of land cover on a regional scale is useful for climate and environmental modeling. In this study, we present a novel land cover classification product based on spectral and phenological information for the Xinjiang Uygur Autonomous Region (XUAR in China. The product is derived at a 500 m spatial resolution using an innovative approach employing moderate resolution imaging spectroradiometer (MODIS surface reflectance and the enhanced vegetation index (EVI time series. The classification results capture regional scale land cover patterns and small-scale phenomena. By applying a regionally specified classification scheme, an extensive collection of training data, and regionally tuned data processing, the quality and consistency of the phenological maps are significantly improved. With the ability to provide an updated land cover product considering the heterogenic environmental and climatic conditions, the novel land cover map is valuable for research related to environmental change in this region.

  4. Partial volume tissue classification of multichannel magnetic resonance images-a mixel model.

    Science.gov (United States)

    Choi, H S; Haynor, D R; Kim, Y

    1991-01-01

    A single volume element (voxel) in a medical image may be composed of a mixture of multiple tissue types. The authors call voxels which contain multiple tissue classes mixels. A statistical mixel image model based on Markov random field (MRF) theory and an algorithm for the classification of mixels are presented. The authors concentrate on the classification of multichannel magnetic resonance (MR) images of the brain although the algorithm has other applications. The authors also present a method for compensating for the gray-level variation of MR images between different slices, which is primarily caused by the inhomogeneity of the RF field produced by the imaging coil. PMID:18222842

  5. A probabilistic neural network approach for modeling and classification of bacterial growth/no-growth data.

    Science.gov (United States)

    Hajmeer, M; Basheer, I

    2002-10-01

    In this paper, we propose to use probabilistic neural networks (PNNs) for classification of bacterial growth/no-growth data and modeling the probability of growth. The PNN approach combines both Bayes theorem of conditional probability and Parzen's method for estimating the probability density functions of the random variables. Unlike other neural network training paradigms, PNNs are characterized by high training speed and their ability to produce confidence levels for their classification decision. As a practical application of the proposed approach, PNNs were investigated for their ability in classification of growth/no-growth state of a pathogenic Escherichia coli R31 in response to temperature and water activity. A comparison with the most frequently used traditional statistical method based on logistic regression and multilayer feedforward artificial neural network (MFANN) trained by error backpropagation was also carried out. The PNN-based models were found to outperform linear and nonlinear logistic regression and MFANN in both the classification accuracy and ease by which PNN-based models are developed. PMID:12133614

  6. Spectral classification of stars based on LAMOST spectra

    CERN Document Server

    Liu, Chao; Zhang, Bo; Wan, Jun-Chen; Deng, Li-Cai; Hou, Yonghui; Wang, Yuefei; Yang, Ming; Zhang, Yong

    2015-01-01

    In this work, we select the high signal-to-noise ratio spectra of stars from the LAMOST data andmap theirMK classes to the spectral features. The equivalentwidths of the prominent spectral lines, playing the similar role as the multi-color photometry, form a clean stellar locus well ordered by MK classes. The advantage of the stellar locus in line indices is that it gives a natural and continuous classification of stars consistent with either the broadly used MK classes or the stellar astrophysical parameters. We also employ a SVM-based classification algorithm to assignMK classes to the LAMOST stellar spectra. We find that the completenesses of the classification are up to 90% for A and G type stars, while it is down to about 50% for OB and K type stars. About 40% of the OB and K type stars are mis-classified as A and G type stars, respectively. This is likely owe to the difference of the spectral features between the late B type and early A type stars or between the late G and early K type stars are very we...

  7. Risk Classification and Risk-based Safety and Mission Assurance

    Science.gov (United States)

    Leitner, Jesse A.

    2014-01-01

    Recent activities to revamp and emphasize the need to streamline processes and activities for Class D missions across the agency have led to various interpretations of Class D, including the lumping of a variety of low-cost projects into Class D. Sometimes terms such as Class D minus are used. In this presentation, mission risk classifications will be traced to official requirements and definitions as a measure to ensure that projects and programs align with the guidance and requirements that are commensurate for their defined risk posture. As part of this, the full suite of risk classifications, formal and informal will be defined, followed by an introduction to the new GPR 8705.4 that is currently under review.GPR 8705.4 lays out guidance for the mission success activities performed at the Classes A-D for NPR 7120.5 projects as well as for projects not under NPR 7120.5. Furthermore, the trends in stepping from Class A into higher risk posture classifications will be discussed. The talk will conclude with a discussion about risk-based safety and mission assuranceat GSFC.

  8. Content-based image retrieval applied to BI-RADS tissue classification in screening mammography

    OpenAIRE

    2011-01-01

    AIM: To present a content-based image retrieval (CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classification.

  9. A Hybrid Classification Approach based on FCA and Emerging Patterns - An application for the classification of biological inhibitors

    OpenAIRE

    Asses, Yasmine; Buzmakov, Aleksey; Bourquard, Thomas; Kuznetsov, Sergei O.; Napoli, Amedeo

    2012-01-01

    Classification is an important task in data analysis and learning. Classification can be performed using supervised or unsupervised methods. From the unsupervised point of view, Formal Concept Analysis (FCA) can be used for such a task in an efficient and well-founded way. From the supervised point of view, emerging patterns rely on pattern mining and can be used to characterize classes of objects w.r.t. a priori labels. In this paper, we present a hybrid classification method which is based ...

  10. [Animal Models of Depression: Behavior as the Basis for Methodology, Assessment Criteria and Classifications].

    Science.gov (United States)

    Grigoryan, G A; Gulyaeva, N V

    2015-01-01

    Analysis of the current state modeling of depression in animals is presented. Criteria and classification systems of the existing models are considered as well as approaches to the assessment of model validity. Though numerous approaches to modeling of depressive states based on disturbances of both motivational and emotional brain mechanisms have been elaborated, no satisfactory model of stable depression state has been developed yet. However, the diversity of existing models is quite positive since it allows performing targeted studies of selected neurobiological mechanisms and laws of depressive state development, as well as to investigate mechanisms of action and predict pharmacological profiles of potential antidepressants. PMID:26841653

  11. Classification images in a very general decision model.

    Science.gov (United States)

    Murray, Richard F

    2016-06-01

    Most of the theory supporting our understanding of classification images relies on standard signal detection models and the use of normally distributed stimulus noise. Here I show that the most common methods of calculating classification images by averaging stimulus noise samples within stimulus-response classes of trials are much more general than has previously been demonstrated, and that they give unbiased estimates of an observer's template for a wide range of decision rules and non-Gaussian stimulus noise distributions. These results are similar to findings on reverse correlation and related methods in the neurophysiology literature, but here I formulate them in terms that are tailored to signal detection analyses of visual tasks, in order to make them more accessible and useful to visual psychophysicists. I examine 2AFC and yes-no designs. These findings make it possible to use and interpret classification images in tasks where observers' decision strategies may not conform to classic signal detection models such as the difference rule, and in tasks where the stimulus noise is non-Gaussian. PMID:27174841

  12. A minimum spanning forest based classification method for dedicated breast CT images

    Energy Technology Data Exchange (ETDEWEB)

    Pike, Robert [Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, Georgia 30329 (United States); Sechopoulos, Ioannis [Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, Georgia 30329 and Winship Cancer Institute of Emory University, Atlanta, Georgia 30322 (United States); Fei, Baowei, E-mail: bfei@emory.edu [Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, Georgia 30329 (United States); Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, Georgia 30322 (United States); Department of Mathematics and Computer Science, Emory University, Atlanta, Georgia 30322 (United States); Winship Cancer Institute of Emory University, Atlanta, Georgia 30322 (United States)

    2015-11-15

    Purpose: To develop and test an automated algorithm to classify different types of tissue in dedicated breast CT images. Methods: Images of a single breast of five different patients were acquired with a dedicated breast CT clinical prototype. The breast CT images were processed by a multiscale bilateral filter to reduce noise while keeping edge information and were corrected to overcome cupping artifacts. As skin and glandular tissue have similar CT values on breast CT images, morphologic processing is used to identify the skin based on its position information. A support vector machine (SVM) is trained and the resulting model used to create a pixelwise classification map of fat and glandular tissue. By combining the results of the skin mask with the SVM results, the breast tissue is classified as skin, fat, and glandular tissue. This map is then used to identify markers for a minimum spanning forest that is grown to segment the image using spatial and intensity information. To evaluate the authors’ classification method, they use DICE overlap ratios to compare the results of the automated classification to those obtained by manual segmentation on five patient images. Results: Comparison between the automatic and the manual segmentation shows that the minimum spanning forest based classification method was able to successfully classify dedicated breast CT image with average DICE ratios of 96.9%, 89.8%, and 89.5% for fat, glandular, and skin tissue, respectively. Conclusions: A 2D minimum spanning forest based classification method was proposed and evaluated for classifying the fat, skin, and glandular tissue in dedicated breast CT images. The classification method can be used for dense breast tissue quantification, radiation dose assessment, and other applications in breast imaging.

  13. A minimum spanning forest based classification method for dedicated breast CT images

    International Nuclear Information System (INIS)

    Purpose: To develop and test an automated algorithm to classify different types of tissue in dedicated breast CT images. Methods: Images of a single breast of five different patients were acquired with a dedicated breast CT clinical prototype. The breast CT images were processed by a multiscale bilateral filter to reduce noise while keeping edge information and were corrected to overcome cupping artifacts. As skin and glandular tissue have similar CT values on breast CT images, morphologic processing is used to identify the skin based on its position information. A support vector machine (SVM) is trained and the resulting model used to create a pixelwise classification map of fat and glandular tissue. By combining the results of the skin mask with the SVM results, the breast tissue is classified as skin, fat, and glandular tissue. This map is then used to identify markers for a minimum spanning forest that is grown to segment the image using spatial and intensity information. To evaluate the authors’ classification method, they use DICE overlap ratios to compare the results of the automated classification to those obtained by manual segmentation on five patient images. Results: Comparison between the automatic and the manual segmentation shows that the minimum spanning forest based classification method was able to successfully classify dedicated breast CT image with average DICE ratios of 96.9%, 89.8%, and 89.5% for fat, glandular, and skin tissue, respectively. Conclusions: A 2D minimum spanning forest based classification method was proposed and evaluated for classifying the fat, skin, and glandular tissue in dedicated breast CT images. The classification method can be used for dense breast tissue quantification, radiation dose assessment, and other applications in breast imaging

  14. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-11-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  15. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-10-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  16. A mixed effects least squares support vector machine model for classification of longitudinal data

    OpenAIRE

    Luts, Jan; Molenberghs, Geert; Verbeke, Geert; Van Huffel, Sabine; Suykens, Johan A.K.

    2012-01-01

    A mixed effects least squares support vector machine (LS-SVM) classifier is introduced to extend the standard LS-SVM classifier for handling longitudinal data. The mixed effects LS-SVM model contains a random intercept and allows to classify highly unbalanced data, in the sense that there is an unequal number of observations for each case at non-fixed time points. The methodology consists of a regression modeling and a classification step based on the obtained regression estimates. Regression...

  17. Variable Star Signature Classification using Slotted Symbolic Markov Modeling

    CERN Document Server

    Johnston, Kyle B

    2016-01-01

    With the advent of digital astronomy, new benefits and new challenges have been presented to the modern day astronomer. No longer can the astronomer rely on manual processing, instead the profession as a whole has begun to adopt more advanced computational means. This paper focuses on the construction and application of a novel time-domain signature extraction methodology and the development of a supporting supervised pattern classification algorithm for the identification of variable stars. A methodology for the reduction of stellar variable observations (time-domain data) into a novel feature space representation is introduced. The methodology presented will be referred to as Slotted Symbolic Markov Modeling (SSMM) and has a number of advantages which will be demonstrated to be beneficial; specifically to the supervised classification of stellar variables. It will be shown that the methodology outperformed a baseline standard methodology on a standardized set of stellar light curve data. The performance on ...

  18. Texton Based Shape Features on Local Binary Pattern for Age Classification

    OpenAIRE

    V. Vijaya Kumar; B. Eswara Reddy; P. Chandra Sekhar Reddy

    2012-01-01

    Classification and recognition of objects is interest of many researchers. Shape is a significant feature of objects and it plays a crucial role in image classification and recognition. The present paper assumes that the features that drastically affect the adulthood classification system are the Shape features (SF) of face. Based on this, the present paper proposes a new technique of adulthood classification by extracting feature parameters of face on Integrated Texton based LBP (IT-LBP) ima...

  19. Classification of Ocean Acoustic Data Using AR Modeling and Wavelet Transforms

    OpenAIRE

    Fargues, Monique P.; Bennett, R., Harris, J.; Barsanti, R. J.

    1997-01-01

    This study investigates the application of orthogonal, non-orthogonal wavelet-based procedures, and AR modeling as feature extraction techniques to classify several classes of underwater signals consisting of sperm whale, killer whale, gray whale, pilot whale, humpback whale, and underwater earthquake data. A two-hidden-layer back-propagation neural network is used for the classification procedure. Performance obtained using the two wavelet-based schemes are compared with those obtained usin...

  20. Object-Based Crop Species Classification Based on the Combination of Airborne Hyperspectral Images and LiDAR Data

    Directory of Open Access Journals (Sweden)

    Xiaolong Liu

    2015-01-01

    Full Text Available Identification of crop species is an important issue in agricultural management. In recent years, many studies have explored this topic using multi-spectral and hyperspectral remote sensing data. In this study, we perform dedicated research to propose a framework for mapping crop species by combining hyperspectral and Light Detection and Ranging (LiDAR data in an object-based image analysis (OBIA paradigm. The aims of this work were the following: (i to understand the performances of different spectral dimension-reduced features from hyperspectral data and their combination with LiDAR derived height information in image segmentation; (ii to understand what classification accuracies of crop species can be achieved by combining hyperspectral and LiDAR data in an OBIA paradigm, especially in regions that have fragmented agricultural landscape and complicated crop planting structure; and (iii to understand the contributions of the crop height that is derived from LiDAR data, as well as the geometric and textural features of image objects, to the crop species’ separabilities. The study region was an irrigated agricultural area in the central Heihe river basin, which is characterized by many crop species, complicated crop planting structures, and fragmented landscape. The airborne hyperspectral data acquired by the Compact Airborne Spectrographic Imager (CASI with a 1 m spatial resolution and the Canopy Height Model (CHM data derived from the LiDAR data acquired by the airborne Leica ALS70 LiDAR system were used for this study. The image segmentation accuracies of different feature combination schemes (very high-resolution imagery (VHR, VHR/CHM, and minimum noise fractional transformed data (MNF/CHM were evaluated and analyzed. The results showed that VHR/CHM outperformed the other two combination schemes with a segmentation accuracy of 84.8%. The object-based crop species classification results of different feature integrations indicated that

  1. Analytical models and system topologies for remote multispectral data acquisition and classification

    Science.gov (United States)

    Huck, F. O.; Park, S. K.; Burcher, E. E.; Kelly, W. L., IV

    1978-01-01

    Simple analytical models are presented of the radiometric and statistical processes that are involved in multispectral data acquisition and classification. Also presented are basic system topologies which combine remote sensing with data classification. These models and topologies offer a preliminary but systematic step towards the use of computer simulations to analyze remote multispectral data acquisition and classification systems.

  2. Generalization performance of graph-based semisupervised classification

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.

  3. Hydrophobicity classification of polymeric materials based on fractal dimension

    Directory of Open Access Journals (Sweden)

    Daniel Thomazini

    2008-12-01

    Full Text Available This study proposes a new method to obtain hydrophobicity classification (HC in high voltage polymer insulators. In the method mentioned, the HC was analyzed by fractal dimension (fd and its processing time was evaluated having as a goal the application in mobile devices. Texture images were created from spraying solutions produced of mixtures of isopropyl alcohol and distilled water in proportions, which ranged from 0 to 100% volume of alcohol (%AIA. Based on these solutions, the contact angles of the drops were measured and the textures were used as patterns for fractal dimension calculations.

  4. An AIS-Based E-mail Classification Method

    Science.gov (United States)

    Qing, Jinjian; Mao, Ruilong; Bie, Rongfang; Gao, Xiao-Zhi

    This paper proposes a new e-mail classification method based on the Artificial Immune System (AIS), which is endowed with good diversity and self-adaptive ability by using the immune learning, immune memory, and immune recognition. In our method, the features of spam and non-spam extracted from the training sets are combined together, and the number of false positives (non-spam messages that are incorrectly classified as spam) can be reduced. The experimental results demonstrate that this method is effective in reducing the false rate.

  5. Commercial Shot Classification Based on Multiple Features Combination

    Science.gov (United States)

    Liu, Nan; Zhao, Yao; Zhu, Zhenfeng; Ni, Rongrong

    This paper presents a commercial shot classification scheme combining well-designed visual and textual features to automatically detect TV commercials. To identify the inherent difference between commercials and general programs, a special mid-level textual descriptor is proposed, aiming to capture the spatio-temporal properties of the video texts typical of commercials. In addition, we introduce an ensemble-learning based combination method, named Co-AdaBoost, to interactively exploit the intrinsic relations between the visual and textual features employed.

  6. A Method for Data Classification Based on Discernibility Matrix and Discernibility Function

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put forward a method for data classification. Namely, firstly, we use discernibility matrix and discernibility function to delete superfluous attributes in formation system and get a necessary attribute set. Secondly, we delete superfluous attribute values and get decision rules. Finally, we classify data by means of decision rules. The experiments show that data classification using this method is simpler in the structure, and can improve the efficiency of classification.

  7. Remote Sensing Classification Method of Wetland Based on AN Improved Svm

    Science.gov (United States)

    Lin, Y.; Shen, M.; Liu, B.; Ye, Q.

    2013-08-01

    The increase of population and economic development, especial the land use and urbanization bring the wetland resource a huge pressure and a serious consequence of a sharp drop in the recent years. Therefore wetland eco-environment degradation and sustainable development have become the focus of wetland research. Remote sensing technology has become an important means of environment dynamic monitoring. It has practical significance for wetland protection, restoration and sustainable utilization by using remote sensing technology to develop dynamic monitoring research of wetland spatial variation pattern. In view of the complexity of wetland information extraction performance of the SVM classifier, this paper proposes a feature weighted SVM classifier using mixed kernel function. In order to ensure the high-accuracy of the classification result, the feature spaces and the interpretation keys are constructed by the properties of different data. We use the GainRatio (featurei) to build the feature weighted parameter h and test the different kernel functions in SVM. Since the different kernel functions can influence fitting ability and prediction accuracy of SVM and the categories are more easily discriminated by the higher GainRatio, we introduce feature weighted ω calculated by GainRatio to the model. Accordingly we developed an improved model named "Feature weighted & Mixed kernel function SVM" based on a series of experiments. Taking the east beach of Chongming Island in Shanghai as case study, the improved model shows superiority of extensibility and stability in comparison with the classification results of the experiments applying the Minimum Distance classification, the Radial Basis Function of SVM classification and the Polynomial Kernel function of SVM classification with the use of Landsat TM data of 2009. This new model also avoids the weak correlation or uncorrelated characteristics' domination and integrates different information sources effectively to

  8. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  9. Forest Classification Based on Forest texture in Northwest Yunnan Province

    International Nuclear Information System (INIS)

    Forest texture is an intrinsic characteristic and an important visual feature of a forest ecological system. Full utilization of forest texture will be a great help in increasing the accuracy of forest classification based on remote sensed data. Taking Shangri-La as a study area, forest classification has been based on the texture. The results show that: (1) From the texture abundance, texture boundary, entropy as well as visual interpretation, the combination of Grayscale-gradient co-occurrence matrix and wavelet transformation is much better than either one of both ways of forest texture information extraction; (2) During the forest texture information extraction, the size of the texture-suitable window determined by the semi-variogram method depends on the forest type (evergreen broadleaf forest is 3×3, deciduous broadleaf forest is 5×5, etc.). (3)While classifying forest based on forest texture information, the texture factor assembly differs among forests: Variance Heterogeneity and Correlation should be selected when the window is between 3×3 and 5×5; Mean, Correlation, and Entropy should be used when the window in the range of 7×7 to 19×19; and Correlation, Second Moment, and Variance should be used when the range is larger than 21×21

  10. Fuzzy classification of phantom parent groups in an animal model

    Directory of Open Access Journals (Sweden)

    Fikse Freddy

    2009-09-01

    Full Text Available Abstract Background Genetic evaluation models often include genetic groups to account for unequal genetic level of animals with unknown parentage. The definition of phantom parent groups usually includes a time component (e.g. years. Combining several time periods to ensure sufficiently large groups may create problems since all phantom parents in a group are considered contemporaries. Methods To avoid the downside of such distinct classification, a fuzzy logic approach is suggested. A phantom parent can be assigned to several genetic groups, with proportions between zero and one that sum to one. Rules were presented for assigning coefficients to the inverse of the relationship matrix for fuzzy-classified genetic groups. This approach was illustrated with simulated data from ten generations of mass selection. Observations and pedigree records were randomly deleted. Phantom parent groups were defined on the basis of gender and generation number. In one scenario, uncertainty about generation of birth was simulated for some animals with unknown parents. In the distinct classification, one of the two possible generations of birth was randomly chosen to assign phantom parents to genetic groups for animals with simulated uncertainty, whereas the phantom parents were assigned to both possible genetic groups in the fuzzy classification. Results The empirical prediction error variance (PEV was somewhat lower for fuzzy-classified genetic groups. The ranking of animals with unknown parents was more correct and less variable across replicates in comparison with distinct genetic groups. In another scenario, each phantom parent was assigned to three groups, one pertaining to its gender, and two pertaining to the first and last generation, with proportion depending on the (true generation of birth. Due to the lower number of groups, the empirical PEV of breeding values was smaller when genetic groups were fuzzy-classified. Conclusion Fuzzy-classification

  11. Classification of Lactococcus lactis cell envelope proteinase based on gene sequencing, peptides formed after hydrolysis of milk, and computer modeling

    DEFF Research Database (Denmark)

    Børsting, Mette Winther; Qvist, K.B.; Brockmann, E.; Vindeløv, J.; Pedersen, T.L.; Vogensen, Finn Kvist; Ardö, Ylva Margareta

    2015-01-01

    Lactococcus lactis strains depend on a proteolytic system for growth in milk to release essential AA from casein. The cleavage specificities of the cell envelope proteinase (CEP) can vary between strains and environments and whether the enzyme is released or bound to the cell wall. Thirty-eight Lc....... lactis strains were grouped according to their CEP AA sequences and according to identified peptides after hydrolysis of milk. Finally, AA positions in the substrate binding region were suggested by the use of a new CEP template based on Streptococcus C5a CEP. Aligning the CEP AA sequences of 38 strains...... of Lc. lactis showed that 21 strains, which were previously classified as group d, could be subdivided into 3 groups. Independently, similar subgroupings were found based on comparison of the Lc. lactis CEP AA sequences and based on normalized quantity of identified peptides released from αS1-casein...

  12. Active Build-Model Random Forest Method for Network Traffic Classification

    Directory of Open Access Journals (Sweden)

    Alhamza Munther

    2014-05-01

    Full Text Available Network traffic classification continues to be an interesting subject among numerous networking communities. This method introduces multi-beneficial solutions in different avenues, such as network security, network management, anomaly detection, and quality-of-service. In this paper, we propose a supervised machine learning method that efficiently classifies different types of applications using the Active Build-Model Random Forest (ABRF method. This method constructs a new build model for the original Random Forest (RF method to decrease processing time. This build model includes only the active trees (i.e., trees with high accuracy, whereas the passive trees are excluded from the forest. The passive trees were excluded without any negative effect on classification accuracy. Results show that the ABRF method decreases the processing time by up to 37.5% compared with the original RF method. Our model has an overall accuracy of 98.66% based on the benchmark dataset considered in this paper.

  13. Dynamic Block-Based Parameter Estimation for MRF Classification of High-Resolution Images

    OpenAIRE

    Aghighi, Hossein; Trinder, John; Tarabalka, Yuliya; Lim, Samsung

    2014-01-01

    A Markov random field is a graphical model that is commonly used to combine spectral information and spatial context into image classification problems. The contributions of the spatial versus spectral energies are typically defined by using a smoothing parameter, which is often set empirically. We propose a new framework to estimate the smoothing parameter. For this purpose, we introduce the new concepts of dynamic blocks and class label co-occurrence matrices. The estimation is then based o...

  14. A DECISION TREE-BASED CLASSIFICATION APPROACH TO RULE EXTRACTION FOR SECURITY ANALYSIS

    OpenAIRE

    Ren, N.; M. ZARGHAM; Rahimi, S.

    2006-01-01

    Stock selection rules are extensively utilized as the guideline to construct high performance stock portfolios. However, the predictive performance of the rules developed by some economic experts in the past has decreased dramatically for the current stock market. In this paper, C4.5 decision tree classification method was adopted to construct a model for stock prediction based on the fundamental stock data, from which a set of stock selection rules was derived. The experimental results showe...

  15. A Fast Logdet Divergence Based Metric Learning Algorithm for Large Data Sets Classification

    OpenAIRE

    Jiangyuan Mei; Jian Hou; Jicheng Chen; Hamid Reza Karimi

    2014-01-01

    Large data sets classification is widely used in many industrial applications. It is a challenging task to classify large data sets efficiently, accurately, and robustly, as large data sets always contain numerous instances with high dimensional feature space. In order to deal with this problem, in this paper we present an online Logdet divergence based metric learning (LDML) model by making use of the powerfulness of metric learning. We firstly generate a Mahalanobis matrix via l...

  16. A Novel Land Cover Classification Map Based on a MODIS Time-Series in Xinjiang, China

    OpenAIRE

    Linlin Lu; Claudia Kuenzer; Huadong Guo; Qingting Li; Tengfei Long; Xinwu Li

    2014-01-01

    Accurate mapping of land cover on a regional scale is useful for climate and environmental modeling. In this study, we present a novel land cover classification product based on spectral and phenological information for the Xinjiang Uygur Autonomous Region (XUAR) in China. The product is derived at a 500 m spatial resolution using an innovative approach employing moderate resolution imaging spectroradiometer (MODIS) surface reflectance and the enhanced vegetation index (EVI) time series. The ...

  17. A New Classification Analysis of Customer Requirement Information Based on Quantitative Standardization for Product Configuration

    OpenAIRE

    Zheng Xiao; Zude Zhou; Buyun Sheng

    2016-01-01

    Traditional methods used for the classification of customer requirement information are typically based on specific indicators, hierarchical structures, and data formats and involve a qualitative analysis in terms of stationary patterns. Because these methods neither consider the scalability of classification results nor do they regard subsequent application to product configuration, their classification becomes an isolated operation. However, the transformation of customer requirement inform...

  18. Style-based classification of Chinese ink and wash paintings

    Science.gov (United States)

    Sheng, Jiachuan; Jiang, Jianmin

    2013-09-01

    Following the fact that a large collection of ink and wash paintings (IWP) is being digitized and made available on the Internet, their automated content description, analysis, and management are attracting attention across research communities. While existing research in relevant areas is primarily focused on image processing approaches, a style-based algorithm is proposed to classify IWPs automatically by their authors. As IWPs do not have colors or even tones, the proposed algorithm applies edge detection to locate the local region and detect painting strokes to enable histogram-based feature extraction and capture of important cues to reflect the styles of different artists. Such features are then applied to drive a number of neural networks in parallel to complete the classification, and an information entropy balanced fusion is proposed to make an integrated decision for the multiple neural network classification results in which the entropy is used as a pointer to combine the global and local features. Evaluations via experiments support that the proposed algorithm achieves good performances, providing excellent potential for computerized analysis and management of IWPs.

  19. ECG-based heartbeat classification for arrhythmia detection: A survey.

    Science.gov (United States)

    Luz, Eduardo José da S; Schwartz, William Robson; Cámara-Chávez, Guillermo; Menotti, David

    2016-04-01

    An electrocardiogram (ECG) measures the electric activity of the heart and has been widely used for detecting heart diseases due to its simplicity and non-invasive nature. By analyzing the electrical signal of each heartbeat, i.e., the combination of action impulse waveforms produced by different specialized cardiac tissues found in the heart, it is possible to detect some of its abnormalities. In the last decades, several works were developed to produce automatic ECG-based heartbeat classification methods. In this work, we survey the current state-of-the-art methods of ECG-based automated abnormalities heartbeat classification by presenting the ECG signal preprocessing, the heartbeat segmentation techniques, the feature description methods and the learning algorithms used. In addition, we describe some of the databases used for evaluation of methods indicated by a well-known standard developed by the Association for the Advancement of Medical Instrumentation (AAMI) and described in ANSI/AAMI EC57:1998/(R)2008 (ANSI/AAMI, 2008). Finally, we discuss limitations and drawbacks of the methods in the literature presenting concluding remarks and future challenges, and also we propose an evaluation process workflow to guide authors in future works. PMID:26775139

  20. Proposed classification of medial maxillary labial frenum based on morphology

    Directory of Open Access Journals (Sweden)

    Ranjana Mohan

    2014-01-01

    Full Text Available Objectives: To propose a new classification of median maxillary labial frenum (MMLF based on the morphology in permanent dentition, conducting a cross-sectional survey. Materials and Methods: Unicentric study was conducted on 2,400 adults (1,414 males, 986 females, aged between 18 and 76 years, with mean age = 38.62, standard deviation (SD = 12.53. Male mean age = 38.533 years and male SD = 12.498. Female mean age = 38.71 and female SD = 12.5750 for a period of 6 months at Teerthanker Mahaveer University, Moradabad, Northern India. The frenum morphology was determined by using the direct visual method under natural light and categorized. Results: Diverse frenum morphologies were observed. Several variations found in the study have not been documented in the past literature and were named and classified according to their morphology. Discussion: The MMLF presents a diverse array of morphological variations. Several other undocumented types of frena were observed and revised, detailed classification has been proposed based on cross-sectional survey.

  1. A Cluster Based Approach for Classification of Web Results

    Directory of Open Access Journals (Sweden)

    Apeksha Khabia

    2014-12-01

    Full Text Available Nowadays significant amount of information from web is present in the form of text, e.g., reviews, forum postings, blogs, news articles, email messages, web pages. It becomes difficult to classify documents in predefined categories as the number of document grows. Clustering is the classification of a data into clusters, so that the data in each cluster share some common trait – often vicinity according to some defined measure. Underlying distribution of data set can somewhat be depicted based on the learned clusters under the guidance of initial data set. Thus, clusters of documents can be employed to train the classifier by using defined features of those clusters. One of the important issues is also to classify the text data from web into different clusters by mining the knowledge. Conforming to that, this paper presents a review on most of document clustering technique and cluster based classification techniques used so far. Also pre-processing on text dataset and document clustering method is explained in brief.

  2. Understanding Acupuncture Based on ZHENG Classification from System Perspective

    Directory of Open Access Journals (Sweden)

    Junwei Fang

    2013-01-01

    Full Text Available Acupuncture is an efficient therapy method originated in ancient China, the study of which based on ZHENG classification is a systematic research on understanding its complexity. The system perspective is contributed to understand the essence of phenomena, and, as the coming of the system biology era, broader technology platforms such as omics technologies were established for the objective study of traditional chinese medicine (TCM. Omics technologies could dynamically determine molecular components of various levels, which could achieve a systematic understanding of acupuncture by finding out the relationships of various response parts. After reviewing the literature of acupuncture studied by omics approaches, the following points were found. Firstly, with the help of omics approaches, acupuncture was found to be able to treat diseases by regulating the neuroendocrine immune (NEI network and the change of which could reflect the global effect of acupuncture. Secondly, the global effect of acupuncture could reflect ZHENG information at certain structure and function levels, which might reveal the mechanism of Meridian and Acupoint Specificity. Furthermore, based on comprehensive ZHENG classification, omics researches could help us understand the action characteristics of acupoints and the molecular mechanisms of their synergistic effect.

  3. Target Image Classification through Encryption Algorithm Based on the Biological Features

    OpenAIRE

    Zhiwu Chen; Qing E. Wu; Weidong Yang

    2014-01-01

    In order to effectively make biological image classification and identification, this paper studies the biological owned characteristics, gives an encryption algorithm, and presents a biological classification algorithm based on the encryption process. Through studying the composition characteristics of palm, this paper uses the biological classification algorithm to carry out the classification or recognition of palm, improves the accuracy and efficiency of the existing biological classifica...

  4. Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Classification

    OpenAIRE

    Rajendra Palange,; Nishikant Pachpute

    2015-01-01

    This paper proposes the Rainfall Prediction System by using classification technique. The advanced and modified neural network called Data Core Based Fuzzy Min Max Neural Network (DCFMNN) is used for pattern classification. This classification method is applied to predict Rainfall. The neural network called fuzzy min max neural network (FMNN) that creates hyperboxes for classification and predication, has a problem of overlapping neurons that resoled in DCFMNN to give greater accu...

  5. An Assessment of Case Base Reasoning for Short Text Message Classification

    OpenAIRE

    Healy, Matt, (Thesis); Delany, Sarah Jane; Zamolotskikh, Anton

    2004-01-01

    Message classification is a text classification task that has provoked much interest in machine learning. One aspect of message classification that presents a particular challenge is the classification of short text messages. This paper presents an assessment of applying a case based approach that was developed for long text messages (specifically spam filtering) to short text messages. The evaluation involves determining the most appropriate feature types and feature representation for short...

  6. AR-based Method for ECG Classification and Patient Recognition

    Directory of Open Access Journals (Sweden)

    Branislav Vuksanovic

    2013-09-01

    Full Text Available The electrocardiogram (ECG is the recording of heart activity obtained by measuring the signals from electrical contacts placed on the skin of the patient. By analyzing ECG, it is possible to detect the rate and consistency of heartbeats and identify possible irregularities in heart operation. This paper describes a set of techniques employed to pre-process the ECG signals and extract a set of features – autoregressive (AR signal parameters used to characterise ECG signal. Extracted parameters are in this work used to accomplish two tasks. Firstly, AR features belonging to each ECG signal are classified in groups corresponding to three different heart conditions – normal, arrhythmia and ventricular arrhythmia. Obtained classification results indicate accurate, zero-error classification of patients according to their heart condition using the proposed method. Sets of extracted AR coefficients are then extended by adding an additional parameter – power of AR modelling error and a suitability of developed technique for individual patient identification is investigated. Individual feature sets for each group of detected QRS sections are classified in p clusters where p represents the number of patients in each group. Developed system has been tested using ECG signals available in MIT/BIH and Politecnico of Milano VCG/ECG database. Achieved recognition rates indicate that patient identification using ECG signals could be considered as a possible approach in some applications using the system developed in this work. Pre-processing stages, applied parameter extraction techniques and some intermediate and final classification results are described and presented in this paper.

  7. Concept Association and Hierarchical Hamming Clustering Model in Text Classification

    Institute of Scientific and Technical Information of China (English)

    Su Gui-yang; Li Jian-hua; Ma Ying-hua; Li Sheng-hong; Yin Zhong-hang

    2004-01-01

    We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to reduce the dimensionality of the category feature vector space which can solve the problem of the extremely high dimensionality of the documents' feature space. The results of experiment indicate that it can obtain the co-occurrence relations among keywords in the documents which promote the recall of classification system effectively. The hierarchical Hamming clustering model can reduce the dimensionality of the category feature vector efficiently, the size of the vector space is only about 10% of the primary dimensionality.

  8. Utilizing ECG-Based Heartbeat Classification for Hypertrophic Cardiomyopathy Identification.

    Science.gov (United States)

    Rahman, Quazi Abidur; Tereshchenko, Larisa G; Kongkatong, Matthew; Abraham, Theodore; Abraham, M Roselle; Shatkay, Hagit

    2015-07-01

    Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is (potentially fatally) obstructed. A test based on electrocardiograms (ECG) that record the heart electrical activity can help in early detection of HCM patients. This paper presents a cardiovascular-patient classifier we developed to identify HCM patients using standard 10-second, 12-lead ECG signals. Patients are classified as having HCM if the majority of their recorded heartbeats are recognized as characteristic of HCM. Thus, the classifier's underlying task is to recognize individual heartbeats segmented from 12-lead ECG signals as HCM beats, where heartbeats from non-HCM cardiovascular patients are used as controls. We extracted 504 morphological and temporal features—both commonly used and newly-developed ones—from ECG signals for heartbeat classification. To assess classification performance, we trained and tested a random forest classifier and a support vector machine classifier using 5-fold cross validation. We also compared the performance of these two classifiers to that obtained by a logistic regression classifier, and the first two methods performed better than logistic regression. The patient-classification precision of random forests and of support vector machine classifiers is close to 0.85. Recall (sensitivity) and specificity are approximately 0.90. We also conducted feature selection experiments by gradually removing the least informative features; the results show that a relatively small subset of 264 highly informative features can achieve performance measures comparable to those achieved by using the complete set of features. PMID:25915962

  9. Comparison of Cheng's Index-and SSR Marker-based Classification of Asian Cultivated Rice

    Institute of Scientific and Technical Information of China (English)

    WANG Cai-hong; XU Qun; YU Ping; YUAN Xiao-ping; YU Han-yong; WANG Yi-ping; TANG Sheng-xiang

    2013-01-01

    A total of 100 cultivated rice accessions,with a clear isozyme-based classification,were analyzed based on Cheng's index and simple sequence repeat (SSR) marker.The results showed that the isozyme-based classification was in high accordance with that based on Cheng's index and SSR markers.Mantel-test revealed that the Euclidean distance of Cheng's index was significantly correlated with Nei's unbiased genetic distance of SSR markers (r =0.466,P ≤ 0.01).According to the model-based group and cluster analysis,the Cheng's index-and SSR-based classification coincided with each other,with the goodness of fit of 82.1% and 84.7% in indica,97.4% and 95.1% in japonica,respectively,showing higher accordance than that within subspecies.Therefore,Cheng's index could be used to classify subspecies,while SSR marker could be more efficient to analyze the subgroups within subspecies.

  10. A Categorical Framework for Model Classification in the Geosciences

    Science.gov (United States)

    Hauhs, Michael; Trancón y Widemann, Baltasar; Lange, Holger

    2016-04-01

    Models have a mixed record of success in the geosciences. In meteorology, model development and implementation has been among the first and most successful examples of triggering computer technology in science. On the other hand, notorious problems such as the 'equifinality issue' in hydrology lead to a rather mixed reputation of models in other areas. The most successful models in geosciences are applications of dynamic systems theory to non-living systems or phenomena. Thus, we start from the hypothesis that the success of model applications relates to the influence of life on the phenomenon under study. We thus focus on the (formal) representation of life in models. The aim is to investigate whether disappointment in model performance is due to system properties such as heterogeneity and historicity of ecosystems, or rather reflects an abstraction and formalisation problem at a fundamental level. As a formal framework for this investigation, we use category theory as applied in computer science to specify behaviour at an interface. Its methods have been developed for translating and comparing formal structures among different application areas and seems highly suited for a classification of the current "model zoo" in the geosciences. The approach is rather abstract, with a high degree of generality but a low level of expressibility. Here, category theory will be employed to check the consistency of assumptions about life in different models. It will be shown that it is sufficient to distinguish just four logical cases to check for consistency of model content. All four cases can be formalised as variants of coalgebra-algebra homomorphisms. It can be demonstrated that transitions between the four variants affect the relevant observations (time series or spatial maps), the formalisms used (equations, decision trees) and the test criteria of success (prediction, classification) of the resulting model types. We will present examples from hydrology and ecology in

  11. Likelihood ratio model for classification of forensic evidence

    International Nuclear Information System (INIS)

    One of the problems of analysis of forensic evidence such as glass fragments, is the determination of their use-type category, e.g. does a glass fragment originate from an unknown window or container? Very small glass fragments arise during various accidents and criminal offences, and could be carried on the clothes, shoes and hair of participants. It is therefore necessary to obtain information on their physicochemical composition in order to solve the classification problem. Scanning Electron Microscopy coupled with an Energy Dispersive X-ray Spectrometer and the Glass Refractive Index Measurement method are routinely used in many forensic institutes for the investigation of glass. A natural form of glass evidence evaluation for forensic purposes is the likelihood ratio-LR = p(E|H1)/p(E|H2). The main aim of this paper was to study the performance of LR models for glass object classification which considered one or two sources of data variability, i.e. between-glass-object variability and(or) within-glass-object variability. Within the proposed model a multivariate kernel density approach was adopted for modelling the between-object distribution and a multivariate normal distribution was adopted for modelling within-object distributions. Moreover, a graphical method of estimating the dependence structure was employed to reduce the highly multivariate problem to several lower-dimensional problems. The performed analysis showed that the best likelihood model was the one which allows to include information about between and within-object variability, and with variables derived from elemental compositions measured by SEM-EDX, and refractive values determined before (RIb) and after (RIa) the annealing process, in the form of dRI = log10|RIa - RIb|. This model gave better results than the model with only between-object variability considered. In addition, when dRI and variables derived from elemental compositions were used, this model outperformed two other

  12. Optimal Non-Invasive Fault Classification Model for Packaged Ceramic Tile Quality Monitoring Using MMW Imaging

    Science.gov (United States)

    Agarwal, Smriti; Singh, Dharmendra

    2016-04-01

    Millimeter wave (MMW) frequency has emerged as an efficient tool for different stand-off imaging applications. In this paper, we have dealt with a novel MMW imaging application, i.e., non-invasive packaged goods quality estimation for industrial quality monitoring applications. An active MMW imaging radar operating at 60 GHz has been ingeniously designed for concealed fault estimation. Ceramic tiles covered with commonly used packaging cardboard were used as concealed targets for undercover fault classification. A comparison of computer vision-based state-of-the-art feature extraction techniques, viz, discrete Fourier transform (DFT), wavelet transform (WT), principal component analysis (PCA), gray level co-occurrence texture (GLCM), and histogram of oriented gradient (HOG) has been done with respect to their efficient and differentiable feature vector generation capability for undercover target fault classification. An extensive number of experiments were performed with different ceramic tile fault configurations, viz., vertical crack, horizontal crack, random crack, diagonal crack along with the non-faulty tiles. Further, an independent algorithm validation was done demonstrating classification accuracy: 80, 86.67, 73.33, and 93.33 % for DFT, WT, PCA, GLCM, and HOG feature-based artificial neural network (ANN) classifier models, respectively. Classification results show good capability for HOG feature extraction technique towards non-destructive quality inspection with appreciably low false alarm as compared to other techniques. Thereby, a robust and optimal image feature-based neural network classification model has been proposed for non-invasive, automatic fault monitoring for a financially and commercially competent industrial growth.

  13. Hyperspectral image classification based on spatial and spectral features and sparse representation

    Institute of Scientific and Technical Information of China (English)

    Yang Jing-Hui; Wang Li-Guo; Qian Jin-Xi

    2014-01-01

    To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method (Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed (GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.

  14. Automatic earthquake detection and classification with continuous hidden Markov models: a possible tool for monitoring Las Canadas caldera in Tenerife

    International Nuclear Information System (INIS)

    A possible interaction of (volcano-) tectonic earthquakes with the continuous seismic noise recorded in the volcanic island of Tenerife was recently suggested, but existing catalogues seem to be far from being self consistent, calling for the development of automatic detection and classification algorithms. In this work we propose the adoption of a methodology based on Hidden Markov Models (HMMs), widely used already in other fields, such as speech classification.

  15. Classification of EMG Signal Based on Human Percentile using SOM

    Directory of Open Access Journals (Sweden)

    M.H. Jali

    2014-07-01

    Full Text Available Electromyography (EMG is a bio signal that is formed by physiological variations in the state of muscle fibre membranes. Pattern recognition is one of the fields in the bio-signal processing which classified the signal into certain desired categories with subject to their area of application. This study described the classification of the EMG signal based on human body percentile using Self Organizing Mapping (SOM technique. Different human percentile definitively varies the arm circumference size. Variation of arm circumference is due to fatty tissue that lay between active muscle and skin. Generally the fatty tissue would decrease the overall amplitude of the EMG signal. Data collection is conducted randomly with fifteen subjects that have numerous percentiles using non-invasive technique at Biceps Brachii muscle. The signals are then going through filtering process to prepare them for the next stage. Then, five well known time domain feature extraction methods are applied to the signal before the classification process. Self Organizing Map (SOM technique is used as a classifier to discriminate between the human percentiles. Result shows that SOM is capable in clustering the EMG signal to the desired human percentile categories by optimizing the neurons of the technique.

  16. Computational hepatocellular carcinoma tumor grading based on cell nuclei classification.

    Science.gov (United States)

    Atupelage, Chamidu; Nagahashi, Hiroshi; Kimura, Fumikazu; Yamaguchi, Masahiro; Tokiya, Abe; Hashiguchi, Akinori; Sakamoto, Michiie

    2014-10-01

    Hepatocellular carcinoma (HCC) is the most common histological type of primary liver cancer. HCC is graded according to the malignancy of the tissues. It is important to diagnose low-grade HCC tumors because these tissues have good prognosis. Image interpretation-based computer-aided diagnosis (CAD) systems have been developed to automate the HCC grading process. Generally, the HCC grade is determined by the characteristics of liver cell nuclei. Therefore, it is preferable that CAD systems utilize only liver cell nuclei for HCC grading. This paper proposes an automated HCC diagnosing method. In particular, it defines a pipeline-path that excludes nonliver cell nuclei in two consequent pipeline-modules and utilizes the liver cell nuclear features for HCC grading. The significance of excluding the nonliver cell nuclei for HCC grading is experimentally evaluated. Four categories of liver cell nuclear features were utilized for classifying the HCC tumors. Results indicated that nuclear texture is the dominant feature for HCC grading and others contribute to increase the classification accuracy. The proposed method was employed to classify a set of regions of interest selected from HCC whole slide images into five classes and resulted in a 95.97% correct classification rate. PMID:26158066

  17. Texture-Based Automated Lithological Classification Using Aeromagenetic Anomaly Images

    Science.gov (United States)

    Shankar, Vivek

    2009-01-01

    This report consists of a thesis submitted to the faculty of the Department of Electrical and Computer Engineering, in partial fulfillment of the requirements for the degree of Master of Science, Graduate College, The University of Arizona, 2004 Aeromagnetic anomaly images are geophysical prospecting tools frequently used in the exploration of metalliferous minerals and hydrocarbons. The amplitude and texture content of these images provide a wealth of information to geophysicists who attempt to delineate the nature of the Earth's upper crust. These images prove to be extremely useful in remote areas and locations where the minerals of interest are concealed by basin fill. Typically, geophysicists compile a suite of aeromagnetic anomaly images, derived from amplitude and texture measurement operations, in order to obtain a qualitative interpretation of the lithological (rock) structure. Texture measures have proven to be especially capable of capturing the magnetic anomaly signature of unique lithological units. We performed a quantitative study to explore the possibility of using texture measures as input to a machine vision system in order to achieve automated classification of lithological units. This work demonstrated a significant improvement in classification accuracy over random guessing based on a priori probabilities. Additionally, a quantitative comparison between the performances of five classes of texture measures in their ability to discriminate lithological units was achieved.

  18. Classification of chronic obstructive pulmonary disease based on chest radiography

    Directory of Open Access Journals (Sweden)

    Leilane Marcos

    2013-12-01

    Full Text Available Objective Quantitative analysis of chest radiographs of patients with and without chronic obstructive pulmonary disease (COPD determining if the data obtained from such radiographic images could classify such individuals according to the presence or absence of disease. Materials and Methods For such a purpose, three groups of chest radiographic images were utilized, namely: group 1, including 25 individuals with COPD; group 2, including 27 individuals without COPD; and group 3 (utilized for the reclassification /validation of the analysis, including 15 individuals with COPD. The COPD classification was based on spirometry. The variables normalized by retrosternal height were the following: pulmonary width (LARGP; levels of right (ALBDIR and left (ALBESQ diaphragmatic eventration; costophrenic angle (ANGCF; and right (DISDIR and left (DISESQ intercostal distances. Results As the radiographic images of patients with and without COPD were compared, statistically significant differences were observed between the two groups on the variables related to the diaphragm. In the COPD reclassification the following variables presented the highest indices of correct classification: ANGCF (80%, ALBDIR (73.3%, ALBESQ (86.7%. Conclusion The radiographic assessment of the chest demonstrated that the variables related to the diaphragm allow a better differentiation between individuals with and without COPD.

  19. Classification of knee arthropathy with accelerometer-based vibroarthrography.

    Science.gov (United States)

    Moreira, Dinis; Silva, Joana; Correia, Miguel V; Massada, Marta

    2016-01-01

    One of the most common knee joint disorders is known as osteoarthritis which results from the progressive degeneration of cartilage and subchondral bone over time, affecting essentially elderly adults. Current evaluation techniques are either complex, expensive, invasive or simply fails into detection of small and progressive changes that occur within the knee. Vibroarthrography appeared as a new solution where the mechanical vibratory signals arising from the knee are recorded recurring only to an accelerometer and posteriorly analyzed enabling the differentiation between a healthy and an arthritic joint. In this study, a vibration-based classification system was created using a dataset with 92 healthy and 120 arthritic segments of knee joint signals collected from 19 healthy and 20 arthritic volunteers, evaluated with k-nearest neighbors and support vector machine classifiers. The best classification was obtained using the k-nearest neighbors classifier with only 6 time-frequency features with an overall accuracy of 89.8% and with a precision, recall and f-measure of 88.3%, 92.4% and 90.1%, respectively. Preliminary results showed that vibroarthrography can be a promising, non-invasive and low cost tool that could be used for screening purposes. Despite this encouraging results, several upgrades in the data collection process and analysis can be further implemented. PMID:27225550

  20. Pro duct Image Classification Based on Fusion Features

    Institute of Scientific and Technical Information of China (English)

    YANG Xiao-hui; LIU Jing-jing; YANG Li-jun

    2015-01-01

    Two key challenges raised by a product images classification system are classi-fication precision and classification time. In some categories, classification precision of the latest techniques, in the product images classification system, is still low. In this paper, we propose a local texture descriptor termed fan refined local binary pattern, which captures more detailed information by integrating the spatial distribution into the local binary pattern feature. We compare our approach with different methods on a subset of product images on Amazon/eBay and parts of PI100 and experimental results have demonstrated that our proposed approach is superior to the current existing methods. The highest classification precision is increased by 21%and the average classification time is reduced by 2/3.

  1. Variable selection in model-based discriminant analysis

    OpenAIRE

    Maugis, Cathy; Celeux, Gilles; Martin-Magniette, Marie-Laure

    2010-01-01

    A general methodology for selecting predictors for Gaussian generative classification models is presented. The problem is regarded as a model selection problem. Three different roles for each possible predictor are considered: a variable can be a relevant classification predictor or not, and the irrelevant classification variables can be linearly dependent on a part of the relevant predictors or independent variables. This variable selection model was inspired by the model-based clustering mo...

  2. A Method of Soil Salinization Information Extraction with SVM Classification Based on ICA and Texture Features

    Institute of Scientific and Technical Information of China (English)

    ZHANG Fei; TASHPOLAT Tiyip; KUNG Hsiang-te; DING Jian-li; MAMAT.Sawut; VERNER Johnson; HAN Gui-hong; GUI Dong-wei

    2011-01-01

    Salt-affected soils classification using remotely sensed images is one of the most common applications in remote sensing,and many algorithms have been developed and applied for this purpose in the literature.This study takes the Delta Oasis of Weigan and Kuqa Rivers as a study area and discusses the prediction of soil salinization from ETM+ Landsat data.It reports the Support Vector Machine(SVM) classification method based on Independent Component Analysis(ICA) and Texture features.Meanwhile,the letter introduces the fundamental theory of SVM algorithm and ICA,and then incorporates ICA and texture features.The classification result is compared with ICA-SVM classification,single data source SVM classification,maximum likelihood classification(MLC) and neural network classification qualitatively and quantitatively.The result shows that this method can effectively solve the problem of low accuracy and fracture classification result in single data source classification.It has high spread ability toward higher array input.The overall accuracy is 98.64%,which increases by 10.2% compared with maximum likelihood classification,even increases by 12.94% compared with neural net classification,and thus acquires good effectiveness.Therefore,the classification method based on SVM and incorporating the ICA and texture features can be adapted to RS image classification and monitoring of soil salinization.

  3. Study on a pattern classification method of soil quality based on simplified learning sample dataset

    Science.gov (United States)

    Zhang, Jiahua; Liu, S.; Hu, Y.; Tian, Y.

    2011-01-01

    Based on the massive soil information in current soil quality grade evaluation, this paper constructed an intelligent classification approach of soil quality grade depending on classical sampling techniques and disordered multiclassification Logistic regression model. As a case study to determine the learning sample capacity under certain confidence level and estimation accuracy, and use c-means algorithm to automatically extract the simplified learning sample dataset from the cultivated soil quality grade evaluation database for the study area, Long chuan county in Guangdong province, a disordered Logistic classifier model was then built and the calculation analysis steps of soil quality grade intelligent classification were given. The result indicated that the soil quality grade can be effectively learned and predicted by the extracted simplified dataset through this method, which changed the traditional method for soil quality grade evaluation. ?? 2011 IEEE.

  4. Radiological classification of renal angiomyolipomas based on 127 tumors

    Directory of Open Access Journals (Sweden)

    Prando Adilson

    2003-01-01

    Full Text Available PURPOSE: Demonstrate radiological findings of 127 angiomyolipomas (AMLs and propose a classification based on the radiological evidence of fat. MATERIALS AND METHODS: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73, multiple without tuberous sclerosis (TS (n = 4 and multiple with TS (n = 8, were retrospectively reviewed. Eighteen AMLs (14% presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13, hemorrhage (n = 11 and impossibility of an adequate preoperative characterization (n = 8. There was not a case of renal cell carcinoma (RCC with fat component in this group of patients. RESULTS: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal: 54%; Pattern-II, partially fatty (intrarenal or exophytic: 29%; Pattern-III, minimally fatty (most exophytic and perirenal: 11%; and Pattern-IV, without fat (most exophytic and perirenal: 6%. CONCLUSIONS: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm, pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exophytic and heterogeneous (patterns II and III. The rare pattern-IV AMLs, however, can be small or large, intra-renal or exophytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with detectable

  5. Radiological classification of renal angiomyolipomas based on 127 tumors

    Energy Technology Data Exchange (ETDEWEB)

    Prando, Adilson [Hospital Vera Cruz, Campinas, SP (Brazil). Dept. de Radiologia]. E-mail: aprando@mpc.com.br

    2003-05-15

    Purpose: Demonstrate radiological findings of 127 angiomyolipomas (AMLs) and propose a classification based on the radiological evidence of fat. Materials And Methods: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73), multiple without tuberous sclerosis (TS) (n = 4) and multiple with TS (n = 8), were retrospectively reviewed. Eighteen AMLs (14%) presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13), hemorrhage (n = 11) and impossibility of an adequate preoperative characterization (n = 8). There was not a case of renal cell carcinoma (RCC) with fat component in this group of patients. Results: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal): 54%; Pattern-II, partially fatty (intrarenal or exo phytic): 29%; Pattern-III, minimally fatty (most exo phytic and peri renal): 11%; and Pattern-IV, without fat (most exo phytic and peri renal): 6%. Conclusions: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm), pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exo phytic and heterogeneous (patterns II and III). The rare pattern-IV AMLs, however, can be small or large, intra-renal or exo phytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with

  6. Radiological classification of renal angiomyolipomas based on 127 tumors

    International Nuclear Information System (INIS)

    Purpose: Demonstrate radiological findings of 127 angiomyolipomas (AMLs) and propose a classification based on the radiological evidence of fat. Materials And Methods: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73), multiple without tuberous sclerosis (TS) (n = 4) and multiple with TS (n = 8), were retrospectively reviewed. Eighteen AMLs (14%) presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13), hemorrhage (n = 11) and impossibility of an adequate preoperative characterization (n = 8). There was not a case of renal cell carcinoma (RCC) with fat component in this group of patients. Results: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal): 54%; Pattern-II, partially fatty (intrarenal or exo phytic): 29%; Pattern-III, minimally fatty (most exo phytic and peri renal): 11%; and Pattern-IV, without fat (most exo phytic and peri renal): 6%. Conclusions: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm), pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exo phytic and heterogeneous (patterns II and III). The rare pattern-IV AMLs, however, can be small or large, intra-renal or exo phytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with

  7. Hierarchical Markov random-field modeling for texture classification in chest radiographs

    Science.gov (United States)

    Vargas-Voracek, Rene; Floyd, Carey E., Jr.; Nolte, Loren W.; McAdams, Page

    1996-04-01

    A hierarchical Markov random field (MRF) modeling approach is presented for the classification of textures in selected regions of interest (ROIs) of chest radiographs. The procedure integrates possible texture classes and their spatial definition with other components present in an image such as noise and background trend. Classification is performed as a maximum a-posteriori (MAP) estimation of texture class and involves an iterative Gibbs- sampling technique. Two cases are studied: classification of lung parenchyma versus bone and classification of normal lung parenchyma versus miliary tuberculosis (MTB). Accurate classification was obtained for all examined cases showing the potential of the proposed modeling approach for texture analysis of radiographic images.

  8. Finite mixture models and model-based clustering

    Directory of Open Access Journals (Sweden)

    Volodymyr Melnykov

    2010-01-01

    Full Text Available Finite mixture models have a long history in statistics, having been used to model population heterogeneity, generalize distributional assumptions, and lately, for providing a convenient yet formal framework for clustering and classification. This paper provides a detailed review into mixture models and model-based clustering. Recent trends as well as open problems in the area are also discussed.

  9. A kernel-based multi-feature image representation for histopathology image classification

    International Nuclear Information System (INIS)

    This paper presents a novel strategy for building a high-dimensional feature space to represent histopathology image contents. Histogram features, related to colors, textures and edges, are combined together in a unique image representation space using kernel functions. This feature space is further enhanced by the application of latent semantic analysis, to model hidden relationships among visual patterns. All that information is included in the new image representation space. Then, support vector machine classifiers are used to assign semantic labels to images. Processing and classification algorithms operate on top of kernel functions, so that; the structure of the feature space is completely controlled using similarity measures and a dual representation. The proposed approach has shown a successful performance in a classification task using a dataset with 1,502 real histopathology images in 18 different classes. The results show that our approach for histological image classification obtains an improved average performance of 20.6% when compared to a conventional classification approach based on SVM directly applied to the original kernel.

  10. Toward the classification of the realistic free fermionic models

    International Nuclear Information System (INIS)

    The realistic free fermionic models have had remarkable success in providing plausible explanations for various properties of the Standard Model which include the natural appearance of three generations, the explanation of the heavy top quark mass and the qualitative structure of the fermion mass spectrum in general, the stability of the proton and more. These intriguing achievements makes evident the need to understand the general space of these models. While the number of possibilities is large, general patterns can be extracted. In this paper the author presents a detailed discussion on the construction of the realistic free fermionic models with the aim of providing some insight into the basic structures and building blocks that enter the construction. The role of free phases in the determination of the phenomenology of the models is discussed in detail. The author discusses the connection between the free phases and mirror symmetry in (2,2) models and the corresponding symmetries in the case of (2,0) models. The importance of the free phases in determining the effective low energy phenomenology is illustrated in several examples. The classification of the models in terms of boundary condition selection rules, real world-sheet fermion pairings, exotic matter states and the hidden sector is discussed

  11. Classification and thermal history of petroleum based on light hydrocarbons

    Science.gov (United States)

    Thompson, K. F. M.

    1983-02-01

    Classifications of oils and kerogens are described. Two indices are employed, termed the Heptane and IsoheptaneValues, based on analyses of gasoline-range hydrocarbons. The indices assess degree of paraffinicity. and allow the definition of four types of oil: normal, mature, supermature, and biodegraded. The values of these indices measured in sediment extracts are a function of maximum attained temperature and of kerogen type. Aliphatic and aromatic kerogens are definable. Only the extracts of sediments bearing aliphatic kerogens having a specific thermal history are identical to the normal oils which form the largest group (41%) in the sample set. This group was evidently generated at subsurface temperatures of the order of 138°-149°C, (280°-300°F) defined under specific conditions of burial history. It is suggested that all other petroleums are transformation products of normal oils.

  12. MICROWAVE BASED CLASSIFICATION OF MATERIAL USING NEURAL NETWORK

    Directory of Open Access Journals (Sweden)

    Anil H. Soni

    2011-07-01

    Full Text Available Microwave radar has emerged as a useful tool in many remote sensing application including material classification, target detection and shape extraction. In this paper, we present method to classify material based on their dielectric characteristics. Microwave radar in X-band range is used for scanning the target made of various materials like Acrylic, Metal and Wood in free space. Depending on their respective electromagnetic property, reflections from each target are measured and radar image is obtained. Further various features such as Energy, Entropy, Normalized sum of image intensity and standard deviation etc. are extracted and fed to feedfor word multilayer perceptron classifier, which determines whether it is dielectric or non-dielectric (metallic. Results show good performance.

  13. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification. PMID:22097877

  14. New classification system-based visual outcome in Eales′ disease

    Directory of Open Access Journals (Sweden)

    Saxena Sandeep

    2007-01-01

    Full Text Available Purpose: A retrospective tertiary care center-based study was undertaken to evaluate the visual outcome in Eales′ disease, based on a new classification system, for the first time. Materials and Methods: One hundred and fifty-nine consecutive cases of Eales′ disease were included. All the eyes were staged according to the new classification: Stage 1: periphlebitis of small (1a and large (1b caliber vessels with superficial retinal hemorrhages; Stage 2a: capillary non-perfusion, 2b: neovascularization elsewhere/of the disc; Stage 3a: fibrovascular proliferation, 3b: vitreous hemorrhage; Stage 4a: traction/combined rhegmatogenous retinal detachment and 4b: rubeosis iridis, neovascular glaucoma, complicated cataract and optic atrophy. Visual acuity was graded as: Grade I 20/20 or better; Grade II 20/30 to 20/40; Grade III 20/60 to 20/120 and Grade IV 20/200 or worse. All the cases were managed by medical therapy, photocoagulation and/or vitreoretinal surgery. Visual acuity was converted into decimal scale, denoting 20/20=1 and 20/800=0.01. Paired t-test / Wilcoxon signed-rank tests were used for statistical analysis. Results: Vitreous hemorrhage was the commonest presenting feature (49.32%. Cases with Stages 1 to 3 and 4a and 4b achieved final visual acuity ranging from 20/15 to 20/40; 20/80 to 20/400 and 20/200 to 20/400, respectively. Statistically significant improvement in visual acuities was observed in all the stages of the disease except Stages 1a and 4b. Conclusion: Significant improvement in visual acuities was observed in the majority of stages of Eales′ disease following treatment. This study adds further to the little available evidences of treatment effects in literature and may have effect on patient care and health policy in Eales′ disease.

  15. Cancer Data Clustering and Classification Based using Efnn_Pcamethod

    OpenAIRE

    J. Saranya; Hemalatha, R.

    2014-01-01

    One challenge area inside the studies of natural phenomenon data is that the classifications of the expression dataset into correct classes. The distinctive nature of the of Obtainable natural phenomenon data set is that the foremost challenge. massive vary of extraneous attributes (genes), challenge arises from the applying domain of cancer classification. though Accuracy plays a major think about cancer classification, the biological conation is another key criterion, as any biological data...

  16. BRAIN TUMOR CLASSIFICATION USING NEURAL NETWORK BASED METHODS

    OpenAIRE

    Kalyani A. Bhawar*, Prof. Nitin K. Bhil

    2016-01-01

    MRI (Magnetic resonance Imaging) brain neoplasm pictures Classification may be a troublesome tasks due to the variance and complexity of tumors. This paper presents two Neural Network techniques for the classification of the magnetic resonance human brain images. The proposed Neural Network technique consists of 3 stages, namely, feature extraction, dimensionality reduction, and classification. In the first stage, we have obtained the options connected with tomography pictures victimization d...

  17. Investigating text message classification using case-based reasoning

    OpenAIRE

    Healy, Matt, (Thesis)

    2007-01-01

    Text classification is the categorization of text into a predefined set of categories. Text classification is becoming increasingly important given the large volume of text stored electronically e.g. email, digital libraries and the World Wide Web (WWW). These documents represent a massive amount of information that can be accessed easily. To gain benefit from using this information requires organisation. One way of organising it automatically is to use text classification. A number of well k...

  18. Variable Star Signature Classification using Slotted Symbolic Markov Modeling

    Science.gov (United States)

    Johnston, Kyle B.; Peter, Adrian M.

    2016-01-01

    With the advent of digital astronomy, new benefits and new challenges have been presented to the modern day astronomer. No longer can the astronomer rely on manual processing, instead the profession as a whole has begun to adopt more advanced computational means. Our research focuses on the construction and application of a novel time-domain signature extraction methodology and the development of a supporting supervised pattern classification algorithm for the identification of variable stars. A methodology for the reduction of stellar variable observations (time-domain data) into a novel feature space representation is introduced. The methodology presented will be referred to as Slotted Symbolic Markov Modeling (SSMM) and has a number of advantages which will be demonstrated to be beneficial; specifically to the supervised classification of stellar variables. It will be shown that the methodology outperformed a baseline standard methodology on a standardized set of stellar light curve data. The performance on a set of data derived from the LINEAR dataset will also be shown.

  19. Genetic Programming for the Generation of Crisp and Fuzzy Rule Bases in Classification and Diagnosis of Medical Data

    DEFF Research Database (Denmark)

    Dounias, George; Tsakonas, Athanasios; Jantzen, Jan;

    2002-01-01

    This paper demonstrates two methodologies for the construction of rule-based systems in medical decision making. The first approach consists of a method combining genetic programming and heuristic hierarchical rule-base construction. The second model is composed by a strongly-typed genetic...... programming system for the generation of fuzzy rule-based systems. Two different medical domains are used to evaluate the models. The first field is the diagnosis of subtypes of Aphasia. Two models for crisp rule-bases are presented. The first one discriminates between four major types and the second attempts...... the classification between all common types. A third model consisted of a GP-generated fuzzy rule-based system is tested on the same domain. The second medical domain is the classification of Pap-Smear Test examinations where a crisp rule-based system is constructed. Results denote the effectiveness of the proposed...

  20. The research of land covers classification based on waveform features correction of full-waveform LiDAR

    Science.gov (United States)

    Zhou, Mei; Liu, Menghua; Zhang, Zheng; Ma, Lian; Zhang, Huijing

    2015-10-01

    In order to solve the problem of insufficient classification types and low classification accuracy using traditional discrete LiDAR, in this paper, the waveform features of Full-waveform LiDAR were analyzed and corrected to be used for land covers classification. Firstly, the waveforms were processed, including waveform preprocessing, waveform decomposition and features extraction. The extracted features were distance, amplitude, waveform width and the backscattering cross-section. In order to decrease the differences of features of the same land cover type and further improve the effectiveness of the features for land covers classification, this paper has made comprehensive correction on the extracted features. The features of waveforms obtained in Zhangye were extracted and corrected. It showed that the variance of corrected features can be reduced by about 20% compared to original features. Then classification ability of corrected features was clearly analyzed using the measured waveform data with different characteristics. To further verify whether the corrected features can improve the classification accuracy, this paper has respectively classified typical land covers based on original features and corrected features. Since the features have independently Gaussian distribution, the Gaussian mixture density model (GMDM) was put forward to be the classification model to classify the targets as road, trees, buildings and farmland in this paper. The classification results of these four land cover types were obtained according to the ground truth information gotten from CCD image data of the targets region. It showed that the classification accuracy can be improved by about 8% when the corrected features were used.

  1. ANALYZING AVIATION SAFETY REPORTS: FROM TOPIC MODELING TO SCALABLE MULTI-LABEL CLASSIFICATION

    Data.gov (United States)

    National Aeronautics and Space Administration — ANALYZING AVIATION SAFETY REPORTS: FROM TOPIC MODELING TO SCALABLE MULTI-LABEL CLASSIFICATION AMRUDIN AGOVIC*, HANHUAI SHAN, AND ARINDAM BANERJEE Abstract. The...

  2. Entropy-based gene ranking without selection bias for the predictive classification of microarray data

    Directory of Open Access Journals (Sweden)

    Serafini Maria

    2003-11-01

    Full Text Available Abstract Background We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process. Results With E-RFE, we speed up the recursive feature elimination (RFE with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Conclusions Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  3. Multimodal Classification of Mild Cognitive Impairment Based on Partial Least Squares.

    Science.gov (United States)

    Wang, Pingyue; Chen, Kewei; Yao, Li; Hu, Bin; Wu, Xia; Zhang, Jiacai; Ye, Qing; Guo, Xiaojuan

    2016-08-10

    In recent years, increasing attention has been given to the identification of the conversion of mild cognitive impairment (MCI) to Alzheimer's disease (AD). Brain neuroimaging techniques have been widely used to support the classification or prediction of MCI. The present study combined magnetic resonance imaging (MRI), 18F-fluorodeoxyglucose PET (FDG-PET), and 18F-florbetapir PET (florbetapir-PET) to discriminate MCI converters (MCI-c, individuals with MCI who convert to AD) from MCI non-converters (MCI-nc, individuals with MCI who have not converted to AD in the follow-up period) based on the partial least squares (PLS) method. Two types of PLS models (informed PLS and agnostic PLS) were built based on 64 MCI-c and 65 MCI-nc from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The results showed that the three-modality informed PLS model achieved better classification accuracy of 81.40%, sensitivity of 79.69%, and specificity of 83.08% compared with the single-modality model, and the three-modality agnostic PLS model also achieved better classification compared with the two-modality model. Moreover, combining the three modalities with clinical test score (ADAS-cog), the agnostic PLS model (independent data: florbetapir-PET; dependent data: FDG-PET and MRI) achieved optimal accuracy of 86.05%, sensitivity of 81.25%, and specificity of 90.77%. In addition, the comparison of PLS, support vector machine (SVM), and random forest (RF) showed greater diagnostic power of PLS. These results suggested that our multimodal PLS model has the potential to discriminate MCI-c from the MCI-nc and may therefore be helpful in the early diagnosis of AD. PMID:27567818

  4. Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

    OpenAIRE

    Huibin Lu; Zhengping Hu; Hongxiao Gao

    2015-01-01

    In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the...

  5. Three-Phase Tournament-Based Method for Better Email Classification

    OpenAIRE

    Sabah Sayed; Samir AbdelRahman; Ibrahim Farag

    2012-01-01

    Email classification performance has attracted much attention in the last decades. This paper proposes a tournament-based method to evolve email classification performance utilizing World Final Cup rules as a solution heuristics. Our proposed classification method passes through three phases: 1) clustering (grouping) email folders (topics or classes) based on their token and field similarities, 2) training binary classifiers on each class pair and 3) applying 2-layer tournament me...

  6. Uncertainty classification method of remote sensing image based on high-dimensional cloud model and RBF neural network%基于高维云模型和RBF神经网络的遥感影像不确定性分类方法

    Institute of Scientific and Technical Information of China (English)

    李刚; 万幼川

    2012-01-01

    云模型是用自然语言值表示的某个定性概念与其定量表示之间的不确定性转换模型,RBF神经网络已经广泛应用于遥感影像分类.考虑到传统的RBF神经网络分类技术不能有效表达影像分类过程中存在的不确定性、难以自适应地确定隐含层神经元,本文提出了一个基于高维云模型和改进RBF神经网络的不确定性分类技术.利用高维正态云创建隐含层神经元,使RBF神经网络能充分表达影像分类过程中存在的不确定性.通过峰值法云变换和高维云算法自适应地确定最优隐含层神经元.通过基于概率的权值确定和频率阈值调整,进一步优化RBF神经网络的结构.实验表明,本文提出的方法有较高的分类精度,分类结果基本上与人眼目视解译一致.%Cloud model is an uncertainty conversion model between qualitative concept described by natural language and its quantitative expression. The RBF neural network has been applied widely to remote sensing image classification. Considering the traditional RBF neural network classification technique couldn ' t effectively express uncertainty existing in image classification, and couldn' t determine adaptively hidden layer neurons, this paper proposed an uncertainty classification technique based on high-dimension cloud model and improved RBF neural network. Firstly, by using high-dimensional normal cloud models to construct hidden layer neurons, RBF neural network could fully express the uncertainty existing in image classification. Then, by using peak-based cloud transform and high-dimensional cloud algorithm, the optimal neurons of hidden layer were adaptively determined. Finally, by using probability-based weight determination and frequency threshold adjustment, the RBF neural network was further optimized. The experiments showed that the proposed method had higher classification accuracy and could produce good classification results which were consistent with

  7. Evaluation of soft segment modeling on a context independent phoneme classification system

    International Nuclear Information System (INIS)

    The geometric distribution of states duration is one of the main performance limiting assumptions of hidden Markov modeling of speech signals. Stochastic segment models, generally, and segmental HMM, specifically overcome this deficiency partly at the cost of more complexity in both training and recognition phases. In addition to this assumption, the gradual temporal changes of speech statistics has not been modeled in HMM. In this paper, a new duration modeling approach is presented. The main idea of the model is to consider the effect of adjacent segments on the probability density function estimation and evaluation of each acoustic segment. This idea not only makes the model robust against segmentation errors, but also it models gradual change from one segment to the next one with a minimum set of parameters. The proposed idea is analytically formulated and tested on a TIMIT based context independent phenomena classification system. During the test procedure, the phoneme classification of different phoneme classes was performed by applying various proposed recognition algorithms. The system was optimized and the results have been compared with a continuous density hidden Markov model (CDHMM) with similar computational complexity. The results show 8-10% improvement in phoneme recognition rate in comparison with standard continuous density hidden Markov model. This indicates improved compatibility of the proposed model with the speech nature. (author)

  8. Research and Application of Human Capital Strategic Classification Tool: Human Capital Classification Matrix Based on Biological Natural Attribute

    Directory of Open Access Journals (Sweden)

    Yong Liu

    2014-12-01

    Full Text Available In order to study the causes of weak human capital structure strategic classification management in China, we analyze that enterprises around the world face increasingly difficult for human capital management. In order to provide strategically sound answers, the HR managers need the critical information provided by the right technology processing and analytical tools. In this study, there are different types and levels of human capital in formal organization management, which is not the same contribution to a formal organization. An important guarantee for sustained and healthy development of the formal or informal organization is lower human capital risk. To resist this risk is primarily dependent on human capital hedge force and appreciation force in value, which is largely dependent on the strategic value of the performance of senior managers. Based on the analysis of high-level managers perspective, we also discuss the value and configuration of principles and methods to be followed in human capital strategic classification based on Boston Consulting Group (BCG matrix and build Human Capital Classification (HCC matrix based on biological natural attribute to effectively realize human capital structure strategic classification.

  9. [ECoG classification based on wavelet variance].

    Science.gov (United States)

    Yan, Shiyu; Liu, Chong; Wang, Hong; Zhao, Haibin

    2013-06-01

    For a typical electrocorticogram (ECoG)-based brain-computer interface (BCI) system in which the subject's task is to imagine movements of either the left small finger or the tongue, we proposed a feature extraction algorithm using wavelet variance. Firstly the definition and significance of wavelet variance were brought out and taken as feature based on the discussion of wavelet transform. Six channels with most distinctive features were selected from 64 channels for analysis. Consequently the EEG data were decomposed using db4 wavelet. The wavelet coeffi-cient variances containing Mu rhythm and Beta rhythm were taken out as features based on ERD/ERS phenomenon. The features were classified linearly with an algorithm of cross validation. The results of off-line analysis showed that high classification accuracies of 90. 24% and 93. 77% for training and test data set were achieved, the wavelet vari-ance had characteristics of simplicity and effectiveness and it was suitable for feature extraction in BCI research. K PMID:23865300

  10. Support vector machine based classification and mapping of atherosclerotic plaques using fluorescence lifetime imaging (Conference Presentation)

    Science.gov (United States)

    Fatakdawala, Hussain; Gorpas, Dimitris S.; Bec, Julien; Ma, Dinglong M.; Yankelevich, Diego R.; Bishop, John W.; Marcu, Laura

    2016-02-01

    The progression of atherosclerosis in coronary vessels involves distinct pathological changes in the vessel wall. These changes manifest in the formation of a variety of plaque sub-types. The ability to detect and distinguish these plaques, especially thin-cap fibroatheromas (TCFA) may be relevant for guiding percutaneous coronary intervention as well as investigating new therapeutics. In this work we demonstrate the ability of fluorescence lifetime imaging (FLIm) derived parameters (lifetime values from sub-bands 390/40 nm, 452/45 nm and 542/50 nm respectively) for generating classification maps for identifying eight different atherosclerotic plaque sub-types in ex vivo human coronary vessels. The classification was performed using a support vector machine based classifier that was built from data gathered from sixteen coronary vessels in a previous study. This classifier was validated in the current study using an independent set of FLIm data acquired from four additional coronary vessels with a new rotational FLIm system. Classification maps were compared to co-registered histological data. Results show that the classification maps allow identification of the eight different plaque sub-types despite the fact that new data was gathered with a different FLIm system. Regions with diffuse intimal thickening (n=10), fibrotic tissue (n=2) and thick-cap fibroatheroma (n=1) were correctly identified on the classification map. The ability to identify different plaque types using FLIm data alone may serve as a powerful clinical and research tool for studying atherosclerosis in animal models as well as in humans.

  11. Hyperspectral remote sensing image classification based on decision level fusion

    Institute of Scientific and Technical Information of China (English)

    Peijun Du; Wei Zhang; Junshi Xia

    2011-01-01

    @@ To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner.To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers.In the experiment, by using the operational modular imaging spectrometer (OMIS) II HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion.The results also indicate that the optimization of input features can improve the classification performance.%To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner. To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers. In the experiment, by using the operational modular imaging spectrometer (OMIS) Ⅱ HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion. The results also indicate that the optimization of input features can improve the classification performance.

  12. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    Directory of Open Access Journals (Sweden)

    Hongxia Li

    2013-08-01

    Full Text Available With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It realizes the independent component analysis of complex network text classification. Through the ICA clustering algorithm of independent component, it realizes character words clustering extraction of text classification. The visualization of text retrieval is improved. Finally, we make a comparative analysis of collocation algorithm and ICA clustering algorithm through text classification and keyword search experiment. The paper gives the clustering degree of algorithm and accuracy figure. Through simulation analysis, we find that ICA clustering algorithm increases by 1.2% comparing with text classification clustering degree. Accuracy can be improved by 11.1% at most. It improves the efficiency and accuracy of text classification retrieval. It also provides a theoretical reference for text retrieval classification of eBook

  13. Multiscale modeling for classification of SAR imagery using hybrid EM algorithm and genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    Xianbin Wen; Hua Zhang; Jianguang Zhang; Xu Jiao; Lei Wang

    2009-01-01

    A novel method that hybridizes genetic algorithm (GA) and expectation maximization (EM) algorithm for the classification of syn-thetic aperture radar (SAR) imagery is proposed by the finite Gaussian mixtures model (GMM) and multiscale autoregressive (MAR)model. This algorithm is capable of improving the global optimality and consistency of the classification performance. The experiments on the SAR images show that the proposed algorithm outperforms the standard EM method significantly in classification accuracy.

  14. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  15. Classification of Histological Images Based on the Stationary Wavelet Transform

    International Nuclear Information System (INIS)

    Non-Hodgkin lymphomas are of many distinct types, and different classification systems make it difficult to diagnose them correctly. Many of these systems classify lymphomas only based on what they look like under a microscope. In 2008 the World Health Organisation (WHO) introduced the most recent system, which also considers the chromosome features of the lymphoma cells and the presence of certain proteins on their surface. The WHO system is the one that we apply in this work. Herewith we present an automatic method to classify histological images of three types of non-Hodgkin lymphoma. Our method is based on the Stationary Wavelet Transform (SWT), and it consists of three steps: 1) extracting sub-bands from the histological image through SWT, 2) applying Analysis of Variance (ANOVA) to clean noise and select the most relevant information, 3) classifying it by the Support Vector Machine (SVM) algorithm. The kernel types Linear, RBF and Polynomial were evaluated with our method applied to 210 images of lymphoma from the National Institute on Aging. We concluded that the following combination led to the most relevant results: detail sub-band, ANOVA and SVM with Linear and RBF kernels

  16. Defining and evaluating classification algorithm for high-dimensional data based on latent topics.

    Directory of Open Access Journals (Sweden)

    Le Luo

    Full Text Available Automatic text categorization is one of the key techniques in information retrieval and the data mining field. The classification is usually time-consuming when the training dataset is large and high-dimensional. Many methods have been proposed to solve this problem, but few can achieve satisfactory efficiency. In this paper, we present a method which combines the Latent Dirichlet Allocation (LDA algorithm and the Support Vector Machine (SVM. LDA is first used to generate reduced dimensional representation of topics as feature in VSM. It is able to reduce features dramatically but keeps the necessary semantic information. The Support Vector Machine (SVM is then employed to classify the data based on the generated features. We evaluate the algorithm on 20 Newsgroups and Reuters-21578 datasets, respectively. The experimental results show that the classification based on our proposed LDA+SVM model achieves high performance in terms of precision, recall and F1 measure. Further, it can achieve this within a much shorter time-frame. Our process improves greatly upon the previous work in this field and displays strong potential to achieve a streamlined classification process for a wide range of applications.

  17. Ovarian Cancer Classification based on Mass Spectrometry Analysis of Sera

    Directory of Open Access Journals (Sweden)

    Baolin Wu

    2006-01-01

    Full Text Available In our previous study [1], we have compared the performance of a number of widely used discrimination methods for classifying ovarian cancer using Matrix Assisted Laser Desorption Ionization (MALDI mass spectrometry data on serum samples obtained from Reflectron mode. Our results demonstrate good performance with a random forest classifier. In this follow-up study, to improve the molecular classification power of the MALDI platform for ovarian cancer disease, we expanded the mass range of the MS data by adding data acquired in Linear mode and evaluated the resultant decrease in classification error. A general statistical framework is proposed to obtain unbiased classification error estimates and to analyze the effects of sample size and number of selected m/z features on classification errors. We also emphasize the importance of combining biological knowledge and statistical analysis to obtain both biologically and statistically sound results. Our study shows improvement in classification accuracy upon expanding the mass range of the analysis. In order to obtain the best classification accuracies possible, we found that a relatively large training sample size is needed to obviate the sample variations. For the ovarian MS dataset that is the focus of the current study, our results show that approximately 20-40 m/z features are needed to achieve the best classification accuracy from MALDI-MS analysis of sera. Supplementary information can be found at http://bioinformatics.med.yale.edu/proteomics/BioSupp2.html.

  18. Diagnostics of enterprise bankruptcy occurrence probability in an anti-crisis management: modern approaches and classification of models

    Directory of Open Access Journals (Sweden)

    I.V. Zhalinska

    2015-09-01

    Full Text Available Diagnostics of enterprise bankruptcy occurrence probability is defined as an important tool ensuring the viability of an organization under conditions of unpredictable dynamic environment. The paper aims to define the basic features of diagnostics of bankruptcy occurrence probability models and their classification. The article grounds the objective increasing of crisis probability in modern enterprises where such increasing leads to the need to improve the efficiency of anti-crisis enterprise activities. The system of anti-crisis management is based on the subsystem of diagnostics of bankruptcy occurrence probability. Such a subsystem is the main one for further measures to prevent and overcome the crisis. The classification of existing models of enterprise bankruptcy occurrence probability has been suggested. The classification is based on methodical and methodological principles of models. The following main groups of models are determined: the models using financial ratios, aggregates and scores, the models of discriminated analysis, the methods of strategic analysis, informal models, artificial intelligence systems and the combination of the models. The classification made it possible to identify the analytical capabilities of each of the groups of models suggested.

  19. An approach for mechanical fault classification based on generalized discriminant analysis

    Institute of Scientific and Technical Information of China (English)

    LI Wei-hua; SHI Tie-lin; YANG Shu-zi

    2006-01-01

    To deal with pattern classification of complicated mechanical faults,an approach to multi-faults classification based on generalized discriminant analysis is presented.Compared with linear discriminant analysis (LDA),generalized discriminant analysis (GDA),one of nonlinear discriminant analysis methods,is more suitable for classifying the linear non-separable problem.The connection and difference between KPCA (Kernel Principal Component Analysis) and GDA is discussed.KPCA is good at detection of machine abnormality while GDA performs well in multi-faults classification based on the collection of historical faults symptoms.When the proposed method is applied to air compressor condition classification and gear fault classification,an excellent performance in complicated multi-faults classification is presented.

  20. A NEW SVM BASED EMOTIONAL CLASSIFICATION OF IMAGE

    Institute of Scientific and Technical Information of China (English)

    Wang Weining; Yu Yinglin; Zhang Jianchao

    2005-01-01

    How high-level emotional representation of art paintings can be inferred from percep tual level features suited for the particular classes (dynamic vs. static classification)is presented. The key points are feature selection and classification. According to the strong relationship between notable lines of image and human sensations, a novel feature vector WLDLV (Weighted Line Direction-Length Vector) is proposed, which includes both orientation and length information of lines in an image. Classification is performed by SVM (Support Vector Machine) and images can be classified into dynamic and static. Experimental results demonstrate the effectiveness and superiority of the algorithm.

  1. The ARMA model's pole characteristics of Doppler signals from the carotid artery and their classification application

    Institute of Scientific and Technical Information of China (English)

    CHEN Xi; WANG Yuanyuan; ZHANG Yu; WANG Weiqi

    2002-01-01

    In order to diagnose the cerebral infarction, a classification system based onthe ARMA model and BP (Back-Propagation) neural network is presented to analyzeblood flow Doppler signals from the carotid artery. In this system, an ARMA modelis first used to analyze the audio Doppler blood flow signals from the carotid artery.Then several characteristic parameters of the pole's distribution are estimated. Afterstudies of these characteristic parameters' sensitivity to the textcolor cerebral infarctiondiagnosis, a BP neural network using sensitive parameters is established to classify thenormal or abnormal state of the cerebral vessel. With 474 cases used to establish theappropriate neural network, and 52 cases used to test the network, the results showthat the correct classification rate of both training and testing are over 94%. Thus thissystem is useful to diagnose the cerebral infarction.

  2. Improving Sparse Representation-Based Classification Using Local Principal Component Analysis

    OpenAIRE

    Weaver, Chelsea; Saito, Naoki

    2016-01-01

    Sparse representation-based classification (SRC), proposed by Wright et al., seeks the sparsest decomposition of a test sample over the dictionary of training samples, with classification to the most-contributing class. Because it assumes test samples can be written as linear combinations of their same-class training samples, the success of SRC depends on the size and representativeness of the training set. Our proposed classification algorithm enlarges the training set by using local princip...

  3. Analysis on Design of Kohonen-network System Based on Classification of Complex Signals

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The key methods of detection and classification of the electroencephalogram(EEG) used in recent years are introduced . Taking EEG for example, the design plan of Kohonen neural network system based on detection and classification of complex signals is proposed, and both the network design and signal processing are analyzed, including pre-processing of signals, extraction of signal features, classification of signal and network topology, etc.

  4. Comparing Machine Learning Classifiers for Object-Based Land Cover Classification Using Very High Resolution Imagery

    OpenAIRE

    Yuguo Qian; Weiqi Zhou; Jingli Yan; Weifeng Li; Lijian Han

    2014-01-01

    This study evaluates and compares the performance of four machine learning classifiers—support vector machine (SVM), normal Bayes (NB), classification and regression tree (CART) and K nearest neighbor (KNN)—to classify very high resolution images, using an object-based classification procedure. In particular, we investigated how tuning parameters affect the classification accuracy with different training sample sizes. We found that: (1) SVM and NB were superior to CART and KNN, and both could...

  5. The Discriminative validity of "nociceptive," "peripheral neuropathic," and "central sensitization" as mechanisms-based classifications of musculoskeletal pain.

    LENUS (Irish Health Repository)

    Smart, Keith M

    2012-02-01

    OBJECTIVES: Empirical evidence of discriminative validity is required to justify the use of mechanisms-based classifications of musculoskeletal pain in clinical practice. The purpose of this study was to evaluate the discriminative validity of mechanisms-based classifications of pain by identifying discriminatory clusters of clinical criteria predictive of "nociceptive," "peripheral neuropathic," and "central sensitization" pain in patients with low back (+\\/- leg) pain disorders. METHODS: This study was a cross-sectional, between-patients design using the extreme-groups method. Four hundred sixty-four patients with low back (+\\/- leg) pain were assessed using a standardized assessment protocol. After each assessment, patients\\' pain was assigned a mechanisms-based classification. Clinicians then completed a clinical criteria checklist indicating the presence\\/absence of various clinical criteria. RESULTS: Multivariate analyses using binary logistic regression with Bayesian model averaging identified a discriminative cluster of 7, 3, and 4 symptoms and signs predictive of a dominance of "nociceptive," "peripheral neuropathic," and "central sensitization" pain, respectively. Each cluster was found to have high levels of classification accuracy (sensitivity, specificity, positive\\/negative predictive values, positive\\/negative likelihood ratios). DISCUSSION: By identifying a discriminatory cluster of symptoms and signs predictive of "nociceptive," "peripheral neuropathic," and "central" pain, this study provides some preliminary discriminative validity evidence for mechanisms-based classifications of musculoskeletal pain. Classification system validation requires the accumulation of validity evidence before their use in clinical practice can be recommended. Further studies are required to evaluate the construct and criterion validity of mechanisms-based classifications of musculoskeletal pain.

  6. A new mass classification system derived from multiple features and a trained MLP model

    Science.gov (United States)

    Tan, Maxine; Pu, Jiantao; Zheng, Bin

    2014-03-01

    High false-positive recall rate is an important clinical issue that reduces efficacy of screening mammography. Aiming to help improve accuracy of classification between the benign and malignant breast masses and then reduce false-positive recalls, we developed and tested a new computer-aided diagnosis (CAD) scheme for mass classification using a database including 600 verified mass regions. The mass regions were segmented from regions of interest (ROIs) with a fixed size of 512×512 pixels. The mass regions were first segmented by an automated scheme, with manual corrections to the mass boundary performed if there was noticeable segmentation error. We randomly divided the 600 ROIs into 400 ROIs (200 malignant and 200 benign) for training, and 200 ROIs (100 malignant and 100 benign) for testing. We computed and analyzed 124 shape, texture, contrast, and spiculation based features in this study. Combining with previously computed 27 regional and shape based features for each of the ROIs in our database, we built an initial image feature pool. From this pool of 151 features, we extracted 13 features by applying the Sequential Forward Floating Selection algorithm on the ROIs in the training dataset. We then trained a multilayer perceptron model using these 13 features, and applied the trained model to the ROIs in the testing dataset. Receiver operating characteristic (ROC) analysis was used to evaluate classification accuracy. The area under the ROC curve was 0.8814±0.025 for the testing dataset. The results show a higher CAD mass classification performance, which needs to be validated further in a more comprehensive study.

  7. Trace elements based classification on clinkers. Application to Spanish clinkers

    Directory of Open Access Journals (Sweden)

    Tamás, F. D.

    2001-12-01

    Full Text Available The qualitative identification to determine the origin (i.e. manufacturing factory of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the obtained classifier was measured by ten-fold cross validation. The results show that the proposed method is useful to identify an easy-to-use expert system that is able to determine the origin of the clinker based on its trace element content.

    En el presente trabajo se describe el procedimiento de identificación cualitativa de clínkeres españoles con el objeto de determinar su origen (fábrica. Esa clasificación de los clínkeres se basa en el contenido de sus elementos traza. Se analizaron 15 clínkeres diferentes procedentes de 11 fábricas de cemento españolas, determinándose los contenidos en Mg, Sr, Ba, Mn, Ti, Zr, Zn y V. Se ha diseñado un sistema experto mediante un árbol de decisión binario basado en los datos recogidos. La clasificación obtenida fue examinada mediante la validación cruzada de 10 valores. Los resultados obtenidos muestran que el modelo propuesto es válido para identificar, de manera fácil, un sistema experto capaz de determinar el origen de un clínker basándose en el contenido de sus elementos traza.

  8. China's Classification-Based Forest Management: Procedures, Problems, and Prospects

    Science.gov (United States)

    Dai, Limin; Zhao, Fuqiang; Shao, Guofan; Zhou, Li; Tang, Lina

    2009-06-01

    China’s new Classification-Based Forest Management (CFM) is a two-class system, including Commodity Forest (CoF) and Ecological Welfare Forest (EWF) lands, so named according to differences in their distinct functions and services. The purposes of CFM are to improve forestry economic systems, strengthen resource management in a market economy, ease the conflicts between wood demands and public welfare, and meet the diversified needs for forest services in China. The formative process of China’s CFM has involved a series of trials and revisions. China’s central government accelerated the reform of CFM in the year 2000 and completed the final version in 2003. CFM was implemented at the provincial level with the aid of subsidies from the central government. About a quarter of the forestland in China was approved as National EWF lands by the State Forestry Administration in 2006 and 2007. Logging is prohibited on National EWF lands, and their landowners or managers receive subsidies of about 70 RMB (US10) per hectare from the central government. CFM represents a new forestry strategy in China and its implementation inevitably faces challenges in promoting the understanding of forest ecological services, generalizing nationwide criteria for identifying EWF and CoF lands, setting up forest-specific compensation mechanisms for ecological benefits, enhancing the knowledge of administrators and the general public about CFM, and sustaining EWF lands under China’s current forestland tenure system. CFM does, however, offer a viable pathway toward sustainable forest management in China.

  9. Brazilian Cardiorespiratory Fitness Classification Based on Maximum Oxygen Consumption

    Science.gov (United States)

    Herdy, Artur Haddad; Caixeta, Ananda

    2016-01-01

    Background Cardiopulmonary exercise test (CPET) is the most complete tool available to assess functional aerobic capacity (FAC). Maximum oxygen consumption (VO2 max), an important biomarker, reflects the real FAC. Objective To develop a cardiorespiratory fitness (CRF) classification based on VO2 max in a Brazilian sample of healthy and physically active individuals of both sexes. Methods We selected 2837 CEPT from 2837 individuals aged 15 to 74 years, distributed as follows: G1 (15 to 24); G2 (25 to 34); G3 (35 to 44); G4 (45 to 54); G5 (55 to 64) and G6 (65 to 74). Good CRF was the mean VO2 max obtained for each group, generating the following subclassification: Very Low (VL): VO2 105%. Results Men VL 105% G1 53.13 G2 49.77 G3 47.67 G4 42.52 G5 37.06 G6 31.50 Women G1 40.85 G2 40.01 G3 34.09 G4 32.66 G5 30.04 G6 26.36 Conclusions This chart stratifies VO2 max measured on a treadmill in a robust Brazilian sample and can be used as an alternative for the real functional evaluation of physically and healthy individuals stratified by age and sex. PMID:27305285

  10. 基于流记录偏好度的多分类器融合流量识别模型%Traffic classification model based on fusion of multiple classifiers with flow preference

    Institute of Scientific and Technical Information of China (English)

    董仕; 丁伟

    2013-01-01

    The concept of multi-classifier fusion was introduced which can improve the classification accuracy and over-come the disadvantage of single classifier. DS theory was introduced into decision module of traffic classification and preference and timeliness was proposed. After analyzing multi-classifier model by simulation, the results show the new classifier model can overcome one sidedness of single classifier, depending on multiple evidences to optimize the traffic results.%通过将证据理论引入到流量分类的决策模块中,提出了偏好度和时效度权值,并通过实测数据对多分类器识别模型进行验证,其结果表明该模型较好的克服了单分类器的片面性,通过对多个证据的融合来优化识别的结果。

  11. Basic Hand Gestures Classification Based on Surface Electromyography

    OpenAIRE

    Aleksander Palkowski; Grzegorz Redlarski

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the prop...

  12. Consistent image-based measurement and classification of skin color

    OpenAIRE

    Harville, Michael; Baker, Harlyn; Bhatti, Nina; Süsstrunk, Sabine

    2005-01-01

    Little prior image processing work has addressed estimation and classification of skin color in a manner that is independent of camera and illuminant. To this end, we first present new methods for 1) fast, easy-to-use image color correction, with specialization toward skin tones, and 2) fully automated estimation of facial skin color, with robustness to shadows, specularities, and blemishes. Each of these is validated independently against ground truth, and then combined with a classification...

  13. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    OpenAIRE

    Hongxia Li

    2013-01-01

    With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It...

  14. Texture Features based Blur Classification in Barcode Images

    OpenAIRE

    Shamik Tiwari; Vidya Prasad Shukla; Sangappa Birada; Ajay Singh

    2013-01-01

    Blur is an undesirable phenomenon which appears as image degradation. Blur classification is extremely desirable before application of any blur parameters estimation approach in case of blind restoration of barcode image. A novel approach to classify blur in motion, defocus, and co-existence of both blur categories is presented in this paper. The key idea involves statistical features extraction of blur pattern in frequency domain and designing of blur classification system with feed forward ...

  15. IMPROVEMENT OF TCAM-BASED PACKET CLASSIFICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Xu Zhen; Zhang Jun; Rui Liyang; Sun Jun

    2008-01-01

    The feature of Ternary Content Addressable Memories (TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However, the limitations of TCAMs impede their utilization. In this paper, the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.

  16. Basic Hand Gestures Classification Based on Surface Electromyography.

    Science.gov (United States)

    Palkowski, Aleksander; Redlarski, Grzegorz

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method. PMID:27298630

  17. Basic Hand Gestures Classification Based on Surface Electromyography

    Directory of Open Access Journals (Sweden)

    Aleksander Palkowski

    2016-01-01

    Full Text Available This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method.

  18. Basic Hand Gestures Classification Based on Surface Electromyography

    Science.gov (United States)

    Palkowski, Aleksander; Redlarski, Grzegorz

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method. PMID:27298630

  19. Parallelization of automatic classification systems based on support vector machines: Comparison and application to JET database

    International Nuclear Information System (INIS)

    In learning machines, the larger the training dataset the better model can be obtained. Therefore, the training phase can be very demanding in terms of computational time in mono-processor computers. To overcome this difficulty, codes should be parallelized. This article describes two general purpose parallelization techniques of a classification system based on support vector machines (SVM). Both of them have been applied to the recognition of the L-H confinement regime in JET. This has allowed reducing the training computation time from 70 h to 3 min.

  20. Recognizing Thousands of Legal Entities through Instance-based Visual Classification

    OpenAIRE

    Leveau, Valentin; Joly, Alexis; Buisson, Olivier; Letessier, Pierre; Valduriez, Patrick

    2014-01-01

    This paper considers the problem of recognizing legal en-tities in visual contents in a similar way to named-entity recognizers for text documents. Whereas previous works were restricted to the recognition of a few tens of logotypes, we generalize the problem to the recognition of thousands of legal persons, each being modeled by a rich corporate identity automatically built from web images. We intro-duce a new geometrically-consistent instance-based classifi-cation method that is shown to ou...

  1. Unsupervised amplitude and texture classification of SAR images with multinomial latent model

    OpenAIRE

    Kayabol, Koray; Zerubia, Josiane

    2013-01-01

    We combine both amplitude and texture statistics of the Synthetic Aperture Radar (SAR) images for modelbased classification purpose. In a finite mixture model, we bring together the Nakagami densities to model the class amplitudes and a 2D Auto-Regressive texture model with t-distributed regression error to model the textures of the classes. A nonstationary Multinomial Logistic (MnL) latent class label model is used as a mixture density to obtain spatially smooth class segments. The Classific...

  2. Wavelength-adaptive dehazing using histogram merging-based classification for UAV images.

    Science.gov (United States)

    Yoon, Inhye; Jeong, Seokhwa; Jeong, Jaeheon; Seo, Doochun; Paik, Joonki

    2015-01-01

    Since incoming light to an unmanned aerial vehicle (UAV) platform can be scattered by haze and dust in the atmosphere, the acquired image loses the original color and brightness of the subject. Enhancement of hazy images is an important task in improving the visibility of various UAV images. This paper presents a spatially-adaptive dehazing algorithm that merges color histograms with consideration of the wavelength-dependent atmospheric turbidity. Based on the wavelength-adaptive hazy image acquisition model, the proposed dehazing algorithm consists of three steps: (i) image segmentation based on geometric classes; (ii) generation of the context-adaptive transmission map; and (iii) intensity transformation for enhancing a hazy UAV image. The major contribution of the research is a novel hazy UAV image degradation model by considering the wavelength of light sources. In addition, the proposed transmission map provides a theoretical basis to differentiate visually important regions from others based on the turbidity and merged classification results. PMID:25808767

  3. Wavelength-Adaptive Dehazing Using Histogram Merging-Based Classification for UAV Images

    Directory of Open Access Journals (Sweden)

    Inhye Yoon

    2015-03-01

    Full Text Available Since incoming light to an unmanned aerial vehicle (UAV platform can be scattered by haze and dust in the atmosphere, the acquired image loses the original color and brightness of the subject. Enhancement of hazy images is an important task in improving the visibility of various UAV images. This paper presents a spatially-adaptive dehazing algorithm that merges color histograms with consideration of the wavelength-dependent atmospheric turbidity. Based on the wavelength-adaptive hazy image acquisition model, the proposed dehazing algorithm consists of three steps: (i image segmentation based on geometric classes; (ii generation of the context-adaptive transmission map; and (iii intensity transformation for enhancing a hazy UAV image. The major contribution of the research is a novel hazy UAV image degradation model by considering the wavelength of light sources. In addition, the proposed transmission map provides a theoretical basis to differentiate visually important regions from others based on the turbidity and merged classification results.

  4. Support vector machine based classification of fast Fourier transform spectroscopy of proteins

    Science.gov (United States)

    Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine

    2009-02-01

    Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.

  5. A neurally inspired musical instrument classification system based upon the sound onset.

    Science.gov (United States)

    Newton, Michael J; Smith, Leslie S

    2012-06-01

    Physiological evidence suggests that sound onset detection in the auditory system may be performed by specialized neurons as early as the cochlear nucleus. Psychoacoustic evidence shows that the sound onset can be important for the recognition of musical sounds. Here the sound onset is used in isolation to form tone descriptors for a musical instrument classification task. The task involves 2085 isolated musical tones from the McGill dataset across five instrument categories. A neurally inspired tone descriptor is created using a model of the auditory system's response to sound onset. A gammatone filterbank and spiking onset detectors, built from dynamic synapses and leaky integrate-and-fire neurons, create parallel spike trains that emphasize the sound onset. These are coded as a descriptor called the onset fingerprint. Classification uses a time-domain neural network, the echo state network. Reference strategies, based upon mel-frequency cepstral coefficients, evaluated either over the whole tone or only during the sound onset, provide context to the method. Classification success rates for the neurally-inspired method are around 75%. The cepstral methods perform between 73% and 76%. Further testing with tones from the Iowa MIS collection shows that the neurally inspired method is considerably more robust when tested with data from an unrelated dataset. PMID:22712950

  6. Association Technique based on Classification for Classifying Microcalcification and Mass in Mammogram

    Directory of Open Access Journals (Sweden)

    Herwanto

    2013-01-01

    Full Text Available Currently, mammography is recognized as the most effective imaging modality for breast cancer screening. The challenge of using mammography is how to locate the area, which is indeed a solitary geographic abnormality. In mammography screening it is important to define the risk for women who have radiologically negative findings and for those who might develop malignancy later in life. Microcalcification and mass segmentation are used frequently as the first step in mammography screening. The main objective of this paper is to apply association technique based on classification algorithm to classify microcalcification and mass in mammogram. The system that we propose consists of: (i a preprocessing phase to enhance the quality of the image and followed by segmentating region of interest; (ii a phase for mining a transactional table; and (iii a phase for organizing the resulted association rules in a classification model. This paper also illustrates how important the data cleaning phase in building the data mining process for image classification. The proposed method was evaluated using the mammogram data from Mammographic Image Analysis Society (MIAS. The MIAS data consist of 207 images of normal breast, 64 benign, and 51 malignant. 85 mammograms of MIAS data have mass, and 25 mammograms have microcalcification. The features of mean and Gray Level Co-occurrence Matrix homogeneity have been proved to be potential for discriminating microcalcification from mass. The accuracy obtained by this method is 83%.

  7. Hydrogeological discrete fracture modelling to support rock suitability classification

    International Nuclear Information System (INIS)

    This report presents hydrogeological discrete fracture network (Hydro-DFN) modelling in support of the testing and development of the Posiva's Rock Suitability Classification (RSC) system. The aims are to: quantify information on the fulfilment of the inflow and large fracture criteria in the deposition tunnels and deposition holes; provide information on likely properties adjacent to large fractures and deformation zones; and quantify saline water upconing to support definition of the respect distances to the fault zones and hydraulically active zones. The work presented is an update to a previous RSC study performed in 2010, making use of an updated Hydro-DFN model developed for the 2012 Site Descriptive Model (SDM) accounting for new data, in particular that acquired underground in the pilot holes drilled ahead of the ONKALO facility. The interpretation of the tunnel pilot holes has provided a lower detection limit on the specific capacity of hydraulic fractures resulting in a higher overall number of inflows to deposition holes being simulated than in the previous study, although these are small in magnitude, and simulated inflows above 1 L/min are now very rare. Out of the 5391 possible deposition hole positions considered in the modelling, of order 100 are above the RSC limit of 0.1 L/min. It is demonstrated that screening such positions generally avoids locations with the highest post-closure flow-rates. Screening of positions in direct contact with large hydraulic fractures (defined as connected open fractures with equivalent radius greater than 75 m and at least one intersect with a deposition tunnel) is also very effective in avoiding the majority of locations with relatively high predicted post-closure flow-rates. Total inflows to the c. 40 km of deposition and adjacent central tunnels after grouting are of order 100 L/min. (orig.)

  8. Hydrogeological discrete fracture modelling to support rock suitability classification

    Energy Technology Data Exchange (ETDEWEB)

    Hartley, L.; Hoek, J.; Swan, D.; Baxter, D.; Woollard, H. [AMEC, Oxford (United Kingdom)

    2014-01-15

    This report presents hydrogeological discrete fracture network (Hydro-DFN) modelling in support of the testing and development of the Posiva's Rock Suitability Classification (RSC) system. The aims are to: quantify information on the fulfilment of the inflow and large fracture criteria in the deposition tunnels and deposition holes; provide information on likely properties adjacent to large fractures and deformation zones; and quantify saline water upconing to support definition of the respect distances to the fault zones and hydraulically active zones. The work presented is an update to a previous RSC study performed in 2010, making use of an updated Hydro-DFN model developed for the 2012 Site Descriptive Model (SDM) accounting for new data, in particular that acquired underground in the pilot holes drilled ahead of the ONKALO facility. The interpretation of the tunnel pilot holes has provided a lower detection limit on the specific capacity of hydraulic fractures resulting in a higher overall number of inflows to deposition holes being simulated than in the previous study, although these are small in magnitude, and simulated inflows above 1 L/min are now very rare. Out of the 5391 possible deposition hole positions considered in the modelling, of order 100 are above the RSC limit of 0.1 L/min. It is demonstrated that screening such positions generally avoids locations with the highest post-closure flow-rates. Screening of positions in direct contact with large hydraulic fractures (defined as connected open fractures with equivalent radius greater than 75 m and at least one intersect with a deposition tunnel) is also very effective in avoiding the majority of locations with relatively high predicted post-closure flow-rates. Total inflows to the c. 40 km of deposition and adjacent central tunnels after grouting are of order 100 L/min. (orig.)

  9. Review of Remotely Sensed Imagery Classification Patterns Based on Object-oriented Image Analysis

    Institute of Scientific and Technical Information of China (English)

    LIU Yongxue; LI Manchun; MAO Liang; XU Feifei; HUANG Shuo

    2006-01-01

    With the wide use of high-resolution remotely sensed imagery, the object-oriented remotely sensed information classification pattern has been intensively studied. Starting with the definition of object-oriented remotely sensed information classification pattern and a literature review of related research progress, this paper sums up 4 developing phases of object-oriented classification pattern during the past 20 years. Then, we discuss the three aspects of methodology in detail, namely remotely sensed imagery segmentation, feature analysis and feature selection, and classification rule generation, through comparing them with remotely sensed information classification method based on per-pixel. At last, this paper presents several points that need to be paid attention to in the future studies on object-oriented RS information classification pattern: 1) developing robust and highly effective image segmentation algorithm for multi-spectral RS imagery; 2) improving the feature-set including edge, spatial-adjacent and temporal characteristics; 3) discussing the classification rule generation classifier based on the decision tree; 4) presenting evaluation methods for classification result by object-oriented classification pattern.

  10. INDUS - a composition-based approach for rapid and accurate taxonomic classification of metagenomic sequences

    OpenAIRE

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Reddy, Rachamalla Maheedhar; Reddy, Chennareddy Venkata Siva Kumar; Singh, Nitin Kumar; Sharmila S Mande

    2011-01-01

    Background Taxonomic classification of metagenomic sequences is the first step in metagenomic analysis. Existing taxonomic classification approaches are of two types, similarity-based and composition-based. Similarity-based approaches, though accurate and specific, are extremely slow. Since, metagenomic projects generate millions of sequences, adopting similarity-based approaches becomes virtually infeasible for research groups having modest computational resources. In this study, we present ...

  11. Classification and Identification of Over-voltage Based on HHT and SVM

    Institute of Scientific and Technical Information of China (English)

    WANG Jing; YANG Qing; CHEN Lin; SIMA Wenxia

    2012-01-01

    This paper proposes an effective method for over-voltage classification based on the Hilbert-Huang transform(HHT) method.Hilbert-Huang transform method is composed of empirical mode decomposition(EMD) and Hilbert transform.Nine kinds of common power system over-voltages are calculated and analyzed by HHT.Based on the instantaneous amplitude spectrum,Hilbert marginal spectrum and Hilbert time-frequency spectrum,three kinds of over-voltage characteristic quantities are obtained.A hierarchical classification system is built based on HHT and support vector machine(SVM).This classification system is tested by 106 field over-voltage signals,and the average classification rate is 94.3%.This research shows that HHT is an effective time-frequency analysis algorithms in the application of over-voltage classification and identification.

  12. 78 FR 18252 - Prevailing Rate Systems; North American Industry Classification System Based Federal Wage System...

    Science.gov (United States)

    2013-03-26

    ... Industry Classification System Based Federal Wage System Wage Surveys AGENCY: U. S. Office of Personnel... is issuing a proposed rule that would update the 2007 North American Industry Classification System..., the U.S. Office of Personnel Management (OPM) issued a final rule (73 FR 45853) to update the...

  13. 78 FR 58153 - Prevailing Rate Systems; North American Industry Classification System Based Federal Wage System...

    Science.gov (United States)

    2013-09-23

    ... RIN 3206-AM78 Prevailing Rate Systems; North American Industry Classification System Based Federal... Industry Classification System (NAICS) codes currently used in Federal Wage System wage survey industry..., 2013, the U.S. Office of Personnel Management (OPM) issued a proposed rule (78 FR 18252) to update...

  14. Classification Model of Customer Value Based on Rough sets-Neural Network%基于粗糙集-神经网络技术的客户价值分类模型的研究

    Institute of Scientific and Technical Information of China (English)

    陈亮; 苏翔; 王金钟

    2011-01-01

    Evaluation of customer value and then classifing the customer is one of the core of customer relationship management. This article first identifies some affecting factors of the customer value classification, uses the rough set-neural network model, that is to say using complementary advantages of rough set and neural network, preprocess the data by using rough set, and then chooses neural network as the evaluation method, finally,classifies the customer value according to the results evaluation.%客户关系管理的核心内容之一是对客户价值的评价进而对客户进行分类.文章首先找出对客户价值分类有影响的一些因子,应用粗糙集-神经网络的模型,即利用神经网络与粗糙集理论的优势互补,采用粗糙集对数据进行预处理,然后选择神经网络作为评价方法,最后根据客户价值评价结果进行客户价值分类.

  15. Validation of a novel classification model of psychogenic nonepileptic seizures by video-EEG analysis and a machine learning approach.

    Science.gov (United States)

    Magaudda, Adriana; Laganà, Angela; Calamuneri, Alessandro; Brizzi, Teresa; Scalera, Cinzia; Beghi, Massimiliano; Cornaggia, Cesare Maria; Di Rosa, Gabriella

    2016-07-01

    The aim of this study was to validate a novel classification for the diagnosis of PNESs. Fifty-five PNES video-EEG recordings were retrospectively analyzed by four epileptologists and one psychiatrist in a blind manner and classified into four distinct groups: Hypermotor (H), Akinetic (A), Focal Motor (FM), and with Subjective Symptoms (SS). Eleven signs and symptoms, which are frequently found in PNESs, were chosen for statistical validation of our classification. An artificial neural network (ANN) analyzed PNES video recordings based on the signs and symptoms mentioned above. By comparing results produced by the ANN with classifications given by examiners, we were able to understand whether such classification was objective and generalizable. Through accordance metrics based on signs and symptoms (range: 0-100%), we found that most of the seizures belonging to class A showed a high degree of accordance (mean±SD=73%±5%); a similar pattern was found for class SS (80% slightly lower accordance was reported for class H (58%±18%)), with a minimum of 30% in some cases. Low agreement arose from the FM group. Seizures were univocally assigned to a given class in 83.6% of seizures. The ANN classified PNESs in the same way as visual examination in 86.7%. Agreement between ANN classification and visual classification reached 83.3% (SD=17.8%) accordance for class H, 100% (SD=22%) for class A, 83.3% (SD=21.2%) for class SS, and 50% (SD=19.52%) for class FM. This is the first study in which the validity of a new PNES classification was established and reached in two different ways. Video-EEG evaluation needs to be performed by an experienced clinician, but later on, it may be fed into ANN analysis, whose feedback will provide guidance for differential diagnosis. Our analysis, supported by the ML approach, showed that this model of classification could be objectively performed by video-EEG examination. PMID:27208925

  16. Classification of Counseling and Therapy Theorists, Methods, Processes, and Goals: The E-R-A Model.

    Science.gov (United States)

    L'Abate, Luciano

    1981-01-01

    Presents an Emotionality-Rationality-Activity model that integrates recent classifications of counseling and psychotherapy. The model also serves as a theoretical basis from which methods, goals, and processes during counseling, psychotherapy, and training can be derived and integrated. (Author)

  17. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  18. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  19. Quantitative measurement of retinal ganglion cell populations via histology-based random forest classification.

    Science.gov (United States)

    Hedberg-Buenz, Adam; Christopher, Mark A; Lewis, Carly J; Fernandes, Kimberly A; Dutca, Laura M; Wang, Kai; Scheetz, Todd E; Abràmoff, Michael D; Libby, Richard T; Garvin, Mona K; Anderson, Michael G

    2016-05-01

    The inner surface of the retina contains a complex mixture of neurons, glia, and vasculature, including retinal ganglion cells (RGCs), the final output neurons of the retina and primary neurons that are damaged in several blinding diseases. The goal of the current work was two-fold: to assess the feasibility of using computer-assisted detection of nuclei and random forest classification to automate the quantification of RGCs in hematoxylin/eosin (H&E)-stained retinal whole-mounts; and if possible, to use the approach to examine how nuclear size influences disease susceptibility among RGC populations. To achieve this, data from RetFM-J, a semi-automated ImageJ-based module that detects, counts, and collects quantitative data on nuclei of H&E-stained whole-mounted retinas, were used in conjunction with a manually curated set of images to train a random forest classifier. To test performance, computer-derived outputs were compared to previously published features of several well-characterized mouse models of ophthalmic disease and their controls: normal C57BL/6J mice; Jun-sufficient and Jun-deficient mice subjected to controlled optic nerve crush (CONC); and DBA/2J mice with naturally occurring glaucoma. The result of these efforts was development of RetFM-Class, a command-line-based tool that uses data output from RetFM-J to perform random forest classification of cell type. Comparative testing revealed that manual and automated classifications by RetFM-Class correlated well, with 83.2% classification accuracy for RGCs. Automated characterization of C57BL/6J retinas predicted 54,642 RGCs per normal retina, and identified a 48.3% Jun-dependent loss of cells at 35 days post CONC and a 71.2% loss of RGCs among 16-month-old DBA/2J mice with glaucoma. Output from automated analyses was used to compare nuclear area among large numbers of RGCs from DBA/2J mice (n = 127,361). In aged DBA/2J mice with glaucoma, RetFM-Class detected a decrease in median and mean nucleus size

  20. Deep Neural Networks Based Recognition of Plant Diseases by Leaf Image Classification.

    Science.gov (United States)

    Sladojevic, Srdjan; Arsenovic, Marko; Anderla, Andras; Culibrk, Dubravko; Stefanovic, Darko

    2016-01-01

    The latest generation of convolutional neural networks (CNNs) has achieved impressive results in the field of image classification. This paper is concerned with a new approach to the development of plant disease recognition model, based on leaf image classification, by the use of deep convolutional networks. Novel way of training and the methodology used facilitate a quick and easy system implementation in practice. The developed model is able to recognize 13 different types of plant diseases out of healthy leaves, with the ability to distinguish plant leaves from their surroundings. According to our knowledge, this method for plant disease recognition has been proposed for the first time. All essential steps required for implementing this disease recognition model are fully described throughout the paper, starting from gathering images in order to create a database, assessed by agricultural experts. Caffe, a deep learning framework developed by Berkley Vision and Learning Centre, was used to perform the deep CNN training. The experimental results on the developed model achieved precision between 91% and 98%, for separate class tests, on average 96.3%. PMID:27418923