WorldWideScience

Sample records for based classification model

  1. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...... datasets. Our model also outperforms A Decision Cluster Classification (ADCC) and the Decision Cluster Forest Classification (DCFC) models on the Reuters-21578 dataset....

  2. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  3. Polarimetric SAR image classification based on discriminative dictionary learning model

    Science.gov (United States)

    Sang, Cheng Wei; Sun, Hong

    2018-03-01

    Polarimetric SAR (PolSAR) image classification is one of the important applications of PolSAR remote sensing. It is a difficult high-dimension nonlinear mapping problem, the sparse representations based on learning overcomplete dictionary have shown great potential to solve such problem. The overcomplete dictionary plays an important role in PolSAR image classification, however for PolSAR image complex scenes, features shared by different classes will weaken the discrimination of learned dictionary, so as to degrade classification performance. In this paper, we propose a novel overcomplete dictionary learning model to enhance the discrimination of dictionary. The learned overcomplete dictionary by the proposed model is more discriminative and very suitable for PolSAR classification.

  4. Sparse Representation Based Binary Hypothesis Model for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Yidong Tang

    2016-01-01

    Full Text Available The sparse representation based classifier (SRC and its kernel version (KSRC have been employed for hyperspectral image (HSI classification. However, the state-of-the-art SRC often aims at extended surface objects with linear mixture in smooth scene and assumes that the number of classes is given. Considering the small target with complex background, a sparse representation based binary hypothesis (SRBBH model is established in this paper. In this model, a query pixel is represented in two ways, which are, respectively, by background dictionary and by union dictionary. The background dictionary is composed of samples selected from the local dual concentric window centered at the query pixel. Thus, for each pixel the classification issue becomes an adaptive multiclass classification problem, where only the number of desired classes is required. Furthermore, the kernel method is employed to improve the interclass separability. In kernel space, the coding vector is obtained by using kernel-based orthogonal matching pursuit (KOMP algorithm. Then the query pixel can be labeled by the characteristics of the coding vectors. Instead of directly using the reconstruction residuals, the different impacts the background dictionary and union dictionary have on reconstruction are used for validation and classification. It enhances the discrimination and hence improves the performance.

  5. Compositional Model Based Fisher Vector Coding for Image Classification.

    Science.gov (United States)

    Liu, Lingqiao; Wang, Peng; Shen, Chunhua; Wang, Lei; Hengel, Anton van den; Wang, Chao; Shen, Heng Tao

    2017-12-01

    Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) as the generative model for local features. However, the representative power of a GMM can be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes, and the number of prototypes is usually small in FVC. To alleviate this limitation, in this work, we break the convention which assumes that a local feature is drawn from one of a few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as a linear combination of multiple key components, and the combination weight is a latent random variable. In doing so we greatly enhance the representative power of the generative model underlying FVC. To implement our idea, we design two particular generative models following this compositional approach. In our first model, the mean vector is sampled from the subspace spanned by a set of bases and the combination weight is drawn from a Laplace distribution. In our second model, we further assume that a local feature is composed of a discriminative part and a residual part. As a result, a local feature is generated by the linear combination of discriminative part bases and residual part bases. The decomposition of the discriminative and residual parts is achieved via the guidance of a pre-trained supervised coding method. By calculating the gradient vector of the proposed models, we derive two new Fisher vector coding strategies. The first is termed Sparse Coding-based Fisher Vector Coding (SCFVC) and can be used as the substitute of traditional GMM based FVC. The second is termed Hybrid Sparse Coding-based Fisher vector coding (HSCFVC) since it

  6. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  7. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic...

  8. Skin injury model classification based on shape vector analysis.

    Science.gov (United States)

    Röhrich, Emil; Thali, Michael; Schweitzer, Wolf

    2012-11-06

    Skin injuries can be crucial in judicial decision making. Forensic experts base their classification on subjective opinions. This study investigates whether known classes of simulated skin injuries are correctly classified statistically based on 3D surface models and derived numerical shape descriptors. Skin injury surface characteristics are simulated with plasticine. Six injury classes - abrasions, incised wounds, gunshot entry wounds, smooth and textured strangulation marks as well as patterned injuries - with 18 instances each are used for a k-fold cross validation with six partitions. Deformed plasticine models are captured with a 3D surface scanner. Mean curvature is estimated for each polygon surface vertex. Subsequently, distance distributions and derived aspect ratios, convex hulls, concentric spheres, hyperbolic points and Fourier transforms are used to generate 1284-dimensional shape vectors. Subsequent descriptor reduction maximizing SNR (signal-to-noise ratio) result in an average of 41 descriptors (varying across k-folds). With non-normal multivariate distribution of heteroskedastic data, requirements for LDA (linear discriminant analysis) are not met. Thus, shrinkage parameters of RDA (regularized discriminant analysis) are optimized yielding a best performance with λ = 0.99 and γ = 0.001. Receiver Operating Characteristic of a descriptive RDA yields an ideal Area Under the Curve of 1.0 for all six categories. Predictive RDA results in an average CRR (correct recognition rate) of 97,22% under a 6 partition k-fold. Adding uniform noise within the range of one standard deviation degrades the average CRR to 71,3%. Digitized 3D surface shape data can be used to automatically classify idealized shape models of simulated skin injuries. Deriving some well established descriptors such as histograms, saddle shape of hyperbolic points or convex hulls with subsequent reduction of dimensionality while maximizing SNR seem to work well for the data at hand, as

  9. Proposing a Hybrid Model Based on Robson's Classification for Better Impact on Trends of Cesarean Deliveries.

    Science.gov (United States)

    Hans, Punit; Rohatgi, Renu

    2017-06-01

    To construct a hybrid model classification for cesarean section (CS) deliveries based on the woman-characteristics (Robson's classification with additional layers of indications for CS, keeping in view low-resource settings available in India). This is a cross-sectional study conducted at Nalanda Medical College, Patna. All the women delivered from January 2016 to May 2016 in the labor ward were included. Results obtained were compared with the values obtained for India, from secondary analysis of WHO multi-country survey (2010-2011) by Joshua Vogel and colleagues' study published in "The Lancet Global Health." The three classifications (indication-based, Robson's and hybrid model) applied for categorization of the cesarean deliveries from the same sample of data and a semiqualitative evaluations done, considering the main characteristics, strengths and weaknesses of each classification system. The total number of women delivered during study period was 1462, out of which CS deliveries were 471. Overall, CS rate calculated for NMCH, hospital in this specified period, was 32.21% ( p  = 0.001). Hybrid model scored 23/23, and scores of Robson classification and indication-based classification were 21/23 and 10/23, respectively. Single-study centre and referral bias are the limitations of the study. Given the flexibility of the classifications, we constructed a hybrid model based on the woman-characteristics system with additional layers of other classification. Indication-based classification answers why, Robson classification answers on whom, while through our hybrid model we get to know why and on whom cesarean deliveries are being performed.

  10. A unified model for context-based behavioural modelling and classification

    CSIR Research Space (South Africa)

    Dabrowski, JJ

    2015-11-01

    Full Text Available for Context-Based Behavioural Modelling and Classification Joel Janek Dabrowskia,∗, Johan Pieter de Villiersb,a,∗∗ aDepartment of Electrical, Electronic and Computer Engineering, University of Pretoria, 2 Lynnwood Rd, Pretoria, South Africa b... consists of tracked location data, ocean conditions, weather conditions, time of day, season of year, regional location data and vessel class ground truth. Each vessel in an environment is classified as either a pirate, transport or fishing vessel...

  11. The method of narrow-band audio classification based on universal noise background model

    Science.gov (United States)

    Rui, Rui; Bao, Chang-chun

    2013-03-01

    Audio classification is the basis of content-based audio analysis and retrieval. The conventional classification methods mainly depend on feature extraction of audio clip, which certainly increase the time requirement for classification. An approach for classifying the narrow-band audio stream based on feature extraction of audio frame-level is presented in this paper. The audio signals are divided into speech, instrumental music, song with accompaniment and noise using the Gaussian mixture model (GMM). In order to satisfy the demand of actual environment changing, a universal noise background model (UNBM) for white noise, street noise, factory noise and car interior noise is built. In addition, three feature schemes are considered to optimize feature selection. The experimental results show that the proposed algorithm achieves a high accuracy for audio classification, especially under each noise background we used and keep the classification time less than one second.

  12. Classification Based on Hierarchical Linear Models: The Need for Incorporation of Social Contexts in Classification Analysis

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qui

    2009-01-01

    Many areas in educational and psychological research involve the use of classification statistical analysis. For example, school districts might be interested in attaining variables that provide optimal prediction of school dropouts. In psychology, a researcher might be interested in the classification of a subject into a particular psychological…

  13. Multigrades Classification Model of Magnesite Ore Based on SAE and ELM

    Directory of Open Access Journals (Sweden)

    Yachun Mao

    2017-01-01

    Full Text Available Magnesite is an important raw material for extracting magnesium metal and magnesium compound; how precise its grade classification exerts great influence on the smelting process. Thus, it is increasingly important to determine fast and accurately the grade of magnesite. In this paper, a method based on stacked autoencoder (SAE and extreme learning machine (ELM was established for the classification model of magnesite. Stacked autoencoder (SAE was firstly used to reduce the dimension of magnesite spectrum data and then neutral network model of extreme learning machine (ELM was adopted to classify the data. Two improved extreme learning machine (ELM models were employed for better classification, namely, accuracy extreme learning machine (AELM and integrated accuracy (IELM to build up the classification models. The grade classification through traditional methods such as chemical approaches, artificial methods, and BP neutral network model was compared to that in this paper. Results showed that the classification model of magnesite ore through stacked autoencoder (SAE and extreme learning machine (ELM is better in terms of speed and accuracy; thus, this paper provides a new way for the grade classification of magnesite ore.

  14. AN ADABOOST OPTIMIZED CCFIS BASED CLASSIFICATION MODEL FOR BREAST CANCER DETECTION

    Directory of Open Access Journals (Sweden)

    CHANDRASEKAR RAVI

    2017-06-01

    Full Text Available Classification is a Data Mining technique used for building a prototype of the data behaviour, using which an unseen data can be classified into one of the defined classes. Several researchers have proposed classification techniques but most of them did not emphasis much on the misclassified instances and storage space. In this paper, a classification model is proposed that takes into account the misclassified instances and storage space. The classification model is efficiently developed using a tree structure for reducing the storage complexity and uses single scan of the dataset. During the training phase, Class-based Closed Frequent ItemSets (CCFIS were mined from the training dataset in the form of a tree structure. The classification model has been developed using the CCFIS and a similarity measure based on Longest Common Subsequence (LCS. Further, the Particle Swarm Optimization algorithm is applied on the generated CCFIS, which assigns weights to the itemsets and their associated classes. Most of the classifiers are correctly classifying the common instances but they misclassify the rare instances. In view of that, AdaBoost algorithm has been used to boost the weights of the misclassified instances in the previous round so as to include them in the training phase to classify the rare instances. This improves the accuracy of the classification model. During the testing phase, the classification model is used to classify the instances of the test dataset. Breast Cancer dataset from UCI repository is used for experiment. Experimental analysis shows that the accuracy of the proposed classification model outperforms the PSOAdaBoost-Sequence classifier by 7% superior to other approaches like Naïve Bayes Classifier, Support Vector Machine Classifier, Instance Based Classifier, ID3 Classifier, J48 Classifier, etc.

  15. Comparison of Enzymes / Non-Enzymes Proteins Classification Models Based on 3D, Composition, Sequences and Topological Indices

    OpenAIRE

    Munteanu, Cristian Robert

    2014-01-01

    Comparison of Enzymes / Non-Enzymes Proteins Classification Models Based on 3D, Composition, Sequences and Topological Indices, German Conference on Bioinformatics (GCB), Potsdam, Germany (September, 2007)

  16. Wearable-Sensor-Based Classification Models of Faller Status in Older Adults.

    Science.gov (United States)

    Howcroft, Jennifer; Lemaire, Edward D; Kofman, Jonathan

    2016-01-01

    Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521). Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment.

  17. Wearable-Sensor-Based Classification Models of Faller Status in Older Adults.

    Directory of Open Access Journals (Sweden)

    Jennifer Howcroft

    Full Text Available Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521. Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment.

  18. Mixture model-based atmospheric air mass classification: a probabilistic view of thermodynamic profiles

    Science.gov (United States)

    Pernin, Jérôme; Vrac, Mathieu; Crevoisier, Cyril; Chédin, Alain

    2017-04-01

    Air mass classification has become an important area in synoptic climatology, simplifying the complexity of the atmosphere by dividing the atmosphere into discrete similar thermodynamic patterns. However, the constant growth of atmospheric databases in both size and complexity implies the need to develop new adaptive classifications. Here, we propose a robust unsupervised and supervised classification methodology of a large thermodynamic dataset, on a global scale and over several years, into discrete air mass groups homogeneous in both temperature and humidity that also provides underlying probability laws. Temperature and humidity at different pressure levels are aggregated into a set of cumulative distribution function (CDF) values instead of classical ones. The method is based on a Gaussian mixture model and uses the expectation-maximization (EM) algorithm to estimate the parameters of the mixture. Spatially gridded thermodynamic profiles come from ECMWF reanalyses spanning the period 2000-2009. Different aspects are investigated, such as the sensitivity of the classification process to both temporal and spatial samplings of the training dataset. Comparisons of the classifications made either by the EM algorithm or by the widely used k-means algorithm show that the former can be viewed as a generalization of the latter. Moreover, the EM algorithm delivers, for each observation, the probabilities of belonging to each class, as well as the associated uncertainty. Finally, a decision tree is proposed as a tool for interpreting the different classes, highlighting the relative importance of temperature and humidity in the classification process.

  19. SVM classification model in depression recognition based on mutation PSO parameter optimization

    Directory of Open Access Journals (Sweden)

    Zhang Ming

    2017-01-01

    Full Text Available At present, the clinical diagnosis of depression is mainly through structured interviews by psychiatrists, which is lack of objective diagnostic methods, so it causes the higher rate of misdiagnosis. In this paper, a method of depression recognition based on SVM and particle swarm optimization algorithm mutation is proposed. To address on the problem that particle swarm optimization (PSO algorithm easily trap in local optima, we propose a feedback mutation PSO algorithm (FBPSO to balance the local search and global exploration ability, so that the parameters of the classification model is optimal. We compared different PSO mutation algorithms about classification accuracy for depression, and found the classification accuracy of support vector machine (SVM classifier based on feedback mutation PSO algorithm is the highest. Our study promotes important reference value for establishing auxiliary diagnostic used in depression recognition of clinical diagnosis.

  20. A model presented for classification ECG signals base on Case-Based Reasoning

    Directory of Open Access Journals (Sweden)

    Elaheh Sayari

    2013-07-01

    Full Text Available Early detection of heart diseases/abnormalities can prolong life and enhance the quality of living through appropriate treatment; thus classifying cardiac signals will be helped to immediate diagnosing of heart beat type in cardiac patients. The present paper utilizes the case base reasoning (CBR for classification of ECG signals. Four types of ECG beats (normal beat, congestive heart failure beat, ventricular tachyarrhythmia beat and atrial fibrillation beat obtained from the PhysioBank database was classified by the proposed CBR model. The main purpose of this article is classifying heart signals and diagnosing the type of heart beat in cardiac patients that in proposed CBR (Case Base Reasoning system, Training and testing data for diagnosing and classifying types of heart beat have been used. The evaluation results from the model are shown that the proposed model has high accuracy in classifying heart signals and helps to clinical decisions for diagnosing the type of heart beat in cardiac patients which indeed has high impact on diagnosing the type of heart beat aided computer.

  1. Latent classification models

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre

    2005-01-01

    parametric family ofdistributions.  In this paper we propose a new set of models forclassification in continuous domains, termed latent classificationmodels. The latent classification model can roughly be seen ascombining the \\NB model with a mixture of factor analyzers,thereby relaxing the assumptions...... classification model, and wedemonstrate empirically that the accuracy of the proposed model issignificantly higher than the accuracy of other probabilisticclassifiers....

  2. Prediction of different types of liver diseases using rule based classification model.

    Science.gov (United States)

    Kumar, Yugal; Sahoo, G

    2013-01-01

    Diagnosing different types of liver diseases clinically is a quite hectic process because patients have to undergo large numbers of independent laboratory tests. On the basis of results and analysis of laboratory test, different liver diseases are classified. Hence to simplify this complex process, we have developed a Rule Base Classification Model (RBCM) to predict different types of liver diseases. The proposed model is the combination of rules and different data mining techniques. The objective of this paper is to propose a rule based classification model with machine learning techniques for the prediction of different types of Liver diseases. A dataset was developed with twelve attributes that include the records of 583 patients in which 441 patients were male and rests were female. Support Vector Machine (SVM), Rule Induction (RI), Decision Tree (DT), Naive Bayes (NB) and Artificial Neural Network (ANN) data mining techniques with K-cross fold technique are used with the proposed model for the prediction of liver diseases. The performance of these data mining techniques are evaluated with accuracy, sensitivity, specificity and kappa parameters as well as statistical techniques (ANOVA and Chi square test) are used to analyze the liver disease dataset and independence of attributes. Out of 583 patients, 416 patients are liver diseases affected and rests of 167 patients are healthy. The proposed model with decision tree (DT) technique provides the better result among all techniques (RI, SVM, ANN and NB) with all parameters (Accuracy 98.46%, Sensitivity 95.7%, Specificity 95.28% and Kappa 0.983) while the SVM exhibits poor performance (Accuracy 82.33%, Sensitivity 68.03%, Specificity 91.28% and Kappa 0.801). It is also found that the best performance of the model without rules (RI, Accuracy 82.68%, Sensitivity 86.34%, Specificity 90.51% and Kappa 0.619) is almost similar to the worst performance of the rule based classification model (SVM, Accuracy 82

  3. Application of a niche-based model for forest cover classification

    Directory of Open Access Journals (Sweden)

    Amici V

    2012-05-01

    Full Text Available In recent years, a surge of interest in biodiversity conservation have led to the development of new approaches to facilitate ecologically-based conservation policies and management plans. In particular, image classification and predictive distribution modeling applied to forest habitats, constitute a crucial issue as forests constitute the most widespread vegetation type and play a key role for ecosystem functioning. Then, the general purpose of this study is to develop a framework that in the absence of large amounts of field data for large areas may allow to select the most appropriate classification. In some cases, a hard division of classes is required, especially as support to environmental policies; despite this it is necessary to take into account problems which derive from a crisp view of ecological entities being mapped, since habitats are expected to be structurally complex and continuously vary within a landscape. In this paper, a niche model (MaxEnt, generally used to estimate species/habitat distribution, has been applied to classify forest cover in a complex Mediterranean area and to estimate the probability distribution of four forest types, producing continuous maps of forest cover. The use of the obtained models as validation of model for crisp classifications, highlighted that crisp classification, which is being continuously used in landscape research and planning, is not free from drawbacks as it is showing a high degree of inner variability. The modeling approach followed by this study, taking into account the uncertainty proper of the natural ecosystems and the use of environmental variables in land cover classification, may represent an useful approach to making more efficient and effective field inventories and to developing effective forest conservation policies.

  4. Object-Based Mangrove Species Classification Using Unmanned Aerial Vehicle Hyperspectral Images and Digital Surface Models

    Directory of Open Access Journals (Sweden)

    Jingjing Cao

    2018-01-01

    Full Text Available Mangroves are one of the most important coastal wetland ecosystems, and the compositions and distributions of mangrove species are essential for conservation and restoration efforts. Many studies have explored this topic using remote sensing images that were obtained by satellite-borne and airborne sensors, which are known to be efficient for monitoring the mangrove ecosystem. With improvements in carrier platforms and sensor technology, unmanned aerial vehicles (UAVs with high-resolution hyperspectral images in both spectral and spatial domains have been used to monitor crops, forests, and other landscapes of interest. This study aims to classify mangrove species on Qi’ao Island using object-based image analysis techniques based on UAV hyperspectral images obtained from a commercial hyperspectral imaging sensor (UHD 185 onboard a UAV platform. First, the image objects were obtained by segmenting the UAV hyperspectral image and the UAV-derived digital surface model (DSM data. Second, spectral features, textural features, and vegetation indices (VIs were extracted from the UAV hyperspectral image, and the UAV-derived DSM data were used to extract height information. Third, the classification and regression tree (CART method was used to selection bands, and the correlation-based feature selection (CFS algorithm was employed for feature reduction. Finally, the objects were classified into different mangrove species and other land covers based on their spectral and spatial characteristic differences. The classification results showed that when considering the three features (spectral features, textural features, and hyperspectral VIs, the overall classification accuracies of the two classifiers used in this paper, i.e., k-nearest neighbor (KNN and support vector machine (SVM, were 76.12% (Kappa = 0.73 and 82.39% (Kappa = 0.801, respectively. After incorporating tree height into the classification features, the accuracy of species classification

  5. [Eco-value level classification model of forest ecosystem based on modified projection pursuit technique].

    Science.gov (United States)

    Wu, Chengzhen; Hong, Wei; Hong, Tao

    2006-03-01

    To optimize the projection function and direction of projection pursuit technique, predigest its realization process, and overcome the shortcomings in long time calculation and in the difficulty of optimizing projection direction and computer programming, this paper presented a modified simplex method (MSM), and based on it, brought forward the eco-value level classification model (EVLCM) of forest ecosystem, which could integrate the multidimensional classification index into one-dimensional projection value, with high projection value denoting high ecosystem services value. Examples of forest ecosystem could be reasonably classified by the new model according to their projection value, suggesting that EVLCM driven directly by samples data of forest ecosystem was simple and feasible, applicable, and maneuverable. The calculating time and value of projection function were 34% and 143% of those with the traditional projection pursuit technique, respectively. This model could be applied extensively to classify and estimate all kinds of non-linear and multidimensional data in ecology, biology, and regional sustainable development.

  6. Ligand and structure-based classification models for Prediction of P-glycoprotein inhibitors

    DEFF Research Database (Denmark)

    Klepsch, Freya; Poongavanam, Vasanthanathan; Ecker, Gerhard Franz

    2014-01-01

    The ABC transporter P-glycoprotein (P-gp) actively transports a wide range of drugs and toxins out of cells, and is therefore related to multidrug resistance and the ADME profile of therapeutics. Thus, development of predictive in silico models for the identification of P-gp inhibitors is of great......Score resulted in the correct prediction of 61 % of the external test set. This demonstrates that ligand-based models currently remain the methods of choice for accurately predicting P-gp inhibitors. However, structure-based classification offers information about possible drug/protein interactions, which helps...

  7. Generative topographic mapping-based classification models and their applicability domain: application to the biopharmaceutics Drug Disposition Classification System (BDDCS).

    Science.gov (United States)

    Gaspar, Héléna A; Marcou, Gilles; Horvath, Dragos; Arault, Alban; Lozano, Sylvain; Vayer, Philippe; Varnek, Alexandre

    2013-12-23

    Earlier (Kireeva et al. Mol. Inf. 2012, 31, 301-312), we demonstrated that generative topographic mapping (GTM) can be efficiently used both for data visualization and building of classification models in the initial D-dimensional space of molecular descriptors. Here, we describe the modeling in two-dimensional latent space for the four classes of the BioPharmaceutics Drug Disposition Classification System (BDDCS) involving VolSurf descriptors. Three new definitions of the applicability domain (AD) of models have been suggested: one class-independent AD which considers the GTM likelihood and two class-dependent ADs considering respectively, either the predominant class in a given node of the map or informational entropy. The class entropy AD was found to be the most efficient for the BDDCS modeling. The predominant class AD can be directly visualized on GTM maps, which helps the interpretation of the model.

  8. Unsupervised polarimetric SAR urban area classification based on model-based decomposition with cross scattering

    Science.gov (United States)

    Xiang, Deliang; Tang, Tao; Ban, Yifang; Su, Yi; Kuang, Gangyao

    2016-06-01

    Since it has been validated that cross-polarized scattering (HV) is caused not only by vegetation but also by rotated dihedrals, in this study, we use rotated dihedral corner reflectors to form a cross scattering matrix and propose an extended four-component model-based decomposition method for PolSAR data over urban areas. Unlike other urban area decomposition techniques which need to discriminate the urban and natural areas before decomposition, this proposed method is applied on PolSAR image directly. The building orientation angle is considered in this scattering matrix, making it flexible and adaptive in the decomposition. Therefore, we can separate cross scattering of urban areas from the overall HV component. Further, the cross and helix scattering components are also compared. Then, using these decomposed scattering powers, the buildings and natural areas can be easily discriminated from each other using a simple unsupervised K-means classifier. Moreover, buildings aligned and not aligned along the radar flight direction can be also distinguished clearly. Spaceborne RADARSAT-2 and airborne AIRSAR full polarimetric SAR data are used to validate the performance of our proposed method. The cross scattering power of oriented buildings is generated, leading to a better decomposition result for urban areas with respect to other state-of-the-art urban decomposition techniques. The decomposed scattering powers significantly improve the classification accuracy for urban areas.

  9. SAR Images Statistical Modeling and Classification Based on the Mixture of Alpha-Stable Distributions

    Directory of Open Access Journals (Sweden)

    Fangling Pu

    2013-05-01

    Full Text Available This paper proposes the mixture of Alpha-stable (MAS distributions for modeling statistical property of Synthetic Aperture Radar (SAR images in a supervised Markovian classification algorithm. Our work is motivated by the fact that natural scenes consist of various reflectors with different types that are typically concentrated within a small area, and SAR images generally exhibit sharp peaks, heavy tails, and even multimodal statistical property, especially at high resolution. Unimodal distributions do not fit such statistical property well, and thus a multimodal approach is necessary. Driven by the multimodality and impulsiveness of high resolution SAR images histogram, we utilize the mixture of Alpha-stable distributions to describe such characteristics. A pseudo-simulated annealing (PSA estimator based on Markov chain Monte Carlo (MCMC is present to efficiently estimate model parameters of the mixture of Alpha-stable distributions. To validate the proposed PSA estimator, we apply it to simulated data and compare its performance to that of a state-of-the-art estimator. Finally, we exploit the MAS distributions and a Markovian context for SAR images classification. The effectiveness of the proposed classifier is demonstrated by experiments on TerraSAR-X images, which verifies the validity of the MAS distributions for modeling and classification of SAR images.

  10. Using ELM-based weighted probabilistic model in the classification of synchronous EEG BCI.

    Science.gov (United States)

    Tan, Ping; Tan, Guan-Zheng; Cai, Zi-Xing; Sa, Wei-Ping; Zou, Yi-Qun

    2017-01-01

    Extreme learning machine (ELM) is an effective machine learning technique with simple theory and fast implementation, which has gained increasing interest from various research fields recently. A new method that combines ELM with probabilistic model method is proposed in this paper to classify the electroencephalography (EEG) signals in synchronous brain-computer interface (BCI) system. In the proposed method, the softmax function is used to convert the ELM output to classification probability. The Chernoff error bound, deduced from the Bayesian probabilistic model in the training process, is adopted as the weight to take the discriminant process. Since the proposed method makes use of the knowledge from all preceding training datasets, its discriminating performance improves accumulatively. In the test experiments based on the datasets from BCI competitions, the proposed method is compared with other classification methods, including the linear discriminant analysis, support vector machine, ELM and weighted probabilistic model methods. For comparison, the mutual information, classification accuracy and information transfer rate are considered as the evaluation indicators for these classifiers. The results demonstrate that our method shows competitive performance against other methods.

  11. A model for the evaluation of domain based classification of GPCR.

    Science.gov (United States)

    Kumari, Tannu; Pant, Bhaskar; Pardasani, Kamalraj Raj

    2009-10-11

    G-Protein Coupled Receptors (GPCR) are the largest family of membrane bound receptor and plays a vital role in various biological processes with their amenability to drug intervention. They are the spotlight for the pharmaceutical industry. Experimental methods are both time consuming and expensive so there is need to develop a computational approach for classification to expedite the drug discovery process. In the present study domain based classification model has been developed by employing and evaluating various machine learning approaches like Bagging, J48, Bayes net, and Naive Bayes. Various softwares are available for predicting domains. The result and accuracy of output for the same input varies for these software's. Thus, there is dilemma in choosing any one of it. To address this problem, a simulation model has been developed using well known five softwares for domain prediction to explore the best predicted result with maximum accuracy. The classifier is developed for classification up to 3 levels for class A. An accuracy of 98.59% by Naïve Bayes for level I, 92.07% by J48 for level II and 82.14% by Bagging for level III has been achieved.

  12. A physiologically-inspired model of numerical classification based on graded stimulus coding

    Directory of Open Access Journals (Sweden)

    John Pearson

    2010-01-01

    Full Text Available In most natural decision contexts, the process of selecting among competing actions takes place in the presence of informative, but potentially ambiguous, stimuli. Decisions about magnitudes—quantities like time, length, and brightness that are linearly ordered—constitute an important subclass of such decisions. It has long been known that perceptual judgments about such quantities obey Weber’s Law, wherein the just-noticeable difference in a magnitude is proportional to the magnitude itself. Current physiologically inspired models of numerical classification assume discriminations are made via a labeled line code of neurons selectively tuned for numerosity, a pattern observed in the firing rates of neurons in the ventral intraparietal area (VIP of the macaque. By contrast, neurons in the contiguous lateral intraparietal area (LIP signal numerosity in a graded fashion, suggesting the possibility that numerical classification could be achieved in the absence of neurons tuned for number. Here, we consider the performance of a decision model based on this analog coding scheme in a paradigmatic discrimination task—numerosity bisection. We demonstrate that a basic two-neuron classifier model, derived from experimentally measured monotonic responses of LIP neurons, is sufficient to reproduce the numerosity bisection behavior of monkeys, and that the threshold of the classifier can be set by reward maximization via a simple learning rule. In addition, our model predicts deviations from Weber Law scaling of choice behavior at high numerosity. Together, these results suggest both a generic neuronal framework for magnitude-based decisions and a role for reward contingency in the classification of such stimuli.

  13. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models

    DEFF Research Database (Denmark)

    Kheir, Rania Bou; Greve, Mogens Humlekrog; Bøcher, Peder Klith

    2010-01-01

    ) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME......) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME ¼ 29.5%; N¼ 54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME¼ 31.5%; N ¼ 14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation...

  14. A stylistic classification of Russian-language texts based on the random walk model

    Science.gov (United States)

    Kramarenko, A. A.; Nekrasov, K. A.; Filimonov, V. V.; Zhivoderov, A. A.; Amieva, A. A.

    2017-09-01

    A formal approach to text analysis is suggested that is based on the random walk model. The frequencies and reciprocal positions of the vowel letters are matched up by a process of quasi-particle migration. Statistically significant difference in the migration parameters for the texts of different functional styles is found. Thus, a possibility of classification of texts using the suggested method is demonstrated. Five groups of the texts are singled out that can be distinguished from one another by the parameters of the quasi-particle migration process.

  15. A Classification Model and an Open E-Learning System Based on Intuitionistic Fuzzy Sets for Instructional Design Concepts

    Science.gov (United States)

    Güyer, Tolga; Aydogdu, Seyhmus

    2016-01-01

    This study suggests a classification model and an e-learning system based on this model for all instructional theories, approaches, models, strategies, methods, and technics being used in the process of instructional design that constitutes a direct or indirect resource for educational technology based on the theory of intuitionistic fuzzy sets…

  16. Hidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers

    Directory of Open Access Journals (Sweden)

    M. Beyreuther

    2011-02-01

    Full Text Available Automatic earthquake detection and classification is required for efficient analysis of large seismic datasets. Such techniques are particularly important now because access to measures of ground motion is nearly unlimited and the target waveforms (earthquakes are often hard to detect and classify. Here, we propose to use models from speech synthesis which extend the double stochastic models from speech recognition by integrating a more realistic duration of the target waveforms. The method, which has general applicability, is applied to earthquake detection and classification. First, we generate characteristic functions from the time-series. The Hidden semi-Markov Models are estimated from the characteristic functions and Weighted Finite-State Transducers are constructed for the classification. We test our scheme on one month of continuous seismic data, which corresponds to 370 151 classifications, showing that incorporating the time dependency explicitly in the models significantly improves the results compared to Hidden Markov Models.

  17. Prospective identification of adolescent suicide ideation using classification tree analysis: Models for community-based screening.

    Science.gov (United States)

    Hill, Ryan M; Oosterhoff, Benjamin; Kaplow, Julie B

    2017-07-01

    Although a large number of risk markers for suicide ideation have been identified, little guidance has been provided to prospectively identify adolescents at risk for suicide ideation within community settings. The current study addressed this gap in the literature by utilizing classification tree analysis (CTA) to provide a decision-making model for screening adolescents at risk for suicide ideation. Participants were N = 4,799 youth (Mage = 16.15 years, SD = 1.63) who completed both Waves 1 and 2 of the National Longitudinal Study of Adolescent to Adult Health. CTA was used to generate a series of decision rules for identifying adolescents at risk for reporting suicide ideation at Wave 2. Findings revealed 3 distinct solutions with varying sensitivity and specificity for identifying adolescents who reported suicide ideation. Sensitivity of the classification trees ranged from 44.6% to 77.6%. The tree with greatest specificity and lowest sensitivity was based on a history of suicide ideation. The tree with moderate sensitivity and high specificity was based on depressive symptoms, suicide attempts or suicide among family and friends, and social support. The most sensitive but least specific tree utilized these factors and gender, ethnicity, hours of sleep, school-related factors, and future orientation. These classification trees offer community organizations options for instituting large-scale screenings for suicide ideation risk depending on the available resources and modality of services to be provided. This study provides a theoretically and empirically driven model for prospectively identifying adolescents at risk for suicide ideation and has implications for preventive interventions among at-risk youth. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models

    Directory of Open Access Journals (Sweden)

    Jin Dai

    2014-01-01

    Full Text Available The similarity between objects is the core research area of data mining. In order to reduce the interference of the uncertainty of nature language, a similarity measurement between normal cloud models is adopted to text classification research. On this basis, a novel text classifier based on cloud concept jumping up (CCJU-TC is proposed. It can efficiently accomplish conversion between qualitative concept and quantitative data. Through the conversion from text set to text information table based on VSM model, the text qualitative concept, which is extraction from the same category, is jumping up as a whole category concept. According to the cloud similarity between the test text and each category concept, the test text is assigned to the most similar category. By the comparison among different text classifiers in different feature selection set, it fully proves that not only does CCJU-TC have a strong ability to adapt to the different text features, but also the classification performance is also better than the traditional classifiers.

  19. Steel column base classification

    OpenAIRE

    Jaspart, J.P.; Wald, F.; Weynand, K.; Gresnigt, A.M.

    2008-01-01

    The influence of the rotational characteristics of the column bases on the structural frame response is discussed and specific design criteria for stiffness classification into semi-rigid and rigid joints are derived. The particular case of an industrial portal frame is then considered. Peer reviewed

  20. Model for Detection and Classification of DDoS Traffic Based on Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    D. Peraković

    2017-06-01

    Full Text Available Detection of DDoS (Distributed Denial of Service traffic is of great importance for the availability protection of services and other information and communication resources. The research presented in this paper shows the application of artificial neural networks in the development of detection and classification model for three types of DDoS attacks and legitimate network traffic. Simulation results of developed model showed accuracy of 95.6% in classification of pre-defined classes of traffic.

  1. Classification of Flying Insects with high performance using improved DTW algorithm based on hidden Markov model

    Directory of Open Access Journals (Sweden)

    S. Arif Abdul Rahuman

    Full Text Available ABSTRACT Insects play significant role in the human life. And insects pollinate major food crops consumed in the world. Insect pests consume and destroy major crops in the world. Hence to have control over the disease and pests, researches are going on in the area of entomology using chemical, biological and mechanical approaches. The data relevant to the flying insects often changes over time, and classification of such data is a central issue. And such time series mining tasks along with classification is critical nowadays. Most time series data mining algorithms use similarity search and hence time taken for similarity search is the bottleneck and it does not produce accurate results and also produces very poor performance. In this paper, a novel classification method that is based on the dynamic time warping (DTW algorithm is proposed. The dynamic time warping algorithm is deterministic and lacks in modeling stochastic signals. The dynamic time warping (DTW algorithm is improved by implementing a nonlinear median filtering (NMF. Recognition accuracy of conventional DTW algorithms is less than that of the hidden Markov model (HMM by same voice activity detection (VAD and noise-reduction. With running spectrum filtering (RSF and dynamic range adjustment (DRA. NMF seek the median distance of every reference of time series data and the recognition accuracy is much improved. In this research work, optical sensors are used to record the sound of insect flight, with invariance to interference from ambient sounds. The implementation of our tool includes two parts, an optical sensor to record the "sound" of insect flight, and a software that leverages on the sensor information, to automatically detect and identify flying insects.

  2. Forecasting oral absorption across biopharmaceutics classification system classes with physiologically based pharmacokinetic models.

    Science.gov (United States)

    Hansmann, Simone; Darwich, Adam; Margolskee, Alison; Aarons, Leon; Dressman, Jennifer

    2016-12-01

    The aim of this study was (1) to determine how closely physiologically based pharmacokinetic (PBPK) models can predict oral bioavailability using a priori knowledge of drug-specific properties and (2) to examine the influence of the biopharmaceutics classification system class on the simulation success. Simcyp Simulator, GastroPlus ™ and GI-Sim were used. Compounds with published Biowaiver monographs (bisoprolol (BCS I), nifedipine (BCS II), cimetidine (BCS III), furosemide (BCS IV)) were selected to ensure availability of accurate and reproducible data for all required parameters. Simulation success was evaluated with the average fold error (AFE) and absolute average fold error (AAFE). Parameter sensitivity analysis (PSA) to selected parameters was performed. Plasma concentration-time profiles after intravenous administration were forecast within an AAFE biopharmaceutics classification system (BCS) class. The reliability of literature permeability data was identified as a key issue in the accuracy of predicting oral drug absorption. For the four drugs studied, it appears that the forecasting accuracy of the PBPK models is related to the BCS class (BCS I > BCS II, BCS III > BCS IV). These results will need to be verified with additional drugs. © 2016 Royal Pharmaceutical Society.

  3. Diagnostic classification based on functional connectivity in chronic pain: model optimization in fibromyalgia and rheumatoid arthritis.

    Science.gov (United States)

    Sundermann, Benedikt; Burgmer, Markus; Pogatzki-Zahn, Esther; Gaubitz, Markus; Stüber, Christoph; Wessolleck, Erik; Heuft, Gereon; Pfleiderer, Bettina

    2014-03-01

    The combination of functional magnetic resonance imaging (fMRI) of the brain with multivariate pattern analysis (MVPA) has been proposed as a possible diagnostic tool. Goal of this investigation was to identify potential functional connectivity (FC) differences in the salience network (SN) and default mode network (DMN) between fibromyalgia syndrome (FMS), rheumatoid arthritis (RA), and controls (HC) and to evaluate the diagnostic applicability of derived pattern classification approaches. The resting period during an fMRI examination was retrospectively analyzed in women with FMS (n = 17), RA (n = 16), and HC (n = 17). FC was calculated for SN and DMN subregions. Classification accuracies of discriminative MVPA models were evaluated with cross-validation: (1) inferential test of a single method, (2) explorative model optimization. No inferentially tested model was able to classify subjects with statistically significant accuracy. However, the diagnostic ability for the differential diagnostic problem exhibited a trend to significance (accuracy: 69.7%, P = .086). Optimized models in the explorative analysis reached accuracies up to 73.5% (FMS vs. HC), 78.8% (RA vs. HC), and 78.8% (FMS vs. RA) whereas other models performed at or below chance level. Comparable support vector machine approaches performed above average for all three problems. Observed accuracies are not sufficient to reliably differentiate between FMS and RA for diagnostic purposes. However, some indirect evidence in support of the feasibility of this approach is provided. This exploratory analysis constitutes a fundamental model optimization effort to be based on in further investigations. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.

  4. An improved model for sensible heat flux estimation based on landcover classification

    Science.gov (United States)

    Zhou, Ti; Xin, Xiaozhou; Jiao, Jingjun; Peng, Zhiqing

    2014-10-01

    Remote sensing (RS) has been recognized as the most feasible means to provide spatially distributed regional evapotranspiration (ET). However, classical RS flux algorithms (SEBS, S-SEBI, SEBAL, etc.) can hardly be used with coarser resolution RS data from sensors like MODIS or AVHRR for no consideration of surface heterogeneity in mixed pixels even they are suitable for assessing the surface fluxes with high resolution RS data.A new model named FAFH is developed in this study to enhance the accuracy of flux estimation in mixed pixels based on high resolution landcover classification data. The area fraction and relative sensible heat fraction of each heterogeneous land use type calculated within coarse resolution pixels are calculated firstly, and then used for the weighted average of modified sensible heat. The study is carried out in the core agricultural land of Zhangye, the middle reaches of Heihe river based on the flux and landcover classification product of HJ-1B in our earlier work. The result indicates that FAFH increases the accuracy of sensible heat by 5% absolutely, 10.64% relatively in the whole research area.

  5. Dynamic Latent Classification Model

    DEFF Research Database (Denmark)

    Zhong, Shengtong; Martínez, Ana M.; Nielsen, Thomas Dyhre

    Monitoring a complex process often involves keeping an eye on hundreds or thousands of sensors to determine whether or not the process is under control. We have been working with dynamic data from an oil production facility in the North sea, where unstable situations should be identified as soon...... as possible. Motivated by this problem setting, we propose a generative model for dynamic classification in continuous domains. At each time point the model can be seen as combining a naive Bayes model with a mixture of factor analyzers (FA). The latent variables of the FA are used to capture the dynamics...... in the process as well as modeling dependences between attributes....

  6. In silico screening of estrogen-like chemicals based on different nonlinear classification models.

    Science.gov (United States)

    Liu, Huanxiang; Papa, Ester; Walker, John D; Gramatica, Paola

    2007-07-01

    Increasing concern is being shown by the scientific community, government regulators, and the public about endocrine-disrupting chemicals that are adversely affecting human and wildlife health through a variety of mechanisms. There is a great need for an effective means of rapidly assessing endocrine-disrupting activity, especially estrogen-simulating activity, because of the large number of such chemicals in the environment. In this study, quantitative structure activity relationship (QSAR) models were developed to quickly and effectively identify possible estrogen-like chemicals based on 232 structurally-diverse chemicals (training set) by using several nonlinear classification methodologies (least-square support vector machine (LS-SVM), counter-propagation artificial neural network (CP-ANN), and k nearest neighbour (kNN)) based on molecular structural descriptors. The models were externally validated by 87 chemicals (prediction set) not included in the training set. All three methods can give satisfactory prediction results both for training and prediction sets, and the most accurate model was obtained by the LS-SVM approach through the comparison of performance. In addition, our model was also applied to about 58,000 discrete organic chemicals; about 76% were predicted not to bind to Estrogen Receptor. The obtained results indicate that the proposed QSAR models are robust, widely applicable and could provide a feasible and practical tool for the rapid screening of potential estrogens.

  7. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Science.gov (United States)

    Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933

  8. Prototype-based Models for the Supervised Learning of Classification Schemes

    Science.gov (United States)

    Biehl, Michael; Hammer, Barbara; Villmann, Thomas

    2017-06-01

    An introduction is given to the use of prototype-based models in supervised machine learning. The main concept of the framework is to represent previously observed data in terms of so-called prototypes, which reflect typical properties of the data. Together with a suitable, discriminative distance or dissimilarity measure, prototypes can be used for the classification of complex, possibly high-dimensional data. We illustrate the framework in terms of the popular Learning Vector Quantization (LVQ). Most frequently, standard Euclidean distance is employed as a distance measure. We discuss how LVQ can be equipped with more general dissimilarites. Moreover, we introduce relevance learning as a tool for the data-driven optimization of parameterized distances.

  9. Objects Classification by Learning-Based Visual Saliency Model and Convolutional Neural Network

    Directory of Open Access Journals (Sweden)

    Na Li

    2016-01-01

    Full Text Available Humans can easily classify different kinds of objects whereas it is quite difficult for computers. As a hot and difficult problem, objects classification has been receiving extensive interests with broad prospects. Inspired by neuroscience, deep learning concept is proposed. Convolutional neural network (CNN as one of the methods of deep learning can be used to solve classification problem. But most of deep learning methods, including CNN, all ignore the human visual information processing mechanism when a person is classifying objects. Therefore, in this paper, inspiring the completed processing that humans classify different kinds of objects, we bring forth a new classification method which combines visual attention model and CNN. Firstly, we use the visual attention model to simulate the processing of human visual selection mechanism. Secondly, we use CNN to simulate the processing of how humans select features and extract the local features of those selected areas. Finally, not only does our classification method depend on those local features, but also it adds the human semantic features to classify objects. Our classification method has apparently advantages in biology. Experimental results demonstrated that our method made the efficiency of classification improve significantly.

  10. A unified model for context-based behavioural modelling and classification.

    CSIR Research Space (South Africa)

    Dabrowski, JJ

    2015-11-01

    Full Text Available the continuous dynamics of a entity and incorporating various contextual elements that influence behaviour. The entity is classified according to its behaviour. Classification is expressed as a conditional probability of the entity class given its tracked...

  11. Churn classification model for local telecommunication company ...

    African Journals Online (AJOL)

    ... model based on the Rough Set Theory to classify customer churn. The results of the study show that the proposed Rough Set classification model outperforms the existing models and contributes to significant accuracy improvement. Keywords: customer churn; classification model; telecommunication industry; data mining;

  12. Assimilation of a knowledge base and physical models to reduce errors in passive-microwave classifications of sea ice

    Science.gov (United States)

    Maslanik, J. A.; Key, J.

    1992-01-01

    An expert system framework has been developed to classify sea ice types using satellite passive microwave data, an operational classification algorithm, spatial and temporal information, ice types estimated from a dynamic-thermodynamic model, output from a neural network that detects the onset of melt, and knowledge about season and region. The rule base imposes boundary conditions upon the ice classification, modifies parameters in the ice algorithm, determines a `confidence' measure for the classified data, and under certain conditions, replaces the algorithm output with model output. Results demonstrate the potential power of such a system for minimizing overall error in the classification and for providing non-expert data users with a means of assessing the usefulness of the classification results for their applications.

  13. Quantitative Outline-based Shape Analysis and Classification of Planetary Craterforms using Supervised Learning Models

    Science.gov (United States)

    Slezak, Thomas Joseph; Radebaugh, Jani; Christiansen, Eric

    2017-10-01

    The shapes of craterform morphology on planetary surfaces provides rich information about their origins and evolution. While morphologic information provides rich visual clues to geologic processes and properties, the ability to quantitatively communicate this information is less easily accomplished. This study examines the morphology of craterforms using the quantitative outline-based shape methods of geometric morphometrics, commonly used in biology and paleontology. We examine and compare landforms on planetary surfaces using shape, a property of morphology that is invariant to translation, rotation, and size. We quantify the shapes of paterae on Io, martian calderas, terrestrial basaltic shield calderas, terrestrial ash-flow calderas, and lunar impact craters using elliptic Fourier analysis (EFA) and the Zahn and Roskies (Z-R) shape function, or tangent angle approach to produce multivariate shape descriptors. These shape descriptors are subjected to multivariate statistical analysis including canonical variate analysis (CVA), a multiple-comparison variant of discriminant analysis, to investigate the link between craterform shape and classification. Paterae on Io are most similar in shape to terrestrial ash-flow calderas and the shapes of terrestrial basaltic shield volcanoes are most similar to martian calderas. The shapes of lunar impact craters, including simple, transitional, and complex morphology, are classified with a 100% rate of success in all models. Multiple CVA models effectively predict and classify different craterforms using shape-based identification and demonstrate significant potential for use in the analysis of planetary surfaces.

  14. Interpretable exemplar-based shape classification using constrained sparse linear models.

    Science.gov (United States)

    Sigurdsson, Gunnar A; Yang, Zhen; Tran, Trac D; Prince, Jerry L

    2015-02-01

    Many types of diseases manifest themselves as observable changes in the shape of the affected organs. Using shape classification, we can look for signs of disease and discover relationships between diseases. We formulate the problem of shape classification in a holistic framework that utilizes a lossless scalar field representation and a non-parametric classification based on sparse recovery. This framework generalizes over certain classes of unseen shapes while using the full information of the shape, bypassing feature extraction. The output of the method is the class whose combination of exemplars most closely approximates the shape, and furthermore, the algorithm returns the most similar exemplars along with their similarity to the shape, which makes the result simple to interpret. Our results show that the method offers accurate classification between three cerebellar diseases and controls in a database of cerebellar ataxia patients. For reproducible comparison, promising results are presented on publicly available 2D datasets, including the ETH-80 dataset where the method achieves 88.4% classification accuracy.

  15. BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data.

    Science.gov (United States)

    Guo, Yang; Liu, Shuhui; Li, Zhanhuai; Shang, Xuequn

    2018-04-11

    The classification of cancer subtypes is of great importance to cancer disease diagnosis and therapy. Many supervised learning approaches have been applied to cancer subtype classification in the past few years, especially of deep learning based approaches. Recently, the deep forest model has been proposed as an alternative of deep neural networks to learn hyper-representations by using cascade ensemble decision trees. It has been proved that the deep forest model has competitive or even better performance than deep neural networks in some extent. However, the standard deep forest model may face overfitting and ensemble diversity challenges when dealing with small sample size and high-dimensional biology data. In this paper, we propose a deep learning model, so-called BCDForest, to address cancer subtype classification on small-scale biology datasets, which can be viewed as a modification of the standard deep forest model. The BCDForest distinguishes from the standard deep forest model with the following two main contributions: First, a named multi-class-grained scanning method is proposed to train multiple binary classifiers to encourage diversity of ensemble. Meanwhile, the fitting quality of each classifier is considered in representation learning. Second, we propose a boosting strategy to emphasize more important features in cascade forests, thus to propagate the benefits of discriminative features among cascade layers to improve the classification performance. Systematic comparison experiments on both microarray and RNA-Seq gene expression datasets demonstrate that our method consistently outperforms the state-of-the-art methods in application of cancer subtype classification. The multi-class-grained scanning and boosting strategy in our model provide an effective solution to ease the overfitting challenge and improve the robustness of deep forest model working on small-scale data. Our model provides a useful approach to the classification of cancer subtypes

  16. BClass: A Bayesian Approach Based on Mixture Models for Clustering and Classification of Heterogeneous Biological Data

    Directory of Open Access Journals (Sweden)

    Arturo Medrano-Soto

    2004-12-01

    Full Text Available Based on mixture models, we present a Bayesian method (called BClass to classify biological entities (e.g. genes when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects.

  17. Object based classification of high resolution data in urban areas considering digital surface models

    OpenAIRE

    Oczipka, Martin Eckhard

    2010-01-01

    Over the last couple of years more and more analogue airborne cameras were replaced by digital cameras. Digitally recorded image data have significant advantages to film based data. Digital aerial photographs have a much better radiometric resolution. Image information can be acquired in shaded areas too. This information is essential for a stable and continuous classification, because no data or unclassified areas should be as small as possible. Considering this technological progress, on...

  18. Classification of Polarimetric SAR Image Based on Support Vector Machine Using Multiple-Component Scattering Model and Texture Features

    Directory of Open Access Journals (Sweden)

    Lamei Zhang

    2010-01-01

    Full Text Available The classification of polarimetric SAR image based on Multiple-Component Scattering Model (MCSM and Support Vector Machine (SVM is presented in this paper. MCSM is a potential decomposition method for a general condition. SVM is a popular tool for machine learning tasks involving classification, recognition, or detection. The scattering powers of single-bounce, double-bounce, volume, helix, and wire scattering components are extracted from fully polarimetric SAR images. Combining with the scattering powers of MCSM and the selected texture features from Gray-level cooccurrence matrix (GCM, SVM is used for the classification of polarimetric SAR image. We generate a validity test for the proposed method using Danish EMISAR L-band fully polarimetric data of Foulum Area (DK, Denmark. The preliminary result indicates that this method can classify most of the areas correctly.

  19. A survey of supervised machine learning models for mobile-phone based pathogen identification and classification

    Science.gov (United States)

    Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Tseng, Derek; Benien, Parul; Ozcan, Aydogan

    2017-03-01

    Giardia lamblia causes a disease known as giardiasis, which results in diarrhea, abdominal cramps, and bloating. Although conventional pathogen detection methods used in water analysis laboratories offer high sensitivity and specificity, they are time consuming, and need experts to operate bulky equipment and analyze the samples. Here we present a field-portable and cost-effective smartphone-based waterborne pathogen detection platform that can automatically classify Giardia cysts using machine learning. Our platform enables the detection and quantification of Giardia cysts in one hour, including sample collection, labeling, filtration, and automated counting steps. We evaluated the performance of three prototypes using Giardia-spiked water samples from different sources (e.g., reagent-grade, tap, non-potable, and pond water samples). We populated a training database with >30,000 cysts and estimated our detection sensitivity and specificity using 20 different classifier models, including decision trees, nearest neighbor classifiers, support vector machines (SVMs), and ensemble classifiers, and compared their speed of training and classification, as well as predicted accuracies. Among them, cubic SVM, medium Gaussian SVM, and bagged-trees were the most promising classifier types with accuracies of 94.1%, 94.2%, and 95%, respectively; we selected the latter as our preferred classifier for the detection and enumeration of Giardia cysts that are imaged using our mobile-phone fluorescence microscope. Without the need for any experts or microbiologists, this field-portable pathogen detection platform can present a useful tool for water quality monitoring in resource-limited-settings.

  20. Dictionary Learning Based Dimensionality Reduction for Classification

    OpenAIRE

    Schnass, Karin; Vandergheynst, Pierre

    2008-01-01

    In this article we present a signal model for classification based on a low dimensional dictionary embedded into the high dimensional signal space. We develop an alternate projection algorithm to find the embedding and the dictionary and finally test the classification performance of our scheme in comparison to Fisher’s LDA.

  1. Modeling Wood Fibre Length in Black Spruce (Picea mariana (Mill. BSP Based on Ecological Land Classification

    Directory of Open Access Journals (Sweden)

    Elisha Townshend

    2015-09-01

    Full Text Available Effective planning to optimize the forest value chain requires accurate and detailed information about the resource; however, estimates of the distribution of fibre properties on the landscape are largely unavailable prior to harvest. Our objective was to fit a model of the tree-level average fibre length related to ecosite classification and other forest inventory variables depicted at the landscape scale. A series of black spruce increment cores were collected at breast height from trees in nine different ecosite groups within the boreal forest of northeastern Ontario, and processed using standard techniques for maceration and fibre length measurement. Regression tree analysis and random forests were used to fit hierarchical classification models and find the most important predictor variables for the response variable area-weighted mean stem-level fibre length. Ecosite group was the best predictor in the regression tree. Longer mean fibre-length was associated with more productive ecosites that supported faster growth. The explanatory power of the model of fitted data was good; however, random forests simulations indicated poor generalizability. These results suggest the potential to develop localized models linking wood fibre length in black spruce to landscape-level attributes, and improve the sustainability of forest management by identifying ideal locations to harvest wood that has desirable fibre characteristics.

  2. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach

    Directory of Open Access Journals (Sweden)

    Yulin Jian

    2017-06-01

    Full Text Available A novel classification model, named the quantum-behaved particle swarm optimization (QPSO-based weighted multiple kernel extreme learning machine (QWMK-ELM, is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs. The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM, kernel extreme learning machine (KELM, k-nearest neighbors (KNN, support vector machine (SVM, multi-layer perceptron (MLP, radical basis function neural network (RBFNN, and probabilistic neural network (PNN. The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  3. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    Science.gov (United States)

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  4. Spectral-spatial classification combined with diffusion theory based inverse modeling of hyperspectral images

    Science.gov (United States)

    Paluchowski, Lukasz A.; Bjorgan, Asgeir; Nordgaard, Hâvard B.; Randeberg, Lise L.

    2016-02-01

    Hyperspectral imagery opens a new perspective for biomedical diagnostics and tissue characterization. High spectral resolution can give insight into optical properties of the skin tissue. However, at the same time the amount of collected data represents a challenge when it comes to decomposition into clusters and extraction of useful diagnostic information. In this study spectral-spatial classification and inverse diffusion modeling were employed to hyperspectral images obtained from a porcine burn model using a hyperspectral push-broom camera. The implemented method takes advantage of spatial and spectral information simultaneously, and provides information about the average optical properties within each cluster. The implemented algorithm allows mapping spectral and spatial heterogeneity of the burn injury as well as dynamic changes of spectral properties within the burn area. The combination of statistical and physics informed tools allowed for initial separation of different burn wounds and further detailed characterization of the injuries in short post-injury time.

  5. Persistent pulmonary subsolid nodules: model-based iterative reconstruction for nodule classification and measurement variability on low-dose CT.

    Science.gov (United States)

    Kim, Hyungjin; Park, Chang Min; Kim, Seong Ho; Lee, Sang Min; Park, Sang Joon; Lee, Kyung Hee; Goo, Jin Mo

    2014-11-01

    To compare the pulmonary subsolid nodule (SSN) classification agreement and measurement variability between filtered back projection (FBP) and model-based iterative reconstruction (MBIR). Low-dose CTs were reconstructed using FBP and MBIR for 47 patients with 47 SSNs. Two readers independently classified SSNs into pure or part-solid ground-glass nodules, and measured the size of the whole nodule and solid portion twice on both reconstruction algorithms. Nodule classification agreement was analyzed using Cohen's kappa and compared between reconstruction algorithms using McNemar's test. Measurement variability was investigated using Bland-Altman analysis and compared with the paired t-test. Cohen's kappa for inter-reader SSN classification agreement was 0.541-0.662 on FBP and 0.778-0.866 on MBIR. Between the two readers, nodule classification was consistent in 79.8 % (75/94) with FBP and 91.5 % (86/94) with MBIR (p = 0.027). Inter-reader measurement variability range was -5.0-2.1 mm on FBP and -3.3-1.8 mm on MBIR for whole nodule size, and was -6.5-0.9 mm on FBP and -5.5-1.5 mm on MBIR for solid portion size. Inter-reader measurement differences were significantly smaller on MBIR (p = 0.027, whole nodule; p = 0.011, solid portion). MBIR significantly improved SSN classification agreement and reduced measurement variability of both whole nodules and solid portions between readers. • Low-dose CT using MBIR algorithm improves reproducibility in the classification of SSNs. • MBIR would enable more confident clinical planning according to the SSN type. • Reduced measurement variability on MBIR allows earlier detection of potentially malignant nodules.

  6. Deep learning-based fine-grained car make/model classification for visual surveillance

    Science.gov (United States)

    Gundogdu, Erhan; Parıldı, Enes Sinan; Solmaz, Berkan; Yücesoy, Veysel; Koç, Aykut

    2017-10-01

    Fine-grained object recognition is a potential computer vision problem that has been recently addressed by utilizing deep Convolutional Neural Networks (CNNs). Nevertheless, the main disadvantage of classification methods relying on deep CNN models is the need for considerably large amount of data. In addition, there exists relatively less amount of annotated data for a real world application, such as the recognition of car models in a traffic surveillance system. To this end, we mainly concentrate on the classification of fine-grained car make and/or models for visual scenarios by the help of two different domains. First, a large-scale dataset including approximately 900K images is constructed from a website which includes fine-grained car models. According to their labels, a state-of-the-art CNN model is trained on the constructed dataset. The second domain that is dealt with is the set of images collected from a camera integrated to a traffic surveillance system. These images, which are over 260K, are gathered by a special license plate detection method on top of a motion detection algorithm. An appropriately selected size of the image is cropped from the region of interest provided by the detected license plate location. These sets of images and their provided labels for more than 30 classes are employed to fine-tune the CNN model which is already trained on the large scale dataset described above. To fine-tune the network, the last two fully-connected layers are randomly initialized and the remaining layers are fine-tuned in the second dataset. In this work, the transfer of a learned model on a large dataset to a smaller one has been successfully performed by utilizing both the limited annotated data of the traffic field and a large scale dataset with available annotations. Our experimental results both in the validation dataset and the real field show that the proposed methodology performs favorably against the training of the CNN model from scratch.

  7. A Classification Method Based on Polarimetric Entropy and GEV Mixture Model for Intertidal Area of PolSAR Image

    Directory of Open Access Journals (Sweden)

    She Xiaoqiang

    2017-10-01

    Full Text Available This paper proposes a classification method for the intertidal area using quad-polarimetric synthetic aperture radar data. In this paper, a systematic comparison of four well-known multipolarization features is provided so that appropriate features can be selected based on the characteristics of the intertidal area. Analysis result shows that the two most powerful multipolarization features are polarimetric entropy and anisotropy. Furthermore, through our detailed analysis of the scattering mechanisms of the polarimetric entropy, the Generalized Extreme Value (GEV distribution is employed to describe the statistical characteristics of the intertidal area based on the extreme value theory. Consequently, a new classification method is proposed by combining the GEV Mixture Models and the EM algorithm. Finally, experiments are performed on the Radarsat-2 quad-polarization data of the Dongtan intertidal area, Shanghai, to validate our method.

  8. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  9. Persistent pulmonary subsolid nodules: model-based iterative reconstruction for nodule classification and measurement variability on low-dose CT

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Hyungjin; Kim, Seong Ho; Lee, Sang Min; Lee, Kyung Hee [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University Medical Research Center, Institute of Radiation Medicine, Seoul (Korea, Republic of); Park, Chang Min; Park, Sang Joon; Goo, Jin Mo [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University Medical Research Center, Institute of Radiation Medicine, Seoul (Korea, Republic of); Seoul National University, Cancer Research Institute, Seoul (Korea, Republic of)

    2014-11-15

    To compare the pulmonary subsolid nodule (SSN) classification agreement and measurement variability between filtered back projection (FBP) and model-based iterative reconstruction (MBIR). Low-dose CTs were reconstructed using FBP and MBIR for 47 patients with 47 SSNs. Two readers independently classified SSNs into pure or part-solid ground-glass nodules, and measured the size of the whole nodule and solid portion twice on both reconstruction algorithms. Nodule classification agreement was analyzed using Cohen's kappa and compared between reconstruction algorithms using McNemar's test. Measurement variability was investigated using Bland-Altman analysis and compared with the paired t-test. Cohen's kappa for inter-reader SSN classification agreement was 0.541-0.662 on FBP and 0.778-0.866 on MBIR. Between the two readers, nodule classification was consistent in 79.8 % (75/94) with FBP and 91.5 % (86/94) with MBIR (p = 0.027). Inter-reader measurement variability range was -5.0-2.1 mm on FBP and -3.3-1.8 mm on MBIR for whole nodule size, and was -6.5-0.9 mm on FBP and -5.5-1.5 mm on MBIR for solid portion size. Inter-reader measurement differences were significantly smaller on MBIR (p = 0.027, whole nodule; p = 0.011, solid portion). MBIR significantly improved SSN classification agreement and reduced measurement variability of both whole nodules and solid portions between readers. (orig.)

  10. Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices.

    Science.gov (United States)

    Munteanu, Cristian Robert; González-Díaz, Humberto; Magalhães, Alexandre L

    2008-09-21

    The huge amount of new proteins that need a fast enzymatic activity characterization creates demands of protein QSAR theoretical models. The protein parameters that can be used for an enzyme/non-enzyme classification includes the simpler indices such as composition, sequence and connectivity, also called topological indices (TIs) and the computationally expensive 3D descriptors. A comparison of the 3D versus lower dimension indices has not been reported with respect to the power of discrimination of proteins according to enzyme action. A set of 966 proteins (enzymes and non-enzymes) whose structural characteristics are provided by PDB/DSSP files was analyzed with Python/Biopython scripts, STATISTICA and Weka. The list of indices includes, but it is not restricted to pure composition indices (residue fractions), DSSP secondary structure protein composition and 3D indices (surface and access). We also used mixed indices such as composition-sequence indices (Chou's pseudo-amino acid compositions or coupling numbers), 3D-composition (surface fractions) and DSSP secondary structure amino acid composition/propensities (obtained with our Prot-2S Web tool). In addition, we extend and test for the first time several classic TIs for the Randic's protein sequence Star graphs using our Sequence to Star Graph (S2SG) Python application. All the indices were processed with general discriminant analysis models (GDA), neural networks (NN) and machine learning (ML) methods and the results are presented versus complexity, average of Shannon's information entropy (Sh) and data/method type. This study compares for the first time all these classes of indices to assess the ratios between model accuracy and indices/model complexity in enzyme/non-enzyme discrimination. The use of different methods and complexity of data shows that one cannot establish a direct relation between the complexity and the accuracy of the model.

  11. Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus.

    Science.gov (United States)

    Jouhet, Vianney; Mougin, Fleur; Bréchat, Bérénice; Thiessard, Frantz

    2017-02-07

    Identifying incident cancer cases within a population remains essential for scientific research in oncology. Data produced within electronic health records can be useful for this purpose. Due to the multiplicity of providers, heterogeneous terminologies such as ICD-10 and ICD-O-3 are used for oncology diagnosis recording purpose. To enable disease identification based on these diagnoses, there is a need for integrating disease classifications in oncology. Our aim was to build a model integrating concepts involved in two disease classifications, namely ICD-10 (diagnosis) and ICD-O-3 (topography and morphology), despite their structural heterogeneity. Based on the NCIt, a "derivative" model for linking diagnosis and topography-morphology combinations was defined and built. ICD-O-3 and ICD-10 codes were then used to instantiate classes of the "derivative" model. Links between terminologies obtained through the model were then compared to mappings provided by the Surveillance, Epidemiology, and End Results (SEER) program. The model integrated 42% of neoplasm ICD-10 codes (excluding metastasis), 98% of ICD-O-3 morphology codes (excluding metastasis) and 68% of ICD-O-3 topography codes. For every codes instantiating at least a class in the "derivative" model, comparison with SEER mappings reveals that all mappings were actually available in the model as a link between the corresponding codes. We have proposed a method to automatically build a model for integrating ICD-10 and ICD-O-3 based on the NCIt. The resulting "derivative" model is a machine understandable resource that enables an integrated view of these heterogeneous terminologies. The NCIt structure and the available relationships can help to bridge disease classifications taking into account their structural and granular heterogeneities. However, (i) inconsistencies exist within the NCIt leading to misclassifications in the "derivative" model, (ii) the "derivative" model only integrates a part of ICD-10 and ICD

  12. Feature Classification for Robust Shape-Based Collaborative Tracking and Model Updating

    Directory of Open Access Journals (Sweden)

    C. S. Regazzoni

    2008-09-01

    Full Text Available A new collaborative tracking approach is introduced which takes advantage of classified features. The core of this tracker is a single tracker that is able to detect occlusions and classify features contributing in localizing the object. Features are classified in four classes: good, suspicious, malicious, and neutral. Good features are estimated to be parts of the object with a high degree of confidence. Suspicious ones have a lower, yet significantly high, degree of confidence to be a part of the object. Malicious features are estimated to be generated by clutter, while neutral features are characterized with not a sufficient level of uncertainty to be assigned to the tracked object. When there is no occlusion, the single tracker acts alone, and the feature classification module helps it to overcome distracters such as still objects or little clutter in the scene. When more than one desired moving objects bounding boxes are close enough, the collaborative tracker is activated and it exploits the advantages of the classified features to localize each object precisely as well as updating the objects shape models more precisely by assigning again the classified features to the objects. The experimental results show successful tracking compared with the collaborative tracker that does not use the classified features. Moreover, more precise updated object shape models will be shown.

  13. Emotion models for textual emotion classification

    Science.gov (United States)

    Bruna, O.; Avetisyan, H.; Holub, J.

    2016-11-01

    This paper deals with textual emotion classification which gained attention in recent years. Emotion classification is used in user experience, product evaluation, national security, and tutoring applications. It attempts to detect the emotional content in the input text and based on different approaches establish what kind of emotional content is present, if any. Textual emotion classification is the most difficult to handle, since it relies mainly on linguistic resources and it introduces many challenges to assignment of text to emotion represented by a proper model. A crucial part of each emotion detector is emotion model. Focus of this paper is to introduce emotion models used for classification. Categorical and dimensional models of emotion are explained and some more advanced approaches are mentioned.

  14. Identifying Chinese Microblog Users With High Suicide Probability Using Internet-Based Profile and Linguistic Features: Classification Model.

    Science.gov (United States)

    Guan, Li; Hao, Bibo; Cheng, Qijin; Yip, Paul Sf; Zhu, Tingshao

    2015-01-01

    Traditional offline assessment of suicide probability is time consuming and difficult in convincing at-risk individuals to participate. Identifying individuals with high suicide probability through online social media has an advantage in its efficiency and potential to reach out to hidden individuals, yet little research has been focused on this specific field. The objective of this study was to apply two classification models, Simple Logistic Regression (SLR) and Random Forest (RF), to examine the feasibility and effectiveness of identifying high suicide possibility microblog users in China through profile and linguistic features extracted from Internet-based data. There were nine hundred and nine Chinese microblog users that completed an Internet survey, and those scoring one SD above the mean of the total Suicide Probability Scale (SPS) score, as well as one SD above the mean in each of the four subscale scores in the participant sample were labeled as high-risk individuals, respectively. Profile and linguistic features were fed into two machine learning algorithms (SLR and RF) to train the model that aims to identify high-risk individuals in general suicide probability and in its four dimensions. Models were trained and then tested by 5-fold cross validation; in which both training set and test set were generated under the stratified random sampling rule from the whole sample. There were three classic performance metrics (Precision, Recall, F1 measure) and a specifically defined metric "Screening Efficiency" that were adopted to evaluate model effectiveness. Classification performance was generally matched between SLR and RF. Given the best performance of the classification models, we were able to retrieve over 70% of the labeled high-risk individuals in overall suicide probability as well as in the four dimensions. Screening Efficiency of most models varied from 1/4 to 1/2. Precision of the models was generally below 30%. Individuals in China with high suicide

  15. Structure based classification for bile salt export pump (BSEP) inhibitors using comparative structural modeling of human BSEP

    Science.gov (United States)

    Jain, Sankalp; Grandits, Melanie; Richter, Lars; Ecker, Gerhard F.

    2017-06-01

    The bile salt export pump (BSEP) actively transports conjugated monovalent bile acids from the hepatocytes into the bile. This facilitates the formation of micelles and promotes digestion and absorption of dietary fat. Inhibition of BSEP leads to decreased bile flow and accumulation of cytotoxic bile salts in the liver. A number of compounds have been identified to interact with BSEP, which results in drug-induced cholestasis or liver injury. Therefore, in silico approaches for flagging compounds as potential BSEP inhibitors would be of high value in the early stage of the drug discovery pipeline. Up to now, due to the lack of a high-resolution X-ray structure of BSEP, in silico based identification of BSEP inhibitors focused on ligand-based approaches. In this study, we provide a homology model for BSEP, developed using the corrected mouse P-glycoprotein structure (PDB ID: 4M1M). Subsequently, the model was used for docking-based classification of a set of 1212 compounds (405 BSEP inhibitors, 807 non-inhibitors). Using the scoring function ChemScore, a prediction accuracy of 81% on the training set and 73% on two external test sets could be obtained. In addition, the applicability domain of the models was assessed based on Euclidean distance. Further, analysis of the protein-ligand interaction fingerprints revealed certain functional group-amino acid residue interactions that could play a key role for ligand binding. Though ligand-based models, due to their high speed and accuracy, remain the method of choice for classification of BSEP inhibitors, structure-assisted docking models demonstrate reasonably good prediction accuracies while additionally providing information about putative protein-ligand interactions.

  16. Fuzzy One-Class Classification Model Using Contamination Neighborhoods

    Directory of Open Access Journals (Sweden)

    Lev V. Utkin

    2012-01-01

    Full Text Available A fuzzy classification model is studied in the paper. It is based on the contaminated (robust model which produces fuzzy expected risk measures characterizing classification errors. Optimal classification parameters of the models are derived by minimizing the fuzzy expected risk. It is shown that an algorithm for computing the classification parameters is reduced to a set of standard support vector machine tasks with weighted data points. Experimental results with synthetic data illustrate the proposed fuzzy model.

  17. Cry-Based Classification of Healthy and Sick Infants Using Adapted Boosting Mixture Learning Method for Gaussian Mixture Models

    Directory of Open Access Journals (Sweden)

    Hesam Farsaie Alaie

    2012-01-01

    Full Text Available We make use of information inside infant’s cry signal in order to identify the infant’s psychological condition. Gaussian mixture models (GMMs are applied to distinguish between healthy full-term and premature infants, and those with specific medical problems available in our cry database. Cry pattern for each pathological condition is created by using adapted boosting mixture learning (BML method to estimate mixture model parameters. In the first experiment, test results demonstrate that the introduced adapted BML method for learning of GMMs has a better performance than conventional EM-based reestimation algorithm as a reference system in multipathological classification task. This newborn cry-based diagnostic system (NCDS extracted Mel-frequency cepstral coefficients (MFCCs as a feature vector for cry patterns of newborn infants. In binary classification experiment, the system discriminated a test infant’s cry signal into one of two groups, namely, healthy and pathological based on MFCCs. The binary classifier achieved a true positive rate of 80.77% and a true negative rate of 86.96% which show the ability of the system to correctly identify healthy and diseased infants, respectively.

  18. Latent classification models

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre

    2005-01-01

    of the \\NB classifier. In theproposed model the continuous attributes are described by amixture of multivariate Gaussians, where the conditionaldependencies among the attributes are encoded using latentvariables. We present algorithms for learning both the parametersand the structure of a latent...

  19. Mean – Variance parametric Model for the Classification based on Cries of Babies

    OpenAIRE

    Khalid Nazim S. A; Dr. M.B Sanjay Pande

    2010-01-01

    Cry is a feature which makes a individual to take certain care about the infant which has initiated it. It is also equally understood that cry makes a person to take certain steps. In the present work, we have tried to implement a mathematical model which can classify the cry into its cluster or group based on certain parameters based on which a cry is classified into a normal or abnormal. To corroborate the methodology we taken 17 distinguished features of cry. The implemented mathematical m...

  20. Deep feature extraction and combination for remote sensing image classification based on pre-trained CNN models

    Science.gov (United States)

    Chaib, Souleyman; Yao, Hongxun; Gu, Yanfeng; Amrani, Moussa

    2017-07-01

    Understanding a scene provided by Very High Resolution (VHR) satellite imagery has become a more and more challenging problem. In this paper, we propose a new method for scene classification based on different pre-trained Deep Features Learning Models (DFLMs). DFLMs are applied simultaneously to extract deep features from the VHR image scene, and then different basic operators are applied for features combination extracted with different pre-trained Convolutional Neural Networks (CNN) models. We conduct experiments on the public UC Merced benchmark dataset, which contains 21 different areal categories with sub-meter resolution. Experimental results demonstrate the effectiveness of the proposed method, as compared to several state-of-the-art methods.

  1. THE LOW BACKSCATTERING OBJECTS CLASSIFICATION IN POLSAR IMAGE BASED ON BAG OF WORDS MODEL USING SUPPORT VECTOR MACHINE

    Directory of Open Access Journals (Sweden)

    L. Yang

    2018-04-01

    Full Text Available Due to the forward scattering and block of radar signal, the water, bare soil, shadow, named low backscattering objects (LBOs, often present low backscattering intensity in polarimetric synthetic aperture radar (PolSAR image. Because the LBOs rise similar backscattering intensity and polarimetric responses, the spectral-based classifiers are inefficient to deal with LBO classification, such as Wishart method. Although some polarimetric features had been exploited to relieve the confusion phenomenon, the backscattering features are still found unstable when the system noise floor varies in the range direction. This paper will introduce a simple but effective scene classification method based on Bag of Words (BoW model using Support Vector Machine (SVM to discriminate the LBOs, without relying on any polarimetric features. In the proposed approach, square windows are firstly opened around the LBOs adaptively to determine the scene images, and then the Scale-Invariant Feature Transform (SIFT points are detected in training and test scenes. The several SIFT features detected are clustered using K-means to obtain certain cluster centers as the visual word lists and scene images are represented using word frequency. At last, the SVM is selected for training and predicting new scenes as some kind of LBOs. The proposed method is executed over two AIRSAR data sets at C band and L band, including water, bare soil and shadow scenes. The experimental results illustrate the effectiveness of the scene method in distinguishing LBOs.

  2. Model-based classification of CPT data and automated lithostratigraphic mapping for high-resolution characterization of a heterogeneous sedimentary aquifer.

    Directory of Open Access Journals (Sweden)

    Bart Rogiers

    Full Text Available Cone penetration testing (CPT is one of the most efficient and versatile methods currently available for geotechnical, lithostratigraphic and hydrogeological site characterization. Currently available methods for soil behaviour type classification (SBT of CPT data however have severe limitations, often restricting their application to a local scale. For parameterization of regional groundwater flow or geotechnical models, and delineation of regional hydro- or lithostratigraphy, regional SBT classification would be very useful. This paper investigates the use of model-based clustering for SBT classification, and the influence of different clustering approaches on the properties and spatial distribution of the obtained soil classes. We additionally propose a methodology for automated lithostratigraphic mapping of regionally occurring sedimentary units using SBT classification. The methodology is applied to a large CPT dataset, covering a groundwater basin of ~60 km2 with predominantly unconsolidated sandy sediments in northern Belgium. Results show that the model-based approach is superior in detecting the true lithological classes when compared to more frequently applied unsupervised classification approaches or literature classification diagrams. We demonstrate that automated mapping of lithostratigraphic units using advanced SBT classification techniques can provide a large gain in efficiency, compared to more time-consuming manual approaches and yields at least equally accurate results.

  3. The attractor recurrent neural network based on fuzzy functions: An effective model for the classification of lung abnormalities.

    Science.gov (United States)

    Khodabakhshi, Mohammad Bagher; Moradi, Mohammad Hassan

    2017-05-01

    The respiratory system dynamic is of high significance when it comes to the detection of lung abnormalities, which highlights the importance of presenting a reliable model for it. In this paper, we introduce a novel dynamic modelling method for the characterization of the lung sounds (LS), based on the attractor recurrent neural network (ARNN). The ARNN structure allows the development of an effective LS model. Additionally, it has the capability to reproduce the distinctive features of the lung sounds using its formed attractors. Furthermore, a novel ARNN topology based on fuzzy functions (FFs-ARNN) is developed. Given the utility of the recurrent quantification analysis (RQA) as a tool to assess the nature of complex systems, it was used to evaluate the performance of both the ARNN and the FFs-ARNN models. The experimental results demonstrate the effectiveness of the proposed approaches for multichannel LS analysis. In particular, a classification accuracy of 91% was achieved using FFs-ARNN with sequences of RQA features. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Quantitative structure-activity relationship modeling of polycyclic aromatic hydrocarbon mutagenicity by classification methods based on holistic theoretical molecular descriptors.

    Science.gov (United States)

    Gramatica, Paola; Papa, Ester; Marrocchi, Assunta; Minuti, Lucio; Taticchi, Aldo

    2007-03-01

    Various polycyclic aromatic hydrocarbons (PAHs), ubiquitous environmental pollutants, are recognized mutagens and carcinogens. A homogeneous set of mutagenicity data (TA98 and TA100,+S9) for 32 benzocyclopentaphenanthrenes/chrysenes was modeled by the quantitative structure-activity relationship classification methods k-nearest neighbor and classification and regression tree, using theoretical holistic molecular descriptors. Genetic algorithm provided the selection of the best subset of variables for modeling mutagenicity. The models were validated by leave-one-out and leave-50%-out approaches and have good performance, with sensitivity and specificity ranges of 90-100%. Mutagenicity assessment for these PAHs requires only a few theoretical descriptors of their molecular structure.

  5. High-order feature-based mixture models of classification learning predict individual learning curves and enable personalized teaching.

    Science.gov (United States)

    Cohen, Yarden; Schneidman, Elad

    2013-01-08

    Pattern classification learning tasks are commonly used to explore learning strategies in human subjects. The universal and individual traits of learning such tasks reflect our cognitive abilities and have been of interest both psychophysically and clinically. From a computational perspective, these tasks are hard, because the number of patterns and rules one could consider even in simple cases is exponentially large. Thus, when we learn to classify we must use simplifying assumptions and generalize. Studies of human behavior in probabilistic learning tasks have focused on rules in which pattern cues are independent, and also described individual behavior in terms of simple, single-cue, feature-based models. Here, we conducted psychophysical experiments in which people learned to classify binary sequences according to deterministic rules of different complexity, including high-order, multicue-dependent rules. We show that human performance on such tasks is very diverse, but that a class of reinforcement learning-like models that use a mixture of features captures individual learning behavior surprisingly well. These models reflect the important role of subjects' priors, and their reliance on high-order features even when learning a low-order rule. Further, we show that these models predict future individual answers to a high degree of accuracy. We then use these models to build personally optimized teaching sessions and boost learning.

  6. Production process parameter optimization with a new model based on a genetic algorithm and ABC classification method

    Directory of Open Access Journals (Sweden)

    Milan Eric

    2016-08-01

    Full Text Available The difference between the production cost and selling price of the products may be viewed as a criterion that determines an organization’s competitiveness and market success. In such circumstances, it is necessary to impact these criteria in order to maximize this difference. The selling products’ price, in modern market conditions, is a category which may not be significantly affected. So organizations have one option, which is the production cost reduction. This is the motive for business organizations and the imperative of each organization. The key parameters that influence the costs of production and therefore influence the competitiveness of organizations are the parameters of production machines and processes used to create products. To define optimal parameter values for production machines and processes that will reduce production costs and increase competitiveness of production organizations, the authors have developed a new mathematical model. The model is based on application of the ABC classification method to classify production line processes based on their costs and an application of a genetic algorithm to find the optimal values of production machine parameters used in these processes. It has been applied in three different modern production line processes; the costs obtained by the model application have been compared with the real production costs.

  7. Quality prediction modeling for multistage manufacturing based on classification and association rule mining

    Directory of Open Access Journals (Sweden)

    Kao Hung-An

    2017-01-01

    Full Text Available For manufacturing enterprises, product quality is a key factor to assess production capability and increase their core competence. To reduce external failure cost, many research and methodology have been introduced in order to improve process yield rate, such as TQC/TQM, Shewhart CycleDeming's 14 Points, etc. Nowadays, impressive progress has been made in process monitoring and industrial data analysis because of the Industry 4.0 trend. Industries start to utilize quality control (QC methodology to lower inspection overhead and internal failure cost. Currently, the focus of QC is mostly in the inspection of single workstation and final product, however, for multistage manufacturing, many factors (like equipment, operators, parameters, etc. can have cumulative and interactive effects to the final quality. When failure occurs, it is difficult to resume the original settings for cause analysis. To address these problems, this research proposes a combination of principal components analysis (PCA with classification and association rule mining algorithms to extract features representing relationship of multiple workstations, predict final product quality, and analyze the root-cause of product defect. The method is demonstrated on a semiconductor data set.

  8. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  9. Predicting student satisfaction with courses based on log data from a virtual learning environment – a neural network and classification tree model

    Directory of Open Access Journals (Sweden)

    Ivana Đurđević Babić

    2015-03-01

    Full Text Available Student satisfaction with courses in academic institutions is an important issue and is recognized as a form of support in ensuring effective and quality education, as well as enhancing student course experience. This paper investigates whether there is a connection between student satisfaction with courses and log data on student courses in a virtual learning environment. Furthermore, it explores whether a successful classification model for predicting student satisfaction with course can be developed based on course log data and compares the results obtained from implemented methods. The research was conducted at the Faculty of Education in Osijek and included analysis of log data and course satisfaction on a sample of third and fourth year students. Multilayer Perceptron (MLP with different activation functions and Radial Basis Function (RBF neural networks as well as classification tree models were developed, trained and tested in order to classify students into one of two categories of course satisfaction. Type I and type II errors, and input variable importance were used for model comparison and classification accuracy. The results indicate that a successful classification model using tested methods can be created. The MLP model provides the highest average classification accuracy and the lowest preference in misclassification of students with a low level of course satisfaction, although a t-test for the difference in proportions showed that the difference in performance between the compared models is not statistically significant. Student involvement in forum discussions is recognized as a valuable predictor of student satisfaction with courses in all observed models.

  10. Sentiment classification technology based on Markov logic networks

    Science.gov (United States)

    He, Hui; Li, Zhigang; Yao, Chongchong; Zhang, Weizhe

    2016-07-01

    With diverse online media emerging, there is a growing concern of sentiment classification problem. At present, text sentiment classification mainly utilizes supervised machine learning methods, which feature certain domain dependency. On the basis of Markov logic networks (MLNs), this study proposed a cross-domain multi-task text sentiment classification method rooted in transfer learning. Through many-to-one knowledge transfer, labeled text sentiment classification, knowledge was successfully transferred into other domains, and the precision of the sentiment classification analysis in the text tendency domain was improved. The experimental results revealed the following: (1) the model based on a MLN demonstrated higher precision than the single individual learning plan model. (2) Multi-task transfer learning based on Markov logical networks could acquire more knowledge than self-domain learning. The cross-domain text sentiment classification model could significantly improve the precision and efficiency of text sentiment classification.

  11. Beyond dysfunction and threshold-based classification: a multidimensional model of personality disorder diagnosis.

    Science.gov (United States)

    Bornstein, Robert F; Huprich, Steven K

    2011-06-01

    An alternative dimensional model of personality disorder (PD) diagnosis that addresses several difficulties inherent in the current DSM conceptualization of PDs (excessive PD overlap and comorbidity, use of arbitrary thresholds to distinguish normal from pathological personality functioning, failure to capture variations in the adaptative value of PD symptoms, and inattention to the impact of situational influences on PD-related behaviors) is outlined. The model uses a set of diagnostician-friendly strategies to render PD diagnosis in three steps: (1) the diagnostician assigns every patient a single dimensional rating of overall level of personality dysfunction on a 50-point continuum; (2) the diagnostician assigns separate intensity and impairment ratings for each PD dimension (e.g., narcissism, avoidance, dependency); and (3) the diagnostician lists any personality traits-including PD-related traits-that enhance adaptation and functioning (e.g., histrionic theatricality, obsessive attention to detail). Advantages of the proposed model for clinicians and clinical researchers are discussed.

  12. Relating FIA data to habitat classifications via tree-based models of canopy cover

    Science.gov (United States)

    Mark D. Nelson; Brian G. Tavernia; Chris Toney; Brian F. Walters

    2012-01-01

    Wildlife species-habitat matrices are used to relate lists of species with abundance of their habitats. The Forest Inventory and Analysis Program provides data on forest composition and structure, but these attributes may not correspond directly with definitions of wildlife habitats. We used FIA tree data and tree crown diameter models to estimate canopy cover, from...

  13. Rapid Identification of Asteraceae Plants with Improved RBF-ANN Classification Models Based on MOS Sensor E-Nose.

    Science.gov (United States)

    Zou, Hui-Qin; Li, Shuo; Huang, Ying-Hua; Liu, Yong; Bauer, Rudolf; Peng, Lian; Tao, Ou; Yan, Su-Rong; Yan, Yong-Hong

    2014-01-01

    Plants from Asteraceae family are widely used as herbal medicines and food ingredients, especially in Asian area. Therefore, authentication and quality control of these different Asteraceae plants are important for ensuring consumers' safety and efficacy. In recent decades, electronic nose (E-nose) has been studied as an alternative approach. In this paper, we aim to develop a novel discriminative model by improving radial basis function artificial neural network (RBF-ANN) classification model. Feature selection algorithms, including principal component analysis (PCA) and BestFirst + CfsSubsetEval (BC), were applied in the improvement of RBF-ANN models. Results illustrate that in the improved RBF-ANN models with lower dimension data classification accuracies (100%) remained the same as in the original model with higher-dimension data. It is the first time to introduce feature selection methods to get valuable information on how to attribute more relevant MOS sensors; namely, in this case, S1, S3, S4, S6, and S7 show better capability to distinguish these Asteraceae plants. This paper also gives insights to further research in this area, for instance, sensor array optimization and performance improvement of classification model.

  14. SQL based cardiovascular ultrasound image classification.

    Science.gov (United States)

    Nandagopalan, S; Suryanarayana, Adiga B; Sudarshan, T S B; Chandrashekar, Dhanalakshmi; Manjunath, C N

    2013-01-01

    This paper proposes a novel method to analyze and classify the cardiovascular ultrasound echocardiographic images using Naïve-Bayesian model via database OLAP-SQL. Efficient data mining algorithms based on tightly-coupled model is used to extract features. Three algorithms are proposed for classification namely Naïve-Bayesian Classifier for Discrete variables (NBCD) with SQL, NBCD with OLAP-SQL, and Naïve-Bayesian Classifier for Continuous variables (NBCC) using OLAP-SQL. The proposed model is trained with 207 patient images containing normal and abnormal categories. Out of the three proposed algorithms, a high classification accuracy of 96.59% was achieved from NBCC which is better than the earlier methods.

  15. Models for concurrency: towards a classification

    DEFF Research Database (Denmark)

    Sassone, Vladimiro; Nielsen, Mogens; Winskel, Glynn

    1996-01-01

    Models for concurrency can be classified with respect to three relevant parameters: behaviour/ system, interleaving/noninterleaving, linear/branching time. When modelling a process, a choice concerning such parameters corresponds to choosing the level of abstraction of the resulting semantics....... In this paper, we move a step towards a classification of models for concurrency based on the parameters above. Formally, we choose a representative of any of the eight classes of models obtained by varying the three parameters, and we study the formal relationships between them using the language of category...

  16. Classification of research reactors and discussion of thinking of safety regulation based on the classification

    International Nuclear Information System (INIS)

    Song Chenxiu; Zhu Lixin

    2013-01-01

    Research reactors have different characteristics in the fields of reactor type, use, power level, design principle, operation model and safety performance, etc, and also have significant discrepancy in the aspect of nuclear safety regulation. This paper introduces classification of research reactors and discusses thinking of safety regulation based on the classification of research reactors. (authors)

  17. Learning MRI-based classification models for MGMT methylation status prediction in glioblastoma.

    Science.gov (United States)

    Kanas, Vasileios G; Zacharaki, Evangelia I; Thomas, Ginu A; Zinn, Pascal O; Megalooikonomou, Vasileios; Colen, Rivka R

    2017-03-01

    The O 6 -methylguanine-DNA-methyltransferase (MGMT) promoter methylation has been shown to be associated with improved outcomes in patients with glioblastoma (GBM) and may be a predictive marker of sensitivity to chemotherapy. However, determination of the MGMT promoter methylation status requires tissue obtained via surgical resection or biopsy. The aim of this study was to assess the ability of quantitative and qualitative imaging variables in predicting MGMT methylation status noninvasively. A retrospective analysis of MR images from GBM patients was conducted. Multivariate prediction models were obtained by machine-learning methods and tested on data from The Cancer Genome Atlas (TCGA) database. The status of MGMT promoter methylation was predicted with an accuracy of up to 73.6%. Experimental analysis showed that the edema/necrosis volume ratio, tumor/necrosis volume ratio, edema volume, and tumor location and enhancement characteristics were the most significant variables in respect to the status of MGMT promoter methylation in GBM. The obtained results provide further evidence of an association between standard preoperative MRI variables and MGMT methylation status in GBM. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  18. General regression and representation model for classification.

    Directory of Open Access Journals (Sweden)

    Jianjun Qian

    Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.

  19. An Automatic Segmentation and Classification Framework Based on PCNN Model for Single Tooth in MicroCT Images.

    Directory of Open Access Journals (Sweden)

    Liansheng Wang

    Full Text Available Accurate segmentation and classification of different anatomical structures of teeth from medical images plays an essential role in many clinical applications. Usually, the anatomical structures of teeth are manually labelled by experienced clinical doctors, which is time consuming. However, automatic segmentation and classification is a challenging task because the anatomical structures and surroundings of the tooth in medical images are rather complex. Therefore, in this paper, we propose an effective framework which is designed to segment the tooth with a Selective Binary and Gaussian Filtering Regularized Level Set (GFRLS method improved by fully utilizing three dimensional (3D information, and classify the tooth by employing unsupervised learning Pulse Coupled Neural Networks (PCNN model. In order to evaluate the proposed method, the experiments are conducted on the different datasets of mandibular molars and the experimental results show that our method can achieve better accuracy and robustness compared to other four state of the art clustering methods.

  20. Nonlinear Inertia Classification Model and Application

    Directory of Open Access Journals (Sweden)

    Mei Wang

    2014-01-01

    Full Text Available Classification model of support vector machine (SVM overcomes the problem of a big number of samples. But the kernel parameter and the punishment factor have great influence on the quality of SVM model. Particle swarm optimization (PSO is an evolutionary search algorithm based on the swarm intelligence, which is suitable for parameter optimization. Accordingly, a nonlinear inertia convergence classification model (NICCM is proposed after the nonlinear inertia convergence (NICPSO is developed in this paper. The velocity of NICPSO is firstly defined as the weighted velocity of the inertia PSO, and the inertia factor is selected to be a nonlinear function. NICPSO is used to optimize the kernel parameter and a punishment factor of SVM. Then, NICCM classifier is trained by using the optical punishment factor and the optical kernel parameter that comes from the optimal particle. Finally, NICCM is applied to the classification of the normal state and fault states of online power cable. It is experimentally proved that the iteration number for the proposed NICPSO to reach the optimal position decreases from 15 to 5 compared with PSO; the training duration is decreased by 0.0052 s and the recognition precision is increased by 4.12% compared with SVM.

  1. Cardiac arrhythmia classification using autoregressive modeling

    Directory of Open Access Journals (Sweden)

    Srinivasan Narayanan

    2002-11-01

    Full Text Available Abstract Background Computer-assisted arrhythmia recognition is critical for the management of cardiac disorders. Various techniques have been utilized to classify arrhythmias. Generally, these techniques classify two or three arrhythmias or have significantly large processing times. A simpler autoregressive modeling (AR technique is proposed to classify normal sinus rhythm (NSR and various cardiac arrhythmias including atrial premature contraction (APC, premature ventricular contraction (PVC, superventricular tachycardia (SVT, ventricular tachycardia (VT and ventricular fibrillation (VF. Methods AR Modeling was performed on ECG data from normal sinus rhythm as well as various arrhythmias. The AR coefficients were computed using Burg's algorithm. The AR coefficients were classified using a generalized linear model (GLM based algorithm in various stages. Results AR modeling results showed that an order of four was sufficient for modeling the ECG signals. The accuracy of detecting NSR, APC, PVC, SVT, VT and VF were 93.2% to 100% using the GLM based classification algorithm. Conclusion The results show that AR modeling is useful for the classification of cardiac arrhythmias, with reasonably high accuracies. Further validation of the proposed technique will yield acceptable results for clinical implementation.

  2. Brain extraction based on locally linear representation-based classification.

    Science.gov (United States)

    Huang, Meiyan; Yang, Wei; Jiang, Jun; Wu, Yao; Zhang, Yu; Chen, Wufan; Feng, Qianjin

    2014-05-15

    Brain extraction is an important procedure in brain image analysis. Although numerous brain extraction methods have been presented, enhancing brain extraction methods remains challenging because brain MRI images exhibit complex characteristics, such as anatomical variability and intensity differences across different sequences and scanners. To address this problem, we present a Locally Linear Representation-based Classification (LLRC) method for brain extraction. A novel classification framework is derived by introducing the locally linear representation to the classical classification model. Under this classification framework, a common label fusion approach can be considered as a special case and thoroughly interpreted. Locality is important to calculate fusion weights for LLRC; this factor is also considered to determine that Local Anchor Embedding is more applicable in solving locally linear coefficients compared with other linear representation approaches. Moreover, LLRC supplies a way to learn the optimal classification scores of the training samples in the dictionary to obtain accurate classification. The International Consortium for Brain Mapping and the Alzheimer's Disease Neuroimaging Initiative databases were used to build a training dataset containing 70 scans. To evaluate the proposed method, we used four publicly available datasets (IBSR1, IBSR2, LPBA40, and ADNI3T, with a total of 241 scans). Experimental results demonstrate that the proposed method outperforms the four common brain extraction methods (BET, BSE, GCUT, and ROBEX), and is comparable to the performance of BEaST, while being more accurate on some datasets compared with BEaST. Copyright © 2014 Elsevier Inc. All rights reserved.

  3. Computerized Classification Testing with the Rasch Model

    Science.gov (United States)

    Eggen, Theo J. H. M.

    2011-01-01

    If classification in a limited number of categories is the purpose of testing, computerized adaptive tests (CATs) with algorithms based on sequential statistical testing perform better than estimation-based CATs (e.g., Eggen & Straetmans, 2000). In these computerized classification tests (CCTs), the Sequential Probability Ratio Test (SPRT) (Wald,…

  4. Object Classification Using Substance Based Neural Network

    Directory of Open Access Journals (Sweden)

    P. Sengottuvelan

    2014-01-01

    Full Text Available Object recognition has shown tremendous increase in the field of image analysis. The required set of image objects is identified and retrieved on the basis of object recognition. In this paper, we propose a novel classification technique called substance based image classification (SIC using a wavelet neural network. The foremost task of SIC is to remove the surrounding regions from an image to reduce the misclassified portion and to effectively reflect the shape of an object. At first, the image to be extracted is performed with SIC system through the segmentation of the image. Next, in order to attain more accurate information, with the extracted set of regions, the wavelet transform is applied for extracting the configured set of features. Finally, using the neural network classifier model, misclassification over the given natural images and further background images are removed from the given natural image using the LSEG segmentation. Moreover, to increase the accuracy of object classification, SIC system involves the removal of the regions in the surrounding image. Performance evaluation reveals that the proposed SIC system reduces the occurrence of misclassification and reflects the exact shape of an object to approximately 10–15%.

  5. Application of a Hybrid Model Based on a Convolutional Auto-Encoder and Convolutional Neural Network in Object-Oriented Remote Sensing Classification

    Directory of Open Access Journals (Sweden)

    Wei Cui

    2018-01-01

    Full Text Available Variation in the format and classification requirements for remote sensing data makes establishing a standard remote sensing sample dataset difficult. As a result, few remote sensing deep neural network models have been widely accepted. We propose a hybrid deep neural network model based on a convolutional auto-encoder and a complementary convolutional neural network to solve this problem. The convolutional auto-encoder supports feature extraction and data dimension reduction of remote sensing data. The extracted features are input into the convolutional neural network and subsequently classified. Experimental results show that in the proposed model, the classification accuracy increases from 0.916 to 0.944, compared to a traditional convolutional neural network model; furthermore, the number of training runs is reduced from 40,000 to 22,000, and the number of labelled samples can be reduced by more than half, all while ensuring a classification accuracy of no less than 0.9, which suggests the effectiveness and feasibility of the proposed model.

  6. AN Information Text Classification Algorithm Based on DBN

    Directory of Open Access Journals (Sweden)

    LU Shu-bao

    2017-04-01

    Full Text Available Aiming at the problem of low categorization accuracy and uneven distribution of the traditional text classification algorithms,a text classification algorithm based on deep learning has been put forward. Deep belief networks have very strong feature learning ability,which can be extracted from the high dimension of the original feature,so that the text classification can not only be considered,but also can be used to train classification model. The formula of TF-IDF is used to compute text eigenvalues,and the deep belief networks are used to construct the classifier. The experimental results show that compared with the commonly used classification algorithms such as support vector machine,neural network and extreme learning machine,the algorithm has higher accuracy and practicability,and it has opened up new ideas for the research of text classification.

  7. Comparative analysis of tree classification models for detecting fusarium oxysporum f. sp cubense (TR4) based on multi soil sensor parameters

    Science.gov (United States)

    Estuar, Maria Regina Justina; Victorino, John Noel; Coronel, Andrei; Co, Jerelyn; Tiausas, Francis; Señires, Chiara Veronica

    2017-09-01

    Use of wireless sensor networks and smartphone integration design to monitor environmental parameters surrounding plantations is made possible because of readily available and affordable sensors. Providing low cost monitoring devices would be beneficial, especially to small farm owners, in a developing country like the Philippines, where agriculture covers a significant amount of the labor market. This study discusses the integration of wireless soil sensor devices and smartphones to create an application that will use multidimensional analysis to detect the presence or absence of plant disease. Specifically, soil sensors are designed to collect soil quality parameters in a sink node from which the smartphone collects data from via Bluetooth. Given these, there is a need to develop a classification model on the mobile phone that will report infection status of a soil. Though tree classification is the most appropriate approach for continuous parameter-based datasets, there is a need to determine whether tree models will result to coherent results or not. Soil sensor data that resides on the phone is modeled using several variations of decision tree, namely: decision tree (DT), best-fit (BF) decision tree, functional tree (FT), Naive Bayes (NB) decision tree, J48, J48graft and LAD tree, where decision tree approaches the problem by considering all sensor nodes as one. Results show that there are significant differences among soil sensor parameters indicating that there are variances in scores between the infected and uninfected sites. Furthermore, analysis of variance in accuracy, recall, precision and F1 measure scores from tree classification models homogeneity among NBTree, J48graft and J48 tree classification models.

  8. An enhanced forest classification scheme for modeling vegetation-climate interactions based on national forest inventory data

    Science.gov (United States)

    Majasalmi, Titta; Eisner, Stephanie; Astrup, Rasmus; Fridman, Jonas; Bright, Ryan M.

    2018-01-01

    Forest management affects the distribution of tree species and the age class of a forest, shaping its overall structure and functioning and in turn the surface-atmosphere exchanges of mass, energy, and momentum. In order to attribute climate effects to anthropogenic activities like forest management, good accounts of forest structure are necessary. Here, using Fennoscandia as a case study, we make use of Fennoscandic National Forest Inventory (NFI) data to systematically classify forest cover into groups of similar aboveground forest structure. An enhanced forest classification scheme and related lookup table (LUT) of key forest structural attributes (i.e., maximum growing season leaf area index (LAImax), basal-area-weighted mean tree height, tree crown length, and total stem volume) was developed, and the classification was applied for multisource NFI (MS-NFI) maps from Norway, Sweden, and Finland. To provide a complete surface representation, our product was integrated with the European Space Agency Climate Change Initiative Land Cover (ESA CCI LC) map of present day land cover (v.2.0.7). Comparison of the ESA LC and our enhanced LC products (https://doi.org/10.21350/7zZEy5w3) showed that forest extent notably (κ = 0.55, accuracy 0.64) differed between the two products. To demonstrate the potential of our enhanced LC product to improve the description of the maximum growing season LAI (LAImax) of managed forests in Fennoscandia, we compared our LAImax map with reference LAImax maps created using the ESA LC product (and related cross-walking table) and PFT-dependent LAImax values used in three leading land models. Comparison of the LAImax maps showed that our product provides a spatially more realistic description of LAImax in managed Fennoscandian forests compared to reference maps. This study presents an approach to account for the transient nature of forest structural attributes due to human intervention in different land models.

  9. The interplay of descriptor-based computational analysis with pharmacophore modeling builds the basis for a novel classification scheme for feruloyl esterases

    DEFF Research Database (Denmark)

    Udatha, D.B.R.K. Gupta; Kouskoumvekaki, Irene; Olsson, Lisbeth

    2011-01-01

    and putative FAEs; nevertheless, the framework presented here is applicable to every poorly characterized enzyme family. 365 FAE-related sequences of fungal, bacterial and plantae origin were collected and they were clustered using Self Organizing Maps followed by k-means clustering into distinct groups based...... on amino acid composition and physico-chemical composition descriptors derived from the respective amino acid sequence. A Support Vector Machine model was subsequently constructed for the classification of new FAEs into the pre-assigned clusters. The model successfully recognized 98.2% of the training...

  10. Structural Equation Modeling of Classification Managers Based on the Communication Skills and Cultural Intelligence in Sport Organizations

    Directory of Open Access Journals (Sweden)

    Rasool NAZARI

    2015-03-01

    Full Text Available The purpose of this research is to develop structural equation model category managers on communication skills and cultural intelligence agencies had Isfahan Sports. Hence study was of structural equation modeling. The statistical population of this research formed the provincial sports administrators that according formal statistical was 550 people. Research sample size the sample of 207subjects was randomly selected. Cochran's sample size formula was used to determine. Measuring research and Communication Skills (0.81, Cultural Intelligence Scale (0.85 category manager's questionnaire (0.86, respectively. For analysis descriptive and inferential statistics SPSS and LISREL was used. Model results, communication skills, cultural intelligence and athletic directors classification of the fit was good (RMSEA=0.037, GFI= 0.902, AGFI= 0.910, NFT= 0.912. The prerequisite for proper planning to improve communication skills and cultural intelligence managers as influencing exercise essential while the authorial shave the right to choose directors analyst and intuitive strategies for management position shave because it looks better with the managers can be expected to exercise a clearer perspective.

  11. Hot complaint intelligent classification based on text mining

    Directory of Open Access Journals (Sweden)

    XIA Haifeng

    2013-10-01

    Full Text Available The complaint recognizer system plays an important role in making sure the correct classification of the hot complaint,improving the service quantity of telecommunications industry.The customers’ complaint in telecommunications industry has its special particularity which should be done in limited time,which cause the error in classification of hot complaint.The paper presents a model of complaint hot intelligent classification based on text mining,which can classify the hot complaint in the correct level of the complaint navigation.The examples show that the model can be efficient to classify the text of the complaint.

  12. Mechanistic Physiologically Based Pharmacokinetic Modeling of the Dissolution and Food Effect of a Biopharmaceutics Classification System IV Compound-The Venetoclax Story.

    Science.gov (United States)

    Emami Riedmaier, Arian; Lindley, David J; Hall, Jeffrey A; Castleberry, Steven; Slade, Russell T; Stuart, Patricia; Carr, Robert A; Borchardt, Thomas B; Bow, Daniel A J; Nijsen, Marjoleen

    2018-01-01

    Venetoclax, a selective B-cell lymphoma-2 inhibitor, is a biopharmaceutics classification system class IV compound. The aim of this study was to develop a physiologically based pharmacokinetic (PBPK) model to mechanistically describe absorption and disposition of an amorphous solid dispersion formulation of venetoclax in humans. A mechanistic PBPK model was developed incorporating measured amorphous solubility, dissolution, metabolism, and plasma protein binding. A middle-out approach was used to define permeability. Model predictions of oral venetoclax pharmacokinetics were verified against clinical studies of fed and fasted healthy volunteers, and clinical drug interaction studies with strong CYP3A inhibitor (ketoconazole) and inducer (rifampicin). Model verification demonstrated accurate prediction of the observed food effect following a low-fat diet. Ratios of predicted versus observed C max and area under the curve of venetoclax were within 0.8- to 1.25-fold of observed ratios for strong CYP3A inhibitor and inducer interactions, indicating that the venetoclax elimination pathway was correctly specified. The verified venetoclax PBPK model is one of the first examples mechanistically capturing absorption, food effect, and exposure of an amorphous solid dispersion formulated compound. This model allows evaluation of untested drug-drug interactions, especially those primarily occurring in the intestine, and paves the way for future modeling of biopharmaceutics classification system IV compounds. Copyright © 2018 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  13. Use of topographic and climatological models in a geographical data base to improve Landsat MSS classification for Olympic National Park

    Science.gov (United States)

    Cibula, William G.; Nyquist, Maurice O.

    1987-01-01

    An unsupervised computer classification of vegetation/landcover of Olympic National Park and surrounding environs was initially carried out using four bands of Landsat MSS data. The primary objective of the project was to derive a level of landcover classifications useful for park management applications while maintaining an acceptably high level of classification accuracy. Initially, nine generalized vegetation/landcover classes were derived. Overall classification accuracy was 91.7 percent. In an attempt to refine the level of classification, a geographic information system (GIS) approach was employed. Topographic data and watershed boundaries (inferred precipitation/temperature) data were registered with the Landsat MSS data. The resultant boolean operations yielded 21 vegetation/landcover classes while maintaining the same level of classification accuracy. The final classification provided much better identification and location of the major forest types within the park at the same high level of accuracy, and these met the project objective. This classification could now become inputs into a GIS system to help provide answers to park management coupled with other ancillary data programs such as fire management.

  14. Latent Classification Models for Binary Data

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre

    2009-01-01

    the class of that instance. To relax this independence assumption, we have in previous work proposed a family of models, called latent classification models (LCMs). LCMs are defined for continuous domains and generalize the naive Bayes model by using latent variables to model class-conditional dependencies...... between the attributes. In addition to providing good classification accuracy, the LCM model has several appealing properties, including a relatively small parameter space making it less susceptible to over-fitting. In this paper we take a first-step towards generalizing LCMs to hybrid domains...... of different domains, including the problem of recognizing symbols in black and white images....

  15. Knowledge discovery from patients' behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services.

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.

  16. Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

    Science.gov (United States)

    Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi

    2016-01-01

    The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers. PMID:27610177

  17. Blood-based gene expression profiles models for classification of subsyndromal symptomatic depression and major depressive disorder.

    Science.gov (United States)

    Yi, Zhenghui; Li, Zezhi; Yu, Shunying; Yuan, Chengmei; Hong, Wu; Wang, Zuowei; Cui, Jian; Shi, Tieliu; Fang, Yiru

    2012-01-01

    Subsyndromal symptomatic depression (SSD) is a subtype of subthreshold depressive and also lead to significant psychosocial functional impairment as same as major depressive disorder (MDD). Several studies have suggested that SSD is a transitory phenomena in the depression spectrum and is thus considered a subtype of depression. However, the pathophysioloy of depression remain largely obscure and studies on SSD are limited. The present study compared the expression profile and made the classification with the leukocytes by using whole-genome cRNA microarrays among drug-free first-episode subjects with SSD, MDD, and matched controls (8 subjects in each group). Support vector machines (SVMs) were utilized for training and testing on candidate signature expression profiles from signature selection step. Firstly, we identified 63 differentially expressed SSD signatures in contrast to control (Pbiomarkers for SSD and MDD together, we selected top gene signatures from each group of pair-wise comparison results, and merged the signatures together to generate better profiles used for clearly classify SSD and MDD sets in the same time. In details, we tried different combination of signatures from the three pair-wise compartmental results and finally determined 48 gene expression signatures with 100% accuracy. Our finding suggested that SSD and MDD did not exhibit the same expressed genome signature with peripheral blood leukocyte, and blood cell-derived RNA of these 48 gene models may have significant value for performing diagnostic functions and classifying SSD, MDD, and healthy controls.

  18. Hierarchical system for content-based audio classification and retrieval

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are first classical and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of the energy function, the average zero-crossing rate, and the fundamental frequency of audio signals. The first stage is called the coarse-level audio classification and segmentation. Then, environmental sounds are classified into finer classes such as applause, rain, birds' sound, etc., which is called the fine-level audio classification. The second stage is based on time-frequency analysis of audio signals and the use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to the input sample audio. The way of modeling audio features with the hidden Markov model, the procedures of audio classification and retrieval, and the experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy higher than 90%. Examples of audio fine classification and audio retrieval with the proposed HMM-based method are also provided.

  19. Classification

    Science.gov (United States)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  20. Inventory classification based on decoupling points

    Directory of Open Access Journals (Sweden)

    Joakim Wikner

    2015-01-01

    Full Text Available The ideal state of continuous one-piece flow may never be achieved. Still the logistics manager can improve the flow by carefully positioning inventory to buffer against variations. Strategies such as lean, postponement, mass customization, and outsourcing all rely on strategic positioning of decoupling points to separate forecast-driven from customer-order-driven flows. Planning and scheduling of the flow are also based on classification of decoupling points as master scheduled or not. A comprehensive classification scheme for these types of decoupling points is introduced. The approach rests on identification of flows as being either demand based or supply based. The demand or supply is then combined with exogenous factors, classified as independent, or endogenous factors, classified as dependent. As a result, eight types of strategic as well as tactical decoupling points are identified resulting in a process-based framework for inventory classification that can be used for flow design.

  1. SU-G-BRC-13: Model Based Classification for Optimal Position Selection for Left-Sided Breast Radiotherapy: Free Breathing, DIBH, Or Prone

    Energy Technology Data Exchange (ETDEWEB)

    Lin, H; Liu, T; Xu, X [Rensselaer Polytechnic Institute, Troy, NY (United States); Shi, C [Saint Vincent Medical Center, Bridgeport, CT (United States); Petillion, S; Kindts, I [University Hospitals Leuven, Leuven, Vlaams-Brabant (Belgium); Tang, X [Memorial Sloan Kettering Cancer Center, West Harrison, NY (United States)

    2016-06-15

    Purpose: There are clinical decision challenges to select optimal treatment positions for left-sided breast cancer patients—supine free breathing (FB), supine Deep Inspiration Breath Hold (DIBH) and prone free breathing (prone). Physicians often make the decision based on experiences and trials, which might not always result optimal OAR doses. We herein propose a mathematical model to predict the lowest OAR doses among these three positions, providing a quantitative tool for corresponding clinical decision. Methods: Patients were scanned in FB, DIBH, and prone positions under an IRB approved protocol. Tangential beam plans were generated for each position, and OAR doses were calculated. The position with least OAR doses is defined as the optimal position. The following features were extracted from each scan to build the model: heart, ipsilateral lung, breast volume, in-field heart, ipsilateral lung volume, distance between heart and target, laterality of heart, and dose to heart and ipsilateral lung. Principal Components Analysis (PCA) was applied to remove the co-linearity of the input data and also to lower the data dimensionality. Feature selection, another method to reduce dimensionality, was applied as a comparison. Support Vector Machine (SVM) was then used for classification. Thirtyseven patient data were acquired; up to now, five patient plans were available. K-fold cross validation was used to validate the accuracy of the classifier model with small training size. Results: The classification results and K-fold cross validation demonstrated the model is capable of predicting the optimal position for patients. The accuracy of K-fold cross validations has reached 80%. Compared to PCA, feature selection allows causal features of dose to be determined. This provides more clinical insights. Conclusion: The proposed classification system appeared to be feasible. We are generating plans for the rest of the 37 patient images, and more statistically significant

  2. Blood-Based Gene Expression Profiles Models for Classification of Subsyndromal Symptomatic Depression and Major Depressive Disorder

    Science.gov (United States)

    Yu, Shunying; Yuan, Chengmei; Hong, Wu; Wang, Zuowei; Cui, Jian; Shi, Tieliu; Fang, Yiru

    2012-01-01

    Subsyndromal symptomatic depression (SSD) is a subtype of subthreshold depressive and also lead to significant psychosocial functional impairment as same as major depressive disorder (MDD). Several studies have suggested that SSD is a transitory phenomena in the depression spectrum and is thus considered a subtype of depression. However, the pathophysioloy of depression remain largely obscure and studies on SSD are limited. The present study compared the expression profile and made the classification with the leukocytes by using whole-genome cRNA microarrays among drug-free first-episode subjects with SSD, MDD, and matched controls (8 subjects in each group). Support vector machines (SVMs) were utilized for training and testing on candidate signature expression profiles from signature selection step. Firstly, we identified 63 differentially expressed SSD signatures in contrast to control (P< = 5.0E-4) and 30 differentially expressed MDD signatures in contrast to control, respectively. Then, 123 gene signatures were identified with significantly differential expression level between SSD and MDD. Secondly, in order to conduct priority selection for biomarkers for SSD and MDD together, we selected top gene signatures from each group of pair-wise comparison results, and merged the signatures together to generate better profiles used for clearly classify SSD and MDD sets in the same time. In details, we tried different combination of signatures from the three pair-wise compartmental results and finally determined 48 gene expression signatures with 100% accuracy. Our finding suggested that SSD and MDD did not exhibit the same expressed genome signature with peripheral blood leukocyte, and blood cell–derived RNA of these 48 gene models may have significant value for performing diagnostic functions and classifying SSD, MDD, and healthy controls. PMID:22348066

  3. Classification of tumor based on magnetic resonance (MR) brain images using wavelet energy feature and neuro-fuzzy model

    Science.gov (United States)

    Damayanti, A.; Werdiningsih, I.

    2018-03-01

    The brain is the organ that coordinates all the activities that occur in our bodies. Small abnormalities in the brain will affect body activity. Tumor of the brain is a mass formed a result of cell growth not normal and unbridled in the brain. MRI is a non-invasive medical test that is useful for doctors in diagnosing and treating medical conditions. The process of classification of brain tumor can provide the right decision and correct treatment and right on the process of treatment of brain tumor. In this study, the classification process performed to determine the type of brain tumor disease, namely Alzheimer’s, Glioma, Carcinoma and normal, using energy coefficient and ANFIS. Process stages in the classification of images of MR brain are the extraction of a feature, reduction of a feature, and process of classification. The result of feature extraction is a vector approximation of each wavelet decomposition level. The feature reduction is a process of reducing the feature by using the energy coefficients of the vector approximation. The feature reduction result for energy coefficient of 100 per feature is 1 x 52 pixels. This vector will be the input on the classification using ANFIS with Fuzzy C-Means and FLVQ clustering process and LM back-propagation. Percentage of success rate of MR brain images recognition using ANFIS-FLVQ, ANFIS, and LM back-propagation was obtained at 100%.

  4. Robustness analysis of a green chemistry-based model for the classification of silver nanoparticles synthesis processes

    Science.gov (United States)

    This paper proposes a robustness analysis based on Multiple Criteria Decision Aiding (MCDA). The ensuing model was used to assess the implementation of green chemistry principles in the synthesis of silver nanoparticles. Its recommendations were also compared to an earlier develo...

  5. Global Optimization Ensemble Model for Classification Methods

    Directory of Open Access Journals (Sweden)

    Hina Anwar

    2014-01-01

    Full Text Available Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity.

  6. Classification of breast mass lesions using model-based analysis of the characteristic kinetic curve derived from fuzzy c-means clustering.

    Science.gov (United States)

    Chang, Yeun-Chung; Huang, Yan-Hao; Huang, Chiun-Sheng; Chang, Pei-Kang; Chen, Jeon-Hor; Chang, Ruey-Feng

    2012-04-01

    The purpose of this study is to evaluate the diagnostic efficacy of the representative characteristic kinetic curve of dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) extracted by fuzzy c-means (FCM) clustering for the discrimination of benign and malignant breast tumors using a novel computer-aided diagnosis (CAD) system. About the research data set, DCE-MRIs of 132 solid breast masses with definite histopathologic diagnosis (63 benign and 69 malignant) were used in this study. At first, the tumor region was automatically segmented using the region growing method based on the integrated color map formed by the combination of kinetic and area under curve color map. Then, the FCM clustering was used to identify the time-signal curve with the larger initial enhancement inside the segmented region as the representative kinetic curve, and then the parameters of the Tofts pharmacokinetic model for the representative kinetic curve were compared with conventional curve analysis (maximal enhancement, time to peak, uptake rate and washout rate) for each mass. The results were analyzed with a receiver operating characteristic curve and Student's t test to evaluate the classification performance. Accuracy, sensitivity, specificity, positive predictive value and negative predictive value of the combined model-based parameters of the extracted kinetic curve from FCM clustering were 86.36% (114/132), 85.51% (59/69), 87.30% (55/63), 88.06% (59/67) and 84.62% (55/65), better than those from a conventional curve analysis. The A(Z) value was 0.9154 for Tofts model-based parametric features, better than that for conventional curve analysis (0.8673), for discriminating malignant and benign lesions. In conclusion, model-based analysis of the characteristic kinetic curve of breast mass derived from FCM clustering provides effective lesion classification. This approach has potential in the development of a CAD system for DCE breast MRI. Copyright © 2012 Elsevier Inc

  7. Classification of Event-Related Potentials Associated with Response Errors in Actors and Observers Based on Autoregressive Modeling

    NARCIS (Netherlands)

    Vasios, C.E.; Ventouras, E.M.; Matsopoulos, G.K.; Karanasiou, I.; Asvestas, P.; Uzunoglu, N.K.; Schie, H.T. van; Bruijn, E.R.A. de

    2009-01-01

    Event-Related Potentials (ERPs) provide non-invasive measurements of the electrical activity on the scalp related to the processing of stimuli and preparation of responses by the brain. In this paper an ERP-signal classification method is proposed for discriminating between ERPs of correct and

  8. Classification using Hierarchical Naive Bayes models

    DEFF Research Database (Denmark)

    Langseth, Helge; Dyhre Nielsen, Thomas

    2006-01-01

    Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe......, termed Hierarchical Naïve Bayes models. Hierarchical Naïve Bayes models extend the modeling flexibility of Naïve Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naïve Bayes models...

  9. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    Science.gov (United States)

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1

  10. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  11. Risk-based classification system of nanomaterials

    International Nuclear Information System (INIS)

    Tervonen, Tommi; Linkov, Igor; Figueira, Jose Rui; Steevens, Jeffery; Chappell, Mark; Merad, Myriam

    2009-01-01

    Various stakeholders are increasingly interested in the potential toxicity and other risks associated with nanomaterials throughout the different stages of a product's life cycle (e.g., development, production, use, disposal). Risk assessment methods and tools developed and applied to chemical and biological materials may not be readily adaptable for nanomaterials because of the current uncertainty in identifying the relevant physico-chemical and biological properties that adequately describe the materials. Such uncertainty is further driven by the substantial variations in the properties of the original material due to variable manufacturing processes employed in nanomaterial production. To guide scientists and engineers in nanomaterial research and application as well as to promote the safe handling and use of these materials, we propose a decision support system for classifying nanomaterials into different risk categories. The classification system is based on a set of performance metrics that measure both the toxicity and physico-chemical characteristics of the original materials, as well as the expected environmental impacts through the product life cycle. Stochastic multicriteria acceptability analysis (SMAA-TRI), a formal decision analysis method, was used as the foundation for this task. This method allowed us to cluster various nanomaterials in different ecological risk categories based on our current knowledge of nanomaterial physico-chemical characteristics, variation in produced material, and best professional judgments. SMAA-TRI uses Monte Carlo simulations to explore all feasible values for weights, criteria measurements, and other model parameters to assess the robustness of nanomaterial grouping for risk management purposes.

  12. Chinese Sentence Classification Based on Convolutional Neural Network

    Science.gov (United States)

    Gu, Chengwei; Wu, Ming; Zhang, Chuang

    2017-10-01

    Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.

  13. Linear dynamic models for classification of single-trial EEG.

    Science.gov (United States)

    Samdin, S Balqis; Ting, Chee-Ming; Salleh, Sh-Hussain; Ariff, A K; Mohd Noor, A B

    2013-01-01

    This paper investigates the use of linear dynamic models (LDMs) to improve classification of single-trial EEG signals. Existing dynamic classification of EEG uses discrete-state hidden Markov models (HMMs) based on piecewise-stationary assumption, which is inadequate for modeling the highly non-stationary dynamics underlying EEG. The continuous hidden states of LDMs could better describe this continuously changing characteristic of EEG, and thus improve the classification performance. We consider two examples of LDM: a simple local level model (LLM) and a time-varying autoregressive (TVAR) state-space model. AR parameters and band power are used as features. Parameter estimation of the LDMs is performed by using expectation-maximization (EM) algorithm. We also investigate different covariance modeling of Gaussian noises in LDMs for EEG classification. The experimental results on two-class motor-imagery classification show that both types of LDMs outperform the HMM baseline, with the best relative accuracy improvement of 14.8% by LLM with full covariance for Gaussian noises. It may due to that LDMs offer more flexibility in fitting the underlying dynamics of EEG.

  14. Music genre classification via likelihood fusion from multiple feature models

    Science.gov (United States)

    Shiu, Yu; Kuo, C.-C. J.

    2005-01-01

    Music genre provides an efficient way to index songs in a music database, and can be used as an effective means to retrieval music of a similar type, i.e. content-based music retrieval. A new two-stage scheme for music genre classification is proposed in this work. At the first stage, we examine a couple of different features, construct their corresponding parametric models (e.g. GMM and HMM) and compute their likelihood functions to yield soft classification results. In particular, the timbre, rhythm and temporal variation features are considered. Then, at the second stage, these soft classification results are integrated to result in a hard decision for final music genre classification. Experimental results are given to demonstrate the performance of the proposed scheme.

  15. Geomorphology Classification of Shandong Province Based on Digital Elevation Model in the 1 Arc-second Format of Shuttle Radar Topography Mission Data

    Science.gov (United States)

    Fu, Jundong; Zhang, Guangcheng; Wang, Lei; Xia, Nuan

    2018-01-01

    Based on gigital elevation model in the 1 arc-second format of shuttle radar topography mission data, using the window analysis and mean change point analysis of geographic information system (GIS) technology, programmed with python modules this, automatically extracted and calculated geomorphic elements of Shandong province. The best access to quantitatively study area relief amplitude of statistical area. According to Chinese landscape classification standard, the landscape type in Shandong province was divided into 8 types: low altitude plain, medium altitude plain, low altitude platform, medium altitude platform, low altitude hills, medium altitude hills, low relief mountain, medium relief mountain and the percentages of Shandong province’s total area are as follows: 12.72%, 0.01%, 36.38%, 0.24%, 17.26%, 15.64%, 11.1%, 6.65%. The results of landforms are basically the same as the overall terrain of Shandong Province, Shandong province’s total area, and the study can quantitatively and scientifically provide reference for the classification of landforms in Shandong province.

  16. An assessment of European synoptic variability in Hadley Centre Global Environmental models based on an objective classification of weather regimes

    Energy Technology Data Exchange (ETDEWEB)

    James, P.M. [Met Office, Hadley Centre for Climate Prediction and Research, Exeter (United Kingdom)

    2006-08-15

    The frequency of occurrence of persistent synoptic-scale weather patterns over the European and North-East Atlantic regions is examined in a hierarchy of climate model simulations and compared to observational re-analysed data. A new objective method, employing pattern correlation techniques, has been constructed for classifying daily-mean mean-sea-level pressure and 500 hPa geopotential height fields with respect to a set of 29 European weather regime types, based on the widely known subjective Grosswetterlagen (GWL) system of the German Weather Service. The objective method is described and applied initially to ERA40 and NCEP re-analysis data. While the resulting daily Objective-GWL catalogue shows some systematic differences with respect to the subjectively-derived original GWL series, the method is shown to be sufficiently robust for application to climate model output. Ensemble runs from the most recent development of the Hadley Centre's Global Environmental model, HadGEM1, in atmosphere-only, coupled and climate change scenario modes are analysed with regards to European synoptic variability. All simulations successfully exhibit a wide spread of GWL occurrences across all regime types, but some systematic differences in mean GWL frequencies are seen in spite of significant levels of interdecadal variability. These differences provide a basis for estimating local anomalies of surface temperature and precipitation over Europe, which would result from circulation changes alone, in each climate simulation. Comparison to observational re-analyses shows a clear and significant improvement in the simulation of realistic European synoptic variability with the development and resolution of the atmosphere-only models. (orig.)

  17. LAND COVER CLASSIFICATION OF MULTI-SENSOR IMAGES BY DECISION FUSION USING WEIGHTS OF EVIDENCE MODEL

    Directory of Open Access Journals (Sweden)

    P. Li

    2012-07-01

    Full Text Available This paper proposed a novel method of decision fusion based on weights of evidence model (WOE. The probability rules from classification results from each separate dataset were fused using WOE to produce the posterior probability for each class. The final classification was obtained by maximum probability. The proposed method was evaluated in land cover classification using two examples. The results showed that the proposed method effectively combined multisensor data in land cover classification and obtained higher classification accuracy than the use of single source data. The weights of evidence model provides an effective decision fusion method for improved land cover classification using multi-sensor data.

  18. Uncovering Arabidopsis membrane protein interactome enriched in transporters using mating-based split ubiquitin assays and classification models

    Directory of Open Access Journals (Sweden)

    Jin eChen

    2012-06-01

    Full Text Available High-throughput data are a double-edged sword; for the benefit of large amount of data, there is an associated cost of noise. To increase reliability and scalability of high-throughput protein interaction data generation, we tested the efficacy of classification to enrich potential protein-protein interactions (pPPIs. We applied this method to identify interactions among Arabidopsis membrane proteins enriched in transporters. We validated our method with multiple retests. Classification improved the quality of the ensuing interaction network and was effective in reducing the search space and increasing true positive rate. The final network of 541 interactions among 239 proteins (of which 179 are transporters is the first protein interaction network enriched in membrane transporters reported for any organism. This network has similar topological attributes to other published protein interaction networks. It also extends and fills gaps in currently available biological networks in plants and allows building a number of hypotheses about processes and mechanisms involving signal-transduction and transport systems.

  19. Failure diagnosis using deep belief learning based health state classification

    International Nuclear Information System (INIS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2013-01-01

    Effective health diagnosis provides multifarious benefits such as improved safety, improved reliability and reduced costs for operation and maintenance of complex engineered systems. This paper presents a novel multi-sensor health diagnosis method using deep belief network (DBN). DBN has recently become a popular approach in machine learning for its promised advantages such as fast inference and the ability to encode richer and higher order network structures. The DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines and works through a layer by layer successive learning process. The proposed multi-sensor health diagnosis methodology using DBN based state classification can be structured in three consecutive stages: first, defining health states and preprocessing sensory data for DBN training and testing; second, developing DBN based classification models for diagnosis of predefined health states; third, validating DBN classification models with testing sensory dataset. Health diagnosis using DBN based health state classification technique is compared with four existing diagnosis techniques. Benchmark classification problems and two engineering health diagnosis applications: aircraft engine health diagnosis and electric power transformer health diagnosis are employed to demonstrate the efficacy of the proposed approach

  20. Identification of candidate categories of the International Classification of Functioning Disability and Health (ICF for a Generic ICF Core Set based on regression modelling

    Directory of Open Access Journals (Sweden)

    Üstün Bedirhan T

    2006-07-01

    Full Text Available Abstract Background The International Classification of Functioning, Disability and Health (ICF is the framework developed by WHO to describe functioning and disability at both the individual and population levels. While condition-specific ICF Core Sets are useful, a Generic ICF Core Set is needed to describe and compare problems in functioning across health conditions. Methods The aims of the multi-centre, cross-sectional study presented here were: a to propose a method to select ICF categories when a large amount of ICF-based data have to be handled, and b to identify candidate ICF categories for a Generic ICF Core Set by examining their explanatory power in relation to item one of the SF-36. The data were collected from 1039 patients using the ICF checklist, the SF-36 and a Comorbidity Questionnaire. ICF categories to be entered in an initial regression model were selected following systematic steps in accordance with the ICF structure. Based on an initial regression model, additional models were designed by systematically substituting the ICF categories included in it with ICF categories with which they were highly correlated. Results Fourteen different regression models were performed. The variance the performed models account for ranged from 22.27% to 24.0%. The ICF category that explained the highest amount of variance in all the models was sensation of pain. In total, thirteen candidate ICF categories for a Generic ICF Core Set were proposed. Conclusion The selection strategy based on the ICF structure and the examination of the best possible alternative models does not provide a final answer about which ICF categories must be considered, but leads to a selection of suitable candidates which needs further consideration and comparison with the results of other selection strategies in developing a Generic ICF Core Set.

  1. Robust Seismic Normal Modes Computation in Radial Earth Models and A Novel Classification Based on Intersection Points of Waveguides

    Science.gov (United States)

    Ye, J.; Shi, J.; De Hoop, M. V.

    2017-12-01

    We develop a robust algorithm to compute seismic normal modes in a spherically symmetric, non-rotating Earth. A well-known problem is the cross-contamination of modes near "intersections" of dispersion curves for separate waveguides. Our novel computational approach completely avoids artificial degeneracies by guaranteeing orthonormality among the eigenfunctions. We extend Wiggins' and Buland's work, and reformulate the Sturm-Liouville problem as a generalized eigenvalue problem with the Rayleigh-Ritz Galerkin method. A special projection operator incorporating the gravity terms proposed by de Hoop and a displacement/pressure formulation are utilized in the fluid outer core to project out the essential spectrum. Moreover, the weak variational form enables us to achieve high accuracy across the solid-fluid boundary, especially for Stoneley modes, which have exponentially decaying behavior. We also employ the mixed finite element technique to avoid spurious pressure modes arising from discretization schemes and a numerical inf-sup test is performed following Bathe's work. In addition, the self-gravitation terms are reformulated to avoid computations outside the Earth, thanks to the domain decomposition technique. Our package enables us to study the physical properties of intersection points of waveguides. According to Okal's classification theory, the group velocities should be continuous within a branch of the same mode family. However, we have found that there will be a small "bump" near intersection points, which is consistent with Miropol'sky's observation. In fact, we can loosely regard Earth's surface and the CMB as independent waveguides. For those modes that are far from the intersection points, their eigenfunctions are localized in the corresponding waveguides. However, those that are close to intersection points will have physical features of both waveguides, which means they cannot be classified in either family. Our results improve on Okal

  2. A new circulation type classification based upon Lagrangian air trajectories

    Directory of Open Access Journals (Sweden)

    Alexandre M. Ramos

    2014-10-01

    Full Text Available A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories. The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification.A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  3. Autonomous classification models in ubiquitous environments

    OpenAIRE

    Abad Arranz, Miguel Angel

    2016-01-01

    Stream-mining approach is defined as a set of cutting-edge techniques designed to process streams of data in real time, in order to extract knowledge. In the particular case of classification, stream-mining has to adapt its behaviour to the volatile underlying data distributions, what has been called concept drift. Moreover, it is important to note that concept drift may lead to situations where predictive models become invalid and have therefore to be updated to represent the actual concepts...

  4. Analysis of composition-based metagenomic classification.

    Science.gov (United States)

    Higashi, Susan; Barreto, André da Motta Salles; Cantão, Maurício Egidio; de Vasconcelos, Ana Tereza Ribeiro

    2012-01-01

    An essential step of a metagenomic study is the taxonomic classification, that is, the identification of the taxonomic lineage of the organisms in a given sample. The taxonomic classification process involves a series of decisions. Currently, in the context of metagenomics, such decisions are usually based on empirical studies that consider one specific type of classifier. In this study we propose a general framework for analyzing the impact that several decisions can have on the classification problem. Instead of focusing on any specific classifier, we define a generic score function that provides a measure of the difficulty of the classification task. Using this framework, we analyze the impact of the following parameters on the taxonomic classification problem: (i) the length of n-mers used to encode the metagenomic sequences, (ii) the similarity measure used to compare sequences, and (iii) the type of taxonomic classification, which can be conventional or hierarchical, depending on whether the classification process occurs in a single shot or in several steps according to the taxonomic tree. We defined a score function that measures the degree of separability of the taxonomic classes under a given configuration induced by the parameters above. We conducted an extensive computational experiment and found out that reasonable values for the parameters of interest could be (i) intermediate values of n, the length of the n-mers; (ii) any similarity measure, because all of them resulted in similar scores; and (iii) the hierarchical strategy, which performed better in all of the cases. As expected, short n-mers generate lower configuration scores because they give rise to frequency vectors that represent distinct sequences in a similar way. On the other hand, large values for n result in sparse frequency vectors that represent differently metagenomic fragments that are in fact similar, also leading to low configuration scores. Regarding the similarity measure, in

  5. Surface electromyogram signals classification based on bispectrum.

    Science.gov (United States)

    Orosco, Eugenio; Lopez, Natalia; Soria, Carlos; di Sciascio, Fernando

    2010-01-01

    This paper bispectrum is used to classify human arm movements and control a robotic arm based on upper limb's surface electromyogram signals (sEMG). We use bispectrum based on third-order cumulant to parameterize sEMG signals and classify elbow flexion and extension, forearm pronation and supination, and rest states by an artificial neural network (ANN). Finally, a robotic manipulator is controlled based on classification and parameters extracted from the signals. All this process is made in real-time using QNX ® operative system.

  6. Mechanism-based drug exposure classification in pharmacoepidemiological studies

    NARCIS (Netherlands)

    Verdel, B.M.

    2010-01-01

    Mechanism-based classification of drug exposure in pharmacoepidemiological studies In pharmacoepidemiology and pharmacovigilance, the relation between drug exposure and clinical outcomes is crucial. Exposure classification in pharmacoepidemiological studies is traditionally based on

  7. A multifactorial likelihood model for MMR gene variant classification incorporating probabilities based on sequence bioinformatics and tumor characteristics: a report from the Colon Cancer Family Registry.

    Science.gov (United States)

    Thompson, Bryony A; Goldgar, David E; Paterson, Carol; Clendenning, Mark; Walters, Rhiannon; Arnold, Sven; Parsons, Michael T; Michael D, Walsh; Gallinger, Steven; Haile, Robert W; Hopper, John L; Jenkins, Mark A; Lemarchand, Loic; Lindor, Noralane M; Newcomb, Polly A; Thibodeau, Stephen N; Young, Joanne P; Buchanan, Daniel D; Tavtigian, Sean V; Spurdle, Amanda B

    2013-01-01

    Mismatch repair (MMR) gene sequence variants of uncertain clinical significance are often identified in suspected Lynch syndrome families, and this constitutes a challenge for both researchers and clinicians. Multifactorial likelihood model approaches provide a quantitative measure of MMR variant pathogenicity, but first require input of likelihood ratios (LRs) for different MMR variation-associated characteristics from appropriate, well-characterized reference datasets. Microsatellite instability (MSI) and somatic BRAF tumor data for unselected colorectal cancer probands of known pathogenic variant status were used to derive LRs for tumor characteristics using the Colon Cancer Family Registry (CFR) resource. These tumor LRs were combined with variant segregation within families, and estimates of prior probability of pathogenicity based on sequence conservation and position, to analyze 44 unclassified variants identified initially in Australasian Colon CFR families. In addition, in vitro splicing analyses were conducted on the subset of variants based on bioinformatic splicing predictions. The LR in favor of pathogenicity was estimated to be ~12-fold for a colorectal tumor with a BRAF mutation-negative MSI-H phenotype. For 31 of the 44 variants, the posterior probabilities of pathogenicity were such that altered clinical management would be indicated. Our findings provide a working multifactorial likelihood model for classification that carefully considers mode of ascertainment for gene testing. © 2012 Wiley Periodicals, Inc.

  8. Quality-based Multimodal Classification Using Tree-Structured Sparsity

    Science.gov (United States)

    2014-03-08

    ASI Series F, Computer and Systems Sciences, 163:446–456, 1999. 5 [7] D. Hall and J. Llinas. An introduction to multisensor data fusion . Proceedings of...advantages of in- formation fusion based on sparsity models for multi- modal classification. Among several sparsity models, tree- structured sparsity provides...rithm is proposed to solve the optimization problem, which is an efficient tool for feature-level fusion among either ho- mogeneous or heterogeneous

  9. [Automatic classification method of star spectrum data based on classification pattern tree].

    Science.gov (United States)

    Zhao, Xu-Jun; Cai, Jiang-Hui; Zhang, Ji-Fu; Yang, Hai-Feng; Ma, Yang

    2013-10-01

    Frequent pattern, frequently appearing in the data set, plays an important role in data mining. For the stellar spectrum classification tasks, a classification rule mining method based on classification pattern tree is presented on the basis of frequent pattern. The procedures can be shown as follows. Firstly, a new tree structure, i. e., classification pattern tree, is introduced based on the different frequencies of stellar spectral attributes in data base and its different importance used for classification. The related concepts and the construction method of classification pattern tree are also described in this paper. Then, the characteristics of the stellar spectrum are mapped to the classification pattern tree. Two modes of top-to-down and bottom-to-up are used to traverse the classification pattern tree and extract the classification rules. Meanwhile, the concept of pattern capability is introduced to adjust the number of classification rules and improve the construction efficiency of the classification pattern tree. Finally, the SDSS (the Sloan Digital Sky Survey) stellar spectral data provided by the National Astronomical Observatory are used to verify the accuracy of the method. The results show that a higher classification accuracy has been got.

  10. ICF-based classification and measurement of functioning.

    Science.gov (United States)

    Stucki, G; Kostanjsek, N; Ustün, B; Cieza, A

    2008-09-01

    If we aim towards a comprehensive understanding of human functioning and the development of comprehensive programs to optimize functioning of individuals and populations we need to develop suitable measures. The approval of the International Classification, Disability and Health (ICF) in 2001 by the 54th World Health Assembly as the first universally shared model and classification of functioning, disability and health marks, therefore an important step in the development of measurement instruments and ultimately for our understanding of functioning, disability and health. The acceptance and use of the ICF as a reference framework and classification has been facilitated by its development in a worldwide, comprehensive consensus process and the increasing evidence regarding its validity. However, the broad acceptance and use of the ICF as a reference framework and classification will also depend on the resolution of conceptual and methodological challenges relevant for the classification and measurement of functioning. This paper therefore describes first how the ICF categories can serve as building blocks for the measurement of functioning and then the current state of the development of ICF based practical tools and international standards such as the ICF Core Sets. Finally it illustrates how to map the world of measures to the ICF and vice versa and the methodological principles relevant for the transformation of information obtained with a clinical test or a patient-oriented instrument to the ICF as well as the development of ICF-based clinical and self-reported measurement instruments.

  11. Reducing Spatial Data Complexity for Classification Models

    International Nuclear Information System (INIS)

    Ruta, Dymitr; Gabrys, Bogdan

    2007-01-01

    Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be frequently retrained which further hinders their use. Various data reduction techniques ranging from data sampling up to density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions. As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of classification performance at the

  12. Nominated Texture Based Cervical Cancer Classification

    Directory of Open Access Journals (Sweden)

    Edwin Jayasingh Mariarputham

    2015-01-01

    Full Text Available Accurate classification of Pap smear images becomes the challenging task in medical image processing. This can be improved in two ways. One way is by selecting suitable well defined specific features and the other is by selecting the best classifier. This paper presents a nominated texture based cervical cancer (NTCC classification system which classifies the Pap smear images into any one of the seven classes. This can be achieved by extracting well defined texture features and selecting best classifier. Seven sets of texture features (24 features are extracted which include relative size of nucleus and cytoplasm, dynamic range and first four moments of intensities of nucleus and cytoplasm, relative displacement of nucleus within the cytoplasm, gray level cooccurrence matrix, local binary pattern histogram, tamura features, and edge orientation histogram. Few types of support vector machine (SVM and neural network (NN classifiers are used for the classification. The performance of the NTCC algorithm is tested and compared to other algorithms on public image database of Herlev University Hospital, Denmark, with 917 Pap smear images. The output of SVM is found to be best for the most of the classes and better results for the remaining classes.

  13. Remote Sensing Image Classification Based on Stacked Denoising Autoencoder

    Directory of Open Access Journals (Sweden)

    Peng Liang

    2017-12-01

    Full Text Available Focused on the issue that conventional remote sensing image classification methods have run into the bottlenecks in accuracy, a new remote sensing image classification method inspired by deep learning is proposed, which is based on Stacked Denoising Autoencoder. First, the deep network model is built through the stacked layers of Denoising Autoencoder. Then, with noised input, the unsupervised Greedy layer-wise training algorithm is used to train each layer in turn for more robust expressing, characteristics are obtained in supervised learning by Back Propagation (BP neural network, and the whole network is optimized by error back propagation. Finally, Gaofen-1 satellite (GF-1 remote sensing data are used for evaluation, and the total accuracy and kappa accuracy reach 95.7% and 0.955, respectively, which are higher than that of the Support Vector Machine and Back Propagation neural network. The experiment results show that the proposed method can effectively improve the accuracy of remote sensing image classification.

  14. Directional wavelet based features for colonic polyp classification.

    Science.gov (United States)

    Wimmer, Georg; Tamaki, Toru; Tischendorf, J J W; Häfner, Michael; Yoshida, Shigeto; Tanaka, Shinji; Uhl, Andreas

    2016-07-01

    -of-the-art methods on most of the databases. We will also show that the Weibull distribution is better suited to model the subband coefficient distribution than other commonly used probability distributions like the Gaussian distribution and the generalized Gaussian distribution. So this work gives a reasonable summary of wavelet based methods for colonic polyp classification and the huge amount of endoscopic polyp databases used for our experiments assures a high significance of the achieved results. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  15. Discriminative Analysis of Different Grades of Gaharu (Aquilaria malaccensis Lamk. via 1H-NMR-Based Metabolomics Using PLS-DA and Random Forests Classification Models

    Directory of Open Access Journals (Sweden)

    Siti Nazirah Ismail

    2017-09-01

    Full Text Available Gaharu (agarwood, Aquilaria malaccensis Lamk. is a valuable tropical rainforest product traded internationally for its distinctive fragrance. It is not only popular as incense and in perfumery, but also favored in traditional medicine due to its sedative, carminative, cardioprotective and analgesic effects. The current study addresses the chemical differences and similarities between gaharu samples of different grades, obtained commercially, using 1H-NMR-based metabolomics. Two classification models: partial least squares-discriminant analysis (PLS-DA and Random Forests were developed to classify the gaharu samples on the basis of their chemical constituents. The gaharu samples could be reclassified into a ‘high grade’ group (samples A, B and D, characterized by high contents of kusunol, jinkohol, and 10-epi-γ-eudesmol; an ‘intermediate grade’ group (samples C, F and G, dominated by fatty acid and vanillic acid; and a ‘low grade’ group (sample E and H, which had higher contents of aquilarone derivatives and phenylethyl chromones. The results showed that 1H- NMR-based metabolomics can be a potential method to grade the quality of gaharu samples on the basis of their chemical constituents.

  16. Genome-Based Taxonomic Classification of Bacteroidetes

    Science.gov (United States)

    Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N.; Woyke, Tanja; Kyrpides, Nikos C.; Klenk, Hans-Peter; Göker, Markus

    2016-01-01

    The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved. PMID:28066339

  17. A Model for Classification Secondary School Student Enrollment Approval Based on E-Learning Management System and E-Games

    OpenAIRE

    Hany Mohamed El-katary; Essam M. Ramzy Hamed; Safaa Sayed Mahmoud

    2016-01-01

    Student is the key of the educational process, where students’ creativity and interactions are strongly encouraged. There are many tools embedded in Learning Management Systems (LMS) that considered as a goal evaluation of learners. A problem that currently appeared is that assessment process is not always fair or accurate in classifying students according to accumulated knowledge. Therefore, there is a need to apply a new model for better decision making for students’ enrollment and assessme...

  18. Classification of archaeological pieces into their respective stratum by a chemometric model based on the soil concentration of 25 selected elements

    International Nuclear Information System (INIS)

    Carrero, J.A.; Goienaga, N.; Fdez-Ortiz de Vallejuelo, S.; Arana, G.; Madariaga, J.M.

    2010-01-01

    The aim of this work was to demonstrate that an archaeological ceramic piece has remained buried underground in the same stratum for centuries without being removed. For this purpose, a chemometric model based on Principal Component Analysis, Soft Independent Modelling of Class Analogy and Linear Discriminant Analysis classification techniques was created with the concentration of some selected elements of both soil of the stratum and soil adhered to the ceramic piece. Some ceramic pieces from four different stratigraphic units, coming from a roman archaeological site in Alava (North of Spain), and its respective stratum soils were collected. The soil adhered to the ceramic pieces was removed and treated in the same way as the soil from its respective stratum. The digestion was carried out following the US Environmental Pollution Agency EPA 3051A method. A total of 54 elements were determined in the extracts by a rapid screening inductively coupled plasma mass spectrometry method. After rejecting the major elements and those which could have changed from the original composition of the soils (migration or retention from/to the buried objects), the following elements (25) were finally taken into account to construct the model: Li, V, Co, As, Y, Nb, Sn, Ba, La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, Au, Th and U. A total of 33 subsamples were treated from 10 soils belonging to 4 different stratigraphic units. The final model groups and discriminate them in four groups, according to the stratigraphic unit, having both the stratum and soils adhered to the pieces falling down in the same group.

  19. Classification of Lactococcus lactis cell envelope proteinase based on gene sequencing, peptides formed after hydrolysis of milk, and computer modeling

    DEFF Research Database (Denmark)

    Børsting, Mette Winther; Qvist, K.B.; Brockmann, E.

    2015-01-01

    Lactococcus lactis strains depend on a proteolytic system for growth in milk to release essential AA from casein. The cleavage specificities of the cell envelope proteinase (CEP) can vary between strains and environments and whether the enzyme is released or bound to the cell wall. Thirty-eight Lc....... lactis strains were grouped according to their CEP AA sequences and according to identified peptides after hydrolysis of milk. Finally, AA positions in the substrate binding region were suggested by the use of a new CEP template based on Streptococcus C5a CEP. Aligning the CEP AA sequences of 38 strains...

  20. Cirrhosis Classification Based on Texture Classification of Random Features

    Directory of Open Access Journals (Sweden)

    Hui Liu

    2014-01-01

    Full Text Available Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage. CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM.

  1. Energy-efficiency based classification of the manufacturing workstation

    Science.gov (United States)

    Frumuşanu, G.; Afteni, C.; Badea, N.; Epureanu, A.

    2017-08-01

    EU Directive 92/75/EC established for the first time an energy consumption labelling scheme, further implemented by several other directives. As consequence, nowadays many products (e.g. home appliances, tyres, light bulbs, houses) have an EU Energy Label when offered for sale or rent. Several energy consumption models of manufacturing equipments have been also developed. This paper proposes an energy efficiency - based classification of the manufacturing workstation, aiming to characterize its energetic behaviour. The concept of energy efficiency of the manufacturing workstation is defined. On this base, a classification methodology has been developed. It refers to specific criteria and their evaluation modalities, together to the definition & delimitation of energy efficiency classes. The energy class position is defined after the amount of energy needed by the workstation in the middle point of its operating domain, while its extension is determined by the value of the first coefficient from the Taylor series that approximates the dependence between the energy consume and the chosen parameter of the working regime. The main domain of interest for this classification looks to be the optimization of the manufacturing activities planning and programming. A case-study regarding an actual lathe classification from energy efficiency point of view, based on two different approaches (analytical and numerical) is also included.

  2. Fast rule-based bioactivity prediction using associative classification mining

    Directory of Open Access Journals (Sweden)

    Yu Pulan

    2012-11-01

    Full Text Available Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM, which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR, classification based on multiple association rules (CMAR and classification based on association rules (CBA are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB, mutagenicity and hERG (the human Ether-a-go-go-Related Gene blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM methods, and produce highly interpretable models.

  3. Histological image classification using biologically interpretable shape-based features

    International Nuclear Information System (INIS)

    Kothari, Sonal; Phan, John H; Young, Andrew N; Wang, May D

    2013-01-01

    Automatic cancer diagnostic systems based on histological image classification are important for improving therapeutic decisions. Previous studies propose textural and morphological features for such systems. These features capture patterns in histological images that are useful for both cancer grading and subtyping. However, because many of these features lack a clear biological interpretation, pathologists may be reluctant to adopt these features for clinical diagnosis. We examine the utility of biologically interpretable shape-based features for classification of histological renal tumor images. Using Fourier shape descriptors, we extract shape-based features that capture the distribution of stain-enhanced cellular and tissue structures in each image and evaluate these features using a multi-class prediction model. We compare the predictive performance of the shape-based diagnostic model to that of traditional models, i.e., using textural, morphological and topological features. The shape-based model, with an average accuracy of 77%, outperforms or complements traditional models. We identify the most informative shapes for each renal tumor subtype from the top-selected features. Results suggest that these shapes are not only accurate diagnostic features, but also correlate with known biological characteristics of renal tumors. Shape-based analysis of histological renal tumor images accurately classifies disease subtypes and reveals biologically insightful discriminatory features. This method for shape-based analysis can be extended to other histological datasets to aid pathologists in diagnostic and therapeutic decisions

  4. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  5. Contextual segment-based classification of airborne laser scanner data

    NARCIS (Netherlands)

    Vosselman, George; Coenen, Maximilian; Rottensteiner, Franz

    2017-01-01

    Classification of point clouds is needed as a first step in the extraction of various types of geo-information from point clouds. We present a new approach to contextual classification of segmented airborne laser scanning data. Potential advantages of segment-based classification are easily offset

  6. Classification of Regional Ionospheric Disturbances Based on Support Vector Machines

    Science.gov (United States)

    Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil

    2016-07-01

    Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification

  7. Classification of customer lifetime value models using Markov chain

    Science.gov (United States)

    Permana, Dony; Pasaribu, Udjianna S.; Indratno, Sapto W.; Suprayogi

    2017-10-01

    A firm’s potential reward in future time from a customer can be determined by customer lifetime value (CLV). There are some mathematic methods to calculate it. One method is using Markov chain stochastic model. Here, a customer is assumed through some states. Transition inter the states follow Markovian properties. If we are given some states for a customer and the relationships inter states, then we can make some Markov models to describe the properties of the customer. As Markov models, CLV is defined as a vector contains CLV for a customer in the first state. In this paper we make a classification of Markov Models to calculate CLV. Start from two states of customer model, we make develop in many states models. The development a model is based on weaknesses in previous model. Some last models can be expected to describe how real characters of customers in a firm.

  8. G0-WISHART Distribution Based Classification from Polarimetric SAR Images

    Science.gov (United States)

    Hu, G. C.; Zhao, Q. H.

    2017-09-01

    Enormous scientific and technical developments have been carried out to further improve the remote sensing for decades, particularly Polarimetric Synthetic Aperture Radar(PolSAR) technique, so classification method based on PolSAR images has getted much more attention from scholars and related department around the world. The multilook polarmetric G0-Wishart model is a more flexible model which describe homogeneous, heterogeneous and extremely heterogeneous regions in the image. Moreover, the polarmetric G0-Wishart distribution dose not include the modified Bessel function of the second kind. It is a kind of simple statistical distribution model with less parameter. To prove its feasibility, a process of classification has been tested with the full-polarized Synthetic Aperture Radar (SAR) image by the method. First, apply multilook polarimetric SAR data process and speckle filter to reduce speckle influence for classification result. Initially classify the image into sixteen classes by H/A/α decomposition. Using the ICM algorithm to classify feature based on the G0-Wshart distance. Qualitative and quantitative results show that the proposed method can classify polaimetric SAR data effectively and efficiently.

  9. A classification model for non-alcoholic steatohepatitis (NASH) using confocal Raman micro-spectroscopy

    Science.gov (United States)

    Yan, Jie; Yu, Yang; Kang, Jeon Woong; Tam, Zhi Yang; Xu, Shuoyu; Fong, Eliza Li Shan; Singh, Surya Pratap; Song, Ziwei; Tucker Kellogg, Lisa; So, Peter; Yu, Hanry

    2017-07-01

    We combined Raman micro-spectroscopy and machine learning techniques to develop a classification model based on a well-established non-alcoholic steatohepatitis (NASH) mouse model, using spectrum pre-processing, biochemical component analysis (BCA) and logistic regression.

  10. Classification of very high resolution SAR images of urban areas by dictionary-based mixture models, copulas, and Markov random fields using textural features

    Science.gov (United States)

    Voisin, Aurélie; Moser, Gabriele; Krylov, Vladimir A.; Serpico, Sebastiano B.; Zerubia, Josiane

    2010-10-01

    This paper addresses the problem of the classification of very high resolution (VHR) SAR amplitude images of urban areas. The proposed supervised method combines a finite mixture technique to estimate class-conditional probability density functions, Bayesian classification, and Markov random fields (MRFs). Textural features, such as those extracted by the greylevel co-occurrency method, are also integrated in the technique, as they allow to improve the discrimination of urban areas. Copulas are applied to estimate bivariate joint class-conditional statistics, merging the marginal distributions of both textural and SAR amplitude features. The resulting joint distribution estimates are plugged into a hidden MRF model, endowed with a modified Metropolis dynamics scheme for energy minimization. Experimental results with COSMO-SkyMed and TerraSAR-X images point out the accuracy of the proposed method, also as compared with previous contextual classifiers.

  11. Vessel-guided airway segmentation based on voxel classification

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem

    2008-01-01

    This paper presents a method for improving airway tree segmentation using vessel orientation information. We use the fact that an airway branch is always accompanied by an artery, with both structures having similar orientations. This work is based on a  voxel classification airway segmentation...... method proposed previously. The probability of a voxel belonging to the airway, from the voxel classification method, is augmented with an orientation similarity measure as a criterion for region growing. The orientation similarity measure of a voxel indicates how similar is the orientation...... of the surroundings of a voxel, estimated based on a tube model, is to that of a neighboring vessel. The proposed method is tested on 20 CT images from different subjects selected randomly from a lung cancer screening study. Length of the airway branches from the results of the proposed method are significantly...

  12. Soil classification basing on the spectral characteristics of topsoil samples

    Science.gov (United States)

    Liu, Huanjun; Zhang, Xiaokang; Zhang, Xinle

    2016-04-01

    Soil taxonomy plays an important role in soil utility and management, but China has only course soil map created based on 1980s data. New technology, e.g. spectroscopy, could simplify soil classification. The study try to classify soils basing on the spectral characteristics of topsoil samples. 148 topsoil samples of typical soils, including Black soil, Chernozem, Blown soil and Meadow soil, were collected from Songnen plain, Northeast China, and the room spectral reflectance in the visible and near infrared region (400-2500 nm) were processed with weighted moving average, resampling technique, and continuum removal. Spectral indices were extracted from soil spectral characteristics, including the second absorption positions of spectral curve, the first absorption vale's area, and slope of spectral curve at 500-600 nm and 1340-1360 nm. Then K-means clustering and decision tree were used respectively to build soil classification model. The results indicated that 1) the second absorption positions of Black soil and Chernozem were located at 610 nm and 650 nm respectively; 2) the spectral curve of the meadow is similar to its adjacent soil, which could be due to soil erosion; 3) decision tree model showed higher classification accuracy, and accuracy of Black soil, Chernozem, Blown soil and Meadow are 100%, 88%, 97%, 50% respectively, and the accuracy of Blown soil could be increased to 100% by adding one more spectral index (the first two vole's area) to the model, which showed that the model could be used for soil classification and soil map in near future.

  13. Partial imputation to improve predictive modelling in insurance risk classification using a hybrid positive selection algorithm and correlation-based feature selection

    CSIR Research Space (South Africa)

    Duma, M

    2013-09-01

    Full Text Available model is built using libSVM 2.91, a library tool for SVMs designed by Chih-Chung Chang and Chin-Jen Lin. The model also used the radial basis function (RBF) as the kernel function (KF) and the gamma parameter was set to 0.005 (derived by trial...., Dancing with dirty data: methods for exploring and cleaning data. In Casualty Actuarial Society Forum, Casualty Ac- tuarial Society, Virginia, USA, pp. 198–254. 2. Peng, Y. and Kou, G., A comparative study of classification methods in financial risk...

  14. Overfitting Reduction of Text Classification Based on AdaBELM

    Directory of Open Access Journals (Sweden)

    Xiaoyue Feng

    2017-07-01

    Full Text Available Overfitting is an important problem in machine learning. Several algorithms, such as the extreme learning machine (ELM, suffer from this issue when facing high-dimensional sparse data, e.g., in text classification. One common issue is that the extent of overfitting is not well quantified. In this paper, we propose a quantitative measure of overfitting referred to as the rate of overfitting (RO and a novel model, named AdaBELM, to reduce the overfitting. With RO, the overfitting problem can be quantitatively measured and identified. The newly proposed model can achieve high performance on multi-class text classification. To evaluate the generalizability of the new model, we designed experiments based on three datasets, i.e., the 20 Newsgroups, Reuters-21578, and BioMed corpora, which represent balanced, unbalanced, and real application data, respectively. Experiment results demonstrate that AdaBELM can reduce overfitting and outperform classical ELM, decision tree, random forests, and AdaBoost on all three text-classification datasets; for example, it can achieve 62.2% higher accuracy than ELM. Therefore, the proposed model has a good generalizability.

  15. Density Based Support Vector Machines for Classification

    OpenAIRE

    Zahra Nazari; Dongshik Kang

    2015-01-01

    Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...

  16. Structure-based classification and ontology in chemistry

    Directory of Open Access Journals (Sweden)

    Hastings Janna

    2012-04-01

    Full Text Available Abstract Background Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving relevant results from the available information, and organising those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures, while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. Results We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. Conclusion Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational

  17. Model sparsity and brain pattern interpretation of classification models in neuroimaging

    DEFF Research Database (Denmark)

    Rasmussen, Peter Mondrup; Madsen, Kristoffer Hougaard; Churchill, Nathan W

    2012-01-01

    Interest is increasing in applying discriminative multivariate analysis techniques to the analysis of functional neuroimaging data. Model interpretation is of great importance in the neuroimaging context, and is conventionally based on a ‘brain map’ derived from the classification model. In this ...

  18. Classification models for the prediction of clinicians' information needs.

    Science.gov (United States)

    Del Fiol, Guilherme; Haug, Peter J

    2009-02-01

    Clinicians face numerous information needs during patient care activities and most of these needs are not met. Infobuttons are information retrieval tools that help clinicians to fulfill their information needs by providing links to on-line health information resources from within an electronic medical record (EMR) system. The aim of this study was to produce classification models based on medication infobutton usage data to predict the medication-related content topics (e.g., dose, adverse effects, drug interactions, patient education) that a clinician is most likely to choose while entering medication orders in a particular clinical context. We prepared a dataset with 3078 infobutton sessions and 26 attributes describing characteristics of the user, the medication, and the patient. In these sessions, users selected one out of eight content topics. Automatic attribute selection methods were then applied to the dataset to eliminate redundant and useless attributes. The reduced dataset was used to produce nine classification models from a set of state-of-the-art machine learning algorithms. Finally, the performance of the models was measured and compared. Area under the ROC curve (AUC) and agreement (kappa) between the content topics predicted by the models and those chosen by clinicians in each infobutton session. The performance of the models ranged from 0.49 to 0.56 (kappa). The AUC of the best model ranged from 0.73 to 0.99. The best performance was achieved when predicting choice of the adult dose, pediatric dose, patient education, and pregnancy category content topics. The results suggest that classification models based on infobutton usage data are a promising method for the prediction of content topics that a clinician would choose to answer patient care questions while using an EMR system.

  19. A Library Book Intelligence Classification System based on Multi-agent

    Science.gov (United States)

    Pengfei, Guo; Liangxian, Du; Junxia, Qi

    This paper introduces the concept of artificial intelligence into the administrative system of the library, and then gives the model of robot system in book classification based on multi-agent. The intelligent robot can recognize books' barcode automatically and here gives the classification algorithm according to the book classification of Chinese library. The algorithm can calculate the concrete position of the books, and relate with all similar books, thus the robot can put all congener books once without turning back.

  20. A model for anomaly classification in intrusion detection systems

    Science.gov (United States)

    Ferreira, V. O.; Galhardi, V. V.; Gonçalves, L. B. L.; Silva, R. C.; Cansian, A. M.

    2015-09-01

    Intrusion Detection Systems (IDS) are traditionally divided into two types according to the detection methods they employ, namely (i) misuse detection and (ii) anomaly detection. Anomaly detection has been widely used and its main advantage is the ability to detect new attacks. However, the analysis of anomalies generated can become expensive, since they often have no clear information about the malicious events they represent. In this context, this paper presents a model for automated classification of alerts generated by an anomaly based IDS. The main goal is either the classification of the detected anomalies in well-defined taxonomies of attacks or to identify whether it is a false positive misclassified by the IDS. Some common attacks to computer networks were considered and we achieved important results that can equip security analysts with best resources for their analyses.

  1. Using Discrete Loss Functions and Weighted Kappa for Classification: An Illustration Based on Bayesian Network Analysis

    Science.gov (United States)

    Zwick, Rebecca; Lenaburg, Lubella

    2009-01-01

    In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…

  2. Mathematical programming models for classification problems with applications to credit scoring

    OpenAIRE

    Falangis, Konstantinos

    2013-01-01

    Mathematical programming (MP) can be used for developing classification models for the two–group classification problem. An MP model can be used to generate a discriminant function that separates the observations in a training sample of known group membership into the specified groups optimally in terms of a group separation criterion. The simplest models for MP discriminant analysis are linear programming models in which the group separation measure is generally based on th...

  3. Topological modeling and classification of mammographic microcalcification clusters.

    Science.gov (United States)

    Chen, Zhili; Strange, Harry; Oliver, Arnau; Denton, Erika R E; Boggis, Caroline; Zwiggelaar, Reyer

    2015-04-01

    The presence of microcalcification clusters is a primary sign of breast cancer; however, it is difficult and time consuming for radiologists to classify microcalcifications as malignant or benign. In this paper, a novel method for the classification of microcalcification clusters in mammograms is proposed. The topology/connectivity of individual microcalcifications is analyzed within a cluster using multiscale morphology. This is distinct from existing approaches that tend to concentrate on the morphology of individual microcalcifications and/or global (statistical) cluster features. A set of microcalcification graphs are generated to represent the topological structure of microcalcification clusters at different scales. Subsequently, graph theoretical features are extracted, which constitute the topological feature space for modeling and classifying microcalcification clusters. k-nearest-neighbors-based classifiers are employed for classifying microcalcification clusters. The validity of the proposed method is evaluated using two well-known digitized datasets (MIAS and DDSM) and a full-field digital dataset. High classification accuracies (up to 96%) and good ROC results (area under the ROC curve up to 0.96) are achieved. A full comparison with related publications is provided, which includes a direct comparison. The results indicate that the proposed approach is able to outperform the current state-of-the-art methods. Significance: This study shows that topology modeling is an important tool for microcalcification analysis not only because of the improved classification accuracy but also because the topological measures can be linked to clinical understanding.

  4. Knowledge base image classification using P-trees

    Science.gov (United States)

    Seetha, M.; Ravi, G.

    2010-02-01

    Image Classification is the process of assigning classes to the pixels in remote sensed images and important for GIS applications, since the classified image is much easier to incorporate than the original unclassified image. To resolve misclassification in traditional parametric classifier like Maximum Likelihood Classifier, the neural network classifier is implemented using back propagation algorithm. The extra spectral and spatial knowledge acquired from the ancillary information is required to improve the accuracy and remove the spectral confusion. To build knowledge base automatically, this paper explores a non-parametric decision tree classifier to extract knowledge from the spatial data in the form of classification rules. A new method is proposed using a data structure called Peano Count Tree (P-tree) for decision tree classification. The Peano Count Tree is a spatial data organization that provides a lossless compressed representation of a spatial data set and facilitates efficient classification than other data mining techniques. The accuracy is assessed using the parameters overall accuracy, User's accuracy and Producer's accuracy for image classification methods of Maximum Likelihood Classification, neural network classification using back propagation, Knowledge Base Classification, Post classification and P-tree Classifier. The results reveal that the knowledge extracted from decision tree classifier and P-tree data structure from proposed approach remove the problem of spectral confusion to a greater extent. It is ascertained that the P-tree classifier surpasses the other classification techniques.

  5. Integrating Globality and Locality for Robust Representation Based Classification

    Directory of Open Access Journals (Sweden)

    Zheng Zhang

    2014-01-01

    Full Text Available The representation based classification method (RBCM has shown huge potential for face recognition since it first emerged. Linear regression classification (LRC method and collaborative representation classification (CRC method are two well-known RBCMs. LRC and CRC exploit training samples of each class and all the training samples to represent the testing sample, respectively, and subsequently conduct classification on the basis of the representation residual. LRC method can be viewed as a “locality representation” method because it just uses the training samples of each class to represent the testing sample and it cannot embody the effectiveness of the “globality representation.” On the contrary, it seems that CRC method cannot own the benefit of locality of the general RBCM. Thus we propose to integrate CRC and LRC to perform more robust representation based classification. The experimental results on benchmark face databases substantially demonstrate that the proposed method achieves high classification accuracy.

  6. Joint Probability-Based Neuronal Spike Train Classification

    Directory of Open Access Journals (Sweden)

    Yan Chen

    2009-01-01

    Full Text Available Neuronal spike trains are used by the nervous system to encode and transmit information. Euclidean distance-based methods (EDBMs have been applied to quantify the similarity between temporally-discretized spike trains and model responses. In this study, using the same discretization procedure, we developed and applied a joint probability-based method (JPBM to classify individual spike trains of slowly adapting pulmonary stretch receptors (SARs. The activity of individual SARs was recorded in anaesthetized, paralysed adult male rabbits, which were artificially-ventilated at constant rate and one of three different volumes. Two-thirds of the responses to the 600 stimuli presented at each volume were used to construct three response models (one for each stimulus volume consisting of a series of time bins, each with spike probabilities. The remaining one-third of the responses where used as test responses to be classified into one of the three model responses. This was done by computing the joint probability of observing the same series of events (spikes or no spikes, dictated by the test response in a given model and determining which probability of the three was highest. The JPBM generally produced better classification accuracy than the EDBM, and both performed well above chance. Both methods were similarly affected by variations in discretization parameters, response epoch duration, and two different response alignment strategies. Increasing bin widths increased classification accuracy, which also improved with increased observation time, but primarily during periods of increasing lung inflation. Thus, the JPBM is a simple and effective method performing spike train classification.

  7. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    Science.gov (United States)

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  8. Inter Genre Similarity Modelling For Automatic Music Genre Classification

    OpenAIRE

    Bagci, Ulas; Erzin, Engin

    2009-01-01

    Music genre classification is an essential tool for music information retrieval systems and it has been finding critical applications in various media platforms. Two important problems of the automatic music genre classification are feature extraction and classifier design. This paper investigates inter-genre similarity modelling (IGS) to improve the performance of automatic music genre classification. Inter-genre similarity information is extracted over the mis-classified feature population....

  9. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

    KAUST Repository

    Najibi, Seyed Morteza

    2017-02-08

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  10. Probabilistic grammatical model for helix‐helix contact site classification

    Science.gov (United States)

    2013-01-01

    Background Hidden Markov Models power many state‐of‐the‐art tools in the field of protein bioinformatics. While excelling in their tasks, these methods of protein analysis do not convey directly information on medium‐ and long‐range residue‐residue interactions. This requires an expressive power of at least context‐free grammars. However, application of more powerful grammar formalisms to protein analysis has been surprisingly limited. Results In this work, we present a probabilistic grammatical framework for problem‐specific protein languages and apply it to classification of transmembrane helix‐helix pairs configurations. The core of the model consists of a probabilistic context‐free grammar, automatically inferred by a genetic algorithm from only a generic set of expert‐based rules and positive training samples. The model was applied to produce sequence based descriptors of four classes of transmembrane helix‐helix contact site configurations. The highest performance of the classifiers reached AUCROC of 0.70. The analysis of grammar parse trees revealed the ability of representing structural features of helix‐helix contact sites. Conclusions We demonstrated that our probabilistic context‐free framework for analysis of protein sequences outperforms the state of the art in the task of helix‐helix contact site classification. However, this is achieved without necessarily requiring modeling long range dependencies between interacting residues. A significant feature of our approach is that grammar rules and parse trees are human‐readable. Thus they could provide biologically meaningful information for molecular biologists. PMID:24350601

  11. Scanpath modeling and classification with hidden Markov models.

    Science.gov (United States)

    Coutrot, Antoine; Hsiao, Janet H; Chan, Antoni B

    2018-02-01

    How people look at visual information reveals fundamental information about them; their interests and their states of mind. Previous studies showed that scanpath, i.e., the sequence of eye movements made by an observer exploring a visual stimulus, can be used to infer observer-related (e.g., task at hand) and stimuli-related (e.g., image semantic category) information. However, eye movements are complex signals and many of these studies rely on limited gaze descriptors and bespoke datasets. Here, we provide a turnkey method for scanpath modeling and classification. This method relies on variational hidden Markov models (HMMs) and discriminant analysis (DA). HMMs encapsulate the dynamic and individualistic dimensions of gaze behavior, allowing DA to capture systematic patterns diagnostic of a given class of observers and/or stimuli. We test our approach on two very different datasets. Firstly, we use fixations recorded while viewing 800 static natural scene images, and infer an observer-related characteristic: the task at hand. We achieve an average of 55.9% correct classification rate (chance = 33%). We show that correct classification rates positively correlate with the number of salient regions present in the stimuli. Secondly, we use eye positions recorded while viewing 15 conversational videos, and infer a stimulus-related characteristic: the presence or absence of original soundtrack. We achieve an average 81.2% correct classification rate (chance = 50%). HMMs allow to integrate bottom-up, top-down, and oculomotor influences into a single model of gaze behavior. This synergistic approach between behavior and machine learning will open new avenues for simple quantification of gazing behavior. We release SMAC with HMM, a Matlab toolbox freely available to the community under an open-source license agreement.

  12. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  13. Music Genre Classification using the multivariate AR feature integration model

    DEFF Research Database (Denmark)

    Ahrendt, Peter; Meng, Anders

    2005-01-01

    Music genre classification systems are normally build as a feature extraction module followed by a classifier. The features are often short-time features with time frames of 10-30ms, although several characteristics of music require larger time scales. Thus, larger time frames are needed to take...... informative decisions about musical genre. For the MIREX music genre contest several authors derive long time features based either on statistical moments and/or temporal structure in the short time features. In our contribution we model a segment (1.2 s) of short time features (texture) using a multivariate...

  14. Simulation and classification of power quality events based on ...

    African Journals Online (AJOL)

    In this paper mathematical modeling method has been used in order to simulating of power quality events. In regarding to excellent neural network function in classification and pattern recognition works, multilayer neural network has been used for events classification. The STFT and Discrete Wavelet Transform are used for ...

  15. Correlation-based linear discriminant classification for gene expression data.

    Science.gov (United States)

    Pan, M; Zhang, J

    2017-01-23

    Microarray gene expression technology provides a systematic approach to patient classification. However, microarray data pose a great computational challenge owing to their large dimensionality, small sample sizes, and potential correlations among genes. A recent study has shown that gene-gene correlations have a positive effect on the accuracy of classification models, in contrast to some previous results. In this study, a recently developed correlation-based classifier, the ensemble of random subspace (RS) Fisher linear discriminants (FLDs), was utilized. The impact of gene-gene correlations on the performance of this classifier and other classifiers was studied using simulated datasets and real datasets. A cross-validation framework was used to evaluate the performance of each classifier using the simulated datasets or real datasets, and misclassification rates (MRs) were computed. Using the simulated data, the average MRs of the correlation-based classifiers decreased as the correlations increased when there were more correlated genes. Using real data, the correlation-based classifiers outperformed the non-correlation-based classifiers, especially when the gene-gene correlations were high. The ensemble RS-FLD classifier is a potential state-of-the-art computational method. The correlation-based ensemble RS-FLD classifier was effective and benefited from gene-gene correlations, particularly when the correlations were high.

  16. Hierarchical structure for audio-video based semantic classification of sports video sequences

    Science.gov (United States)

    Kolekar, M. H.; Sengupta, S.

    2005-07-01

    A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.

  17. Point Based Emotion Classification Using SVM

    OpenAIRE

    Swinkels, Wout

    2016-01-01

    The detection of emotions is a hot topic in the area of computer vision. Emotions are based on subtle changes in the face that are intuitively detected and interpreted by humans. Detecting these subtle changes, based on mathematical models, is a great challenge in the area of computer vision. In this thesis a new method is proposed to achieve state-of-the-art emotion detection performance. This method is based on facial feature points to monitor subtle changes in the face. Therefore the c...

  18. A novel neural network based image reconstruction model with scale and rotation invariance for target identification and classification for Active millimetre wave imaging

    Science.gov (United States)

    Agarwal, Smriti; Bisht, Amit Singh; Singh, Dharmendra; Pathak, Nagendra Prasad

    2014-12-01

    Millimetre wave imaging (MMW) is gaining tremendous interest among researchers, which has potential applications for security check, standoff personal screening, automotive collision-avoidance, and lot more. Current state-of-art imaging techniques viz. microwave and X-ray imaging suffers from lower resolution and harmful ionizing radiation, respectively. In contrast, MMW imaging operates at lower power and is non-ionizing, hence, medically safe. Despite these favourable attributes, MMW imaging encounters various challenges as; still it is very less explored area and lacks suitable imaging methodology for extracting complete target information. Keeping in view of these challenges, a MMW active imaging radar system at 60 GHz was designed for standoff imaging application. A C-scan (horizontal and vertical scanning) methodology was developed that provides cross-range resolution of 8.59 mm. The paper further details a suitable target identification and classification methodology. For identification of regular shape targets: mean-standard deviation based segmentation technique was formulated and further validated using a different target shape. For classification: probability density function based target material discrimination methodology was proposed and further validated on different dataset. Lastly, a novel artificial neural network based scale and rotation invariant, image reconstruction methodology has been proposed to counter the distortions in the image caused due to noise, rotation or scale variations. The designed neural network once trained with sample images, automatically takes care of these deformations and successfully reconstructs the corrected image for the test targets. Techniques developed in this paper are tested and validated using four different regular shapes viz. rectangle, square, triangle and circle.

  19. a Curvature Based Adaptive Neighborhood for Individual Point Cloud Classification

    Science.gov (United States)

    He, E.; Chen, Q.; Wang, H.; Liu, X.

    2017-09-01

    As a key step in 3D scene analysis, point cloud classification has gained a great deal of concerns in the past few years. Due to the uneven density, noise and data missing in point cloud, how to automatically classify the point cloud with a high precision is a very challenging task. The point cloud classification process typically includes the extraction of neighborhood based statistical information and machine learning algorithms. However, the robustness of neighborhood is limited to the density and curvature of the point cloud which lead to a label noise behavior in classification results. In this paper, we proposed a curvature based adaptive neighborhood for individual point cloud classification. Our main improvement is the curvature based adaptive neighborhood method, which could derive ideal 3D point local neighborhood and enhance the separability of features. The experiment result on Oakland benchmark dataset shows that the proposed method can effectively improve the classification accuracy of point cloud.

  20. Calibration of a Plastic Classification System with the Ccw Model

    International Nuclear Information System (INIS)

    Barcala Riveira, J. M.; Fernandez Marron, J. L.; Alberdi Primicia, J.; Navarrete Marin, J. J.; Oller Gonzalez, J. C.

    2003-01-01

    This document describes the calibration of a plastic Classification system with the Ccw model (Classification by Quantum's built with Wavelet Coefficients). The method is applied to spectra of plastics usually present in domestic wastes. Obtained results are showed. (Author) 16 refs

  1. AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Ding

    2016-06-01

    Full Text Available Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM. In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  2. A proposed data base system for detection, classification and ...

    African Journals Online (AJOL)

    A proposed data base system for detection, classification and location of fault on electricity company of Ghana electrical distribution system. Isaac Owusu-Nyarko, Mensah-Ananoo Eugine. Abstract. No Abstract. Keywords: database, classification of fault, power, distribution system, SCADA, ECG. Full Text: EMAIL FULL TEXT ...

  3. Comparison Between Revised Atlanta Classification and Determinant-Based Classification for Acute Pancreatitis in Intensive Care Medicine. Why Do Not Use a Modified Determinant-Based Classification?

    Science.gov (United States)

    Zubia-Olaskoaga, Felix; Maraví-Poma, Enrique; Urreta-Barallobre, Iratxe; Ramírez-Puerta, María-Rosario; Mourelo-Fariña, Mónica; Marcos-Neira, María-Pilar

    2016-05-01

    To compare the classification performance of the Revised Atlanta Classification, the Determinant-Based Classification, and a new modified Determinant-Based Classification according to observed mortality and morbidity. A prospective multicenter observational study conducted in 1-year period. Forty-six international ICUs (Epidemiology of Acute Pancreatitis in Intensive Care Medicine study). Admitted to an ICU with acute pancreatitis and at least one organ failure. Modified Determinant-Based Classification included four categories: In group 1, patients with transient organ failure and without local complications; in group 2, patients with transient organ failure and local complications; in group 3, patients with persistent organ failure and without local complications; and in group 4, patients with persistent organ failure and local complications. A total of 374 patients were included (mortality rate of 28.9%). When modified Determinant-Based Classification was applied, patients in group 1 presented low mortality (2.26%) and morbidity (5.38%), patients in group 2 presented low mortality (6.67%) and high morbidity (60.71%), patients in group 3 presented high mortality (41.46%) and low morbidity (8.33%), and patients in group 4 presented high mortality (59.09%) and morbidity (88.89%). The area under the receiver operator characteristics curve of modified Determinant-Based Classification for mortality was 0.81 (95% CI, 0.77-0.85), with significant differences in comparison to Revised Atlanta Classification (0.77; 95% CI, 0.73-0.81; p Determinant-Based Classification (0.77; 95% CI, 0.72-0.81; p Determinant-Based Classification was 0.80 (95% CI, 0.73-0.86), with significant differences in comparison to Revised Atlanta Classification (0.63, 95% CI, 0.57-0.70; p Determinant-Based Classification (0.81, 95% CI, 0.74-0.88; nonsignificant). Modified Determinant-Based Classification identified four groups with different clinical presentation in patients with acute pancreatitis in

  4. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  5. ISBDD Model for Classification of Hyperspectral Remote Sensing Imagery

    Directory of Open Access Journals (Sweden)

    Na Li

    2018-03-01

    Full Text Available The diverse density (DD algorithm was proposed to handle the problem of low classification accuracy when training samples contain interference such as mixed pixels. The DD algorithm can learn a feature vector from training bags, which comprise instances (pixels. However, the feature vector learned by the DD algorithm cannot always effectively represent one type of ground cover. To handle this problem, an instance space-based diverse density (ISBDD model that employs a novel training strategy is proposed in this paper. In the ISBDD model, DD values of each pixel are computed instead of learning a feature vector, and as a result, the pixel can be classified according to its DD values. Airborne hyperspectral data collected by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS sensor and the Push-broom Hyperspectral Imager (PHI are applied to evaluate the performance of the proposed model. Results show that the overall classification accuracy of ISBDD model on the AVIRIS and PHI images is up to 97.65% and 89.02%, respectively, while the kappa coefficient is up to 0.97 and 0.88, respectively.

  6. Deep neural network and noise classification-based speech enhancement

    Science.gov (United States)

    Shi, Wenhua; Zhang, Xiongwei; Zou, Xia; Han, Wei

    2017-07-01

    In this paper, a speech enhancement method using noise classification and Deep Neural Network (DNN) was proposed. Gaussian mixture model (GMM) was employed to determine the noise type in speech-absent frames. DNN was used to model the relationship between noisy observation and clean speech. Once the noise type was determined, the corresponding DNN model was applied to enhance the noisy speech. GMM was trained with mel-frequency cepstrum coefficients (MFCC) and the parameters were estimated with an iterative expectation-maximization (EM) algorithm. Noise type was updated by spectrum entropy-based voice activity detection (VAD). Experimental results demonstrate that the proposed method could achieve better objective speech quality and smaller distortion under stationary and non-stationary conditions.

  7. Automated Glioblastoma Segmentation Based on a Multiparametric Structured Unsupervised Classification

    Science.gov (United States)

    Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V.; Robles, Montserrat; Aparici, F.; Martí-Bonmatí, L.; García-Gómez, Juan M.

    2015-01-01

    Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation. PMID:25978453

  8. Automated glioblastoma segmentation based on a multiparametric structured unsupervised classification.

    Science.gov (United States)

    Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V; Robles, Montserrat; Aparici, F; Martí-Bonmatí, L; García-Gómez, Juan M

    2015-01-01

    Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.

  9. Validation of the Consensus-Definition for Cancer Cachexia and evaluation of a classification model--a study based on data from an international multicentre project (EPCRC-CSA).

    Science.gov (United States)

    Blum, D; Stene, G B; Solheim, T S; Fayers, P; Hjermstad, M J; Baracos, V E; Fearon, K; Strasser, F; Kaasa, S

    2014-08-01

    Weight loss limits cancer therapy, quality of life and survival. Common diagnostic criteria and a framework for a classification system for cancer cachexia were recently agreed upon by international consensus. Specific assessment domains (stores, intake, catabolism and function) were proposed. The aim of this study is to validate this diagnostic criteria (two groups: model 1) and examine a four-group (model 2) classification system regarding these domains as well as survival. Data from an international patient sample with advanced cancer (N = 1070) were analysed. In model 1, the diagnostic criteria for cancer cachexia [weight loss/body mass index (BMI)] were used. Model 2 classified patients into four groups 0-III, according to weight loss/BMI as a framework for cachexia stages. The cachexia domains, survival and sociodemographic/medical variables were compared across models. Eight hundred and sixty-one patients were included. Model 1 consisted of 399 cachectic and 462 non-cachectic patients. Cachectic patients had significantly higher levels of inflammation, lower nutritional intake and performance status and shorter survival. In model 2, differences were not consistent; appetite loss did not differ between group III and IV, and performance status not between group 0 and I. Survival was shorter in group II and III compared with other groups. By adding other cachexia domains to the model, survival differences were demonstrated. The diagnostic criteria based on weight loss and BMI distinguish between cachectic and non-cachectic patients concerning all domains (intake, catabolism and function) and is associated with survival. In order to guide cachexia treatment a four-group classification model needs additional domains to discriminate between cachexia stages. © The Author 2014. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  10. Chemometric classification of casework arson samples based on gasoline content.

    Science.gov (United States)

    Sinkov, Nikolai A; Sandercock, P Mark L; Harynuk, James J

    2014-02-01

    Detection and identification of ignitable liquids (ILs) in arson debris is a critical part of arson investigations. The challenge of this task is due to the complex and unpredictable chemical nature of arson debris, which also contains pyrolysis products from the fire. ILs, most commonly gasoline, are complex chemical mixtures containing hundreds of compounds that will be consumed or otherwise weathered by the fire to varying extents depending on factors such as temperature, air flow, the surface on which IL was placed, etc. While methods such as ASTM E-1618 are effective, data interpretation can be a costly bottleneck in the analytical process for some laboratories. In this study, we address this issue through the application of chemometric tools. Prior to the application of chemometric tools such as PLS-DA and SIMCA, issues of chromatographic alignment and variable selection need to be addressed. Here we use an alignment strategy based on a ladder consisting of perdeuterated n-alkanes. Variable selection and model optimization was automated using a hybrid backward elimination (BE) and forward selection (FS) approach guided by the cluster resolution (CR) metric. In this work, we demonstrate the automated construction, optimization, and application of chemometric tools to casework arson data. The resulting PLS-DA and SIMCA classification models, trained with 165 training set samples, have provided classification of 55 validation set samples based on gasoline content with 100% specificity and sensitivity. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  11. Pixel classification based color image segmentation using quaternion exponent moments.

    Science.gov (United States)

    Wang, Xiang-Yang; Wu, Zhi-Fang; Chen, Liang; Zheng, Hong-Liang; Yang, Hong-Ying

    2016-02-01

    Image segmentation remains an important, but hard-to-solve, problem since it appears to be application dependent with usually no a priori information available regarding the image structure. In recent years, many image segmentation algorithms have been developed, but they are often very complex and some undesired results occur frequently. In this paper, we propose a pixel classification based color image segmentation using quaternion exponent moments. Firstly, the pixel-level image feature is extracted based on quaternion exponent moments (QEMs), which can capture effectively the image pixel content by considering the correlation between different color channels. Then, the pixel-level image feature is used as input of twin support vector machines (TSVM) classifier, and the TSVM model is trained by selecting the training samples with Arimoto entropy thresholding. Finally, the color image is segmented with the trained TSVM model. The proposed scheme has the following advantages: (1) the effective QEMs is introduced to describe color image pixel content, which considers the correlation between different color channels, (2) the excellent TSVM classifier is utilized, which has lower computation time and higher classification accuracy. Experimental results show that our proposed method has very promising segmentation performance compared with the state-of-the-art segmentation approaches recently proposed in the literature. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Models for warehouse management: Classification and examples

    NARCIS (Netherlands)

    van den Berg, J.P.; van den Berg, J.P.; Zijm, Willem H.M.

    1999-01-01

    In this paper we discuss warehousing systems and present a classification of warehouse management problems. We start with a typology and a brief description of several types of warehousing systems. Next, we present a hierarchy of decision problems encountered in setting up warehousing systems,

  13. A new incomplete pattern classification method based on evidential reasoning.

    Science.gov (United States)

    Liu, Zhun-Ga; Pan, Quan; Mercier, Gregoire; Dezert, Jean

    2015-04-01

    The classification of incomplete patterns is a very challenging task because the object (incomplete pattern) with different possible estimations of missing values may yield distinct classification results. The uncertainty (ambiguity) of classification is mainly caused by the lack of information of the missing data. A new prototype-based credal classification (PCC) method is proposed to deal with incomplete patterns thanks to the belief function framework used classically in evidential reasoning approach. The class prototypes obtained by training samples are respectively used to estimate the missing values. Typically, in a c -class problem, one has to deal with c prototypes, which yield c estimations of the missing values. The different edited patterns based on each possible estimation are then classified by a standard classifier and we can get at most c distinct classification results for an incomplete pattern. Because all these distinct classification results are potentially admissible, we propose to combine them all together to obtain the final classification of the incomplete pattern. A new credal combination method is introduced for solving the classification problem, and it is able to characterize the inherent uncertainty due to the possible conflicting results delivered by different estimations of the missing values. The incomplete patterns that are very difficult to classify in a specific class will be reasonably and automatically committed to some proper meta-classes by PCC method in order to reduce errors. The effectiveness of PCC method has been tested through four experiments with artificial and real data sets.

  14. A Classification-based Review Recommender

    Science.gov (United States)

    O'Mahony, Michael P.; Smyth, Barry

    Many online stores encourage their users to submit product/service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare the performance of several classification techniques using a range of features derived from hotel reviews. We then describe how these classifiers can be used as the basis for a practical recommender that automatically suggests the mosthelpful contrasting reviews to end-users. We present an empirical evaluation which shows that our approach achieves a statistically significant improvement over alternative review ranking schemes.

  15. Customer classification in banking system of Iran based on the credit risk model using multi-criteria decision-making models

    Directory of Open Access Journals (Sweden)

    Khalil Khalili

    2015-11-01

    Full Text Available One of the most important factors of survival of financial institutes and banks in the current competitive markets is to create balance and equality among resources and consumptions as well as to keep the health of money circulation in these institutes. According to the experiences obtained from recent financial crises in the world. The lack of appropriate management of the demands of banks and financial institutions can be considered as one of the main factors of occurrence of this crisis. The objective of the present study is to identify and classify customers according to credit risk and decisions of predictive models. The present research is a survey research employing field study in terms of the data collection method. The method of collecting theoretical framework was library research and the data were collected by two ways of data of a questionnaire and real customers’ financial data. To analyze the data of the questionnaire, analytical hierarchy process and to analyze real customers’ financial data, the TOPSIS method were employed. The population of the study included files of real customers in one of the branches of RefahKargaran Bank in city of Tabriz, Iran. From among 800 files, 140 files were completed and using Morgan’s table, 103 files were investigated. The final model was presented and with 95% of probability, if the next customer’s data is entered the model, it will capable of identifying accurately the degree of customer risk.

  16. Learning classification models with soft-label information.

    Science.gov (United States)

    Nguyen, Quang; Valizadegan, Hamed; Hauskrecht, Milos

    2014-01-01

    Learning of classification models in medicine often relies on data labeled by a human expert. Since labeling of clinical data may be time-consuming, finding ways of alleviating the labeling costs is critical for our ability to automatically learn such models. In this paper we propose a new machine learning approach that is able to learn improved binary classification models more efficiently by refining the binary class information in the training phase with soft labels that reflect how strongly the human expert feels about the original class labels. Two types of methods that can learn improved binary classification models from soft labels are proposed. The first relies on probabilistic/numeric labels, the other on ordinal categorical labels. We study and demonstrate the benefits of these methods for learning an alerting model for heparin induced thrombocytopenia. The experiments are conducted on the data of 377 patient instances labeled by three different human experts. The methods are compared using the area under the receiver operating characteristic curve (AUC) score. Our AUC results show that the new approach is capable of learning classification models more efficiently compared to traditional learning methods. The improvement in AUC is most remarkable when the number of examples we learn from is small. A new classification learning framework that lets us learn from auxiliary soft-label information provided by a human expert is a promising new direction for learning classification models from expert labels, reducing the time and cost needed to label data.

  17. Classification images and bubbles images in the generalized linear model.

    Science.gov (United States)

    Murray, Richard F

    2012-07-09

    Classification images and bubbles images are psychophysical tools that use stimulus noise to investigate what features people use to make perceptual decisions. Previous work has shown that classification images can be estimated using the generalized linear model (GLM), and here I show that this is true for bubbles images as well. Expressing the two approaches in terms of a single statistical model clarifies their relationship to one another, makes it possible to measure classification images and bubbles images simultaneously, and allows improvements developed for one method to be used with the other.

  18. A NEW WASTE CLASSIFYING MODEL: HOW WASTE CLASSIFICATION CAN BECOME MORE OBJECTIVE?

    Directory of Open Access Journals (Sweden)

    Burcea Stefan Gabriel

    2015-07-01

    Full Text Available The waste management specialist must be able to identify and analyze waste generation sources and to propose proper solutions to prevent the waste generation and encurage the waste minimisation. In certain situations like implementing an integrated waste management sustem and configure the waste collection methods and capacities, practitioners can face the challenge to classify the generated waste. This will tend to be the more demanding as the literature does not provide a coherent system of criteria required for an objective waste classification process. The waste incineration will determine no doubt a different waste classification than waste composting or mechanical and biological treatment. In this case the main question is what are the proper classification criteria witch can be used to realise an objective waste classification? The article provide a short critical literature review of the existing waste classification criteria and suggests the conclusion that the literature can not provide unitary waste classification system which is unanimously accepted and assumed by ideologists and practitioners. There are various classification criteria and more interesting perspectives in the literature regarding the waste classification, but the most common criteria based on which specialists classify waste into several classes, categories and types are the generation source, physical and chemical features, aggregation state, origin or derivation, hazardous degree etc. The traditional classification criteria divided waste into various categories, subcategories and types; such an approach is a conjectural one because is inevitable that according to the context in which the waste classification is required the used criteria to differ significantly; hence the need to uniformizating the waste classification systems. For the first part of the article it has been used indirect observation research method by analyzing the literature and the various

  19. Wavelet transform based power quality events classification using ...

    African Journals Online (AJOL)

    WT) energy features by artificial neural network (ANN) and SVM classifiers. The proposed scheme utilizes wavelet based feature extraction to be used for the artificial neural networks in the classification. Six different PQ events are considered in ...

  20. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  1. Automated detection and classification for craters based on geometric matching

    Science.gov (United States)

    Chen, Jian-qing; Cui, Ping-yuan; Cui, Hui-tao

    2011-08-01

    Crater detection and classification are critical elements for planetary mission preparations and landing site selection. This paper presents a methodology for the automated detection and matching of craters on images of planetary surface such as Moon, Mars and asteroids. For craters usually are bowl shaped depression, craters can be figured as circles or circular arc during landing phase. Based on the hypothesis that detected crater edges is related to craters in a template by translation, rotation and scaling, the proposed matching method use circles to fitting craters edge, and align circular arc edges from the image of the target body with circular features contained in a model. The approach includes edge detection, edge grouping, reference point detection and geometric circle model matching. Finally we simulate planetary surface to test the reasonableness and effectiveness of the proposed method.

  2. Maximum-margin based representation learning from multiple atlases for Alzheimer's disease classification.

    Science.gov (United States)

    Min, Rui; Cheng, Jian; Price, True; Wu, Guorong; Shen, Dinggang

    2014-01-01

    In order to establish the correspondences between different brains for comparison, spatial normalization based morphometric measurements have been widely used in the analysis of Alzheimer's disease (AD). In the literature, different subjects are often compared in one atlas space, which may be insufficient in revealing complex brain changes. In this paper, instead of deploying one atlas for feature extraction and classification, we propose a maximum-margin based representation learning (MMRL) method to learn the optimal representation from multiple atlases. Unlike traditional methods that perform the representation learning separately from the classification, we propose to learn the new representation jointly with the classification model, which is more powerful in discriminating AD patients from normal controls (NC). We evaluated the proposed method on the ADNI database, and achieved 90.69% for AD/NC classification and 73.69% for p-MCI/s-MCI classification.

  3. Logistic Regression-Based Trichotomous Classification Tree and Its Application in Medical Diagnosis.

    Science.gov (United States)

    Zhu, Yanke; Fang, Jiqian

    2016-11-01

    The classification tree is a valuable methodology for predictive modeling and data mining. However, the current existing classification trees ignore the fact that there might be a subset of individuals who cannot be well classified based on the information of the given set of predictor variables and who might be classified with a higher error rate; most of the current existing classification trees do not use the combination of variables in each step. An algorithm of a logistic regression-based trichotomous classification tree (LRTCT) is proposed that employs the trichotomous tree structure and the linear combination of predictor variables in the recursive partitioning process. Compared with the widely used classification and regression tree through the applications on a series of simulated data and 2 real data sets, the LRTCT performed better in several aspects and does not require excessive complicated calculations. © The Author(s) 2016.

  4. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models.

    Science.gov (United States)

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  5. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    Science.gov (United States)

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  6. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  7. Formalization of the classification pattern: survey of classification modeling in information systems engineering.

    Science.gov (United States)

    Partridge, Chris; de Cesare, Sergio; Mitchell, Andrew; Odell, James

    2018-01-01

    Formalization is becoming more common in all stages of the development of information systems, as a better understanding of its benefits emerges. Classification systems are ubiquitous, no more so than in domain modeling. The classification pattern that underlies these systems provides a good case study of the move toward formalization in part because it illustrates some of the barriers to formalization, including the formal complexity of the pattern and the ontological issues surrounding the "one and the many." Powersets are a way of characterizing the (complex) formal structure of the classification pattern, and their formalization has been extensively studied in mathematics since Cantor's work in the late nineteenth century. One can use this formalization to develop a useful benchmark. There are various communities within information systems engineering (ISE) that are gradually working toward a formalization of the classification pattern. However, for most of these communities, this work is incomplete, in that they have not yet arrived at a solution with the expressiveness of the powerset benchmark. This contrasts with the early smooth adoption of powerset by other information systems communities to, for example, formalize relations. One way of understanding the varying rates of adoption is recognizing that the different communities have different historical baggage. Many conceptual modeling communities emerged from work done on database design, and this creates hurdles to the adoption of the high level of expressiveness of powersets. Another relevant factor is that these communities also often feel, particularly in the case of domain modeling, a responsibility to explain the semantics of whatever formal structures they adopt. This paper aims to make sense of the formalization of the classification pattern in ISE and surveys its history through the literature, starting from the relevant theoretical works of the mathematical literature and gradually shifting focus

  8. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Science.gov (United States)

    Groenendyk, Derek G; Ferré, Ty P A; Thorp, Kelly R; Rice, Amy K

    2015-01-01

    Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape

  9. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Directory of Open Access Journals (Sweden)

    Derek G Groenendyk

    Full Text Available Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization

  10. Vertebrae classification models - Validating classification models that use morphometrics to identify ancient salmonid (Oncorhynchus spp.) vertebrae to species

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Using morphometric characteristics of modern salmonid (Oncorhynchus spp.) vertebrae, we have developed classification models to identify salmonid vertebrae to the...

  11. A Soft Intelligent Risk Evaluation Model for Credit Scoring Classification

    Directory of Open Access Journals (Sweden)

    Mehdi Khashei

    2015-09-01

    Full Text Available Risk management is one of the most important branches of business and finance. Classification models are the most popular and widely used analytical group of data mining approaches that can greatly help financial decision makers and managers to tackle credit risk problems. However, the literature clearly indicates that, despite proposing numerous classification models, credit scoring is often a difficult task. On the other hand, there is no universal credit-scoring model in the literature that can be accurately and explanatorily used in all circumstances. Therefore, the research for improving the efficiency of credit-scoring models has never stopped. In this paper, a hybrid soft intelligent classification model is proposed for credit-scoring problems. In the proposed model, the unique advantages of the soft computing techniques are used in order to modify the performance of the traditional artificial neural networks in credit scoring. Empirical results of Australian credit card data classifications indicate that the proposed hybrid model outperforms its components, and also other classification models presented for credit scoring. Therefore, the proposed model can be considered as an appropriate alternative tool for binary decision making in business and finance, especially in high uncertainty conditions.

  12. Resampling methods for evaluating classification accuracy of wildlife habitat models

    Science.gov (United States)

    Verbyla, David L.; Litvaitis, John A.

    1989-11-01

    Predictive models of wildlife-habitat relationships often have been developed without being tested The apparent classification accuracy of such models can be optimistically biased and misleading. Data resampling methods exist that yield a more realistic estimate of model classification accuracy These methods are simple and require no new sample data. We illustrate these methods (cross-validation, jackknife resampling, and bootstrap resampling) with computer simulation to demonstrate the increase in precision of the estimate. The bootstrap method is then applied to field data as a technique for model comparison We recommend that biologists use some resampling procedure to evaluate wildlife habitat models prior to field evaluation.

  13. Cancer classification based on gene expression using neural networks.

    Science.gov (United States)

    Hu, H P; Niu, Z J; Bai, Y P; Tan, X H

    2015-12-21

    Based on gene expression, we have classified 53 colon cancer patients with UICC II into two groups: relapse and no relapse. Samples were taken from each patient, and gene information was extracted. Of the 53 samples examined, 500 genes were considered proper through analyses by S-Kohonen, BP, and SVM neural networks. Classification accuracy obtained by S-Kohonen neural network reaches 91%, which was more accurate than classification by BP and SVM neural networks. The results show that S-Kohonen neural network is more plausible for classification and has a certain feasibility and validity as compared with BP and SVM neural networks.

  14. Over-the-Air Deep Learning Based Radio Signal Classification

    Science.gov (United States)

    O'Shea, Timothy James; Roy, Tamoghna; Clancy, T. Charles

    2018-02-01

    We conduct an in depth study on the performance of deep learning based radio signal classification for radio communications signals. We consider a rigorous baseline method using higher order moments and strong boosted gradient tree classification and compare performance between the two approaches across a range of configurations and channel impairments. We consider the effects of carrier frequency offset, symbol rate, and multi-path fading in simulation and conduct over-the-air measurement of radio classification performance in the lab using software radios and compare performance and training strategies for both. Finally we conclude with a discussion of remaining problems, and design considerations for using such techniques.

  15. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  16. Hyperspectral image classification through bilayer graph-based learning.

    Science.gov (United States)

    Gao, Yue; Ji, Rongrong; Cui, Peng; Dai, Qionghai; Hua, Gang

    2014-07-01

    Hyperspectral image classification with limited number of labeled pixels is a challenging task. In this paper, we propose a bilayer graph-based learning framework to address this problem. For graph-based classification, how to establish the neighboring relationship among the pixels from the high dimensional features is the key toward a successful classification. Our graph learning algorithm contains two layers. The first-layer constructs a simple graph, where each vertex denotes one pixel and the edge weight encodes the similarity between two pixels. Unsupervised learning is then conducted to estimate the grouping relations among different pixels. These relations are subsequently fed into the second layer to form a hypergraph structure, on top of which, semisupervised transductive learning is conducted to obtain the final classification results. Our experiments on three data sets demonstrate the merits of our proposed approach, which compares favorably with state of the art.

  17. Tweet-based Target Market Classification Using Ensemble Method

    Directory of Open Access Journals (Sweden)

    Muhammad Adi Khairul Anshary

    2016-09-01

    Full Text Available Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract features. Classification models were constructed to manipulate the training data using two ensemble methods (bagging and boosting. To investigate the effectiveness of the ensemble methods, this study used the CART (classification and regression tree algorithm for comparison. Three categories of consumer goods (computers, mobile phones and cameras and three categories of sentiments (positive, negative and neutral were classified towards three target-market categories. Machine learning was performed using Weka 3.6.9. The results of the test data showed that the bagging method improved the accuracy of CART with 1.9% (to 85.20%. On the other hand, for sentiment classification, the ensemble methods were not successful in increasing the accuracy of CART. The results of this study may be taken into consideration by companies who approach their customers through social media, especially Twitter.

  18. Classification rates: non‐parametric verses parametric models using ...

    African Journals Online (AJOL)

    The local likelihood technique was used to model fit the data sets. The same sets of data were modeled using parametric logit and the abilities of the two models to correctly predict the binary outcome compared. The results obtained showed that non‐parametric estimation gives a better prediction rate (classification ratio) for ...

  19. Co-occurrence Models in Music Genre Classification

    DEFF Research Database (Denmark)

    Ahrendt, Peter; Goutte, Cyril; Larsen, Jan

    2005-01-01

    Music genre classification has been investigated using many different methods, but most of them build on probabilistic models of feature vectors x\\_r which only represent the short time segment with index r of the song. Here, three different co-occurrence models are proposed which instead consider...... genre data set with a variety of modern music. The basis was a so-called AR feature representation of the music. Besides the benefit of having proper probabilistic models of the whole song, the lowest classification test errors were found using one of the proposed models....

  20. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  1. Combining Unsupervised and Supervised Classification to Build User Models for Exploratory Learning Environments

    Science.gov (United States)

    Amershi, Saleema; Conati, Cristina

    2009-01-01

    In this paper, we present a data-based user modeling framework that uses both unsupervised and supervised classification to build student models for exploratory learning environments. We apply the framework to build student models for two different learning environments and using two different data sources (logged interface and eye-tracking data).…

  2. Content-based classification and retrieval of audio

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    An on-line audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the first step of our continuing work towards a general content-based audio classification and retrieval system. The extracted audio features include temporal curves of the energy function,the average zero- crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classification result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classification and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy of over 90 percent. Outlines of further classification of audio into finer types and a query-by-example audio retrieval system on top of the coarse classification are also introduced.

  3. A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model.

    Science.gov (United States)

    Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F

    2014-06-01

    To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). © 2013 Elsevier B.V. All rights reserved.

  4. Classification of Types of Stuttering Symptoms Based on Brain Activity

    Science.gov (United States)

    Jiang, Jing; Lu, Chunming; Peng, Danling; Zhu, Chaozhe; Howell, Peter

    2012-01-01

    Among the non-fluencies seen in speech, some are more typical (MT) of stuttering speakers, whereas others are less typical (LT) and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT) whole-word repetitions (WWR) should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type. PMID:22761887

  5. Classification of types of stuttering symptoms based on brain activity.

    Directory of Open Access Journals (Sweden)

    Jing Jiang

    Full Text Available Among the non-fluencies seen in speech, some are more typical (MT of stuttering speakers, whereas others are less typical (LT and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT whole-word repetitions (WWR should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  6. Twitter classification model: the ABC of two million fitness tweets.

    Science.gov (United States)

    Vickey, Theodore A; Ginis, Kathleen Martin; Dabrowski, Maciej

    2013-09-01

    The purpose of this project was to design and test data collection and management tools that can be used to study the use of mobile fitness applications and social networking within the context of physical activity. This project was conducted over a 6-month period and involved collecting publically shared Twitter data from five mobile fitness apps (Nike+, RunKeeper, MyFitnessPal, Endomondo, and dailymile). During that time, over 2.8 million tweets were collected, processed, and categorized using an online tweet collection application and a customized JavaScript. Using the grounded theory, a classification model was developed to categorize and understand the types of information being shared by application users. Our data show that by tracking mobile fitness app hashtags, a wealth of information can be gathered to include but not limited to daily use patterns, exercise frequency, location-based workouts, and overall workout sentiment.

  7. Estimating classification images with generalized linear and additive models.

    Science.gov (United States)

    Knoblauch, Kenneth; Maloney, Laurence T

    2008-12-22

    Conventional approaches to modeling classification image data can be described in terms of a standard linear model (LM). We show how the problem can be characterized as a Generalized Linear Model (GLM) with a Bernoulli distribution. We demonstrate via simulation that this approach is more accurate in estimating the underlying template in the absence of internal noise. With increasing internal noise, however, the advantage of the GLM over the LM decreases and GLM is no more accurate than LM. We then introduce the Generalized Additive Model (GAM), an extension of GLM that can be used to estimate smooth classification images adaptively. We show that this approach is more robust to the presence of internal noise, and finally, we demonstrate that GAM is readily adapted to estimation of higher order (nonlinear) classification images and to testing their significance.

  8. LAND COVER CLASSIFICATION FROM FULL-WAVEFORM LIDAR DATA BASED ON SUPPORT VECTOR MACHINES

    Directory of Open Access Journals (Sweden)

    M. Zhou

    2016-06-01

    Full Text Available In this study, a land cover classification method based on multi-class Support Vector Machines (SVM is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs method and it showed that SVM method could achieve better classification results.

  9. Combining a generic process-based productivity model classification method to predict the presence and absence species in the Pacific Northwest, U.S.A

    Science.gov (United States)

    Nicholas C. Coops; Richard H. Waring; Todd A. Schroeder

    2009-01-01

    Although long-lived tree species experience considerable environmental variation over their life spans, their geographical distributions reflect sensitivity mainly to mean monthly climatic conditions.We introduce an approach that incorporates a physiologically based growth model to illustrate how a half-dozen tree species differ in their responses to monthly variation...

  10. Density-based unsupervised classification for remote sensing

    NARCIS (Netherlands)

    C.H.M. van Kemenade; J.A. La Poutré (Han); R.J. Mokken

    1998-01-01

    textabstractMost image classification methods are supervised and use a parametric model of the classes that have to be detected. The models of the different classes are trained by means of a set of training regions that usually have to be marked and classified by a human interpreter. Unsupervised

  11. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  12. Purchase-oriented Classification Model of the Spare Parts of Agricultural Machinery

    OpenAIRE

    Zhang, Li-guo

    2011-01-01

    Based on the classification of spare parts and the research results of the demand of spare parts, a three-dimensional classification model of spare parts of agricultural machinery is established, which includes the application axis sorted by technical characteristics, the cost axis classified by ABC method, and the demand axis classified by the demand of the spare parts of agricultural machinery. These dimension axes represent different factors, and the application of factors in purchase is a...

  13. The Study of Land Use Classification Based on SPOT6 High Resolution Data

    OpenAIRE

    Wu Song; Jiang Qigang

    2016-01-01

    A method is carried out to quick classification extract of the type of land use in agricultural areas, which is based on the spot6 high resolution remote sensing classification data and used of the good nonlinear classification ability of support vector machine. The results show that the spot6 high resolution remote sensing classification data can realize land classification efficiently, the overall classification accuracy reached 88.79% and Kappa factor is 0.8632 which means that the classif...

  14. Conceptualising Business Models: Definitions, Frameworks and Classifications

    Directory of Open Access Journals (Sweden)

    Erwin Fielt

    2013-12-01

    Full Text Available The business model concept is gaining traction in different disciplines but is still criticized for being fuzzy and vague and lacking consensus on its definition and compositional elements. In this paper we set out to advance our understanding of the business model concept by addressing three areas of foundational research: business model definitions, business model elements, and business model archetypes. We define a business model as a representation of the value logic of an organization in terms of how it creates and captures customer value. This abstract and generic definition is made more specific and operational by the compositional elements that need to address the customer, value proposition, organizational architecture (firm and network level and economics dimensions. Business model archetypes complement the definition and elements by providing a more concrete and empirical understanding of the business model concept. The main contributions of this paper are (1 explicitly including the customer value concept in the business model definition and focussing on value creation, (2 presenting four core dimensions that business model elements need to cover, (3 arguing for flexibility by adapting and extending business model elements to cater for different purposes and contexts (e.g. technology, innovation, strategy (4 stressing a more systematic approach to business model archetypes by using business model elements for their description, and (5 suggesting to use business model archetype research for the empirical exploration and testing of business model elements and their relationships.

  15. A Spectral Signature Shape-Based Algorithm for Landsat Image Classification

    Directory of Open Access Journals (Sweden)

    Yuanyuan Chen

    2016-08-01

    Full Text Available Land-cover datasets are crucial for earth system modeling and human-nature interaction research at local, regional and global scales. They can be obtained from remotely sensed data using image classification methods. However, in processes of image classification, spectral values have received considerable attention for most classification methods, while the spectral curve shape has seldom been used because it is difficult to be quantified. This study presents a classification method based on the observation that the spectral curve is composed of segments and certain extreme values. The presented classification method quantifies the spectral curve shape and takes full use of the spectral shape differences among land covers to classify remotely sensed images. Using this method, classification maps from TM (Thematic mapper data were obtained with an overall accuracy of 0.834 and 0.854 for two respective test areas. The approach presented in this paper, which differs from previous image classification methods that were mostly concerned with spectral “value” similarity characteristics, emphasizes the "shape" similarity characteristics of the spectral curve. Moreover, this study will be helpful for classification research on hyperspectral and multi-temporal images.

  16. Latent Partially Ordered Classification Models and Normal Mixtures

    Science.gov (United States)

    Tatsuoka, Curtis; Varadi, Ferenc; Jaeger, Judith

    2013-01-01

    Latent partially ordered sets (posets) can be employed in modeling cognitive functioning, such as in the analysis of neuropsychological (NP) and educational test data. Posets are cognitively diagnostic in the sense that classification states in these models are associated with detailed profiles of cognitive functioning. These profiles allow for…

  17. SEMIPARAMETRIC VERSUS PARAMETRIC CLASSIFICATION MODELS - AN APPLICATION TO DIRECT MARKETING

    NARCIS (Netherlands)

    BULT, [No Value

    In this paper we are concerned with estimation of a classification model using semiparametric and parametric methods. Benefits and limitations of semiparametric models in general, and of Manski's maximum score method in particular, are discussed. The maximum score method yields consistent estimates

  18. Underwater Signal Modeling for Subsurface Classification Using Computational Intelligence.

    Science.gov (United States)

    Setayeshi, Saeed

    In the thesis a method for underwater layered media (UWLM) modeling is proposed, and a simple nonlinear structure for implementation of this model based on the behaviour of its characteristics and the propagation of the acoustic signal in the media accounting for attenuation effects is designed. The model that responds to the acoustic input is employed to test the artificial intelligence classifiers ability. Neural network models, the basic principles of the back-propagation algorithm, and the Hopfield model of associative memories are reviewed, and they are employed to use min-max amplitude ranges of a reflected signal of UWLM based on attenuation effects, to define the classes of the synthetic data, detect its peak features and estimate parameters of the media. It has been found that there is a correlation between the number of layers in the media and the optimum number of nodes in the hidden layer of the neural networks. The integration of the result of the neural networks that classify and detect underwater layered media acoustic signals based on attenuation effects to prove the correspondence between the peak points and decay values has introduced a powerful tool for UWLM identification. The methods appear to have applications in replacing original system, for parameter estimation and output prediction in system identification by the proposed networks. The results of computerized simulation of the UWLM modeling in conjunction with the proposed neural networks training process are given. Fuzzy sets is an idea that allows representing and manipulating inexact concepts, fuzzy min-max pattern classification method, and the learning and recalling algorithms for fuzzy neural networks implementation is explained in this thesis. A fuzzy neural network that uses peak amplitude ranges to define classes is proposed and evaluated for UWLM pattern recognition. It is demonstrated to be able to classify the layered media data sets, and can distinguish between the peak points

  19. Mathematical model for classification of EEG signals

    Science.gov (United States)

    Ortiz, Victor H.; Tapia, Juan J.

    2015-09-01

    A mathematical model to filter and classify brain signals from a brain machine interface is developed. The mathematical model classifies the signals from the different lobes of the brain to differentiate the signals: alpha, beta, gamma and theta, besides the signals from vision, speech, and orientation. The model to develop further eliminates noise signals that occur in the process of signal acquisition. This mathematical model can be used on different platforms interfaces for rehabilitation of physically handicapped persons.

  20. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment...... between the branch feature vectors representing those trees. Hereby, localized information in the branches is collectively used in classification and variations in feature values across the tree are taken into account. An approximate anatomical correspondence between matched branches can be achieved...... by including anatomical features in the branch feature vectors. The proposed approach is applied to classify airway trees in computed tomography images of subjects with and without chronic obstructive pulmonary disease (COPD). Using the wall area percentage (WA%), a common measure of airway abnormality in COPD...

  1. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System.

    Directory of Open Access Journals (Sweden)

    Vessela Krasteva

    Full Text Available This study presents a 2-stage heartbeat classifier of supraventricular (SVB and ventricular (VB beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA and classification tree (CT, all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features, Fuzzy (72 features, LDA (142 coefficients, CT (221 decision nodes with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%, LDA (99.6%, Cluster (99.5%, Fuzzy (99.4%; sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies: CT (96.7%, Fuzzy (94.4%, LDA (94.2%, Cluster (92.4%; positive predictivity: CT (99.2%, Cluster (93.6%, LDA (93.0%, Fuzzy (92.4%. CT has superior accuracy by 0.3-6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable 'if-then' rules.

  2. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System.

    Science.gov (United States)

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3-6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable 'if-then' rules.

  3. a Classification Algorithm for Hyperspectral Data Based on Synergetics Theory

    Science.gov (United States)

    Cerra, D.; Mueller, R.; Reinartz, P.

    2012-07-01

    This paper presents a new classification methodology for hyperspectral data based on synergetics theory, which describes the spontaneous formation of patterns and structures in a system through self-organization. We introduce a representation for hyperspectral data, in which a spectrum can be projected in a space spanned by a set of user-defined prototype vectors, which belong to some classes of interest. Each test vector is attracted by a final state associated to a prototype, and can be thus classified. As typical synergetics-based systems have the drawback of a rigid training step, we modify it to allow the selection of user-defined training areas, used to weight the prototype vectors through attention parameters and to produce a more accurate classification map through majority voting of independent classifications. Results are comparable to state of the art classification methodologies, both general and specific to hyperspectral data and, as each classification is based on a single training sample per class, the proposed technique would be particularly effective in tasks where only a small training dataset is available.

  4. Classification of consumers based on perceptions

    DEFF Research Database (Denmark)

    Høg, Esben; Juhl, Hans Jørn; Poulsen, Carsten Stig

    1999-01-01

    This paper reports some results from a recent Danish study of fish consumption. One purpose of the study was to identify consumer segments according to their perceptions of fish in comparison with other food categories. We present a model, which has the capabilities to determine the number...... of segments and putting in order of priority the alternatives examined. The model allows for ties, i.e. the consumer's expression of no preference among alternatives. The parameters in the model are estimated simultaneously by the method of maximum likelihood. The approach is illustrated using data from...

  5. Single image super-resolution based on image patch classification

    Science.gov (United States)

    Xia, Ping; Yan, Hua; Li, Jing; Sun, Jiande

    2017-06-01

    This paper proposed a single image super-resolution algorithm based on image patch classification and sparse representation where gradient information is used to classify image patches into three different classes in order to reflect the difference between the different types of image patches. Compared with other classification algorithms, gradient information based algorithm is simpler and more effective. In this paper, each class is learned to get a corresponding sub-dictionary. High-resolution image patch can be reconstructed by the dictionary and sparse representation coefficients of corresponding class of image patches. The result of the experiments demonstrated that the proposed algorithm has a better effect compared with the other algorithms.

  6. A Dirichlet process mixture model for brain MRI tissue classification.

    Science.gov (United States)

    Ferreira da Silva, Adelino R

    2007-04-01

    Accurate classification of magnetic resonance images according to tissue type or region of interest has become a critical requirement in diagnosis, treatment planning, and cognitive neuroscience. Several authors have shown that finite mixture models give excellent results in the automated segmentation of MR images of the human normal brain. However, performance and robustness of finite mixture models deteriorate when the models have to deal with a variety of anatomical structures. In this paper, we propose a nonparametric Bayesian model for tissue classification of MR images of the brain. The model, known as Dirichlet process mixture model, uses Dirichlet process priors to overcome the limitations of current parametric finite mixture models. To validate the accuracy and robustness of our method we present the results of experiments carried out on simulated MR brain scans, as well as on real MR image data. The results are compared with similar results from other well-known MRI segmentation methods.

  7. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  8. Estimation of Compaction Parameters Based on Soil Classification

    Science.gov (United States)

    Lubis, A. S.; Muis, Z. A.; Hastuty, I. P.; Siregar, I. M.

    2018-02-01

    Factors that must be considered in compaction of the soil works were the type of soil material, field control, maintenance and availability of funds. Those problems then raised the idea of how to estimate the density of the soil with a proper implementation system, fast, and economical. This study aims to estimate the compaction parameter i.e. the maximum dry unit weight (γ dmax) and optimum water content (Wopt) based on soil classification. Each of 30 samples were being tested for its properties index and compaction test. All of the data’s from the laboratory test results, were used to estimate the compaction parameter values by using linear regression and Goswami Model. From the research result, the soil types were A4, A-6, and A-7 according to AASHTO and SC, SC-SM, and CL based on USCS. By linear regression, the equation for estimation of the maximum dry unit weight (γdmax *)=1,862-0,005*FINES- 0,003*LL and estimation of the optimum water content (wopt *)=- 0,607+0,362*FINES+0,161*LL. By Goswami Model (with equation Y=mLogG+k), for estimation of the maximum dry unit weight (γdmax *) with m=-0,376 and k=2,482, for estimation of the optimum water content (wopt *) with m=21,265 and k=-32,421. For both of these equations a 95% confidence interval was obtained.

  9. Superpixel-based classification of gastric chromoendoscopy images

    Science.gov (United States)

    Boschetto, Davide; Grisan, Enrico

    2017-03-01

    Chromoendoscopy (CH) is a gastroenterology imaging modality that involves the staining of tissues with methylene blue, which reacts with the internal walls of the gastrointestinal tract, improving the visual contrast in mucosal surfaces and thus enhancing a doctor's ability to screen precancerous lesions or early cancer. This technique helps identify areas that can be targeted for biopsy or treatment and in this work we will focus on gastric cancer detection. Gastric chromoendoscopy for cancer detection has several taxonomies available, one of which classifies CH images into three classes (normal, metaplasia, dysplasia) based on color, shape and regularity of pit patterns. Computer-assisted diagnosis is desirable to help us improve the reliability of the tissue classification and abnormalities detection. However, traditional computer vision methodologies, mainly segmentation, do not translate well to the specific visual characteristics of a gastroenterology imaging scenario. We propose the exploitation of a first unsupervised segmentation via superpixel, which groups pixels into perceptually meaningful atomic regions, used to replace the rigid structure of the pixel grid. For each superpixel, a set of features is extracted and then fed to a random forest based classifier, which computes a model used to predict the class of each superpixel. The average general accuracy of our model is 92.05% in the pixel domain (86.62% in the superpixel domain), while detection accuracies on the normal and abnormal class are respectively 85.71% and 95%. Eventually, the whole image class can be predicted image through a majority vote on each superpixel's predicted class.

  10. A Classification of PLC Models and Applications

    NARCIS (Netherlands)

    Mader, Angelika H.; Boel, R.; Stremersch, G.

    In the past years there is an increasing interest in analysing PLC applications with formal methods. The first step to this end is to get formal models of PLC applications. Meanwhile, various models for PLCs have already been introduced in the literature. In our paper we discuss several

  11. Classification of Gait Types Based on the Duty-factor

    DEFF Research Database (Denmark)

    Fihl, Preben; Moeslund, Thomas B.

    2007-01-01

    This paper deals with classification of human gait types based on the notion that different gait types are in fact different types of locomotion, i.e., running is not simply walking done faster. We present the duty-factor, which is a descriptor based on this notion. The duty-factor is independent...... on the speed of the human, the cameras setup etc. and hence a robust descriptor for gait classification. The dutyfactor is basically a matter of measuring the ground support of the feet with respect to the stride. We estimate this by comparing the incoming silhouettes to a database of silhouettes with known...... ground support. Silhouettes are extracted using the Codebook method and represented using Shape Contexts. The matching with database silhouettes is done using the Hungarian method. While manually estimated duty-factors show a clear classification the presented system contains misclassifications due...

  12. Classification Model for Damage Localization in a Plate Structure

    Science.gov (United States)

    Janeliukstis, R.; Ruchevskis, S.; Chate, A.

    2018-01-01

    The present study is devoted to the problem of damage localization by means of data classification. The commercial ANSYS finite-elements program was used to make a model of a cantilevered composite plate equipped with numerous strain sensors. The plate was divided into zones, and, for data classification purposes, each of them housed several points to which a point mass of magnitude 5 and 10% of plate mass was applied. At each of these points, a numerical modal analysis was performed, from which the first few natural frequencies and strain readings were extracted. The strain data for every point were the input for a classification procedure involving k nearest neighbors and decision trees. The classification model was trained and optimized by finetuning the key parameters of both algorithms. Finally, two new query points were simulated and subjected to a classification in terms of assigning a label to one of the zones of the plate, thus localizing these points. Damage localization results were compared for both algorithms and were found to be in good agreement with the actual application positions of point load.

  13. Additive mathematical model of materials aeration classification in separators

    Directory of Open Access Journals (Sweden)

    V. Ya. Potapov

    2017-09-01

    Full Text Available The article is devoted to confirmation of a mechanism of aeration classification in drum-shelf friction separators of materials, the components of which are distinguished by a wide range of “sailage” in order to increase the separators efficiency and the quality of finished products in technology of components separation of ore and non-ore materials. Using aerodynamics of bodies of arbitrary shape in a directed air flow, a mathematical model is obtained of aeration classification of particles of material components, depending on their physical properties, unified by an integral criterion of “sailage”, and controlled airflow parameters with separate accounting of influence of particles velocity and flow. Equations are obtained for calculation of geometric parameters of a unit of aeration classification friction drum – shelf separator depending on integral criterion of “sailage” determined by shape, size, density of initial raw material and air viscosity providing for maximum quality of stratification of the feedstock and, as a result, increasing the production efficiency and the quality of the separated material. The efficiency of aeration classification with the use of a controlled air flow is confirmed, as well as sufficient convergence of experimental and calculated data. The additive mathematical model has confirmed the high efficiency of application of aeration classification in drum-type friction separators to improve the quality of stratification with reference to initial raw materials, components of which differ in a wide range of “sailage”.

  14. Classification of consumers based on perceptions

    DEFF Research Database (Denmark)

    Høg, Esben; Juhl, Hans Jørn; Poulsen, Carsten Stig

    1999-01-01

    This paper reports some results from a recent Danish study of fish consumption. One major purpose of the study was to identify consumer segments according to their perceptions of fish in comparison with other food categories. We present a model which has the capabilities to determine the number o...

  15. Road geometry classification by adaptive shape models

    NARCIS (Netherlands)

    Álvarez, J.M.; Gevers, T.; Diego, F.; López, A.M.

    2012-01-01

    Vision-based road detection is important for different applications in transportation, such as autonomous driving, vehicle collision warning, and pedestrian crossing detection. Common approaches to road detection are based on low-level road appearance (e.g., color or texture) and neglect of the

  16. Habitat classification modeling with incomplete data: Pushing the habitat envelope

    Science.gov (United States)

    Zarnetske, P.L.; Edwards, T.C.; Moisen, Gretchen G.

    2007-01-01

    Habitat classification models (HCMs) are invaluable tools for species conservation, land-use planning, reserve design, and metapopulation assessments, particularly at broad spatial scales. However, species occurrence data are often lacking and typically limited to presence points at broad scales. This lack of absence data precludes the use of many statistical techniques for HCMs. One option is to generate pseudo-absence points so that the many available statistical modeling tools can be used. Traditional techniques generate pseudoabsence points at random across broadly defined species ranges, often failing to include biological knowledge concerning the species-habitat relationship. We incorporated biological knowledge of the species-habitat relationship into pseudo-absence points by creating habitat envelopes that constrain the region from which points were randomly selected. We define a habitat envelope as an ecological representation of a species, or species feature's (e.g., nest) observed distribution (i.e., realized niche) based on a single attribute, or the spatial intersection of multiple attributes. We created HCMs for Northern Goshawk (Accipiter gentilis atricapillus) nest habitat during the breeding season across Utah forests with extant nest presence points and ecologically based pseudo-absence points using logistic regression. Predictor variables were derived from 30-m USDA Landfire and 250-m Forest Inventory and Analysis (FIA) map products. These habitat-envelope-based models were then compared to null envelope models which use traditional practices for generating pseudo-absences. Models were assessed for fit and predictive capability using metrics such as kappa, thresholdindependent receiver operating characteristic (ROC) plots, adjusted deviance (Dadj2), and cross-validation, and were also assessed for ecological relevance. For all cases, habitat envelope-based models outperformed null envelope models and were more ecologically relevant, suggesting

  17. A framework of mixed variables classification in the presence of outliers: A robust location model

    Science.gov (United States)

    Hamid, Hashibah binti

    2017-11-01

    The natural outstanding of location model is as an excellent tool for mixed variables classification among other existing approaches such as Kernel-based non-parametric classification, logistic discrimination and linear discriminant analysis. However, the presence of outliers will affect the estimation of population parameters, hence causing inability of the model to provide an adequate statistical model and interpretation as well. In other words, outliers can distort not only parameters estimation, but also lead to poor classification performance. Therefore, this article aims to develop a new framework of location model through the integration of robust technique and classical location model with mixed variables in the presence of outliers, purposely for robust classification. The developed framework produces a new location model that can be used as an alternative approach for classification tasks as for academicians and practitioners in future applications, especially when they are facing with outliers in the data samples. We hope that the new location model produced will have better performance for both contaminated and non-contaminated datasets.

  18. A thyroid nodule classification method based on TI-RADS

    Science.gov (United States)

    Wang, Hao; Yang, Yang; Peng, Bo; Chen, Qin

    2017-07-01

    Thyroid Imaging Reporting and Data System(TI-RADS) is a valuable tool for differentiating the benign and the malignant thyroid nodules. In clinic, doctors can determine the extent of being benign or malignant in terms of different classes by using TI-RADS. Classification represents the degree of malignancy of thyroid nodules. TI-RADS as a classification standard can be used to guide the ultrasonic doctor to examine thyroid nodules more accurately and reliably. In this paper, we aim to classify the thyroid nodules with the help of TI-RADS. To this end, four ultrasound signs, i.e., cystic and solid, echo pattern, boundary feature and calcification of thyroid nodules are extracted and converted into feature vectors. Then semi-supervised fuzzy C-means ensemble (SS-FCME) model is applied to obtain the classification results. The experimental results demonstrate that the proposed method can help doctors diagnose the thyroid nodules effectively.

  19. Robust point cloud classification based on multi-level semantic relationships for urban scenes

    Science.gov (United States)

    Zhu, Qing; Li, Yuan; Hu, Han; Wu, Bo

    2017-07-01

    The semantic classification of point clouds is a fundamental part of three-dimensional urban reconstruction. For datasets with high spatial resolution but significantly more noises, a general trend is to exploit more contexture information to surmount the decrease of discrimination of features for classification. However, previous works on adoption of contexture information are either too restrictive or only in a small region and in this paper, we propose a point cloud classification method based on multi-level semantic relationships, including point-homogeneity, supervoxel-adjacency and class-knowledge constraints, which is more versatile and incrementally propagate the classification cues from individual points to the object level and formulate them as a graphical model. The point-homogeneity constraint clusters points with similar geometric and radiometric properties into regular-shaped supervoxels that correspond to the vertices in the graphical model. The supervoxel-adjacency constraint contributes to the pairwise interactions by providing explicit adjacent relationships between supervoxels. The class-knowledge constraint operates at the object level based on semantic rules, guaranteeing the classification correctness of supervoxel clusters at that level. International Society of Photogrammetry and Remote Sensing (ISPRS) benchmark tests have shown that the proposed method achieves state-of-the-art performance with an average per-area completeness and correctness of 93.88% and 95.78%, respectively. The evaluation of classification of photogrammetric point clouds and DSM generated from aerial imagery confirms the method's reliability in several challenging urban scenes.

  20. Sequential Classification of Palm Gestures Based on A* Algorithm and MLP Neural Network for Quadrocopter Control

    Directory of Open Access Journals (Sweden)

    Wodziński Marek

    2017-06-01

    Full Text Available This paper presents an alternative approach to the sequential data classification, based on traditional machine learning algorithms (neural networks, principal component analysis, multivariate Gaussian anomaly detector and finding the shortest path in a directed acyclic graph, using A* algorithm with a regression-based heuristic. Palm gestures were used as an example of the sequential data and a quadrocopter was the controlled object. The study includes creation of a conceptual model and practical construction of a system using the GPU to ensure the realtime operation. The results present the classification accuracy of chosen gestures and comparison of the computation time between the CPU- and GPU-based solutions.

  1. Fuzzy modeling of farmers' knowledge for land suitability classification

    NARCIS (Netherlands)

    Sicat, R.S.; Carranza, E.J.M.; Nidumolu, U.B.

    2005-01-01

    In a case study, we demonstrate fuzzy modeling of farmers' knowledge (FK) for agricultural land suitability classification using GIS. Capture of FK was through rapid rural participatory approach. The farmer respondents consider, in order of decreasing importance, cropping season, soil color, soil

  2. Classification of integrable discrete Klein-Gordon models

    International Nuclear Information System (INIS)

    Habibullin, Ismagil T; Gudkova, Elena V

    2011-01-01

    The Lie algebraic integrability test is applied to the problem of classification of integrable Klein-Gordon-type equations on quad graphs. The list of equations passing the test is presented, containing several well-known integrable models. A new integrable example is found; its higher symmetry is presented.

  3. A multi-class classification MCLP model with particle swarm ...

    Indian Academy of Sciences (India)

    A M Viswa Bharathy

    A multi-class classification MCLP model with particle swarm optimization for network intrusion detection. A M VISWA BHARATHY1,* and A MAHABUB BASHA2. 1 Anna University, Chennai 600025, India. 2 Department of Computer Science and Engineering, K.S.R. College of Engineering, Tiruchengode 637215,. India.

  4. A Multi-Dimensional Classification Model for Scientific Workflow Characteristics

    Energy Technology Data Exchange (ETDEWEB)

    Ramakrishnan, Lavanya; Plale, Beth

    2010-04-05

    Workflows have been used to model repeatable tasks or operations in manufacturing, business process, and software. In recent years, workflows are increasingly used for orchestration of science discovery tasks that use distributed resources and web services environments through resource models such as grid and cloud computing. Workflows have disparate re uirements and constraints that affects how they might be managed in distributed environments. In this paper, we present a multi-dimensional classification model illustrated by workflow examples obtained through a survey of scientists from different domains including bioinformatics and biomedical, weather and ocean modeling, astronomy detailing their data and computational requirements. The survey results and classification model contribute to the high level understandingof scientific workflows.

  5. Determinants of molecular marker based classification of rice (Oryza ...

    African Journals Online (AJOL)

    mr devi singh

    2015-01-07

    Jan 7, 2015 ... The present communication is aimed to find out determinants of molecular marker based classification of rice (Oryza sativa L) germplasm using the available data from an experiment conducted for development of molecular fingerprints of diverse varieties of Basmati and non Basmati rice adapted to.

  6. Polarity classification using structure-based vector representations of text

    NARCIS (Netherlands)

    Hogenboom, Alexander; Frasincar, Flavius; de Jong, Franciska M.G.; Kaymak, Uzay

    The exploitation of structural aspects of content is becoming increasingly popular in rule-based polarity classification systems. Such systems typically weight the sentiment conveyed by text segments in accordance with these segments' roles in the structure of a text, as identified by deep

  7. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is

  8. Airborne LIDAR Points Classification Based on Tensor Sparse Representation

    Science.gov (United States)

    Li, N.; Pfeifer, N.; Liu, C.

    2017-09-01

    The common statistical methods for supervised classification usually require a large amount of training data to achieve reasonable results, which is time consuming and inefficient. This paper proposes a tensor sparse representation classification (SRC) method for airborne LiDAR points. The LiDAR points are represented as tensors to keep attributes in its spatial space. Then only a few of training data is used for dictionary learning, and the sparse tensor is calculated based on tensor OMP algorithm. The point label is determined by the minimal reconstruction residuals. Experiments are carried out on real LiDAR points whose result shows that objects can be distinguished by this algorithm successfully.

  9. Fully automated classification of bone marrow infiltration in low-dose CT of patients with multiple myeloma based on probabilistic density model and supervised learning.

    Science.gov (United States)

    Martínez-Martínez, Francisco; Kybic, Jan; Lambert, Lukáš; Mecková, Zuzana

    2016-04-01

    This paper presents a fully automated method for the identification of bone marrow infiltration in femurs in low-dose CT of patients with multiple myeloma. We automatically find the femurs and the bone marrow within them. In the next step, we create a probabilistic, spatially dependent density model of normal tissue. At test time, we detect unexpectedly high density voxels which may be related to bone marrow infiltration, as outliers to this model. Based on a set of global, aggregated features representing all detections from one femur, we classify the subjects as being either healthy or not. This method was validated on a dataset of 127 subjects with ground truth created from a consensus of two expert radiologists, obtaining an AUC of 0.996 for the task of distinguishing healthy controls and patients with bone marrow infiltration. To the best of our knowledge, no other automatic image-based method for this task has been published before. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Heartbeat classification system based on neural networks and dimensionality reduction

    Directory of Open Access Journals (Sweden)

    Rodolfo de Figueiredo Dalvi

    Full Text Available Abstract Introduction This paper presents a complete approach for the automatic classification of heartbeats to assist experts in the diagnosis of typical arrhythmias, such as right bundle branch block, left bundle branch block, premature ventricular beats, premature atrial beats and paced beats. Methods A pre-processing step was performed on the electrocardiograms (ECG for baseline removal. Next, a QRS complex detection algorithm was implemented to detect the heartbeats, which contain the primary information that is employed in the classification approach. Next, ECG segmentation was performed, by which a set of features based on the RR interval and the beat waveform morphology were extracted from the ECG signal. The size of the feature vector was reduced by principal component analysis. Finally, the reduced feature vector was employed as the input to an artificial neural network. Results Our approach was tested on the Massachusetts Institute of Technology arrhythmia database. The classification performance on a test set of 18 ECG records of 30 min each achieved an accuracy of 96.97%, a sensitivity of 95.05%, a specificity of 90.88%, a positive predictive value of 95.11%, and a negative predictive value of 92.7%. Conclusion The proposed approach achieved high accuracy for classifying ECG heartbeats and could be used to assist cardiologists in telecardiology services. The main contribution of our classification strategy is in the feature selection step, which reduced classification complexity without major changes in the performance.

  11. Catchment classification and model parameter transfer with a view to regionalisation

    Science.gov (United States)

    Ley, Rita; Hellebrand, Hugo; Casper, Markus C.

    2013-04-01

    Physiographic and climatic catchment characteristics are responsible for catchment response behaviour, whereas hydrological model parameters describe catchment properties in such a way to transform input data (here: precipitation, evaporation) to runoff, hence describing the response behaviour of a catchment. In this respect, model parameters can thus be seen as catchment descriptors. A third catchment descriptor is runoff behaviour, depicted by indices derived from event runoff coefficients and Flow Duration Curves. In an ongoing research project founded by the Deutsche Forschungsgemeinschaft (DFG), we investigate the interdependencies of these three catchment descriptors for catchment classification with a view to regionalisation. The study area comprises about 80 meso-scale catchments in western Germany. These catchments are classified by Self Organising Maps (SOM) based on a) runoff behaviour and b) physical and climatic properties. The two classifications show an overlap of about 80% for all catchments and indicate a direct connection between the two descriptors for a majority of the catchments. Next, all catchments are calibrated with a simple and parsimonious conceptual model, stemming from the Superflex model framework. In this study we test the interdependencies between the classification and the calibrated model parameters by parameter transfer within and between the classes established by SOM. The model simulates total discharge, given observed precipitation and pre-estimated potential evaporation. Simulations with a few catchments show encouraging results: all simulations with the calibrated model show a good fit, which is indicated by Nash Sutcliff coefficients of about 0.8. Most of the simulations of runoff time series for catchments with parameter sets belonging to their own class display good performances too, while simulated runoff with model parameter sets from other classes display significant lower performance. This indicates that there is a

  12. The Study of Meanings of Motifs on Artifacts Discovered from Archeological Sites of Gilan Province and their Classification based on Dumézil’s Trifunctional Model

    Directory of Open Access Journals (Sweden)

    Ziba Kazempoor

    2017-09-01

    Full Text Available Gilan is one of the historical centers which has drawn the attention of archeological and art researchers due to the fact that unique artistic works have been discovered from its archeological sites. The present study seeks to review the meanings of motifs on archeological artifacts discovered from Gilan and to classify the artifacts based on Dumézil’s trifunctional model. This study aims to increase awareness of form and content of designs on historical works of Gilan and to detail historical background of Gilan. To do this, motifs are classified and symbolic meanings of motifs of artifacts discovered from Gilan are consequently detailed. Finally, the three functions of the sacral, the martial and the economic which constitute Dumézil’s trifunctional model for social and cultural structure of Indo-European tribes are used to study the artifacts discovered from Gilan. In general, motifs on archeological artifacts of Gilan could be divided into four groups of plant, animal, human, and abstract which could be analyzed through Dumézil’s trifunctional model. Mixed animal motifs are related to first function and show a sacral function. The artifacts on which war and hunting scenes are represented are concerned with second Dumézil’s function namely “the martial”. Plant and animal motifs and some human figure designs such as figures of historical women are related to certain notions such as life and living which could be categorized into the economic function.

  13. QSAR models for oxidation of organic micropollutants in water based on ozone and hydroxyl radical rate constants and their chemical classification

    KAUST Repository

    Sudhakaran, Sairam

    2013-03-01

    Ozonation is an oxidation process for the removal of organic micropollutants (OMPs) from water and the chemical reaction is governed by second-order kinetics. An advanced oxidation process (AOP), wherein the hydroxyl radicals (OH radicals) are generated, is more effective in removing a wider range of OMPs from water than direct ozonation. Second-order rate constants (kOH and kO3) are good indices to estimate the oxidation efficiency, where higher rate constants indicate more rapid oxidation. In this study, quantitative structure activity relationships (QSAR) models for O3 and AOP processes were developed, and rate constants, kOH and kO3, were predicted based on target compound properties. The kO3 and kOH values ranged from 5 * 10-4 to 105 M-1s-1 and 0.04 to 18 * (109) M-1 s-1, respectively. Several molecular descriptors which potentially influence O3 and OH radical oxidation were identified and studied. The QSAR-defining descriptors were double bond equivalence (DBE), ionisation potential (IP), electron-affinity (EA) and weakly-polar component of solvent accessible surface area (WPSA), and the chemical and statistical significance of these descriptors was discussed. Multiple linear regression was used to build the QSAR models, resulting in high goodness-of-fit, r2 (>0.75). The models were validated by internal and external validation along with residual plots. © 2012 Elsevier Ltd.

  14. Radar Target Classification using Recursive Knowledge-Based Methods

    DEFF Research Database (Denmark)

    Jochumsen, Lars Wurtz

    The topic of this thesis is target classification of radar tracks from a 2D mechanically scanning coastal surveillance radar. The measurements provided by the radar are position data and therefore the classification is mainly based on kinematic data, which is deduced from the position. The target...... been terminated. Therefore, an update of the classification results must be made for each measurement of the target. The data for this work are collected throughout the PhD and are both collected from radars and other sensors such as GPS....... classes used in this work are classes, which are normal for coastal surveillance e.g.~ships, helicopters, birds etc. The classifier must be recursive as all data of a track is not present at any given moment. If all data were available, it would be too late to classify the track, as the track would have...

  15. Torrent classification - Base of rational management of erosive regions

    International Nuclear Information System (INIS)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta

    2008-01-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  16. Apple Shape Classification Method Based on Wavelet Moment

    Directory of Open Access Journals (Sweden)

    Jiangsheng Gui

    2014-09-01

    Full Text Available Shape is not only an important indicator for assessing the grade of the apple, but also the important factors for increasing the value of the apple. In order to improve the apple shape classification accuracy rate, an approach for apple shape sorting based on wavelet moments was proposed, the image was first subjected to a normalization process using its regular moments to obtain scale and translation invariance, the rotation invariant wavelet moment features were then extracted from the scale and translation normalized images and the method of cluster analysis was used for finished the shape classification. This method performs better than traditional approaches such as Fourier descriptors and Zernike moments, because of that Wavelet moments can provide time-domain and frequency domain window, which was verified by experiments. The normal fruit shape, mild deformity and severe deformity classification accuracy is 86.21 %, 85.82 %, 90.81 % by our method.

  17. Hyperspectral image classification based on volumetric texture and dimensionality reduction

    Science.gov (United States)

    Su, Hongjun; Sheng, Yehua; Du, Peijun; Chen, Chen; Liu, Kui

    2015-06-01

    A novel approach using volumetric texture and reduced-spectral features is presented for hyperspectral image classification. Using this approach, the volumetric textural features were extracted by volumetric gray-level co-occurrence matrices (VGLCM). The spectral features were extracted by minimum estimated abundance covariance (MEAC) and linear prediction (LP)-based band selection, and a semi-supervised k-means (SKM) clustering method with deleting the worst cluster (SKMd) bandclustering algorithms. Moreover, four feature combination schemes were designed for hyperspectral image classification by using spectral and textural features. It has been proven that the proposed method using VGLCM outperforms the gray-level co-occurrence matrices (GLCM) method, and the experimental results indicate that the combination of spectral information with volumetric textural features leads to an improved classification performance in hyperspectral imagery.

  18. Classification of the Regional Ionospheric Disturbance Based on Machine Learning Techniques

    Science.gov (United States)

    Terzi, Merve Begum; Arikan, Orhan; Karatay, Secil; Arikan, Feza; Gulyaeva, Tamara

    2016-08-01

    In this study, Total Electron Content (TEC) estimated from GPS receivers is used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. For the automated classification of regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. Performance of developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing developed classification technique to Global Ionospheric Map (GIM) TEC data, which is provided by the NASA Jet Propulsion Laboratory (JPL), it is shown that SVM can be a suitable learning method to detect anomalies in TEC variations.

  19. Deep learning for EEG-Based preference classification

    Science.gov (United States)

    Teo, Jason; Hou, Chew Lin; Mountstephens, James

    2017-10-01

    Electroencephalogram (EEG)-based emotion classification is rapidly becoming one of the most intensely studied areas of brain-computer interfacing (BCI). The ability to passively identify yet accurately correlate brainwaves with our immediate emotions opens up truly meaningful and previously unattainable human-computer interactions such as in forensic neuroscience, rehabilitative medicine, affective entertainment and neuro-marketing. One particularly useful yet rarely explored areas of EEG-based emotion classification is preference recognition [1], which is simply the detection of like versus dislike. Within the limited investigations into preference classification, all reported studies were based on musically-induced stimuli except for a single study which used 2D images. The main objective of this study is to apply deep learning, which has been shown to produce state-of-the-art results in diverse hard problems such as in computer vision, natural language processing and audio recognition, to 3D object preference classification over a larger group of test subjects. A cohort of 16 users was shown 60 bracelet-like objects as rotating visual stimuli on a computer display while their preferences and EEGs were recorded. After training a variety of machine learning approaches which included deep neural networks, we then attempted to classify the users' preferences for the 3D visual stimuli based on their EEGs. Here, we show that that deep learning outperforms a variety of other machine learning classifiers for this EEG-based preference classification task particularly in a highly challenging dataset with large inter- and intra-subject variability.

  20. EFFECT OF PANSHARPENED IMAGE ON SOME OF PIXEL BASED AND OBJECT BASED CLASSIFICATION ACCURACY

    Directory of Open Access Journals (Sweden)

    P. Karakus

    2016-06-01

    Full Text Available Classification is the most important method to determine type of crop contained in a region for agricultural planning. There are two types of the classification. First is pixel based and the other is object based classification method. While pixel based classification methods are based on the information in each pixel, object based classification method is based on objects or image objects that formed by the combination of information from a set of similar pixels. Multispectral image contains a higher degree of spectral resolution than a panchromatic image. Panchromatic image have a higher spatial resolution than a multispectral image. Pan sharpening is a process of merging high spatial resolution panchromatic and high spectral resolution multispectral imagery to create a single high resolution color image. The aim of the study was to compare the potential classification accuracy provided by pan sharpened image. In this study, SPOT 5 image was used dated April 2013. 5m panchromatic image and 10m multispectral image are pan sharpened. Four different classification methods were investigated: maximum likelihood, decision tree, support vector machine at the pixel level and object based classification methods. SPOT 5 pan sharpened image was used to classification sun flowers and corn in a study site located at Kadirli region on Osmaniye in Turkey. The effects of pan sharpened image on classification results were also examined. Accuracy assessment showed that the object based classification resulted in the better overall accuracy values than the others. The results that indicate that these classification methods can be used for identifying sun flower and corn and estimating crop areas.

  1. Effect of Pansharpened Image on Some of Pixel Based and Object Based Classification Accuracy

    Science.gov (United States)

    Karakus, P.; Karabork, H.

    2016-06-01

    Classification is the most important method to determine type of crop contained in a region for agricultural planning. There are two types of the classification. First is pixel based and the other is object based classification method. While pixel based classification methods are based on the information in each pixel, object based classification method is based on objects or image objects that formed by the combination of information from a set of similar pixels. Multispectral image contains a higher degree of spectral resolution than a panchromatic image. Panchromatic image have a higher spatial resolution than a multispectral image. Pan sharpening is a process of merging high spatial resolution panchromatic and high spectral resolution multispectral imagery to create a single high resolution color image. The aim of the study was to compare the potential classification accuracy provided by pan sharpened image. In this study, SPOT 5 image was used dated April 2013. 5m panchromatic image and 10m multispectral image are pan sharpened. Four different classification methods were investigated: maximum likelihood, decision tree, support vector machine at the pixel level and object based classification methods. SPOT 5 pan sharpened image was used to classification sun flowers and corn in a study site located at Kadirli region on Osmaniye in Turkey. The effects of pan sharpened image on classification results were also examined. Accuracy assessment showed that the object based classification resulted in the better overall accuracy values than the others. The results that indicate that these classification methods can be used for identifying sun flower and corn and estimating crop areas.

  2. Improving the classification of plant functional types in dynamic vegetation modelling for the Neogene

    Science.gov (United States)

    Favre, Eric; François, Louis; Utescher, Torsten; Suc, Jean-Pierre; Henrot, Alexandra-Jane; Huang, Kangyou; Chedaddi, Rachid

    2010-05-01

    The simulation of paleovegetation with dynamic vegetation models requires an appropriate definition of plant functional types (PFTs). For several million year old time periods, such as the Miocene, analogue species must be defined and then classified into PFTs. Then, parameters for each plant type must be evaluated from the known present distribution of these analogue species. When the purpose is the validation of model results on available paleovegetation data, it is important that the same PFT classification is used both in the model and in the data. This allows to perform a detailed comparison at the PFT level, which is much more robust than the rough comparison at the biome level often used to validate vegetation model simulations. In this contribution, we present the latest version of the PFT classification developed in the CARAIB vegetation model for the Neogene vegetation. Several new geographic distributions have been added to the ones used in the previous version of the classification. This adapted classification with the addition of shrubs and the refinement of tropical and subtropical types is based on plant types currently present in Europe, on the distributions of analogue species in southeastern Asia and on a set of other species distributions from other regions around the world. The resulting classification involves 26 groups and allows a more precise modelling of the subtropical and tropical types present in Europe during early periods of the Neogene, such as for instance the Middle Miocene.

  3. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  4. The high-density lipoprotein-adjusted SCORE model worsens SCORE-based risk classification in a contemporary population of 30 824 Europeans

    DEFF Research Database (Denmark)

    Mortensen, Martin B; Afzal, Shoaib; Nordestgaard, Børge G

    2015-01-01

    AIMS: Recent European guidelines recommend to include high-density lipoprotein (HDL) cholesterol in risk assessment for primary prevention of cardiovascular disease (CVD), using a SCORE-based risk model (SCORE-HDL). We compared the predictive performance of SCORE-HDL with SCORE in an independent.......8 years of follow-up, 339 individuals died of CVD. In the SCORE target population (age 40-65; n = 30,824), fewer individuals were at baseline categorized as high risk (≥5% 10-year risk of fatal CVD) using SCORE-HDL compared with SCORE (10 vs. 17% in men, 1 vs. 3% in women). SCORE-HDL did not improve...... discrimination of future fatal CVD, compared with SCORE, but decreased the detection rate (sensitivity) of the 5% high-risk threshold from 42 to 26%, yielding a negative net reclassification index (NRI) of -12%. Importantly, using SCORE-HDL, the sensitivity was zero among women. Both SCORE and SCORE...

  5. A Classification System for Hospital-Based Infection Outbreaks

    Directory of Open Access Journals (Sweden)

    Paul S. Ganney

    2010-01-01

    Full Text Available Outbreaks of infection within semi-closed environments such as hospitals, whether inherent in the environment (such as Clostridium difficile (C.Diff or Methicillinresistant Staphylococcus aureus (MRSA or imported from the wider community (such as Norwalk-like viruses (NLVs, are difficult to manage. As part of our work on modelling such outbreaks, we have developed a classification system to describe the impact of a particular outbreak upon an organization. This classification system may then be used in comparing appropriate computer models to real outbreaks, as well as in comparing different real outbreaks in, for example, the comparison of differing management and containment techniques and strategies. Data from NLV outbreaks in the Hull and East Yorkshire Hospitals NHS Trust (the Trust over several previous years are analysed and classified, both for infection within staff (where the end of infection date may not be known and within patients (where it generally is known. A classification system consisting of seven elements is described, along with a goodness-of-fit method for comparing a new classification to previously known ones, for use in evaluating a simulation against history and thereby determining how ‘realistic’ (or otherwise it is.

  6. A classification system for hospital-based infection outbreaks.

    Science.gov (United States)

    Ganney, Paul S; Madeo, Maurice; Phillips, Roger

    2010-12-01

    Outbreaks of infection within semi-closed environments such as hospitals, whether inherent in the environment (such as Clostridium difficile (C.Diff) or Methicillin-resistant Staphylococcus aureus (MRSA) or imported from the wider community (such as Norwalk-like viruses (NLVs)), are difficult to manage. As part of our work on modelling such outbreaks, we have developed a classification system to describe the impact of a particular outbreak upon an organization. This classification system may then be used in comparing appropriate computer models to real outbreaks, as well as in comparing different real outbreaks in, for example, the comparison of differing management and containment techniques and strategies. Data from NLV outbreaks in the Hull and East Yorkshire Hospitals NHS Trust (the Trust) over several previous years are analysed and classified, both for infection within staff (where the end of infection date may not be known) and within patients (where it generally is known). A classification system consisting of seven elements is described, along with a goodness-of-fit method for comparing a new classification to previously known ones, for use in evaluating a simulation against history and thereby determining how 'realistic' (or otherwise) it is.

  7. Semi-supervised Learning for Classification of Polarimetric SAR Images Based on SVM-Wishart

    Directory of Open Access Journals (Sweden)

    Hua Wen-qiang

    2015-02-01

    Full Text Available In this study, we propose a new semi-supervised classification method for Polarimetric SAR (PolSAR images, aiming at handling the issue that the number of train set is small. First, considering the scattering characters of PolSAR data, this method extracts multiple scattering features using target decomposition approach. Then, a semi-supervised learning model is established based on a co-training framework and Support Vector Machine (SVM. Both labeled and unlabeled data are utilized in this model to obtain high classification accuracy. Third, a recovery scheme based on the Wishart classifier is proposed to improve the classification performance. From the experiments conducted in this study, it is evident that the proposed method performs more effectively compared with other traditional methods when the number of train set is small.

  8. A Chinese text classification system based on Naive Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Cui Wei

    2016-01-01

    Full Text Available In this paper, aiming at the characteristics of Chinese text classification, using the ICTCLAS(Chinese lexical analysis system of Chinese academy of sciences for document segmentation, and for data cleaning and filtering the Stop words, using the information gain and document frequency feature selection algorithm to document feature selection. Based on this, based on the Naive Bayesian algorithm implemented text classifier , and use Chinese corpus of Fudan University has carried on the experiment and analysis on the system.

  9. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs

    Science.gov (United States)

    Haaf, Ezra; Barthel, Roland

    2016-04-01

    When assessing hydrogeological conditions at the regional scale, the analyst is often confronted with uncertainty of structures, inputs and processes while having to base inference on scarce and patchy data. Haaf and Barthel (2015) proposed a concept for handling this predicament by developing a groundwater systems classification framework, where information is transferred from similar, but well-explored and better understood to poorly described systems. The concept is based on the central hypothesis that similar systems react similarly to the same inputs and vice versa. It is conceptually related to PUB (Prediction in ungauged basins) where organization of systems and processes by quantitative methods is intended and used to improve understanding and prediction. Furthermore, using the framework it is expected that regional conceptual and numerical models can be checked or enriched by ensemble generated data from neighborhood-based estimators. In a first step, groundwater hydrographs from a large dataset in Southern Germany are compared in an effort to identify structural similarity in groundwater dynamics. A number of approaches to group hydrographs, mostly based on a similarity measure - which have previously only been used in local-scale studies, can be found in the literature. These are tested alongside different global feature extraction techniques. The resulting classifications are then compared to a visual "expert assessment"-based classification which serves as a reference. A ranking of the classification methods is carried out and differences shown. Selected groups from the classifications are related to geological descriptors. Here we present the most promising results from a comparison of classifications based on series correlation, different series distances and series features, such as the coefficients of the discrete Fourier transform and the intrinsic mode functions of empirical mode decomposition. Additionally, we show examples of classes

  10. Protein superfamily classification using fuzzy rule-based classifier.

    Science.gov (United States)

    Mansoori, Eghbal G; Zolghadri, Mansoor J; Katebi, Seraj D

    2009-03-01

    In this paper, we have proposed a fuzzy rule-based classifier for assigning amino acid sequences into different superfamilies of proteins. While the most popular methods for protein classification rely on sequence alignment, our approach is alignment-free and so more human readable. It accounts for the distribution of contiguous patterns of n amino acids ( n-grams) in the sequences as features, alike other alignment-independent methods. Our approach, first extracts a plenty of features from a set of training sequences, then selects only some best of them, using a proposed feature ranking method. Thereafter, using these features, a novel steady-state genetic algorithm for extracting fuzzy classification rules from data is used to generate a compact set of interpretable fuzzy rules. The generated rules are simple and human understandable. So, the biologists can utilize them, for classification purposes, or incorporate their expertise to interpret or even modify them. To evaluate the performance of our fuzzy rule-based classifier, we have compared it with the conventional nonfuzzy C4.5 algorithm, beside some other fuzzy classifiers. This comparative study is conducted through classifying the protein sequences of five superfamily classes, downloaded from a public domain database. The obtained results show that the generated fuzzy rules are more interpretable, with acceptable improvement in the classification accuracy.

  11. Segmentation Based Fuzzy Classification of High Resolution Images

    Science.gov (United States)

    Rao, Mukund; Rao, Suryaprakash; Masser, Ian; Kasturirangan, K.

    Information extraction from satellite images is the process of delineation of entities in the image which pertain to some feature on the earth and to which on associating an attribute, a classification of the image is obtained. Classification is a common technique to extract information from remote sensing data and, by and large, the common classification techniques mainly exploit the spectral characteristics of remote sensing images and attempt to detect patterns in spectral information to classify images. These are based on a per-pixel analysis of the spectral information, "clustering" or "grouping" of pixels is done to generate meaningful thematic information. Most of the classification techniques apply statistical pattern recognition of image spectral vectors to "label" each pixel with appropriate class information from a set of training information. On the other hand, Segmentation is not new, but it is yet seldom used in image processing of remotely sensed data. Although there has been a lot of development in segmentation of grey tone images in this field and other fields, like robotic vision, there has been little progress in segmentation of colour or multi-band imagery. Especially within the last two years many new segmentation algorithms as well as applications were developed, but not all of them lead to qualitatively convincing results while being robust and operational. One reason is that the segmentation of an image into a given number of regions is a problem with a huge number of possible solutions. Newer algorithms based on fractal approach could eventually revolutionize image processing of remotely sensed data. The paper looks at applying spatial concepts to image processing, paving the way to algorithmically formulate some more advanced aspects of cognition and inference. In GIS-based spatial analysis, vector-based tools already have been able to support advanced tasks generating new knowledge. By identifying objects (as segmentation results) from

  12. A new Multiple ANFIS model for classification of hemiplegic gait.

    Science.gov (United States)

    Yardimci, A; Asilkan, O

    2014-01-01

    Neuro-fuzzy system is a combination of neural network and fuzzy system in such a way that neural network learning algorithms, is used to determine parameters of the fuzzy system. This paper describes the application of multiple adaptive neuro-fuzzy inference system (MANFIS) model which has hybrid learning algorithm for classification of hemiplegic gait acceleration (HGA) signals. Decision making was performed in two stages: feature extraction using the wavelet transforms (WT) and the ANFIS trained with the backpropagation gradient descent method in combination with the least squares method. The performance of the ANFIS model was evaluated in terms of training performance and classification accuracies and the results confirmed that the proposed ANFIS model has potential in classifying the HGA signals.

  13. Multiple Sclerosis and Employment: A Research Review Based on the International Classification of Function

    Science.gov (United States)

    Frain, Michael P.; Bishop, Malachy; Rumrill, Phillip D., Jr.; Chan, Fong; Tansey, Timothy N.; Strauser, David; Chiu, Chung-Yi

    2015-01-01

    Multiple sclerosis (MS) is an unpredictable, sometimes progressive chronic illness affecting people in the prime of their working lives. This article reviews the effects of MS on employment based on the World Health Organization's International Classification of Functioning, Disability and Health model. Correlations between employment and…

  14. Music Genre Classification using an Auditory Memory Model

    DEFF Research Database (Denmark)

    Jensen, Kristoffer

    2011-01-01

    Audio feature estimation is potentially improved by including higher- level models. One such model is the Auditory Short Term Memory (STM) model. A new paradigm of audio feature estimation is obtained by adding the influence of notes in the STM. These notes are identified when the perceptual...... results, and an initial experiment with sensory dissonance has been undertaken with good results. The parameters obtained form the auditory memory model, along with the dissonance measure, are shown here to be of interest in genre classification....

  15. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    Science.gov (United States)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  16. Comparing the performance of flat and hierarchical Habitat/Land-Cover classification models in a NATURA 2000 site

    Science.gov (United States)

    Gavish, Yoni; O'Connell, Jerome; Marsh, Charles J.; Tarantino, Cristina; Blonda, Palma; Tomaselli, Valeria; Kunin, William E.

    2018-02-01

    The increasing need for high quality Habitat/Land-Cover (H/LC) maps has triggered considerable research into novel machine-learning based classification models. In many cases, H/LC classes follow pre-defined hierarchical classification schemes (e.g., CORINE), in which fine H/LC categories are thematically nested within more general categories. However, none of the existing machine-learning algorithms account for this pre-defined hierarchical structure. Here we introduce a novel Random Forest (RF) based application of hierarchical classification, which fits a separate local classification model in every branching point of the thematic tree, and then integrates all the different local models to a single global prediction. We applied the hierarchal RF approach in a NATURA 2000 site in Italy, using two land-cover (CORINE, FAO-LCCS) and one habitat classification scheme (EUNIS) that differ from one another in the shape of the class hierarchy. For all 3 classification schemes, both the hierarchical model and a flat model alternative provided accurate predictions, with kappa values mostly above 0.9 (despite using only 2.2-3.2% of the study area as training cells). The flat approach slightly outperformed the hierarchical models when the hierarchy was relatively simple, while the hierarchical model worked better under more complex thematic hierarchies. Most misclassifications came from habitat pairs that are thematically distant yet spectrally similar. In 2 out of 3 classification schemes, the additional constraints of the hierarchical model resulted with fewer such serious misclassifications relative to the flat model. The hierarchical model also provided valuable information on variable importance which can shed light into "black-box" based machine learning algorithms like RF. We suggest various ways by which hierarchical classification models can increase the accuracy and interpretability of H/LC classification maps.

  17. Classification of Hearing Loss Disorders Using Teoae-Based Descriptors

    Science.gov (United States)

    Hatzopoulos, Stavros Dimitris

    Transiently Evoked Otoacoustic Emissions (TEOAE) are signals produced by the cochlea upon stimulation by an acoustic click. Within the context of this dissertation, it was hypothesized that the relationship between the TEOAEs and the functional status of the OHCs provided an opportunity for designing a TEOAE-based clinical procedure that could be used to assess cochlear function. To understand the nature of the TEOAE signals in the time and the frequency domain several different analyses were performed. Using normative Input-Output (IO) curves, short-time FFT analyses and cochlear computer simulations, it was found that for optimization of the hearing loss classification it is necessary to use a complete 20 ms TEOAE segment. It was also determined that various 2-D filtering methods (median and averaging filtering masks, LP-FFT) used to enhance of the TEOAE S/N offered minimal improvement (less than 6 dB per stimulus level). Higher S/N improvements resulted in TEOAE sequences that were over-smoothed. The final classification algorithm was based on a statistical analysis of raw FFT data and when applied to a sample set of clinically obtained TEOAE recordings (from 56 normal and 66 hearing-loss subjects) correctly identified 94.3% of the normal and 90% of the hearing loss subjects, at the 80 dB SPL stimulus level. To enhance the discrimination between the conductive and the sensorineural populations, data from the 68 dB SPL stimulus level were used, which yielded a normal classification of 90.2%, a hearing loss classification of 87.5% and a conductive-sensorineural classification of 87%. Among the hearing-loss populations the best discrimination was obtained in the group of otosclerosis and the worst in the group of acute acoustic trauma.

  18. A novel hybrid classification model of genetic algorithms, modified k-Nearest Neighbor and developed backpropagation neural network.

    Directory of Open Access Journals (Sweden)

    Nader Salari

    Full Text Available Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that

  19. A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-Nearest Neighbor and Developed Backpropagation Neural Network

    Science.gov (United States)

    Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy

    2014-01-01

    Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the

  20. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  1. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  2. Automatic age and gender classification using supervised appearance model

    Science.gov (United States)

    Bukar, Ali Maina; Ugail, Hassan; Connah, David

    2016-11-01

    Age and gender classification are two important problems that recently gained popularity in the research community, due to their wide range of applications. Research has shown that both age and gender information are encoded in the face shape and texture, hence the active appearance model (AAM), a statistical model that captures shape and texture variations, has been one of the most widely used feature extraction techniques for the aforementioned problems. However, AAM suffers from some drawbacks, especially when used for classification. This is primarily because principal component analysis (PCA), which is at the core of the model, works in an unsupervised manner, i.e., PCA dimensionality reduction does not take into account how the predictor variables relate to the response (class labels). Rather, it explores only the underlying structure of the predictor variables, thus, it is no surprise if PCA discards valuable parts of the data that represent discriminatory features. Toward this end, we propose a supervised appearance model (sAM) that improves on AAM by replacing PCA with partial least-squares regression. This feature extraction technique is then used for the problems of age and gender classification. Our experiments show that sAM has better predictive power than the conventional AAM.

  3. A Pruning Neural Network Model in Credit Classification Analysis

    Directory of Open Access Journals (Sweden)

    Yajiao Tang

    2018-01-01

    Full Text Available Nowadays, credit classification models are widely applied because they can help financial decision-makers to handle credit classification issues. Among them, artificial neural networks (ANNs have been widely accepted as the convincing methods in the credit industry. In this paper, we propose a pruning neural network (PNN and apply it to solve credit classification problem by adopting the well-known Australian and Japanese credit datasets. The model is inspired by synaptic nonlinearity of a dendritic tree in a biological neural model. And it is trained by an error back-propagation algorithm. The model is capable of realizing a neuronal pruning function by removing the superfluous synapses and useless dendrites and forms a tidy dendritic morphology at the end of learning. Furthermore, we utilize logic circuits (LCs to simulate the dendritic structures successfully which makes PNN be implemented on the hardware effectively. The statistical results of our experiments have verified that PNN obtains superior performance in comparison with other classical algorithms in terms of accuracy and computational efficiency.

  4. Three-Class Mammogram Classification Based on Descriptive CNN Features

    Directory of Open Access Journals (Sweden)

    M. Mohsin Jadoon

    2017-01-01

    Full Text Available In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases. In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW and convolutional neural network-curvelet transform (CNN-CT. An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE. In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT, while in the second method discrete curvelet transform (DCT is used. In both methods, dense scale invariant feature (DSIFT for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN. Softmax layer and support vector machine (SVM layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.

  5. Three-Class Mammogram Classification Based on Descriptive CNN Features.

    Science.gov (United States)

    Jadoon, M Mohsin; Zhang, Qianni; Haq, Ihsan Ul; Butt, Sharjeel; Jadoon, Adeel

    2017-01-01

    In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.

  6. Classification of e-government documents based on cooperative expression of word vectors

    Science.gov (United States)

    Fu, Qianqian; Liu, Hao; Wei, Zhiqiang

    2017-03-01

    The effective document classification is a powerful technique to deal with the huge amount of e-government documents automatically instead of accomplishing them manually. The word-to-vector (word2vec) model, which converts semantic word into low-dimensional vectors, could be successfully employed to classify the e-government documents. In this paper, we propose the cooperative expressions of word vector (Co-word-vector), whose multi-granularity of integration explores the possibility of modeling documents in the semantic space. Meanwhile, we also aim to improve the weighted continuous bag of words model based on word2vec model and distributed representation of topic-words based on LDA model. Furthermore, combining the two levels of word representation, performance result shows that our proposed method on the e-government document classification outperform than the traditional method.

  7. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    Science.gov (United States)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  8. Motion Pattern-Based Video Classification and Retrieval

    Directory of Open Access Journals (Sweden)

    Ma Yu-Fei

    2003-01-01

    Full Text Available Today′s content-based video retrieval technologies are still far from human′s requirements. A fundamental reason is the lack of content representation that is able to bridge the gap between visual features and semantic conception in video. In this paper, we propose a motion pattern descriptor, motion texture that characterizes motion in a generic way. With this representation, we design a semantic classification scheme to effectively map video clips to semantic categories. Support vector machines (SVMs are used as the classifiers. In addition, this scheme also improves significantly the performance of motion-based shot retrieval due to the comprehensiveness and effectiveness of motion pattern descriptor and the semantic classification capability as shown by experimental evaluations.

  9. A Novel Computer Virus Propagation Model under Security Classification

    Directory of Open Access Journals (Sweden)

    Qingyi Zhu

    2017-01-01

    Full Text Available In reality, some computers have specific security classification. For the sake of safety and cost, the security level of computers will be upgraded with increasing of threats in networks. Here we assume that there exists a threshold value which determines when countermeasures should be taken to level up the security of a fraction of computers with low security level. And in some specific realistic environments the propagation network can be regarded as fully interconnected. Inspired by these facts, this paper presents a novel computer virus dynamics model considering the impact brought by security classification in full interconnection network. By using the theory of dynamic stability, the existence of equilibria and stability conditions is analysed and proved. And the above optimal threshold value is given analytically. Then, some numerical experiments are made to justify the model. Besides, some discussions and antivirus measures are given.

  10. Genre classification using chords and stochastic language models

    OpenAIRE

    Pérez Sancho, Carlos; Rizo Valero, David; Iñesta Quereda, José Manuel

    2009-01-01

    Music genre meta-data is of paramount importance for the organisation of music repositories. People use genre in a natural way when entering a music store or looking into music collections. Automatic genre classification has become a popular topic in music information retrieval research both, with digital audio and symbolic data. This work focuses on the symbolic approach, bringing to music cognition some technologies, like the stochastic language models, already successfully applied to text ...

  11. Yarn-dyed fabric defect classification based on convolutional neural network

    Science.gov (United States)

    Jing, Junfeng; Dong, Amei; Li, Pengfei; Zhang, Kaibing

    2017-09-01

    Considering that manual inspection of the yarn-dyed fabric can be time consuming and inefficient, we propose a yarn-dyed fabric defect classification method by using a convolutional neural network (CNN) based on a modified AlexNet. CNN shows powerful ability in performing feature extraction and fusion by simulating the learning mechanism of human brain. The local response normalization layers in AlexNet are replaced by the batch normalization layers, which can enhance both the computational efficiency and classification accuracy. In the training process of the network, the characteristics of the defect are extracted step by step and the essential features of the image can be obtained from the fusion of the edge details with several convolution operations. Then the max-pooling layers, the dropout layers, and the fully connected layers are employed in the classification model to reduce the computation cost and extract more precise features of the defective fabric. Finally, the results of the defect classification are predicted by the softmax function. The experimental results show promising performance with an acceptable average classification rate and strong robustness on yarn-dyed fabric defect classification.

  12. PCA based feature reduction to improve the accuracy of decision tree c4.5 classification

    Science.gov (United States)

    Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.

    2018-03-01

    Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.

  13. Classification of EEG Signals Based on Pattern Recognition Approach.

    Science.gov (United States)

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.

  14. Classification of EEG Signals Based on Pattern Recognition Approach

    Directory of Open Access Journals (Sweden)

    Hafeez Ullah Amin

    2017-11-01

    Full Text Available Feature extraction is an important step in the process of electroencephalogram (EEG signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR and principal component analysis (PCA. A high density EEG dataset validated the proposed method (128-channels by identifying two classifications: (1 EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM test; (2 EEG signals recorded during a baseline task (eyes open. Classifiers such as, K-nearest neighbors (KNN, Support Vector Machine (SVM, Multi-layer Perceptron (MLP, and Naïve Bayes (NB were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5 of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5 deriving from the sub-band range (3.90–7.81 Hz. Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.

  15. A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark

    Directory of Open Access Journals (Sweden)

    Yong Wang

    2016-02-01

    Full Text Available Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.

  16. Object-Based Classification as an Alternative Approach to the Traditional Pixel-Based Classification to Identify Potential Habitat of the Grasshopper Sparrow

    Science.gov (United States)

    Jobin, Benoît; Labrecque, Sandra; Grenier, Marcelle; Falardeau, Gilles

    2008-01-01

    The traditional method of identifying wildlife habitat distribution over large regions consists of pixel-based classification of satellite images into a suite of habitat classes used to select suitable habitat patches. Object-based classification is a new method that can achieve the same objective based on the segmentation of spectral bands of the image creating homogeneous polygons with regard to spatial or spectral characteristics. The segmentation algorithm does not solely rely on the single pixel value, but also on shape, texture, and pixel spatial continuity. The object-based classification is a knowledge base process where an interpretation key is developed using ground control points and objects are assigned to specific classes according to threshold values of determined spectral and/or spatial attributes. We developed a model using the eCognition software to identify suitable habitats for the Grasshopper Sparrow, a rare and declining species found in southwestern Québec. The model was developed in a region with known breeding sites and applied on other images covering adjacent regions where potential breeding habitats may be present. We were successful in locating potential habitats in areas where dairy farming prevailed but failed in an adjacent region covered by a distinct Landsat scene and dominated by annual crops. We discuss the added value of this method, such as the possibility to use the contextual information associated to objects and the ability to eliminate unsuitable areas in the segmentation and land cover classification processes, as well as technical and logistical constraints. A series of recommendations on the use of this method and on conservation issues of Grasshopper Sparrow habitat is also provided.

  17. An Event Group Based Classification Framework for Multi-variate Sequential Data

    Directory of Open Access Journals (Sweden)

    Chao Sun

    2017-11-01

    Full Text Available Decision tree algorithms were not traditionally considered for sequential data classification, mostly because feature generation needs to be integrated with the modelling procedure in order to avoid a localisation problem. This paper presents an Event Group Based Classification (EGBC framework that utilises an X-of-N (XoN decision tree algorithm to avoid the feature generation issue during the classification on sequential data. In this method, features are generated independently based on the characteristics of the sequential data. Subsequently an XoN decision tree is utilised to select and aggregate useful features from various temporal and other dimensions (as event groups for optimised classification. This leads the EGBC framework to be adaptive to sequential data of differing dimensions, robust to missing data and accommodating to either numeric or nominal data types. The comparatively improved outcomes from applying this method are demonstrated on two distinct areas – a text based language identification task, as well as a honeybee dance behaviour classification problem. A further motivating industrial problem – hot metal temperature prediction, is further considered with the EGBC framework in order to address significant real-world demands.

  18. Classification of complex networks based on similarity of topological network features

    Science.gov (United States)

    Attar, Niousha; Aliakbary, Sadegh

    2017-09-01

    Over the past few decades, networks have been widely used to model real-world phenomena. Real-world networks exhibit nontrivial topological characteristics and therefore, many network models are proposed in the literature for generating graphs that are similar to real networks. Network models reproduce nontrivial properties such as long-tail degree distributions or high clustering coefficients. In this context, we encounter the problem of selecting the network model that best fits a given real-world network. The need for a model selection method reveals the network classification problem, in which a target-network is classified into one of the candidate network models. In this paper, we propose a novel network classification method which is independent of the network size and employs an alignment-free metric of network comparison. The proposed method is based on supervised machine learning algorithms and utilizes the topological similarities of networks for the classification task. The experiments show that the proposed method outperforms state-of-the-art methods with respect to classification accuracy, time efficiency, and robustness to noise.

  19. Cancer Pain: A Critical Review of Mechanism-based Classification and Physical Therapy Management in Palliative Care.

    Science.gov (United States)

    Kumar, Senthil P

    2011-05-01

    Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient's symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient's findings, using a biopsychosocial model of pain.

  20. Cancer Pain: A Critical Review of Mechanism-based Classification and Physical Therapy Management in Palliative Care

    Science.gov (United States)

    Kumar, Senthil P

    2011-01-01

    Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient's symptoms and treatment responses. Existing evidence suggests that the five mechanisms – central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective – operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient's findings, using a biopsychosocial model of pain. PMID:21976851

  1. Cancer pain: A critical review of mechanism-based classification and physical therapy management in palliative care

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2011-01-01

    Full Text Available Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient′s symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient′s findings, using a biopsychosocial model of pain.

  2. Feature selection for elderly faller classification based on wearable sensors.

    Science.gov (United States)

    Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D

    2017-05-30

    Wearable sensors can be used to derive numerous gait pattern features for elderly fall risk and faller classification; however, an appropriate feature set is required to avoid high computational costs and the inclusion of irrelevant features. The objectives of this study were to identify and evaluate smaller feature sets for faller classification from large feature sets derived from wearable accelerometer and pressure-sensing insole gait data. A convenience sample of 100 older adults (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, left and right shanks. Feature selection was performed using correlation-based feature selection (CFS), fast correlation based filter (FCBF), and Relief-F algorithms. Faller classification was performed using multi-layer perceptron neural network, naïve Bayesian, and support vector machine classifiers, with 75:25 single stratified holdout and repeated random sampling. The best performing model was a support vector machine with 78% accuracy, 26% sensitivity, 95% specificity, 0.36 F1 score, and 0.31 MCC and one posterior pelvis accelerometer input feature (left acceleration standard deviation). The second best model achieved better sensitivity (44%) and used a support vector machine with 74% accuracy, 83% specificity, 0.44 F1 score, and 0.29 MCC. This model had ten input features: maximum, mean and standard deviation posterior acceleration; maximum, mean and standard deviation anterior acceleration; mean superior acceleration; and three impulse features. The best multi-sensor model sensitivity (56%) was achieved using posterior pelvis and both shank accelerometers and a naïve Bayesian classifier. The best single-sensor model sensitivity (41%) was achieved using the posterior pelvis accelerometer and a naïve Bayesian classifier. Feature selection provided models with smaller feature

  3. A Characteristics-Based Approach to Radioactive Waste Classification in Advanced Nuclear Fuel Cycles

    Science.gov (United States)

    Djokic, Denia

    The radioactive waste classification system currently used in the United States primarily relies on a source-based framework. This has lead to numerous issues, such as wastes that are not categorized by their intrinsic risk, or wastes that do not fall under a category within the framework and therefore are without a legal imperative for responsible management. Furthermore, in the possible case that advanced fuel cycles were to be deployed in the United States, the shortcomings of the source-based classification system would be exacerbated: advanced fuel cycles implement processes such as the separation of used nuclear fuel, which introduce new waste streams of varying characteristics. To be able to manage and dispose of these potential new wastes properly, development of a classification system that would assign appropriate level of management to each type of waste based on its physical properties is imperative. This dissertation explores how characteristics from wastes generated from potential future nuclear fuel cycles could be coupled with a characteristics-based classification framework. A static mass flow model developed under the Department of Energy's Fuel Cycle Research & Development program, called the Fuel-cycle Integration and Tradeoffs (FIT) model, was used to calculate the composition of waste streams resulting from different nuclear fuel cycle choices: two modified open fuel cycle cases (recycle in MOX reactor) and two different continuous-recycle fast reactor recycle cases (oxide and metal fuel fast reactors). This analysis focuses on the impact of waste heat load on waste classification practices, although future work could involve coupling waste heat load with metrics of radiotoxicity and longevity. The value of separation of heat-generating fission products and actinides in different fuel cycles and how it could inform long- and short-term disposal management is discussed. It is shown that the benefits of reducing the short-term fission

  4. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  5. AIRBORNE LIDAR POINTS CLASSIFICATION BASED ON TENSOR SPARSE REPRESENTATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2017-09-01

    Full Text Available The common statistical methods for supervised classification usually require a large amount of training data to achieve reasonable results, which is time consuming and inefficient. This paper proposes a tensor sparse representation classification (SRC method for airborne LiDAR points. The LiDAR points are represented as tensors to keep attributes in its spatial space. Then only a few of training data is used for dictionary learning, and the sparse tensor is calculated based on tensor OMP algorithm. The point label is determined by the minimal reconstruction residuals. Experiments are carried out on real LiDAR points whose result shows that objects can be distinguished by this algorithm successfully.

  6. Semantic analysis based forms information retrieval and classification

    Science.gov (United States)

    Saba, Tanzila; Alqahtani, Fatimah Ayidh

    2013-09-01

    Data entry forms are employed in all types of enterprises to collect hundreds of customer's information on daily basis. The information is filled manually by the customers. Hence, it is laborious and time consuming to use human operator to transfer these customers information into computers manually. Additionally, it is expensive and human errors might cause serious flaws. The automatic interpretation of scanned forms has facilitated many real applications from speed and accuracy point of view such as keywords spotting, sorting of postal addresses, script matching and writer identification. This research deals with different strategies to extract customer's information from these scanned forms, interpretation and classification. Accordingly, extracted information is segmented into characters for their classification and finally stored in the forms of records in databases for their further processing. This paper presents a detailed discussion of these semantic based analysis strategies for forms processing. Finally, new directions are also recommended for future research. [Figure not available: see fulltext.

  7. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    Science.gov (United States)

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  8. INTENSITY- AND TIME COURSE-BASED CLASSIFICATIONS OF OXIDATIVE STRESSES

    Directory of Open Access Journals (Sweden)

    Volodymyr Lushchak

    2015-05-01

    Full Text Available In living organisms, production of reactive oxygen species (ROS is counterbalanced by their elimination and/or prevention of formation which in concert can typically maintain a steady-state (stationary ROS level. However, this balance may be disturbed and lead to elevated ROS levels and enhanced damage to biomolecules. Since 1985, when H. Sies first introduced the definition of oxidative stress, this area has become one of the hot topics in biology and, to date, many details related to ROS-induced damage to cellular components, ROS-based signaling, cellular responses and adaptation have been disclosed. However, some basal oxidative damage always occurs under unstressed conditions, and in many experimental studies it is difficult to show definitely that oxidative stress is indeed induced by the stressor. Therefore, usually researchers experience substantial difficulties in the correct interpretation of oxidative stress development. For example, in many cases an increase or decrease in the activity of antioxidant and related enzymes are interpreted as evidences of oxidative stress. Careful selection of specific biomarkers (ROS-modified targets may be very helpful. To avoid these sorts of problems, I propose several classifications of oxidative stress based on its time-course and intensity. The time-course classification includes acute and chronic stresses. In the intensity based classification, I propose to discriminate four zones of function in the relationship between “Dose/concentration of inducer” and the measured “Endpoint”: I – basal oxidative stress zone (BOS; II – low intensity oxidative stress (LOS; III – intermediate intensity oxidative stress (IOS; IV – high intensity oxidative stress (HOS. The proposed classifications may be helpful to describe experimental data where oxidative stress is induced and systematize it based on its time course and intensity. Perspective directions of investigations in the field include

  9. Land Covers Classification Based on Random Forest Method Using Features from Full-Waveform LIDAR Data

    Science.gov (United States)

    Ma, L.; Zhou, M.; Li, C.

    2017-09-01

    In this study, a Random Forest (RF) based land covers classification method is presented to predict the types of land covers in Miyun area. The returned full-waveforms which were acquired by a LiteMapper 5600 airborne LiDAR system were processed, including waveform filtering, waveform decomposition and features extraction. The commonly used features that were distance, intensity, Full Width at Half Maximum (FWHM), skewness and kurtosis were extracted. These waveform features were used as attributes of training data for generating the RF prediction model. The RF prediction model was applied to predict the types of land covers in Miyun area as trees, buildings, farmland and ground. The classification results of these four types of land covers were obtained according to the ground truth information acquired from CCD image data of the same region. The RF classification results were compared with that of SVM method and show better results. The RF classification accuracy reached 89.73% and the classification Kappa was 0.8631.

  10. Hierarchical Terrain Classification Based on Multilayer Bayesian Network and Conditional Random Field

    Directory of Open Access Journals (Sweden)

    Chu He

    2017-01-01

    Full Text Available This paper presents a hierarchical classification approach for Synthetic Aperture Radar (SAR images. The Conditional Random Field (CRF and Bayesian Network (BN are employed to incorporate prior knowledge into this approach for facilitating SAR image classification. (1 A multilayer region pyramid is constructed based on multiscale oversegmentation, and then, CRF is used to model the spatial relationships among those extracted regions within each layer of the region pyramid; the boundary prior knowledge is exploited and integrated into the CRF model as a strengthened constraint to improve classification performance near the boundaries. (2 Multilayer BN is applied to establish the causal connections between adjacent layers of the constructed region pyramid, where the classification probabilities of those sub-regions in the lower layer, conditioned on their parents’ regions in the upper layer, are used as adjacent links. More contextual information is taken into account in this framework, which is a benefit to the performance improvement. Several experiments are conducted on real ESAR and TerraSAR data, and the results show that the proposed method achieves better classification accuracy.

  11. Classification of Suicide Attempts through a Machine Learning Algorithm Based on Multiple Systemic Psychiatric Scales

    Directory of Open Access Journals (Sweden)

    Jihoon Oh

    2017-09-01

    Full Text Available Classification and prediction of suicide attempts in high-risk groups is important for preventing suicide. The purpose of this study was to investigate whether the information from multiple clinical scales has classification power for identifying actual suicide attempts. Patients with depression and anxiety disorders (N = 573 were included, and each participant completed 31 self-report psychiatric scales and questionnaires about their history of suicide attempts. We then trained an artificial neural network classifier with 41 variables (31 psychiatric scales and 10 sociodemographic elements and ranked the contribution of each variable for the classification of suicide attempts. To evaluate the clinical applicability of our model, we measured classification performance with top-ranked predictors. Our model had an overall accuracy of 93.7% in 1-month, 90.8% in 1-year, and 87.4% in lifetime suicide attempts detection. The area under the receiver operating characteristic curve (AUROC was the highest for 1-month suicide attempts detection (0.93, followed by lifetime (0.89, and 1-year detection (0.87. Among all variables, the Emotion Regulation Questionnaire had the highest contribution, and the positive and negative characteristics of the scales similarly contributed to classification performance. Performance on suicide attempts classification was largely maintained when we only used the top five ranked variables for training (AUROC; 1-month, 0.75, 1-year, 0.85, lifetime suicide attempts detection, 0.87. Our findings indicate that information from self-report clinical scales can be useful for the classification of suicide attempts. Based on the reliable performance of the top five predictors alone, this machine learning approach could help clinicians identify high-risk patients in clinical settings.

  12. Building an asynchronous web-based tool for machine learning classification.

    Science.gov (United States)

    Weber, Griffin; Vinterbo, Staal; Ohno-Machado, Lucila

    2002-01-01

    Various unsupervised and supervised learning methods including support vector machines, classification trees, linear discriminant analysis and nearest neighbor classifiers have been used to classify high-throughput gene expression data. Simpler and more widely accepted statistical tools have not yet been used for this purpose, hence proper comparisons between classification methods have not been conducted. We developed free software that implements logistic regression with stepwise variable selection as a quick and simple method for initial exploration of important genetic markers in disease classification. To implement the algorithm and allow our collaborators in remote locations to evaluate and compare its results against those of other methods, we developed a user-friendly asynchronous web-based application with a minimal amount of programming using free, downloadable software tools. With this program, we show that classification using logistic regression can perform as well as other more sophisticated algorithms, and it has the advantages of being easy to interpret and reproduce. By making the tool freely and easily available, we hope to promote the comparison of classification methods. In addition, we believe our web application can be used as a model for other bioinformatics laboratories that need to develop web-based analysis tools in a short amount of time and on a limited budget.

  13. Task-Driven Dictionary Learning Based on Mutual Information for Medical Image Classification.

    Science.gov (United States)

    Diamant, Idit; Klang, Eyal; Amitai, Michal; Konen, Eli; Goldberger, Jacob; Greenspan, Hayit

    2017-06-01

    We present a novel variant of the bag-of-visual-words (BoVW) method for automated medical image classification. Our approach improves the BoVW model by learning a task-driven dictionary of the most relevant visual words per task using a mutual information-based criterion. Additionally, we generate relevance maps to visualize and localize the decision of the automatic classification algorithm. These maps demonstrate how the algorithm works and show the spatial layout of the most relevant words. We applied our algorithm to three different tasks: chest x-ray pathology identification (of four pathologies: cardiomegaly, enlarged mediastinum, right consolidation, and left consolidation), liver lesion classification into four categories in computed tomography (CT) images and benign/malignant clusters of microcalcifications (MCs) classification in breast mammograms. Validation was conducted on three datasets: 443 chest x-rays, 118 portal phase CT images of liver lesions, and 260 mammography MCs. The proposed method improves the classical BoVW method for all tested applications. For chest x-ray, area under curve of 0.876 was obtained for enlarged mediastinum identification compared to 0.855 using classical BoVW (with p-value 0.01). For MC classification, a significant improvement of 4% was achieved using our new approach (with p-value = 0.03). For liver lesion classification, an improvement of 6% in sensitivity and 2% in specificity were obtained (with p-value 0.001). We demonstrated that classification based on informative selected set of words results in significant improvement. Our new BoVW approach shows promising results in clinically important domains. Additionally, it can discover relevant parts of images for the task at hand without explicit annotations for training data. This can provide computer-aided support for medical experts in challenging image analysis tasks.

  14. Radiographic classification for fractures of the fifth metatarsal base

    Energy Technology Data Exchange (ETDEWEB)

    Mehlhorn, Alexander T.; Zwingmann, Joern; Hirschmueller, Anja; Suedkamp, Norbert P.; Schmal, Hagen [University of Freiburg Medical Center, Department of Orthopaedic Surgery, Freiburg (Germany)

    2014-04-15

    Avulsion fractures of the fifth metatarsal base (MTB5) are common fore foot injuries. Based on a radiomorphometric analysis reflecting the risk for a secondary displacement, a new classification was developed. A cohort of 95 healthy, sportive, and young patients (age ≤ 50 years) with avulsion fractures of the MTB5 was included in the study and divided into groups with non-displaced, primary-displaced, and secondary-displaced fractures. Radiomorphometric data obtained using standard oblique and dorso-plantar views were analyzed in association with secondary displacement. Based on this, a classification was developed and checked for reproducibility. Fractures with a longer distance between the lateral edge of the styloid process and the lateral fracture step-off and fractures with a more medial joint entry of the fracture line at the MTB5 are at higher risk to displace secondarily. Based on these findings, all fractures were divided into three types: type I with a fracture entry in the lateral third; type II in the middle third; and type III in the medial third of the MTB5. Additionally, the three types were subdivided into an A-type with a fracture displacement <2 mm and a B-type with a fracture displacement ≥ 2 mm. A substantial level of interobserver agreement was found in the assignment of all 95 fractures to the six fracture types (κ = 0.72). The secondary displacement of fractures was confirmed by all examiners in 100 %. Radiomorphometric data may identify fractures at risk for secondary displacement of the MTB5. Based on this, a reliable classification was developed. (orig.)

  15. Image Classification Based on Convolutional Denoising Sparse Autoencoder

    Directory of Open Access Journals (Sweden)

    Shuangshuang Chen

    2017-01-01

    Full Text Available Image classification aims to group images into corresponding semantic categories. Due to the difficulties of interclass similarity and intraclass variability, it is a challenging issue in computer vision. In this paper, an unsupervised feature learning approach called convolutional denoising sparse autoencoder (CDSAE is proposed based on the theory of visual attention mechanism and deep learning methods. Firstly, saliency detection method is utilized to get training samples for unsupervised feature learning. Next, these samples are sent to the denoising sparse autoencoder (DSAE, followed by convolutional layer and local contrast normalization layer. Generally, prior in a specific task is helpful for the task solution. Therefore, a new pooling strategy—spatial pyramid pooling (SPP fused with center-bias prior—is introduced into our approach. Experimental results on the common two image datasets (STL-10 and CIFAR-10 demonstrate that our approach is effective in image classification. They also demonstrate that none of these three components: local contrast normalization, SPP fused with center-prior, and l2 vector normalization can be excluded from our proposed approach. They jointly improve image representation and classification performance.

  16. MMG-based classification of muscle activity for prosthesis control.

    Science.gov (United States)

    Silva, J; Heim, W; Chau, T

    2004-01-01

    We have previously proposed the use of "muscle sounds" or mechanomyography (MMG) as a reliable alternative measure of muscle activity with the main objective of facilitating the use of more comfortable and functional soft silicone sockets with below-elbow externally powered prosthesis. This work describes an integrated strategy where data and sensor fusion algorithms are combined to provide MMG-based detection, estimation and classification of muscle activity. The proposed strategy represents the first ever attempt to generate multiple output signals for practical prosthesis control using a MMG multisensor array embedded distally within a silicon soft socket. This multisensor fusion strategy consists of two stages. The first is the detection stage which determines the presence or absence of muscle contractions in the acquired signals. Upon detection of a contraction, the second stage, that of classification, specifies the nature of the contraction and determines the corresponding control output. Tests with real amputees indicate that with the simple detection and classification algorithms proposed, MMG is indeed comparable to and may exceed EMG functionally.

  17. Resting State fMRI Functional Connectivity-Based Classification Using a Convolutional Neural Network Architecture

    Directory of Open Access Journals (Sweden)

    Regina J. Meszlényi

    2017-10-01

    Full Text Available Machine learning techniques have become increasingly popular in the field of resting state fMRI (functional magnetic resonance imaging network based classification. However, the application of convolutional networks has been proposed only very recently and has remained largely unexplored. In this paper we describe a convolutional neural network architecture for functional connectome classification called connectome-convolutional neural network (CCNN. Our results on simulated datasets and a publicly available dataset for amnestic mild cognitive impairment classification demonstrate that our CCNN model can efficiently distinguish between subject groups. We also show that the connectome-convolutional network is capable to combine information from diverse functional connectivity metrics and that models using a combination of different connectivity descriptors are able to outperform classifiers using only one metric. From this flexibility follows that our proposed CCNN model can be easily adapted to a wide range of connectome based classification or regression tasks, by varying which connectivity descriptor combinations are used to train the network.

  18. Resting State fMRI Functional Connectivity-Based Classification Using a Convolutional Neural Network Architecture

    Science.gov (United States)

    Meszlényi, Regina J.; Buza, Krisztian; Vidnyánszky, Zoltán

    2017-01-01

    Machine learning techniques have become increasingly popular in the field of resting state fMRI (functional magnetic resonance imaging) network based classification. However, the application of convolutional networks has been proposed only very recently and has remained largely unexplored. In this paper we describe a convolutional neural network architecture for functional connectome classification called connectome-convolutional neural network (CCNN). Our results on simulated datasets and a publicly available dataset for amnestic mild cognitive impairment classification demonstrate that our CCNN model can efficiently distinguish between subject groups. We also show that the connectome-convolutional network is capable to combine information from diverse functional connectivity metrics and that models using a combination of different connectivity descriptors are able to outperform classifiers using only one metric. From this flexibility follows that our proposed CCNN model can be easily adapted to a wide range of connectome based classification or regression tasks, by varying which connectivity descriptor combinations are used to train the network. PMID:29089883

  19. Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

    Science.gov (United States)

    Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

    2013-01-01

    The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.

  20. [A comparison between the revision of Atlanta classification and determinant-based classification in acute pancreatitis].

    Science.gov (United States)

    Wu, D; Lu, B; Xue, H D; Lai, Y M; Qian, J M; Yang, H

    2017-12-01

    Objective: To compare the performance of the revision of Atlanta classification (RAC) and determinant-based classification (DBC) in acute pancreatitis. Methods: Consecutive patients with acute pancreatitis admitted to a single center from January 2001 to January 2015 were retrospectively analyzed. Patients were classified into mild, moderately severe and severe categories based on RAC and were simultaneously classified into mild, moderate, severe and critical grades according to DBC. Disease severity and clinical outcomes were compared between subgroups. The receiver operating curve (ROC) was used to compare the utility of RAC and DBC by calculating the area under curve (AUC). Results: Among 1 120 patients enrolled, organ failure occurred in 343 patients (30.6%) and infected necrosis in 74 patients(6.6%). A total of 63 patients (5.6%) died. Statistically significant difference of disease severity and outcomes was observed between all the subgroups in RAC and DBC ( Pacute pancreatitis (with both persistent organ failure and infected necrosis) had the most severe clinical course and the highest mortality (19/31, 61.3%). DBC had a larger AUC (0.73, 95% CI 0.69-0.78) than RAC (0.68, 95% CI 0.65-0.73) in classifying ICU admissions ( P= 0.031), but both were similar in predicting mortality( P= 0.372) and prolonged ICU stay ( P= 0.266). Conclusions: DBC and RAC perform comparably well in categorizing patients with acute pancreatitis regarding disease severity and clinical outcome. DBC is slightly better than RAC in predicting prolonged hospital stay. Persistent organ failure and infected necrosis are risk factors for poor prognosis and presence of both is associated with the most dismal outcome.

  1. An Approach for Leukemia Classification Based on Cooperative Game Theory

    Directory of Open Access Journals (Sweden)

    Atefeh Torkaman

    2011-01-01

    Full Text Available Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO. Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5 with (90.16% in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment

  2. An approach for leukemia classification based on cooperative game theory.

    Science.gov (United States)

    Torkaman, Atefeh; Charkari, Nasrollah Moghaddam; Aghaeipour, Mahnaz

    2011-01-01

    Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO). Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5) with (90.16%) in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment of leukemic

  3. Microcalcification classification assisted by content-based image retrieval for breast cancer diagnosis.

    Science.gov (United States)

    Wei, Liyang; Yang, Yongyi; Nishikawa, Roberts M

    2009-06-01

    In this paper we propose a microcalcification classification scheme, assisted by content-based mammogram retrieval, for breast cancer diagnosis. We recently developed a machine learning approach for mammogram retrieval where the similarity measure between two lesion mammograms was modeled after expert observers. In this work we investigate how to use retrieved similar cases as references to improve the performance of a numerical classifier. Our rationale is that by adaptively incorporating local proximity information into a classifier, it can help to improve its classification accuracy, thereby leading to an improved "second opinion" to radiologists. Our experimental results on a mammogram database demonstrate that the proposed retrieval-driven approach with an adaptive support vector machine (SVM) could improve the classification performance from 0.78 to 0.82 in terms of the area under the ROC curve.

  4. Texture-based classification of different gastric tumors at contrast-enhanced CT

    International Nuclear Information System (INIS)

    Ba-Ssalamah, Ahmed; Muin, Dina; Schernthaner, Ruediger; Kulinna-Cosentini, Christiana; Bastati, Nina; Stift, Judith; Gore, Richard; Mayerhoefer, Marius E.

    2013-01-01

    Purpose: To determine the feasibility of texture analysis for the classification of gastric adenocarcinoma, lymphoma, and gastrointestinal stromal tumors on contrast-enhanced hydrodynamic-MDCT images. Materials and methods: The arterial phase scans of 47 patients with adenocarcinoma (AC) and a histologic tumor grade of [AC-G1, n = 4, G1, n = 4; AC-G2, n = 7; AC-G3, n = 16]; GIST, n = 15; and lymphoma, n = 5, and the venous phase scans of 48 patients with AC-G1, n = 3; AC-G2, n = 6; AC-G3, n = 14; GIST, n = 17; lymphoma, n = 8, were retrospectively reviewed. Based on regions of interest, texture analysis was performed, and features derived from the gray-level histogram, run-length and co-occurrence matrix, absolute gradient, autoregressive model, and wavelet transform were calculated. Fisher coefficients, probability of classification error, average correlation coefficients, and mutual information coefficients were used to create combinations of texture features that were optimized for tumor differentiation. Linear discriminant analysis in combination with a k-nearest neighbor classifier was used for tumor classification. Results: On arterial-phase scans, texture-based lesion classification was highly successful in differentiating between AC and lymphoma, and GIST and lymphoma, with misclassification rates of 3.1% and 0%, respectively. On venous-phase scans, texture-based classification was slightly less successful for AC vs. lymphoma (9.7% misclassification) and GIST vs. lymphoma (8% misclassification), but enabled the differentiation between AC and GIST (10% misclassification), and between the different grades of AC (4.4% misclassification). No texture feature combination was able to adequately distinguish between all three tumor types. Conclusion: Classification of different gastric tumors based on textural information may aid radiologists in establishing the correct diagnosis, at least in cases where the differential diagnosis can be narrowed down to two

  5. Characterization and classification of lupus patients based on plasma thermograms.

    Directory of Open Access Journals (Sweden)

    Nichola C Garbett

    Full Text Available Plasma thermograms (thermal stability profiles of blood plasma are being utilized as a new diagnostic approach for clinical assessment. In this study, we investigated the ability of plasma thermograms to classify systemic lupus erythematosus (SLE patients versus non SLE controls using a sample of 300 SLE and 300 control subjects from the Lupus Family Registry and Repository. Additionally, we evaluated the heterogeneity of thermograms along age, sex, ethnicity, concurrent health conditions and SLE diagnostic criteria.Thermograms were visualized graphically for important differences between covariates and summarized using various measures. A modified linear discriminant analysis was used to segregate SLE versus control subjects on the basis of the thermograms. Classification accuracy was measured based on multiple training/test splits of the data and compared to classification based on SLE serological markers.Median sensitivity, specificity, and overall accuracy based on classification using plasma thermograms was 86%, 83%, and 84% compared to 78%, 95%, and 86% based on a combination of five antibody tests. Combining thermogram and serology information together improved sensitivity from 78% to 86% and overall accuracy from 86% to 89% relative to serology alone. Predictive accuracy of thermograms for distinguishing SLE and osteoarthritis / rheumatoid arthritis patients was comparable. Both gender and anemia significantly interacted with disease status for plasma thermograms (p<0.001, with greater separation between SLE and control thermograms for females relative to males and for patients with anemia relative to patients without anemia.Plasma thermograms constitute an additional biomarker which may help improve diagnosis of SLE patients, particularly when coupled with standard diagnostic testing. Differences in thermograms according to patient sex, ethnicity, clinical and environmental factors are important considerations for application of

  6. Fuzzy classification of phantom parent groups in an animal model

    Directory of Open Access Journals (Sweden)

    Fikse Freddy

    2009-09-01

    Full Text Available Abstract Background Genetic evaluation models often include genetic groups to account for unequal genetic level of animals with unknown parentage. The definition of phantom parent groups usually includes a time component (e.g. years. Combining several time periods to ensure sufficiently large groups may create problems since all phantom parents in a group are considered contemporaries. Methods To avoid the downside of such distinct classification, a fuzzy logic approach is suggested. A phantom parent can be assigned to several genetic groups, with proportions between zero and one that sum to one. Rules were presented for assigning coefficients to the inverse of the relationship matrix for fuzzy-classified genetic groups. This approach was illustrated with simulated data from ten generations of mass selection. Observations and pedigree records were randomly deleted. Phantom parent groups were defined on the basis of gender and generation number. In one scenario, uncertainty about generation of birth was simulated for some animals with unknown parents. In the distinct classification, one of the two possible generations of birth was randomly chosen to assign phantom parents to genetic groups for animals with simulated uncertainty, whereas the phantom parents were assigned to both possible genetic groups in the fuzzy classification. Results The empirical prediction error variance (PEV was somewhat lower for fuzzy-classified genetic groups. The ranking of animals with unknown parents was more correct and less variable across replicates in comparison with distinct genetic groups. In another scenario, each phantom parent was assigned to three groups, one pertaining to its gender, and two pertaining to the first and last generation, with proportion depending on the (true generation of birth. Due to the lower number of groups, the empirical PEV of breeding values was smaller when genetic groups were fuzzy-classified. Conclusion Fuzzy-classification

  7. Evaluation of air quality zone classification methods based on ambient air concentration exposure.

    Science.gov (United States)

    Freeman, Brian; McBean, Ed; Gharabaghi, Bahram; Thé, Jesse

    2017-05-01

    Air quality zones are used by regulatory authorities to implement ambient air standards in order to protect human health. Air quality measurements at discrete air monitoring stations are critical tools to determine whether an air quality zone complies with local air quality standards or is noncompliant. This study presents a novel approach for evaluation of air quality zone classification methods by breaking the concentration distribution of a pollutant measured at an air monitoring station into compliance and exceedance probability density functions (PDFs) and then using Monte Carlo analysis with the Central Limit Theorem to estimate long-term exposure. The purpose of this paper is to compare the risk associated with selecting one ambient air classification approach over another by testing the possible exposure an individual living within a zone may face. The chronic daily intake (CDI) is utilized to compare different pollutant exposures over the classification duration of 3 years between two classification methods. Historical data collected from air monitoring stations in Kuwait are used to build representative models of 1-hr NO 2 and 8-hr O 3 within a zone that meets the compliance requirements of each method. The first method, the "3 Strike" method, is a conservative approach based on a winner-take-all approach common with most compliance classification methods, while the second, the 99% Rule method, allows for more robust analyses and incorporates long-term trends. A Monte Carlo analysis is used to model the CDI for each pollutant and each method with the zone at a single station and with multiple stations. The model assumes that the zone is already in compliance with air quality standards over the 3 years under the different classification methodologies. The model shows that while the CDI of the two methods differs by 2.7% over the exposure period for the single station case, the large number of samples taken over the duration period impacts the sensitivity

  8. Patient-Specific Deep Architectural Model for ECG Classification

    Directory of Open Access Journals (Sweden)

    Kan Luo

    2017-01-01

    Full Text Available Heartbeat classification is a crucial step for arrhythmia diagnosis during electrocardiographic (ECG analysis. The new scenario of wireless body sensor network- (WBSN- enabled ECG monitoring puts forward a higher-level demand for this traditional ECG analysis task. Previously reported methods mainly addressed this requirement with the applications of a shallow structured classifier and expert-designed features. In this study, modified frequency slice wavelet transform (MFSWT was firstly employed to produce the time-frequency image for heartbeat signal. Then the deep learning (DL method was performed for the heartbeat classification. Here, we proposed a novel model incorporating automatic feature abstraction and a deep neural network (DNN classifier. Features were automatically abstracted by the stacked denoising auto-encoder (SDA from the transferred time-frequency image. DNN classifier was constructed by an encoder layer of SDA and a softmax layer. In addition, a deterministic patient-specific heartbeat classifier was achieved by fine-tuning on heartbeat samples, which included a small subset of individual samples. The performance of the proposed model was evaluated on the MIT-BIH arrhythmia database. Results showed that an overall accuracy of 97.5% was achieved using the proposed model, confirming that the proposed DNN model is a powerful tool for heartbeat pattern recognition.

  9. A Study of Image Classification of Remote Sensing Based on Back-Propagation Neural Network with Extended Delta Bar Delta

    Directory of Open Access Journals (Sweden)

    Shi Liang Zhang

    2015-01-01

    Full Text Available This paper proposes a model to extract feature information quickly and accurately identifying what cannot be achieved through traditional methods of remote sensing image classification. First, process the selected Landsat-8 remote sensing data, including radiometric calibration, geometric correction, optimal band combination, and image cropping. Add the processed remote sensing image to the normalized geographic auxiliary information, digital elevation model (DEM, and normalized difference vegetation index (NDVI, working together to build a neural network that consists of three levels based on the structure of back-propagation neural and extended delta bar delta (BPN-EDBD algorithm, determining the parameters of the neural network to constitute a good classification model. Then determine classification and standards via field surveys and related geographic information; select training samples BPN-EDBD for algorithm learning and training and, if necessary, revise and improve its parameters using the BPN-EDBD classification algorithm to classify the remote sensing image after pretreatment and DEM data and NDVI as input parameters and output classification results, and run accuracy assessment. Finally, compare with traditional supervised classification algorithms, while adding different auxiliary geographic information to compare classification results to study the advantages and disadvantages of BPN-EDBD classification algorithm.

  10. Application of Bayesian Classification to Content-Based Data Management

    Science.gov (United States)

    Lynnes, Christopher; Berrick, S.; Gopalan, A.; Hua, X.; Shen, S.; Smith, P.; Yang, K-Y.; Wheeler, K.; Curry, C.

    2004-01-01

    The high volume of Earth Observing System data has proven to be challenging to manage for data centers and users alike. At the Goddard Earth Sciences Distributed Active Archive Center (GES DAAC), about 1 TB of new data are archived each day. Distribution to users is also about 1 TB/day. A substantial portion of this distribution is MODIS calibrated radiance data, which has a wide variety of uses. However, much of the data is not useful for a particular user's needs: for example, ocean color users typically need oceanic pixels that are free of cloud and sun-glint. The GES DAAC is using a simple Bayesian classification scheme to rapidly classify each pixel in the scene in order to support several experimental content-based data services for near-real-time MODIS calibrated radiance products (from Direct Readout stations). Content-based subsetting would allow distribution of, say, only clear pixels to the user if desired. Content-based subscriptions would distribute data to users only when they fit the user's usability criteria in their area of interest within the scene. Content-based cache management would retain more useful data on disk for easy online access. The classification may even be exploited in an automated quality assessment of the geolocation product. Though initially to be demonstrated at the GES DAAC, these techniques have applicability in other resource-limited environments, such as spaceborne data systems.

  11. Contaminant classification using cosine distances based on multiple conventional sensors.

    Science.gov (United States)

    Liu, Shuming; Che, Han; Smith, Kate; Chang, Tian

    2015-02-01

    Emergent contamination events have a significant impact on water systems. After contamination detection, it is important to classify the type of contaminant quickly to provide support for remediation attempts. Conventional methods generally either rely on laboratory-based analysis, which requires a long analysis time, or on multivariable-based geometry analysis and sequence analysis, which is prone to being affected by the contaminant concentration. This paper proposes a new contaminant classification method, which discriminates contaminants in a real time manner independent of the contaminant concentration. The proposed method quantifies the similarities or dissimilarities between sensors' responses to different types of contaminants. The performance of the proposed method was evaluated using data from contaminant injection experiments in a laboratory and compared with a Euclidean distance-based method. The robustness of the proposed method was evaluated using an uncertainty analysis. The results show that the proposed method performed better in identifying the type of contaminant than the Euclidean distance based method and that it could classify the type of contaminant in minutes without significantly compromising the correct classification rate (CCR).

  12. Classification of Nitrate Polluting Activities through Clustering of Isotope Mixing Model Outputs.

    Science.gov (United States)

    Xue, Dongmei; De Baets, Bernard; Van Cleemput, Oswald; Hennessy, Carmel; Berglund, Michael; Boeckx, Pascal

    2013-09-01

    Apportionment of nitrate (NO) sources in surface water and classification of monitoring locations according to NO polluting activities may help implementation of water quality control measures. In this study, we (i) evaluated a Bayesian isotopic mixing model (stable isotope analysis in R [SIAR]) for NO source apportionment using 2 yr of δN-NO and δO-NO data from 29 locations within river basins in Flanders (Belgium) and five expert-defined NO polluting activities, (ii) used the NO source contributions as input to an unsupervised learning algorithm (k-means clustering) to reclassify sampling locations into NO polluting activities, and (iii) assessed if a decision tree model of physicochemical data could retrieve the isotope-based and expert-defined classifications. Based on the SIAR and δB results, manure/sewage was identified as a major NO source, whereas soil N, fertilizer NO, and NH in fertilizer and rain were intermediate sources and NO in precipitation was a minor source. The k-means clustering algorithm allowed classification of NO polluting activities that corresponded well to the expert-defined classifications. A decision tree model of physicochemical parameters allowed us to correctly classify 50 to 100% of the sampling locations as compared with the k-means clustering approach. We suggest that NO polluting activities can be identified via clustering of NO source contributions from samples representing an entire river basin. Classification of future monitoring locations into these classes could use decision tree models based on physicochemical data. The latter approach holds a substantial degree of uncertainty but provides more inherent information for dedicated abatement strategies than monitoring of NO concentrations alone. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.

  13. Classification of Noisy Data: An Approach Based on Genetic Algorithms and Voronoi Tessellation

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Knudsen, Torben

    Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based on the po......Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based...

  14. Spatial uncertainty modeling of fuzzy information in images for pattern classification.

    Directory of Open Access Journals (Sweden)

    Tuan D Pham

    Full Text Available The modeling of the spatial distribution of image properties is important for many pattern recognition problems in science and engineering. Mathematical methods are needed to quantify the variability of this spatial distribution based on which a decision of classification can be made in an optimal sense. However, image properties are often subject to uncertainty due to both incomplete and imprecise information. This paper presents an integrated approach for estimating the spatial uncertainty of vagueness in images using the theory of geostatistics and the calculus of probability measures of fuzzy events. Such a model for the quantification of spatial uncertainty is utilized as a new image feature extraction method, based on which classifiers can be trained to perform the task of pattern recognition. Applications of the proposed algorithm to the classification of various types of image data suggest the usefulness of the proposed uncertainty modeling technique for texture feature extraction.

  15. Object-based Dimensionality Reduction in Land Surface Phenology Classification

    Directory of Open Access Journals (Sweden)

    Brian E. Bunker

    2016-11-01

    Full Text Available Unsupervised classification or clustering of multi-decadal land surface phenology provides a spatio-temporal synopsis of natural and agricultural vegetation response to environmental variability and anthropogenic activities. Notwithstanding the detailed temporal information available in calibrated bi-monthly normalized difference vegetation index (NDVI and comparable time series, typical pre-classification workflows average a pixel’s bi-monthly index within the larger multi-decadal time series. While this process is one practical way to reduce the dimensionality of time series with many hundreds of image epochs, it effectively dampens temporal variation from both intra and inter-annual observations related to land surface phenology. Through a novel application of object-based segmentation aimed at spatial (not temporal dimensionality reduction, all 294 image epochs from a Moderate Resolution Imaging Spectroradiometer (MODIS bi-monthly NDVI time series covering the northern Fertile Crescent were retained (in homogenous landscape units as unsupervised classification inputs. Given the inherent challenges of in situ or manual image interpretation of land surface phenology classes, a cluster validation approach based on transformed divergence enabled comparison between traditional and novel techniques. Improved intra-annual contrast was clearly manifest in rain-fed agriculture and inter-annual trajectories showed increased cluster cohesion, reducing the overall number of classes identified in the Fertile Crescent study area from 24 to 10. Given careful segmentation parameters, this spatial dimensionality reduction technique augments the value of unsupervised learning to generate homogeneous land surface phenology units. By combining recent scalable computational approaches to image segmentation, future work can pursue new global land surface phenology products based on the high temporal resolution signatures of vegetation index time series.

  16. A Categorical Framework for Model Classification in the Geosciences

    Science.gov (United States)

    Hauhs, Michael; Trancón y Widemann, Baltasar; Lange, Holger

    2016-04-01

    Models have a mixed record of success in the geosciences. In meteorology, model development and implementation has been among the first and most successful examples of triggering computer technology in science. On the other hand, notorious problems such as the 'equifinality issue' in hydrology lead to a rather mixed reputation of models in other areas. The most successful models in geosciences are applications of dynamic systems theory to non-living systems or phenomena. Thus, we start from the hypothesis that the success of model applications relates to the influence of life on the phenomenon under study. We thus focus on the (formal) representation of life in models. The aim is to investigate whether disappointment in model performance is due to system properties such as heterogeneity and historicity of ecosystems, or rather reflects an abstraction and formalisation problem at a fundamental level. As a formal framework for this investigation, we use category theory as applied in computer science to specify behaviour at an interface. Its methods have been developed for translating and comparing formal structures among different application areas and seems highly suited for a classification of the current "model zoo" in the geosciences. The approach is rather abstract, with a high degree of generality but a low level of expressibility. Here, category theory will be employed to check the consistency of assumptions about life in different models. It will be shown that it is sufficient to distinguish just four logical cases to check for consistency of model content. All four cases can be formalised as variants of coalgebra-algebra homomorphisms. It can be demonstrated that transitions between the four variants affect the relevant observations (time series or spatial maps), the formalisms used (equations, decision trees) and the test criteria of success (prediction, classification) of the resulting model types. We will present examples from hydrology and ecology in

  17. New Adaptive Image Quality Assessment Based on Distortion Classification

    Directory of Open Access Journals (Sweden)

    Xin JIN

    2014-01-01

    Full Text Available This paper proposes a new adaptive image quality assessment (AIQA method, which is based on distortion classifying. AIQA contains two parts, distortion classification and image quality assessment. Firstly, we analysis characteristics of the original and distorted images, including the distribution of wavelet coefficient, the ratio of edge energy and inner energy of the differential image block, we divide distorted images into White Noise distortion, JPEG compression distortion and fuzzy distortion. To evaluate the quality of first two type distortion images, we use pixel based structure similarity metric and DCT based structural similarity metric respectively. For those blurriness pictures, we present a new wavelet-based structure similarity algorithm. According to the experimental results, AIQA takes the advantages of different structural similarity metrics, and it’s able to simulate the human visual perception effectively.

  18. DNA methylation-based classification of central nervous system tumours

    DEFF Research Database (Denmark)

    Capper, David; Jones, David T.W.; Sill, Martin

    2018-01-01

    Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging - with substantial inter......-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show...

  19. A minimum spanning forest based classification method for dedicated breast CT images

    Energy Technology Data Exchange (ETDEWEB)

    Pike, Robert [Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, Georgia 30329 (United States); Sechopoulos, Ioannis [Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, Georgia 30329 and Winship Cancer Institute of Emory University, Atlanta, Georgia 30322 (United States); Fei, Baowei, E-mail: bfei@emory.edu [Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, Georgia 30329 (United States); Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, Georgia 30322 (United States); Department of Mathematics and Computer Science, Emory University, Atlanta, Georgia 30322 (United States); Winship Cancer Institute of Emory University, Atlanta, Georgia 30322 (United States)

    2015-11-15

    Purpose: To develop and test an automated algorithm to classify different types of tissue in dedicated breast CT images. Methods: Images of a single breast of five different patients were acquired with a dedicated breast CT clinical prototype. The breast CT images were processed by a multiscale bilateral filter to reduce noise while keeping edge information and were corrected to overcome cupping artifacts. As skin and glandular tissue have similar CT values on breast CT images, morphologic processing is used to identify the skin based on its position information. A support vector machine (SVM) is trained and the resulting model used to create a pixelwise classification map of fat and glandular tissue. By combining the results of the skin mask with the SVM results, the breast tissue is classified as skin, fat, and glandular tissue. This map is then used to identify markers for a minimum spanning forest that is grown to segment the image using spatial and intensity information. To evaluate the authors’ classification method, they use DICE overlap ratios to compare the results of the automated classification to those obtained by manual segmentation on five patient images. Results: Comparison between the automatic and the manual segmentation shows that the minimum spanning forest based classification method was able to successfully classify dedicated breast CT image with average DICE ratios of 96.9%, 89.8%, and 89.5% for fat, glandular, and skin tissue, respectively. Conclusions: A 2D minimum spanning forest based classification method was proposed and evaluated for classifying the fat, skin, and glandular tissue in dedicated breast CT images. The classification method can be used for dense breast tissue quantification, radiation dose assessment, and other applications in breast imaging.

  20. Forest Classification Based on Forest texture in Northwest Yunnan Province

    Science.gov (United States)

    Wang, Jinliang; Gao, Yan; Wang, Xiaohua; Fu, Lei

    2014-03-01

    Forest texture is an intrinsic characteristic and an important visual feature of a forest ecological system. Full utilization of forest texture will be a great help in increasing the accuracy of forest classification based on remote sensed data. Taking Shangri-La as a study area, forest classification has been based on the texture. The results show that: (1) From the texture abundance, texture boundary, entropy as well as visual interpretation, the combination of Grayscale-gradient co-occurrence matrix and wavelet transformation is much better than either one of both ways of forest texture information extraction; (2) During the forest texture information extraction, the size of the texture-suitable window determined by the semi-variogram method depends on the forest type (evergreen broadleaf forest is 3×3, deciduous broadleaf forest is 5×5, etc.). (3)While classifying forest based on forest texture information, the texture factor assembly differs among forests: Variance Heterogeneity and Correlation should be selected when the window is between 3×3 and 5×5 Mean, Correlation, and Entropy should be used when the window in the range of 7×7 to 19×19 and Correlation, Second Moment, and Variance should be used when the range is larger than 21×21.

  1. Forest Classification Based on Forest texture in Northwest Yunnan Province

    International Nuclear Information System (INIS)

    Wang, Jinliang; Gao, Yan; Fu, Lei; Wang, Xiaohua

    2014-01-01

    Forest texture is an intrinsic characteristic and an important visual feature of a forest ecological system. Full utilization of forest texture will be a great help in increasing the accuracy of forest classification based on remote sensed data. Taking Shangri-La as a study area, forest classification has been based on the texture. The results show that: (1) From the texture abundance, texture boundary, entropy as well as visual interpretation, the combination of Grayscale-gradient co-occurrence matrix and wavelet transformation is much better than either one of both ways of forest texture information extraction; (2) During the forest texture information extraction, the size of the texture-suitable window determined by the semi-variogram method depends on the forest type (evergreen broadleaf forest is 3×3, deciduous broadleaf forest is 5×5, etc.). (3)While classifying forest based on forest texture information, the texture factor assembly differs among forests: Variance Heterogeneity and Correlation should be selected when the window is between 3×3 and 5×5; Mean, Correlation, and Entropy should be used when the window in the range of 7×7 to 19×19; and Correlation, Second Moment, and Variance should be used when the range is larger than 21×21

  2. Likelihood ratio model for classification of forensic evidence

    Energy Technology Data Exchange (ETDEWEB)

    Zadora, G., E-mail: gzadora@ies.krakow.pl [Institute of Forensic Research, Westerplatte 9, 31-033 Krakow (Poland); Neocleous, T., E-mail: tereza@stats.gla.ac.uk [University of Glasgow, Department of Statistics, 15 University Gardens, Glasgow G12 8QW (United Kingdom)

    2009-05-29

    One of the problems of analysis of forensic evidence such as glass fragments, is the determination of their use-type category, e.g. does a glass fragment originate from an unknown window or container? Very small glass fragments arise during various accidents and criminal offences, and could be carried on the clothes, shoes and hair of participants. It is therefore necessary to obtain information on their physicochemical composition in order to solve the classification problem. Scanning Electron Microscopy coupled with an Energy Dispersive X-ray Spectrometer and the Glass Refractive Index Measurement method are routinely used in many forensic institutes for the investigation of glass. A natural form of glass evidence evaluation for forensic purposes is the likelihood ratio-LR = p(E|H{sub 1})/p(E|H{sub 2}). The main aim of this paper was to study the performance of LR models for glass object classification which considered one or two sources of data variability, i.e. between-glass-object variability and(or) within-glass-object variability. Within the proposed model a multivariate kernel density approach was adopted for modelling the between-object distribution and a multivariate normal distribution was adopted for modelling within-object distributions. Moreover, a graphical method of estimating the dependence structure was employed to reduce the highly multivariate problem to several lower-dimensional problems. The performed analysis showed that the best likelihood model was the one which allows to include information about between and within-object variability, and with variables derived from elemental compositions measured by SEM-EDX, and refractive values determined before (RI{sub b}) and after (RI{sub a}) the annealing process, in the form of dRI = log{sub 10}|RI{sub a} - RI{sub b}|. This model gave better results than the model with only between-object variability considered. In addition, when dRI and variables derived from elemental compositions were used, this

  3. Likelihood ratio model for classification of forensic evidence

    International Nuclear Information System (INIS)

    Zadora, G.; Neocleous, T.

    2009-01-01

    One of the problems of analysis of forensic evidence such as glass fragments, is the determination of their use-type category, e.g. does a glass fragment originate from an unknown window or container? Very small glass fragments arise during various accidents and criminal offences, and could be carried on the clothes, shoes and hair of participants. It is therefore necessary to obtain information on their physicochemical composition in order to solve the classification problem. Scanning Electron Microscopy coupled with an Energy Dispersive X-ray Spectrometer and the Glass Refractive Index Measurement method are routinely used in many forensic institutes for the investigation of glass. A natural form of glass evidence evaluation for forensic purposes is the likelihood ratio-LR = p(E|H 1 )/p(E|H 2 ). The main aim of this paper was to study the performance of LR models for glass object classification which considered one or two sources of data variability, i.e. between-glass-object variability and(or) within-glass-object variability. Within the proposed model a multivariate kernel density approach was adopted for modelling the between-object distribution and a multivariate normal distribution was adopted for modelling within-object distributions. Moreover, a graphical method of estimating the dependence structure was employed to reduce the highly multivariate problem to several lower-dimensional problems. The performed analysis showed that the best likelihood model was the one which allows to include information about between and within-object variability, and with variables derived from elemental compositions measured by SEM-EDX, and refractive values determined before (RI b ) and after (RI a ) the annealing process, in the form of dRI = log 10 |RI a - RI b |. This model gave better results than the model with only between-object variability considered. In addition, when dRI and variables derived from elemental compositions were used, this model outperformed two other

  4. Dynamic logistic regression and dynamic model averaging for binary classification.

    Science.gov (United States)

    McCormick, Tyler H; Raftery, Adrian E; Madigan, David; Burd, Randall S

    2012-03-01

    We propose an online binary classification procedure for cases when there is uncertainty about the model to use and parameters within a model change over time. We account for model uncertainty through dynamic model averaging, a dynamic extension of Bayesian model averaging in which posterior model probabilities may also change with time. We apply a state-space model to the parameters of each model and we allow the data-generating model to change over time according to a Markov chain. Calibrating a "forgetting" factor accommodates different levels of change in the data-generating mechanism. We propose an algorithm that adjusts the level of forgetting in an online fashion using the posterior predictive distribution, and so accommodates various levels of change at different times. We apply our method to data from children with appendicitis who receive either a traditional (open) appendectomy or a laparoscopic procedure. Factors associated with which children receive a particular type of procedure changed substantially over the 7 years of data collection, a feature that is not captured using standard regression modeling. Because our procedure can be implemented completely online, future data collection for similar studies would require storing sensitive patient information only temporarily, reducing the risk of a breach of confidentiality. © 2011, The International Biometric Society.

  5. Selection of classification models from repository of model for water ...

    African Journals Online (AJOL)

    This paper proposes a new technique, Model Selection Technique (MST) for selection and ranking of models from the repository of models by combining three performance measures (Acc, TPR and TNR). This technique provides weightage to each performance measure to find the most suitable model from the repository of ...

  6. A model to facilitate implementation of the International Classification of Functioning, Disability and Health into prosthetics and orthotics.

    Science.gov (United States)

    Jarl, Gustav; Ramstrand, Nerrolyn

    2017-09-01

    The International Classification of Functioning, Disability and Health is a classification of human functioning and disability and is based on a biopsychosocial model of health. As such, International Classification of Functioning, Disability and Health seems suitable as a basis for constructing models defining the clinical P&O process. The aim was to use International Classification of Functioning, Disability and Health to facilitate development of such a model. Proposed model: A model, the Prosthetic and Orthotic Process (POP) model, is proposed. The Prosthetic and Orthotic Process model is based on the concepts of the International Classification of Functioning, Disability and Health and comprises four steps in a cycle: (1) Assessment, including the medical history and physical examination of the patient. (2) Goals, specified on four levels including those related to participation, activity, body functions and structures and technical requirements of the device. (3) Intervention, in which the appropriate course of action is determined based on the specified goal and evidence-based practice. (4) Evaluation of outcomes, where the outcomes are assessed and compared to the corresponding goals. After the evaluation of goal fulfilment, the first cycle in the process is complete, and a broad evaluation is now made including overriding questions about the patient's satisfaction with the outcomes and the process. This evaluation will determine if the process should be ended or if another cycle in the process should be initiated. The Prosthetic and Orthotic Process model can provide a common understanding of the P&O process. Concepts of International Classification of Functioning, Disability and Health have been incorporated into the model to facilitate communication with other rehabilitation professionals and encourage a holistic and patient-centred approach in clinical practice. Clinical relevance The Prosthetic and Orthotic Process model can support the implementation

  7. Interannual rainfall variability and SOM-based circulation classification

    Science.gov (United States)

    Wolski, Piotr; Jack, Christopher; Tadross, Mark; van Aardenne, Lisa; Lennard, Christopher

    2018-01-01

    Self-Organizing Maps (SOM) based classifications of synoptic circulation patterns are increasingly being used to interpret large-scale drivers of local climate variability, and as part of statistical downscaling methodologies. These applications rely on a basic premise of synoptic climatology, i.e. that local weather is conditioned by the large-scale circulation. While it is clear that this relationship holds in principle, the implications of its implementation through SOM-based classification, particularly at interannual and longer time scales, are not well recognized. Here we use a SOM to understand the interannual synoptic drivers of climate variability at two locations in the winter and summer rainfall regimes of South Africa. We quantify the portion of variance in seasonal rainfall totals that is explained by year to year differences in the synoptic circulation, as schematized by a SOM. We furthermore test how different spatial domain sizes and synoptic variables affect the ability of the SOM to capture the dominant synoptic drivers of interannual rainfall variability. Additionally, we identify systematic synoptic forcing that is not captured by the SOM classification. The results indicate that the frequency of synoptic states, as schematized by a relatively disaggregated SOM (7 × 9) of prognostic atmospheric variables, including specific humidity, air temperature and geostrophic winds, captures only 20-45% of interannual local rainfall variability, and that the residual variance contains a strong systematic component. Utilising a multivariate linear regression framework demonstrates that this residual variance can largely be explained using synoptic variables over a particular location; even though they are used in the development of the SOM their influence, however, diminishes with the size of the SOM spatial domain. The influence of the SOM domain size, the choice of SOM atmospheric variables and grid-point explanatory variables on the levels of explained

  8. Optimization models for cancer classification: extracting gene interaction information from microarray expression data.

    Science.gov (United States)

    Antonov, Alexey V; Tetko, Igor V; Mader, Michael T; Budczies, Jan; Mewes, Hans W

    2004-03-22

    Microarray data appear particularly useful to investigate mechanisms in cancer biology and represent one of the most powerful tools to uncover the genetic mechanisms causing loss of cell cycle control. Recently, several different methods to employ microarray data as a diagnostic tool in cancer classification have been proposed. These procedures take changes in the expression of particular genes into account but do not consider disruptions in certain gene interactions caused by the tumor. It is probable that some genes participating in tumor development do not change their expression level dramatically. Thus, they cannot be detected by simple classification approaches used previously. For these reasons, a classification procedure exploiting information related to changes in gene interactions is needed. We propose a MAximal MArgin Linear Programming (MAMA) method for the classification of tumor samples based on microarray data. This procedure detects groups of genes and constructs models (features) that strongly correlate with particular tumor types. The detected features include genes whose functional relations are changed for particular cancer types. The proposed method was tested on two publicly available datasets and demonstrated a prediction ability superior to previously employed classification schemes. The MAMA system was developed using the linear programming system LINDO http://www.lindo.com. A Perl script that specifies the optimization problem for this software is available upon request from the authors.

  9. Construction of protein profile classification model and screening of proteomic signature of acute leukemia.

    Science.gov (United States)

    Xu, Yun; Zhuo, Jiacai; Duan, Yonggang; Shi, Benhang; Chen, Xuhong; Zhang, Xiaoli; Xiao, Liang; Lou, Jin; Huang, Ruihong; Zhang, Qiongli; Du, Xin; Li, Ming; Wang, Daping; Shi, Dunyun

    2014-01-01

    The French-American-British (FAB) and WHO classifications provide important guidelines for the diagnosis, treatment, and prognostic prediction of acute leukemia, but are incapable of accurately differentiating all subtypes, and not well correlated with the clinical outcomes. In this study, we performed the protein profiling of the bone marrow mononuclear cells from the patients with acute leukemia and the health volunteers (control) by surface enhanced laser desorption/ionization-time of flight mass spectrometry (SELDI_TOF_MS). The patients with acute leukemia were analyzed as unitary by the profiling that were grouped into acute promyelocytic leukemia (APL), acute myeloid leukemia-granulocytic (AML-Gran), acute myeloid leukemia-monocytic (AML-Mon) acute lymphocytic leukemia (ALL), and control. Based on 109 proteomic signatures, the classification models of acute leukemia were constructed to screen the predictors by the improvement of the proteomic signatures and to detect their expression characteristics. According to the improvement and the expression characteristics of the predictors, the proteomic signatures (M3829, M1593, M2121, M2536, M1016) characterized successively each group (CON, APL, AML-Gra, AML-Mon, ALL) were screened as target molecules for identification. Meanwhile, the proteomic-based class of determinant samples could be made by the classification models. The credibility of the proteomic-based classification passed the evaluation of Biomarker Patterns Software 5.0 (BPS 5.0) scoring and validated application in clinical practice. The results suggested that the proteomic signatures characterized by different blasts were potential for developing new treatment and monitoring approaches of leukemia blasts. Moreover, the classification model was potential in serving as new diagnose approach of leukemia.

  10. Learning Supervised Topic Models for Classification and Regression from Crowds

    DEFF Research Database (Denmark)

    Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete

    2017-01-01

    annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression......The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most...... problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages...

  11. New Classification Method Based on Support-Significant Association Rules Algorithm

    Science.gov (United States)

    Li, Guoxin; Shi, Wen

    One of the most well-studied problems in data mining is mining for association rules. There was also research that introduced association rule mining methods to conduct classification tasks. These classification methods, based on association rule mining, could be applied for customer segmentation. Currently, most of the association rule mining methods are based on a support-confidence structure, where rules satisfied both minimum support and minimum confidence were returned as strong association rules back to the analyzer. But, this types of association rule mining methods lack of rigorous statistic guarantee, sometimes even caused misleading. A new classification model for customer segmentation, based on association rule mining algorithm, was proposed in this paper. This new model was based on the support-significant association rule mining method, where the measurement of confidence for association rule was substituted by the significant of association rule that was a better evaluation standard for association rules. Data experiment for customer segmentation from UCI indicated the effective of this new model.

  12. Fuzzy logic based classification and assessment of pathological voice signals.

    Science.gov (United States)

    Aghazadeh, Babak Seyed; Heris, Hossein Khadivi

    2009-01-01

    In this paper an efficient fuzzy wavelet packet (WP) based feature extraction method and fuzzy logic based disorder assessment technique were used to investigate voice signals of patients suffering from unilateral vocal fold paralysis (UVFP). Mother wavelet function of tenth order Daubechies (d10) was employed to decompose signals in 5 levels. Next, WP coefficients were used to measure energy and Shannon entropy features at different spectral sub-bands. Consequently, using fuzzy c-means method, signals were clustered into 2 classes. The amount of fuzzy membership of pathological and normal signals in their corresponding clusters was considered as a measure to quantify the discrimination ability of features. A classification accuracy of 100 percent was achieved using an artificial neural network classifier. Finally, fuzzy c-means clustering method was used as a way of voice pathology assessment. Accordingly, fuzzy membership function based health index is proposed.

  13. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning.

    Science.gov (United States)

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-03-15

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.

  14. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning

    Directory of Open Access Journals (Sweden)

    Victoria Plaza-Leiva

    2017-03-01

    Full Text Available Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM, Gaussian processes (GP, and Gaussian mixture models (GMM. A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl. Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.

  15. A drone detection with aircraft classification based on a camera array

    Science.gov (United States)

    Liu, Hao; Qu, Fangchao; Liu, Yingjian; Zhao, Wei; Chen, Yitong

    2018-03-01

    In recent years, because of the rapid popularity of drones, many people have begun to operate drones, bringing a range of security issues to sensitive areas such as airports and military locus. It is one of the important ways to solve these problems by realizing fine-grained classification and providing the fast and accurate detection of different models of drone. The main challenges of fine-grained classification are that: (1) there are various types of drones, and the models are more complex and diverse. (2) the recognition test is fast and accurate, in addition, the existing methods are not efficient. In this paper, we propose a fine-grained drone detection system based on the high resolution camera array. The system can quickly and accurately recognize the detection of fine grained drone based on hd camera.

  16. Multiple kernel boosting framework based on information measure for classification

    International Nuclear Information System (INIS)

    Qi, Chengming; Wang, Yuping; Tian, Wenjie; Wang, Qun

    2016-01-01

    The performance of kernel-based method, such as support vector machine (SVM), is greatly affected by the choice of kernel function. Multiple kernel learning (MKL) is a promising family of machine learning algorithms and has attracted many attentions in recent years. MKL combines multiple sub-kernels to seek better results compared to single kernel learning. In order to improve the efficiency of SVM and MKL, in this paper, the Kullback–Leibler kernel function is derived to develop SVM. The proposed method employs an improved ensemble learning framework, named KLMKB, which applies Adaboost to learning multiple kernel-based classifier. In the experiment for hyperspectral remote sensing image classification, we employ feature selected through Optional Index Factor (OIF) to classify the satellite image. We extensively examine the performance of our approach in comparison to some relevant and state-of-the-art algorithms on a number of benchmark classification data sets and hyperspectral remote sensing image data set. Experimental results show that our method has a stable behavior and a noticeable accuracy for different data set.

  17. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.

  18. Drunk driving detection based on classification of multivariate time series.

    Science.gov (United States)

    Li, Zhenlong; Jin, Xue; Zhao, Xiaohua

    2015-09-01

    This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.

  19. Development of a definition, classification system, and model for cultural geology

    Science.gov (United States)

    Mitchell, Lloyd W., III

    The concept for this study is based upon a personal interest by the author, an American Indian, in promoting cultural perspectives in undergraduate college teaching and learning environments. Most academicians recognize that merged fields can enhance undergraduate curricula. However, conflict may occur when instructors attempt to merge social science fields such as history or philosophy with geoscience fields such as mining and geomorphology. For example, ideologies of Earth structures derived from scientific methodologies may conflict with historical and spiritual understandings of Earth structures held by American Indians. Specifically, this study addresses the problem of how to combine cultural studies with the geosciences into a new merged academic discipline called cultural geology. This study further attempts to develop the merged field of cultural geology using an approach consisting of three research foci: a definition, a classification system, and a model. Literature reviews were conducted for all three foci. Additionally, to better understand merged fields, a literature review was conducted specifically for academic fields that merged social and physical sciences. Methodologies concentrated on the three research foci: definition, classification system, and model. The definition was derived via a two-step process. The first step, developing keyword hierarchical ranking structures, was followed by creating and analyzing semantic word meaning lists. The classification system was developed by reviewing 102 classification systems and incorporating selected components into a system framework. The cultural geology model was created also utilizing a two-step process. A literature review of scientific models was conducted. Then, the definition and classification system were incorporated into a model felt to reflect the realm of cultural geology. A course syllabus was then developed that incorporated the resulting definition, classification system, and model. This

  20. Application of boosting classification and regression to modeling the relationships between trace elements and diseases.

    Science.gov (United States)

    Tan, Chao; Chen, Hui; Zhu, Wanping

    2010-05-01

    The study on the relationship between trace elements and diseases often need to build a classification/regression model. Furthermore, the accuracy of such a model is of particular importance and directly decides its applicability. The goal of this study is to explore the feasibility of applying boosting, i.e., a new strategy from machine learning, to model the relationship between trace elements and diseases. Two examples are employed to illustrate the technique in the applications of classification and regression, respectively. The first example involves the diagnosis of anorexia according to the concentrations of six elements (i.e. classification task). Decision stump and support vector machine are used as the weak/base algorithm and reference algorithm, respectively. The second example involves the prediction of breast cancer mortality based on the intake of trace elements (i.e. a regression task). In this regard, partial least squares is not only used as the weak/base algorithm, but also the reference algorithm. The results from both examples confirm the potential of boosting in modeling the relationship between trace elements and diseases.

  1. OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models

    KAUST Repository

    Magana-Mora, Arturo

    2017-06-14

    Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

  2. When Low Rank Representation Based Hyperspectral Imagery Classification Meets Segmented Stacked Denoising Auto-Encoder Based Spatial-Spectral Feature

    Directory of Open Access Journals (Sweden)

    Cong Wang

    2018-02-01

    Full Text Available When confronted with limited labelled samples, most studies adopt an unsupervised feature learning scheme and incorporate the extracted features into a traditional classifier (e.g., support vector machine, SVM to deal with hyperspectral imagery classification. However, these methods have limitations in generalizing well in challenging cases due to the limited representative capacity of the shallow feature learning model, as well as the insufficient robustness of the classifier which only depends on the supervision of labelled samples. To address these two problems simultaneously, we present an effective low-rank representation-based classification framework for hyperspectral imagery. In particular, a novel unsupervised segmented stacked denoising auto-encoder-based feature learning model is proposed to depict the spatial-spectral characteristics of each pixel in the imagery with deep hierarchical structure. With the extracted features, a low-rank representation based robust classifier is then developed which takes advantage of both the supervision provided by labelled samples and unsupervised correlation (e.g., intra-class similarity and inter-class dissimilarity, etc. among those unlabelled samples. Both the deep unsupervised feature learning and the robust classifier benefit, improving the classification accuracy with limited labelled samples. Extensive experiments on hyperspectral imagery classification demonstrate the effectiveness of the proposed framework.

  3. A CNN Based Approach for Garments Texture Design Classification

    Directory of Open Access Journals (Sweden)

    S.M. Sofiqul Islam

    2017-05-01

    Full Text Available Identifying garments texture design automatically for recommending the fashion trends is important nowadays because of the rapid growth of online shopping. By learning the properties of images efficiently, a machine can give better accuracy of classification. Several Hand-Engineered feature coding exists for identifying garments design classes. Recently, Deep Convolutional Neural Networks (CNNs have shown better performances for different object recognition. Deep CNN uses multiple levels of representation and abstraction that helps a machine to understand the types of data more accurately. In this paper, a CNN model for identifying garments design classes has been proposed. Experimental results on two different datasets show better results than existing two well-known CNN models (AlexNet and VGGNet and some state-of-the-art Hand-Engineered feature extraction methods.

  4. Toward the classification of the realistic free fermionic models

    International Nuclear Information System (INIS)

    Faraggi, A.E.

    1997-08-01

    The realistic free fermionic models have had remarkable success in providing plausible explanations for various properties of the Standard Model which include the natural appearance of three generations, the explanation of the heavy top quark mass and the qualitative structure of the fermion mass spectrum in general, the stability of the proton and more. These intriguing achievements makes evident the need to understand the general space of these models. While the number of possibilities is large, general patterns can be extracted. In this paper the author presents a detailed discussion on the construction of the realistic free fermionic models with the aim of providing some insight into the basic structures and building blocks that enter the construction. The role of free phases in the determination of the phenomenology of the models is discussed in detail. The author discusses the connection between the free phases and mirror symmetry in (2,2) models and the corresponding symmetries in the case of (2,0) models. The importance of the free phases in determining the effective low energy phenomenology is illustrated in several examples. The classification of the models in terms of boundary condition selection rules, real world-sheet fermion pairings, exotic matter states and the hidden sector is discussed

  5. Classification of the Pospiviroidae based on their structural hallmarks.

    Directory of Open Access Journals (Sweden)

    Tamara Giguère

    Full Text Available The simplest known plant pathogens are the viroids. Because of their non-coding single-stranded circular RNA genome, they depend on both their sequence and their structure for both a successful infection and their replication. In the recent years, important progress in the elucidation of their structures was achieved using an adaptation of the selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE protocol in order to probe viroid structures in solution. Previously, SHAPE has been adapted to elucidate the structures of all of the members of the family Avsunviroidae, as well as those of a few members of the family Pospiviroidae. In this study, with the goal of providing an entire compendium of the secondary structures of the various viroid species, a total of thirteen new Pospiviroidae members were probed in solution using the SHAPE protocol. More specifically, the secondary structures of eleven species for which the genus was previously known were initially elucidated. At this point, considering all of the SHAPE elucidated secondary structures, a classification system for viroids in their respective genera was proposed. On the basis of the structural classification reported here, the probings of both the Grapevine latent viroid and the Dahlia latent viroid provide sound arguments for the determination of their respective genera, which appear to be Apscaviroid and Hostuviroid, respectively. More importantly, this study provides the complete repertoire of the secondary structures, mapped in solution, of all of the accepted viroid species reported thus far. In addition, a classification scheme based on structural hallmarks, an important tool for many biological studies, is proposed.

  6. Keratoconus: Classification scheme based on videokeratography and clinical signs

    Science.gov (United States)

    Li, Xiaohui; Yang, Huiying; Rabinowitz, Yaron S.

    2013-01-01

    PURPOSE To determine in a longitudinal study whether there is correlation between videokeratography and clinical signs of keratoconus that might be useful to practicing clinicians. SETTING Cornea-Genetic Eye Institute, Cedars-Sinai Medical Center, Los Angeles, California, USA. METHODS Eyes grouped as keratoconus, early keratoconus, keratoconus suspect, or normal based on clinical signs and videokeratography were examined at baseline and followed for 1 to 8 years. Differences in quantitative videokeratography indices and the progression rate were evaluated. The quantitative indices were central keratometry (K), the inferior–superior (I–S) value, and the keratoconus percentage index (KISA). Discriminant analysis was used to estimate the classification rate using the indices. RESULTS There were significant differences at baseline between the normal, keratoconus-suspect, and early keratoconus groups in all indices; the respective means were central K: 44.17 D, 45.13 D, and 45.97 D; I–S: 0.57, 1.20, and 4.44; log(KISA): 2.49, 2.94, and 5.71 (all Pkeratoconus-suspect group progressed to early keratoconus or keratoconus and 75% in the early keratoconus group progressed to keratoconus. Using all 3 indices and age, 86.9% in the normal group, 75.3% in the early keratoconus group, and 44.6% in the keratoconus-suspect group could be classified, yielding a total classification rate of 68.9%. CONCLUSIONS Cross-sectional and longitudinal data showed significant differences between groups in the 3 indices. Use of this classification scheme might form a basis for detecting subclinical keratoconus. PMID:19683159

  7. Cluster Validity Classification Approaches Based on Geometric Probability and Application in the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    LI Jian-Wei

    2014-08-01

    Full Text Available On the basis of the cluster validity function based on geometric probability in literature [1, 2], propose a cluster analysis method based on geometric probability to process large amount of data in rectangular area. The basic idea is top-down stepwise refinement, firstly categories then subcategories. On all clustering levels, use the cluster validity function based on geometric probability firstly, determine clusters and the gathering direction, then determine the center of clustering and the border of clusters. Through TM remote sensing image classification examples, compare with the supervision and unsupervised classification in ERDAS and the cluster analysis method based on geometric probability in two-dimensional square which is proposed in literature 2. Results show that the proposed method can significantly improve the classification accuracy.

  8. Kernel-based machine learning techniques for infrasound signal classification

    Science.gov (United States)

    Tuma, Matthias; Igel, Christian; Mialle, Pierrick

    2014-05-01

    Infrasound monitoring is one of four remote sensing technologies continuously employed by the CTBTO Preparatory Commission. The CTBTO's infrasound network is designed to monitor the Earth for potential evidence of atmospheric or shallow underground nuclear explosions. Upon completion, it will comprise 60 infrasound array stations distributed around the globe, of which 47 were certified in January 2014. Three stages can be identified in CTBTO infrasound data processing: automated processing at the level of single array stations, automated processing at the level of the overall global network, and interactive review by human analysts. At station level, the cross correlation-based PMCC algorithm is used for initial detection of coherent wavefronts. It produces estimates for trace velocity and azimuth of incoming wavefronts, as well as other descriptive features characterizing a signal. Detected arrivals are then categorized into potentially treaty-relevant versus noise-type signals by a rule-based expert system. This corresponds to a binary classification task at the level of station processing. In addition, incoming signals may be grouped according to their travel path in the atmosphere. The present work investigates automatic classification of infrasound arrivals by kernel-based pattern recognition methods. It aims to explore the potential of state-of-the-art machine learning methods vis-a-vis the current rule-based and task-tailored expert system. To this purpose, we first address the compilation of a representative, labeled reference benchmark dataset as a prerequisite for both classifier training and evaluation. Data representation is based on features extracted by the CTBTO's PMCC algorithm. As classifiers, we employ support vector machines (SVMs) in a supervised learning setting. Different SVM kernel functions are used and adapted through different hyperparameter optimization routines. The resulting performance is compared to several baseline classifiers. All

  9. Faller Classification in Older Adults Using Wearable Sensors Based on Turn and Straight-Walking Accelerometer-Based Features.

    Science.gov (United States)

    Drover, Dylan; Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D

    2017-06-07

    Faller classification in elderly populations can facilitate preventative care before a fall occurs. A novel wearable-sensor based faller classification method for the elderly was developed using accelerometer-based features from straight walking and turns. Seventy-six older individuals (74.15 ± 7.0 years), categorized as prospective fallers and non-fallers, completed a six-minute walk test with accelerometers attached to their lower legs and pelvis. After segmenting straight and turn sections, cross validation tests were conducted on straight and turn walking features to assess classification performance. The best "classifier model-feature selector" combination used turn data, random forest classifier, and select-5-best feature selector (73.4% accuracy, 60.5% sensitivity, 82.0% specificity, and 0.44 Matthew's Correlation Coefficient (MCC)). Using only the most frequently occurring features, a feature subset (minimum of anterior-posterior ratio of even/odd harmonics for right shank, standard deviation (SD) of anterior left shank acceleration SD, SD of mean anterior left shank acceleration, maximum of medial-lateral first quartile of Fourier transform (FQFFT) for lower back, maximum of anterior-posterior FQFFT for lower back) achieved better classification results, with 77.3% accuracy, 66.1% sensitivity, 84.7% specificity, and 0.52 MCC score. All classification performance metrics improved when turn data was used for faller classification, compared to straight walking data. Combining turn and straight walking features decreased performance metrics compared to turn features for similar classifier model-feature selector combinations.

  10. EEG-based classification of bilingual unspoken speech using ANN.

    Science.gov (United States)

    Balaji, Advait; Haldar, Aparajita; Patil, Keshav; Ruthvik, T Sai; Ca, Valliappan; Jartarkar, Mayur; Baths, Veeky

    2017-07-01

    The ability to interpret unspoken or imagined speech through electroencephalography (EEG) is of therapeutic interest for people suffering from speech disorders and `lockedin' syndrome. It is also useful for brain-computer interface (BCI) techniques not involving articulatory actions. Previous work has involved using particular words in one chosen language and training classifiers to distinguish between them. Such studies have reported accuracies of 40-60% and are not ideal for practical implementation. Furthermore, in today's multilingual society, classifiers trained in one language alone might not always have the desired effect. To address this, we present a novel approach to improve accuracy of the current model by combining bilingual interpretation and decision making. We collect data from 5 subjects with Hindi and English as primary and secondary languages respectively and ask them 20 `Yes'/`No' questions (`Haan'/`Na' in Hindi) in each language. We choose sensors present in regions important to both language processing and decision making. Data is preprocessed, and Principal Component Analysis (PCA) is carried out to reduce dimensionality. This is input to Support Vector Machine (SVM), Random Forest (RF), AdaBoost (AB), and Artificial Neural Networks (ANN) classifiers for prediction. Experimental results reveal best accuracy of 85.20% and 92.18% for decision and language classification respectively using ANN. Overall accuracy of bilingual speech classification is 75.38%.

  11. Optimal Non-Invasive Fault Classification Model for Packaged Ceramic Tile Quality Monitoring Using MMW Imaging

    Science.gov (United States)

    Agarwal, Smriti; Singh, Dharmendra

    2016-04-01

    Millimeter wave (MMW) frequency has emerged as an efficient tool for different stand-off imaging applications. In this paper, we have dealt with a novel MMW imaging application, i.e., non-invasive packaged goods quality estimation for industrial quality monitoring applications. An active MMW imaging radar operating at 60 GHz has been ingeniously designed for concealed fault estimation. Ceramic tiles covered with commonly used packaging cardboard were used as concealed targets for undercover fault classification. A comparison of computer vision-based state-of-the-art feature extraction techniques, viz, discrete Fourier transform (DFT), wavelet transform (WT), principal component analysis (PCA), gray level co-occurrence texture (GLCM), and histogram of oriented gradient (HOG) has been done with respect to their efficient and differentiable feature vector generation capability for undercover target fault classification. An extensive number of experiments were performed with different ceramic tile fault configurations, viz., vertical crack, horizontal crack, random crack, diagonal crack along with the non-faulty tiles. Further, an independent algorithm validation was done demonstrating classification accuracy: 80, 86.67, 73.33, and 93.33 % for DFT, WT, PCA, GLCM, and HOG feature-based artificial neural network (ANN) classifier models, respectively. Classification results show good capability for HOG feature extraction technique towards non-destructive quality inspection with appreciably low false alarm as compared to other techniques. Thereby, a robust and optimal image feature-based neural network classification model has been proposed for non-invasive, automatic fault monitoring for a financially and commercially competent industrial growth.

  12. Crop Classification in Satellite Images through Probabilistic Segmentation Based on Multiple Sources

    Directory of Open Access Journals (Sweden)

    Oscar S. Dalmau

    2017-06-01

    Full Text Available Classification methods based on Gaussian Markov Measure Field Models and other probabilistic approaches have to face the problem of construction of the likelihood. Typically, in these methods, the likelihood is computed from 1D or 3D histograms. However, when the number of information sources grows, as in the case of satellite images, the histogram construction becomes more difficult due to the high dimensionality of the feature space. In this work, we propose a generalization of Gaussian Markov Measure Field Models and provide a probabilistic segmentation scheme, which fuses multiple information sources for image segmentation. In particular, we apply the general model to classify types of crops in satellite images. The proposed method allows us to combine several feature spaces. For this purpose, the method requires prior information for building a 3D histogram for each considered feature space. Based on previous histograms, we can compute the likelihood of each site of the image to belong to a class. The computed likelihoods are the main input of the proposed algorithm and are combined in the proposed model using a contrast criteria. Different feature spaces are analyzed, among them are 6 spectral bands from LANDSAT 5 TM, 3 principal components from PCA on 6 spectral bands and 3 principal components from PCA applied on 10 vegetation indices. The proposed algorithm was applied to a real image and obtained excellent results in comparison to different classification algorithms used in crop classification.

  13. Locally linear embedding and neighborhood rough set-based gene selection for gene expression data classification.

    Science.gov (United States)

    Sun, L; Xu, J-C; Wang, W; Yin, Y

    2016-08-30

    Cancer subtype recognition and feature selection are important problems in the diagnosis and treatment of tumors. Here, we propose a novel gene selection approach applied to gene expression data classification. First, two classical feature reduction methods including locally linear embedding (LLE) and rough set (RS) are summarized. The advantages and disadvantages of these algorithms were analyzed and an optimized model for tumor gene selection was developed based on LLE and neighborhood RS (NRS). Bhattacharyya distance was introduced to delete irrelevant genes, pair-wise redundant analysis was performed to remove strongly correlated genes, and the wavelet soft threshold was determined to eliminate noise in the gene datasets. Next, prior optimized search processing was carried out. A new approach combining dimension reduction of LLE and feature reduction of NRS (LLE-NRS) was developed for selecting gene subsets, and then an open source software Weka was applied to distinguish different tumor types and verify the cross-validation classification accuracy of our proposed method. The experimental results demonstrated that the classification performance of the proposed LLE-NRS for selecting gene subset outperforms those of other related models in terms of accuracy, and our proposed approach is feasible and effective in the field of high-dimensional tumor classification.

  14. Applying object-based image analysis and knowledge-based classification to ADS-40 digital aerial photographs to facilitate complex forest land cover classification

    Science.gov (United States)

    Hsieh, Yi-Ta; Chen, Chaur-Tzuhn; Chen, Jan-Chang

    2017-01-01

    In general, considerable human and material resources are required for performing a forest inventory survey. Using remote sensing technologies to save forest inventory costs has thus become an important topic in forest inventory-related studies. Leica ADS-40 digital aerial photographs feature advantages such as high spatial resolution, high radiometric resolution, and a wealth of spectral information. As a result, they have been widely used to perform forest inventories. We classified ADS-40 digital aerial photographs according to the complex forest land cover types listed in the Fourth Forest Resource Survey in an effort to establish a classification method for categorizing ADS-40 digital aerial photographs. Subsequently, we classified the images using the knowledge-based classification method in combination with object-based analysis techniques, decision tree classification techniques, classification parameters such as object texture, shape, and spectral characteristics, a class-based classification method, and geographic information system mapping information. Finally, the results were compared with manually interpreted aerial photographs. Images were classified using a hierarchical classification method comprised of four classification levels (levels 1 to 4). The classification overall accuracy (OA) of levels 1 to 4 is within a range of 64.29% to 98.50%. The final result comparisons showed that the proposed classification method achieved an OA of 78.20% and a kappa coefficient of 0.7597. On the basis of the image classification results, classification errors occurred mostly in images of sunlit crowns because the image values for individual trees varied. Such a variance was caused by the crown structure and the incident angle of the sun. These errors lowered image classification accuracy and warrant further studies. This study corroborates the high feasibility for mapping complex forest land cover types using ADS-40 digital aerial photographs.

  15. A Cost-Sensitive Sparse Representation Based Classification for Class-Imbalance Problem

    Directory of Open Access Journals (Sweden)

    Zhenbing Liu

    2016-01-01

    Full Text Available Sparse representation has been successfully used in pattern recognition and machine learning. However, most existing sparse representation based classification (SRC methods are to achieve the highest classification accuracy, assuming the same losses for different misclassifications. This assumption, however, may not hold in many practical applications as different types of misclassification could lead to different losses. In real-world application, much data sets are imbalanced of the class distribution. To address these problems, we propose a cost-sensitive sparse representation based classification (CSSRC for class-imbalance problem method by using probabilistic modeling. Unlike traditional SRC methods, we predict the class label of test samples by minimizing the misclassification losses, which are obtained via computing the posterior probabilities. Experimental results on the UCI databases validate the efficacy of the proposed approach on average misclassification cost, positive class misclassification rate, and negative class misclassification rate. In addition, we sampled test samples and training samples with different imbalance ratio and use F-measure, G-mean, classification accuracy, and running time to evaluate the performance of the proposed method. The experiments show that our proposed method performs competitively compared to SRC, CSSVM, and CS4VM.

  16. Multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement

    Science.gov (United States)

    Yan, Dan; Bai, Lianfa; Zhang, Yi; Han, Jing

    2018-02-01

    For the problems of missing details and performance of the colorization based on sparse representation, we propose a conceptual model framework for colorizing gray-scale images, and then a multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement (CEMDC) is proposed based on this framework. The algorithm can achieve a natural colorized effect for a gray-scale image, and it is consistent with the human vision. First, the algorithm establishes a multi-sparse dictionary classification colorization model. Then, to improve the accuracy rate of the classification, the corresponding local constraint algorithm is proposed. Finally, we propose a detail enhancement based on Laplacian Pyramid, which is effective in solving the problem of missing details and improving the speed of image colorization. In addition, the algorithm not only realizes the colorization of the visual gray-scale image, but also can be applied to the other areas, such as color transfer between color images, colorizing gray fusion images, and infrared images.

  17. Study and Application on Stability Classification of Tunnel Surrounding Rock Based on Uncertainty Measure Theory

    Directory of Open Access Journals (Sweden)

    Hujun He

    2014-01-01

    Full Text Available Based on uncertainty measure theory, a stability classification and order-arranging model of surrounding rock was established. Considering the practical engineering geologic condition, 5 factors that influence surrounding rock stability were taken into account and uncertainty measure function was obtained based on the in situ data. In this model, uncertainty influence factors were analyzed quantitatively and qualitatively based on the real situation; the weight of index was given based on information entropy theory; surrounding rock stability level was judged based on credible degree recognition criterion; and surrounding rock was ordered based on order-arranging criterion. Furthermore, this model was employed to evaluate 5 sections surrounding rock in Dongshan tunnel of Huainan. The results show that uncertainty measure method is reasonable and can have significance for surrounding rock stability evaluation in the future.

  18. Comparison Effectiveness of Pixel Based Classification and Object Based Classification Using High Resolution Image In Floristic Composition Mapping (Study Case: Gunung Tidar Magelang City)

    Science.gov (United States)

    Ardha Aryaguna, Prama; Danoedoro, Projo

    2016-11-01

    Developments of analysis remote sensing have same way with development of technology especially in sensor and plane. Now, a lot of image have high spatial and radiometric resolution, that's why a lot information. Vegetation object analysis such floristic composition got a lot advantage of that development. Floristic composition can be interpreted using a lot of method such pixel based classification and object based classification. The problems for pixel based method on high spatial resolution image are salt and paper who appear in result of classification. The purpose of this research are compare effectiveness between pixel based classification and object based classification for composition vegetation mapping on high resolution image Worldview-2. The results show that pixel based classification using majority 5×5 kernel windows give the highest accuracy between another classifications. The highest accuracy is 73.32% from image Worldview-2 are being radiometric corrected level surface reflectance, but for overall accuracy in every class, object based are the best between another methods. Reviewed from effectiveness aspect, pixel based are more effective then object based for vegetation composition mapping in Tidar forest.

  19. Gene Expression Based Leukemia Sub-Classification Using Committee Neural Networks

    OpenAIRE

    Sewak, Mihir S.; Reddy, Narender P.; Duan, Zhong-Hui

    2009-01-01

    Analysis of gene expression data provides an objective and efficient technique for sub‑classification of leukemia. The purpose of the present study was to design a committee neural networks based classification systems to subcategorize leukemia gene expression data. In the study, a binary classification system was considered to differentiate acute lymphoblastic leukemia from acute myeloid leukemia. A ternary classification system which classifies leukemia expression data into three subclasses...

  20. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI.

    Science.gov (United States)

    Dikaios, Nikolaos; Alkalbani, Jokha; Abd-Alazeez, Mohamed; Sidhu, Harbir Singh; Kirkham, Alex; Ahmed, Hashim U; Emberton, Mark; Freeman, Alex; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit

    2015-09-01

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. • The ADC and T2-nSI of benign/cancer PZ are higher than benign/cancer TZ. • DCE parameters are significantly different between benign PZ and TZ, but not between cancerous PZ and TZ. • Diagnostic models containing contrast enhancement parameters have reduced performance when applied across zones.

  1. Streamed Sampling on Dynamic data as Support for Classification Model

    Directory of Open Access Journals (Sweden)

    Heru Sukoco

    2013-11-01

    Full Text Available Data mining process on dynamically changing data have several problems, such as unknown data size and skew of the data is always changing. Random sampling method commonly applied for extracting general synopsis from very large database. In this research, Vitter’s reservoir algorithm is used to retrieve k records of data from the database and put into the sample. Sample is used as input for classification task in data mining. Sample type is backing sample and it saved as table contains value of id and priority. Priority indicates the probability of how long data retained in the sample. Kullback-Leibler divergence applied to measure the similarity between population and sample distribution. Result of this research is showed that continuously taken samples randomly is possible when transaction occurs. Kullback-Leibler divergence is a very good measure to maintain similar distribution between the population and the sample with interval from 0 to 0.0001. Sample results are always up to date on new transactions with similar skewnes. In purpose of classification task, decision tree model is improved significantly when the changing occurred.

  2. Convolutional Neural Network-based SAR Image Classification with Noisy Labels

    Directory of Open Access Journals (Sweden)

    Zhao Juanping

    2017-10-01

    Full Text Available SAR image classification is an important task in SAR image interpretation. Supervised learning methods, such as the Convolutional Neural Network (CNN, demand samples that are accurately labeled. However, this presents a major challenge in SAR image labeling. Due to their unique imaging mechanism, SAR images are seriously affected by speckle, geometric distortion, and incomplete structural information. Thus, SAR images have a strong non-intuitive property, which causes difficulties in SAR image labeling, and which results in the weakened learning and generalization performance of many classifiers (including CNN. In this paper, we propose a Probability Transition CNN (PTCNN for patch-level SAR image classification with noisy labels. Based on the classical CNN, PTCNN builds a bridge between noise-free labels and their noisy versions via a noisy-label transition layer. As such, we derive a new CNN model trained with a noisily labeled training dataset that can potentially revise noisy labels and improve learning capacity with noisily labeled data. We use a 16-class land cover dataset and the MSTAR dataset to demonstrate the effectiveness of our model. Our experimental results show the PTCNN model to be robust with respect to label noise and demonstrate its promising classification performance compared with the classical CNN model. Therefore, the proposed PTCNN model could lower the standards required regarding the quality of image labels and have a variety of practical applications.

  3. Unsupervised Classification During Time-Series Model Building.

    Science.gov (United States)

    Gates, Kathleen M; Lane, Stephanie T; Varangis, E; Giovanello, K; Guiskewicz, K

    2017-01-01

    Researchers who collect multivariate time-series data across individuals must decide whether to model the dynamic processes at the individual level or at the group level. A recent innovation, group iterative multiple model estimation (GIMME), offers one solution to this dichotomy by identifying group-level time-series models in a data-driven manner while also reliably recovering individual-level patterns of dynamic effects. GIMME is unique in that it does not assume homogeneity in processes across individuals in terms of the patterns or weights of temporal effects. However, it can be difficult to make inferences from the nuances in varied individual-level patterns. The present article introduces an algorithm that arrives at subgroups of individuals that have similar dynamic models. Importantly, the researcher does not need to decide the number of subgroups. The final models contain reliable group-, subgroup-, and individual-level patterns that enable generalizable inferences, subgroups of individuals with shared model features, and individual-level patterns and estimates. We show that integrating community detection into the GIMME algorithm improves upon current standards in two important ways: (1) providing reliable classification and (2) increasing the reliability in the recovery of individual-level effects. We demonstrate this method on functional MRI from a sample of former American football players.

  4. Log-ratio transformed major element based multidimensional classification for altered High-Mg igneous rocks

    Science.gov (United States)

    Verma, Surendra P.; Rivera-Gómez, M. Abdelaly; Díaz-González, Lorena; Quiroz-Ruiz, Alfredo

    2016-12-01

    A new multidimensional classification scheme consistent with the chemical classification of the International Union of Geological Sciences (IUGS) is proposed for the nomenclature of High-Mg altered rocks. Our procedure is based on an extensive database of major element (SiO2, TiO2, Al2O3, Fe2O3t, MnO, MgO, CaO, Na2O, K2O, and P2O5) compositions of a total of 33,868 (920 High-Mg and 32,948 "Common") relatively fresh igneous rock samples. The database consisting of these multinormally distributed samples in terms of their isometric log-ratios was used to propose a set of 11 discriminant functions and 6 diagrams to facilitate High-Mg rock classification. The multinormality required by linear discriminant and canonical analysis was ascertained by a new computer program DOMuDaF. One multidimensional function can distinguish the High-Mg and Common igneous rocks with high percent success values of about 86.4% and 98.9%, respectively. Similarly, from 10 discriminant functions the High-Mg rocks can also be classified as one of the four rock types (komatiite, meimechite, picrite, and boninite), with high success values of about 88%-100%. Satisfactory functioning of this new classification scheme was confirmed by seven independent tests. Five further case studies involving application to highly altered rocks illustrate the usefulness of our proposal. A computer program HMgClaMSys was written to efficiently apply the proposed classification scheme, which will be available for online processing of igneous rock compositional data. Monte Carlo simulation modeling and mass-balance computations confirmed the robustness of our classification with respect to analytical errors and postemplacement compositional changes.

  5. Image-based fall detection and classification of a user with a walking support system

    Science.gov (United States)

    Taghvaei, Sajjad; Kosuge, Kazuhiro

    2017-10-01

    The classification of visual human action is important in the development of systems that interact with humans. This study investigates an image-based classification of the human state while using a walking support system to improve the safety and dependability of these systems.We categorize the possible human behavior while utilizing a walker robot into eight states (i.e., sitting, standing, walking, and five falling types), and propose two different methods, namely, normal distribution and hidden Markov models (HMMs), to detect and recognize these states. The visual feature for the state classification is the centroid position of the upper body, which is extracted from the user's depth images. The first method shows that the centroid position follows a normal distribution while walking, which can be adopted to detect any non-walking state. The second method implements HMMs to detect and recognize these states. We then measure and compare the performance of both methods. The classification results are employed to control the motion of a passive-type walker (called "RT Walker") by activating its brakes in non-walking states. Thus, the system can be used for sit/stand support and fall prevention. The experiments are performed with four subjects, including an experienced physiotherapist. Results show that the algorithm can be adapted to the new user's motion pattern within 40 s, with a fall detection rate of 96.25% and state classification rate of 81.0%. The proposed method can be implemented to other abnormality detection/classification applications that employ depth image-sensing devices.

  6. Radiological classification of renal angiomyolipomas based on 127 tumors

    Directory of Open Access Journals (Sweden)

    Prando Adilson

    2003-01-01

    Full Text Available PURPOSE: Demonstrate radiological findings of 127 angiomyolipomas (AMLs and propose a classification based on the radiological evidence of fat. MATERIALS AND METHODS: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73, multiple without tuberous sclerosis (TS (n = 4 and multiple with TS (n = 8, were retrospectively reviewed. Eighteen AMLs (14% presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13, hemorrhage (n = 11 and impossibility of an adequate preoperative characterization (n = 8. There was not a case of renal cell carcinoma (RCC with fat component in this group of patients. RESULTS: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal: 54%; Pattern-II, partially fatty (intrarenal or exophytic: 29%; Pattern-III, minimally fatty (most exophytic and perirenal: 11%; and Pattern-IV, without fat (most exophytic and perirenal: 6%. CONCLUSIONS: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm, pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exophytic and heterogeneous (patterns II and III. The rare pattern-IV AMLs, however, can be small or large, intra-renal or exophytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with detectable

  7. Value of the revised Atlanta classification (RAC) and determinant-based classification (DBC) systems in the evaluation of acute pancreatitis.

    Science.gov (United States)

    Wang, Xiaolei; Qin, Li; Cao, Jingli

    2017-11-03

    Since increasing acute pancreatitis (AP) severity is significantly associated with mortality, accurate and rapid determination of severity is crucial for effective clinical management. This study investigated the value of the revised Atlanta classification (RAC) and the determinant-based classification (DBC) systems in stratifying severity of acute pancreatitis. This retrospective observational cohort study included 480 AP patients. Patient demographics and clinical characteristics were recorded. The primary outcome was mortality, and secondary outcomes were admission to intensive care unit (ICU), duration of ICU stay, and duration of hospital stay. Based on the RAC classification, there were 295 patients with mild AP (MAP), 146 patients with moderate-to-severe AP (MSAP), and 39 patients with severe AP (SAP). Based on the DBC classification, there were 389 patients with MAP, 41 patients with MSAP, 32 patients with SAP, and 18 patients with critical AP (CAP). ROC curve analysis showed that the DBC system had a significantly higher accuracy at predicting organ failure compared to the RAC system (p < .001). Multivariate regression analysis showed that age and ICU stay were independent risk factors of mortality. The DBC system had a higher accuracy at predicting organ failure. Age and ICU stay were significantly associated with risk of death in AP patients. A classification of CAP by the DBC system should warrant close attention, and rapid implementation of effective measures to reduce mortality.

  8. Selection of robust variables for transfer of classification models employing the successive projections algorithm.

    Science.gov (United States)

    Milanez, Karla Danielle Tavares Melo; Araújo Nóbrega, Thiago César; Silva Nascimento, Danielle; Galvão, Roberto Kawakami Harrop; Pontes, Márcio José Coelho

    2017-09-01

    Multivariate models have been widely used in analytical problems involving quantitative and qualitative analyzes. However, there are cases in which a model is not applicable to spectra of samples obtained under new experimental conditions or in an instrument not involved in the modeling step. A solution to this problem is the transfer of multivariate models, usually performed using standardization of the spectral responses or enhancement of the robustness of the model. This present paper proposes two new criteria for selection of robust variables for classification transfer employing the successive projections algorithm (SPA). These variables are then used to build models based on linear discriminant analysis (LDA) with low sensitivity with respect to the differences between the responses of the instruments involved. For this purpose, transfer samples are included in the calculation of the cost for each subset of variables under consideration. The proposed methods are evaluated for two case studies involving identification of adulteration of extra virgin olive oil (EVOO) and hydrated ethyl alcohol fuel (HEAF) using UV-Vis and NIR spectroscopy, respectively. In both cases, similar or better classification transfer results (obtained for a test set measured on the secondary instrument) employing the two criteria were obtained in comparison with direct standardization (DS) and piecewise direct standardization (PDS). For the UV-Vis data, both proposed criteria achieved the correct classification rate (CCR) of 85%, while the best CCR obtained for the standardization methods was 81% for DS. For the NIR data, 92.5% of CCR was obtained by both criteria as well as DS. The results demonstrated the possibility of using either of the criteria proposed for building robust models as an alternative to the standardization of spectral responses for transfer of classification. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Crop classification modelling using remote sensing and environmental data in the Greater Platte River Basin, USA

    Science.gov (United States)

    Howard, Daniel M.; Wylie, Bruce K.; Tieszen, Larry L.

    2012-01-01

    With an ever expanding population, potential climate variability and an increasing demand for agriculture-based alternative fuels, accurate agricultural land-cover classification for specific crops and their spatial distributions are becoming critical to researchers, policymakers, land managers and farmers. It is important to ensure the sustainability of these and other land uses and to quantify the net impacts that certain management practices have on the environment. Although other quality crop classification products are often available, temporal and spatial coverage gaps can create complications for certain regional or time-specific applications. Our goal was to develop a model capable of classifying major crops in the Greater Platte River Basin (GPRB) for the post-2000 era to supplement existing crop classification products. This study identifies annual spatial distributions and area totals of corn, soybeans, wheat and other crops across the GPRB from 2000 to 2009. We developed a regression tree classification model based on 2.5 million training data points derived from the National Agricultural Statistics Service (NASS) Cropland Data Layer (CDL) in relation to a variety of other relevant input environmental variables. The primary input variables included the weekly 250 m US Geological Survey Earth Observing System Moderate Resolution Imaging Spectroradiometer normalized differential vegetation index, average long-term growing season temperature, average long-term growing season precipitation and yearly start of growing season. An overall model accuracy rating of 78% was achieved for a test sample of roughly 215 000 independent points that were withheld from model training. Ten 250 m resolution annual crop classification maps were produced and evaluated for the GPRB region, one for each year from 2000 to 2009. In addition to the model accuracy assessment, our validation focused on spatial distribution and county-level crop area totals in comparison with the

  10. Graph-based sensor fusion for classification of transient acoustic signals.

    Science.gov (United States)

    Srinivas, Umamahesh; Nasrabadi, Nasser M; Monga, Vishal

    2015-03-01

    Advances in acoustic sensing have enabled the simultaneous acquisition of multiple measurements of the same physical event via co-located acoustic sensors. We exploit the inherent correlation among such multiple measurements for acoustic signal classification, to identify the launch/impact of munition (i.e., rockets, mortars). Specifically, we propose a probabilistic graphical model framework that can explicitly learn the class conditional correlations between the cepstral features extracted from these different measurements. Additionally, we employ symbolic dynamic filtering-based features, which offer improvements over the traditional cepstral features in terms of robustness to signal distortions. Experiments on real acoustic data sets show that our proposed algorithm outperforms conventional classifiers as well as the recently proposed joint sparsity models for multisensor acoustic classification. Additionally our proposed algorithm is less sensitive to insufficiency in training samples compared to competing approaches.

  11. Study on a pattern classification method of soil quality based on simplified learning sample dataset

    Science.gov (United States)

    Zhang, Jiahua; Liu, S.; Hu, Y.; Tian, Y.

    2011-01-01

    Based on the massive soil information in current soil quality grade evaluation, this paper constructed an intelligent classification approach of soil quality grade depending on classical sampling techniques and disordered multiclassification Logistic regression model. As a case study to determine the learning sample capacity under certain confidence level and estimation accuracy, and use c-means algorithm to automatically extract the simplified learning sample dataset from the cultivated soil quality grade evaluation database for the study area, Long chuan county in Guangdong province, a disordered Logistic classifier model was then built and the calculation analysis steps of soil quality grade intelligent classification were given. The result indicated that the soil quality grade can be effectively learned and predicted by the extracted simplified dataset through this method, which changed the traditional method for soil quality grade evaluation. ?? 2011 IEEE.

  12. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification.

    Science.gov (United States)

    Baali, Hamza; Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J E

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain-computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling's [Formula: see text] statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%.

  13. Tensor Block-Sparsity Based Representation for Spectral-Spatial Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Zhi He

    2016-08-01

    Full Text Available Recently, sparse representation has yielded successful results in hyperspectral image (HSI classification. In the sparse representation-based classifiers (SRCs, a more discriminative representation that preserves the spectral-spatial information can be exploited by treating the HSI as a whole entity. Based on this observation, a tensor block-sparsity based representation method is proposed for spectral-spatial classification of HSI in this paper. Unlike traditional vector/matrix-based SRCs, the proposed method consists of tensor block-sparsity based dictionary learning and class-dependent block sparse representation. By naturally regarding the HSI cube as a third-order tensor, small local patches centered at the training samples are extracted from the HSI to maintain the structural information. All the patches are then partitioned into a number of groups, on which a dictionary learning model is constructed with a tensor block-sparsity constraint. A test sample is also expressed as a small local patch and the block sparse representation is then performed in a class-wise manner to take advantage of the class label information. Finally, the category of the test sample is determined by using the minimal residual. Experimental results of two real-world HSIs show that our proposed method greatly improves the classification performance of SRC.

  14. Tumor classification based on orthogonal linear discriminant analysis.

    Science.gov (United States)

    Wang, Huiya; Zhang, Shanwen

    2014-01-01

    Gene expression profiles have great potential for accurate tumor diagnosis. It is expected to enable us to diagnose tumors precisely and systematically, and also bring the researchers of machine learning two challenges, the curse of dimensionality and the small sample size problems. We propose a manifold learning based dimensional reduction algorithm named orthogonal local discriminant embedding (O-LDE) and apply it to tumor classification. Comparing with the classical local discriminant embedding (LDE), O-LDE aims to obtain an orthogonal linear projection matrix by solving an optimization problem. After being projected into a low-dimensional subspace by O-LDE, the data points of the same class maintain their intrinsic neighbor relations, whereas the neighboring points of the different classes are far from each other. Experimental results on a public tumor dataset validate the effectiveness and feasibility of the proposed algorithm.

  15. Linear regression-based feature selection for microarray data classification.

    Science.gov (United States)

    Abid Hasan, Md; Hasan, Md Kamrul; Abdul Mottalib, M

    2015-01-01

    Predicting the class of gene expression profiles helps improve the diagnosis and treatment of diseases. Analysing huge gene expression data otherwise known as microarray data is complicated due to its high dimensionality. Hence the traditional classifiers do not perform well where the number of features far exceeds the number of samples. A good set of features help classifiers to classify the dataset efficiently. Moreover, a manageable set of features is also desirable for the biologist for further analysis. In this paper, we have proposed a linear regression-based feature selection method for selecting discriminative features. Our main focus is to classify the dataset more accurately using less number of features than other traditional feature selection methods. Our method has been compared with several other methods and in almost every case the classification accuracy is higher using less number of features than the other popular feature selection methods.

  16. Machine Learning Based Localization and Classification with Atomic Magnetometers

    Science.gov (United States)

    Deans, Cameron; Griffin, Lewis D.; Marmugi, Luca; Renzoni, Ferruccio

    2018-01-01

    We demonstrate identification of position, material, orientation, and shape of objects imaged by a Rb 85 atomic magnetometer performing electromagnetic induction imaging supported by machine learning. Machine learning maximizes the information extracted from the images created by the magnetometer, demonstrating the use of hidden data. Localization 2.6 times better than the spatial resolution of the imaging system and successful classification up to 97% are obtained. This circumvents the need of solving the inverse problem and demonstrates the extension of machine learning to diffusive systems, such as low-frequency electrodynamics in media. Automated collection of task-relevant information from quantum-based electromagnetic imaging will have a relevant impact from biomedicine to security.

  17. Fines classification based on sensitivity to pore-fluid chemistry

    Science.gov (United States)

    Jang, Junbong; Santamarina, J. Carlos

    2016-01-01

    The 75-μm particle size is used to discriminate between fine and coarse grains. Further analysis of fine grains is typically based on the plasticity chart. Whereas pore-fluid-chemistry-dependent soil response is a salient and distinguishing characteristic of fine grains, pore-fluid chemistry is not addressed in current classification systems. Liquid limits obtained with electrically contrasting pore fluids (deionized water, 2-M NaCl brine, and kerosene) are combined to define the soil “electrical sensitivity.” Liquid limit and electrical sensitivity can be effectively used to classify fine grains according to their fluid-soil response into no-, low-, intermediate-, or high-plasticity fine grains of low, intermediate, or high electrical sensitivity. The proposed methodology benefits from the accumulated experience with liquid limit in the field and addresses the needs of a broader range of geotechnical engineering problems.

  18. Fines Classification Based on Sensitivity to Pore-Fluid Chemistry

    KAUST Repository

    Jang, Junbong

    2015-12-28

    The 75-μm particle size is used to discriminate between fine and coarse grains. Further analysis of fine grains is typically based on the plasticity chart. Whereas pore-fluid-chemistry-dependent soil response is a salient and distinguishing characteristic of fine grains, pore-fluid chemistry is not addressed in current classification systems. Liquid limits obtained with electrically contrasting pore fluids (deionized water, 2-M NaCl brine, and kerosene) are combined to define the soil "electrical sensitivity." Liquid limit and electrical sensitivity can be effectively used to classify fine grains according to their fluid-soil response into no-, low-, intermediate-, or high-plasticity fine grains of low, intermediate, or high electrical sensitivity. The proposed methodology benefits from the accumulated experience with liquid limit in the field and addresses the needs of a broader range of geotechnical engineering problems. © ASCE.

  19. Classification of cassava genotypes based on qualitative and quantitative data.

    Science.gov (United States)

    Oliveira, E J; Oliveira Filho, O S; Santos, V S

    2015-02-02

    We evaluated the genetic variation of cassava accessions based on qualitative (binomial and multicategorical) and quantitative traits (continuous). We characterized 95 accessions obtained from the Cassava Germplasm Bank of Embrapa Mandioca e Fruticultura; we evaluated these accessions for 13 continuous, 10 binary, and 25 multicategorical traits. First, we analyzed the accessions based only on quantitative traits; next, we conducted joint analysis (qualitative and quantitative traits) based on the Ward-MLM method, which performs clustering in two stages. According to the pseudo-F, pseudo-t2, and maximum likelihood criteria, we identified five and four groups based on quantitative trait and joint analysis, respectively. The smaller number of groups identified based on joint analysis may be related to the nature of the data. On the other hand, quantitative data are more subject to environmental effects in the phenotype expression; this results in the absence of genetic differences, thereby contributing to greater differentiation among accessions. For most of the accessions, the maximum probability of classification was >0.90, independent of the trait analyzed, indicating a good fit of the clustering method. Differences in clustering according to the type of data implied that analysis of quantitative and qualitative traits in cassava germplasm might explore different genomic regions. On the other hand, when joint analysis was used, the means and ranges of genetic distances were high, indicating that the Ward-MLM method is very useful for clustering genotypes when there are several phenotypic traits, such as in the case of genetic resources and breeding programs.

  20. DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS

    Directory of Open Access Journals (Sweden)

    Y. B. Abdullin

    2017-01-01

    Full Text Available Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging due to the lack of contextual information. We propose a deep neural network model that uses bilingual word embeddings to effectively solve sentiment classification problem for a given pair of languages. We apply our approach to two corpora of two different language pairs: English-Russian and Russian-Kazakh. We show how to train a classifier in one language and predict in another. Our approach achieves 73% accuracy for English and 74% accuracy for Russian. For Kazakh sentiment analysis, we propose a baseline method, that achieves 60% accuracy; and a method to learn bilingual embeddings from a large unlabeled corpus using a bilingual word pairs.

  1. A comparative study of machine learning models for ethnicity classification

    Science.gov (United States)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  2. How social and non-social information influence classification decisions: A computational modelling approach.

    Science.gov (United States)

    Puskaric, Marin; von Helversen, Bettina; Rieskamp, Jörg

    2017-08-01

    Social information such as observing others can improve performance in decision making. In particular, social information has been shown to be useful when finding the best solution on one's own is difficult, costly, or dangerous. However, past research suggests that when making decisions people do not always consider other people's behaviour when it is at odds with their own experiences. Furthermore, the cognitive processes guiding the integration of social information with individual experiences are still under debate. Here, we conducted two experiments to test whether information about other persons' behaviour influenced people's decisions in a classification task. Furthermore, we examined how social information is integrated with individual learning experiences by testing different computational models. Our results show that social information had a small but reliable influence on people's classifications. The best computational model suggests that in categorization people first make up their own mind based on the non-social information, which is then updated by the social information.

  3. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    Science.gov (United States)

    Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...

  4. Land Cover Analysis by Using Pixel-Based and Object-Based Image Classification Method in Bogor

    Science.gov (United States)

    Amalisana, Birohmatin; Rokhmatullah; Hernina, Revi

    2017-12-01

    The advantage of image classification is to provide earth’s surface information like landcover and time-series changes. Nowadays, pixel-based image classification technique is commonly performed with variety of algorithm such as minimum distance, parallelepiped, maximum likelihood, mahalanobis distance. On the other hand, landcover classification can also be acquired by using object-based image classification technique. In addition, object-based classification uses image segmentation from parameter such as scale, form, colour, smoothness and compactness. This research is aimed to compare the result of landcover classification and its change detection between parallelepiped pixel-based and object-based classification method. Location of this research is Bogor with 20 years range of observation from 1996 until 2016. This region is famous as urban areas which continuously change due to its rapid development, so that time-series landcover information of this region will be interesting.

  5. GA Based Optimal Feature Extraction Method for Functional Data Classification

    OpenAIRE

    Jun Wan; Zehua Chen; Yingwu Chen; Zhidong Bai

    2010-01-01

    Classification is an interesting problem in functional data analysis (FDA), because many science and application problems end up with classification problems, such as recognition, prediction, control, decision making, management, etc. As the high dimension and high correlation in functional data (FD), it is a key problem to extract features from FD whereas keeping its global characters, which relates to the classification efficiency and precision to heavens. In this paper...

  6. Deep Learning based Classification of FDG-PET Data for Alzheimers Disease Categories.

    Science.gov (United States)

    Singh, Shibani; Srivastava, Anant; Mi, Liang; Caselli, Richard J; Chen, Kewei; Goradia, Dhruman; Reiman, Eric M; Wang, Yalin

    2017-10-01

    Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate for glucose, offering a reliable metabolic biomarker even on presymptomatic Alzheimer's disease (AD) patients. PET scans provide functional information that is unique and unavailable using other types of imaging. However, the computational efficacy of FDG-PET data alone, for the classification of various Alzheimers Diagnostic categories, has not been well studied. This motivates us to correctly discriminate various AD Diagnostic categories using FDG-PET data. Deep learning has improved state-of-the-art classification accuracies in the areas of speech, signal, image, video, text mining and recognition. We propose novel methods that involve probabilistic principal component analysis on max-pooled data and mean-pooled data for dimensionality reduction, and multilayer feed forward neural network which performs binary classification. Our experimental dataset consists of baseline data of subjects including 186 cognitively unimpaired (CU) subects, 336 mild cognitive impairment (MCI) subjects with 158 Late MCI and 178 Early MCI, and 146 AD patients from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We measured F1-measure, precision, recall, negative and positive predictive values with a 10-fold cross validation scheme. Our results indicate that our designed classifiers achieve competitive results while max pooling achieves better classification performance compared to mean-pooled features. Our deep model based research may advance FDG-PET analysis by demonstrating their potential as an effective imaging biomarker of AD.

  7. Deep-learning-based classification of FDG-PET data for Alzheimer's disease categories

    Science.gov (United States)

    Singh, Shibani; Srivastava, Anant; Mi, Liang; Caselli, Richard J.; Chen, Kewei; Goradia, Dhruman; Reiman, Eric M.; Wang, Yalin

    2017-11-01

    Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate for glucose, offering a reliable metabolic biomarker even on presymptomatic Alzheimer's disease (AD) patients. PET scans provide functional information that is unique and unavailable using other types of imaging. However, the computational efficacy of FDG-PET data alone, for the classification of various Alzheimers Diagnostic categories, has not been well studied. This motivates us to correctly discriminate various AD Diagnostic categories using FDG-PET data. Deep learning has improved state-of-the-art classification accuracies in the areas of speech, signal, image, video, text mining and recognition. We propose novel methods that involve probabilistic principal component analysis on max-pooled data and mean-pooled data for dimensionality reduction, and multilayer feed forward neural network which performs binary classification. Our experimental dataset consists of baseline data of subjects including 186 cognitively unimpaired (CU) subjects, 336 mild cognitive impairment (MCI) subjects with 158 Late MCI and 178 Early MCI, and 146 AD patients from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We measured F1-measure, precision, recall, negative and positive predictive values with a 10-fold cross validation scheme. Our results indicate that our designed classifiers achieve competitive results while max pooling achieves better classification performance compared to mean-pooled features. Our deep model based research may advance FDG-PET analysis by demonstrating their potential as an effective imaging biomarker of AD.

  8. A KERNEL-BASED MULTI-FEATURE IMAGE REPRESENTATION FOR HISTOPATHOLOGY IMAGE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    J Carlos Moreno

    2010-09-01

    Full Text Available This paper presents a novel strategy for building a high-dimensional feature space to represent histopathology image contents. Histogram features, related to colors, textures and edges, are combined together in a unique image representation space using kernel functions. This feature space is further enhanced by the application of Latent Semantic Analysis, to model hidden relationships among visual patterns. All that information is included in the new image representation space. Then, Support Vector Machine classifiers are used to assign semantic labels to images. Processing and classification algorithms operate on top of kernel functions, so that, the structure of the feature space is completely controlled using similarity measures and a dual representation. The proposed approach has shown a successful performance in a classification task using a dataset with 1,502 real histopathology images in 18 different classes. The results show that our approach for histological image classification obtains an improved average performance of 20.6% when compared to a conventional classification approach based on SVM directly applied to the original kernel.

  9. Evaluation of soft segment modeling on a context independent phoneme classification system

    International Nuclear Information System (INIS)

    Razzazi, F.; Sayadiyan, A.

    2007-01-01

    The geometric distribution of states duration is one of the main performance limiting assumptions of hidden Markov modeling of speech signals. Stochastic segment models, generally, and segmental HMM, specifically overcome this deficiency partly at the cost of more complexity in both training and recognition phases. In addition to this assumption, the gradual temporal changes of speech statistics has not been modeled in HMM. In this paper, a new duration modeling approach is presented. The main idea of the model is to consider the effect of adjacent segments on the probability density function estimation and evaluation of each acoustic segment. This idea not only makes the model robust against segmentation errors, but also it models gradual change from one segment to the next one with a minimum set of parameters. The proposed idea is analytically formulated and tested on a TIMIT based context independent phenomena classification system. During the test procedure, the phoneme classification of different phoneme classes was performed by applying various proposed recognition algorithms. The system was optimized and the results have been compared with a continuous density hidden Markov model (CDHMM) with similar computational complexity. The results show 8-10% improvement in phoneme recognition rate in comparison with standard continuous density hidden Markov model. This indicates improved compatibility of the proposed model with the speech nature. (author)

  10. Organizational information assets classification model and security architecture methodology

    Directory of Open Access Journals (Sweden)

    Mostafa Tamtaji

    2015-12-01

    Full Text Available Today's, Organizations are exposed with huge and diversity of information and information assets that are produced in different systems shuch as KMS, financial and accounting systems, official and industrial automation sysytems and so on and protection of these information is necessary. Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released.several benefits of this model cuses that organization has a great trend to implementing Cloud computing. Maintaining and management of information security is the main challenges in developing and accepting of this model. In this paper, at first, according to "design science research methodology" and compatible with "design process at information systems research", a complete categorization of organizational assets, including 355 different types of information assets in 7 groups and 3 level, is presented to managers be able to plan corresponding security controls according to importance of each groups. Then, for directing of organization to architect it’s information security in cloud computing environment, appropriate methodology is presented. Presented cloud computing security architecture , resulted proposed methodology, and presented classification model according to Delphi method and expers comments discussed and verified.

  11. Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification

    Directory of Open Access Journals (Sweden)

    Jie Hu

    2018-02-01

    Full Text Available Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM classification, which are valuable when human-annotated keywords are not available. We used a standard benchmark dataset and a homemade patent dataset to evaluate the performance of PKEA. Our patent dataset includes 2500 patents from five distinct technological fields related to autonomous cars (GPS systems, lidar systems, object recognition systems, radar systems, and vehicle control systems. We compared our method with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF, TextRank and Rapid Automatic Keyword Extraction (RAKE. The experimental results show that our proposed algorithm provides a promising way to extract keywords from patent texts for patent classification.

  12. Automated segmentation of atherosclerotic histology based on pattern classification

    Directory of Open Access Journals (Sweden)

    Arna van Engelen

    2013-01-01

    Full Text Available Background: Histology sections provide accurate information on atherosclerotic plaque composition, and are used in various applications. To our knowledge, no automated systems for plaque component segmentation in histology sections currently exist. Materials and Methods: We perform pixel-wise classification of fibrous, lipid, and necrotic tissue in Elastica Von Gieson-stained histology sections, using features based on color channel intensity and local image texture and structure. We compare an approach where we train on independent data to an approach where we train on one or two sections per specimen in order to segment the remaining sections. We evaluate the results on segmentation accuracy in histology, and we use the obtained histology segmentations to train plaque component classification methods in ex vivo Magnetic resonance imaging (MRI and in vivo MRI and computed tomography (CT. Results: In leave-one-specimen-out experiments on 176 histology slices of 13 plaques, a pixel-wise accuracy of 75.7 ± 6.8% was obtained. This increased to 77.6 ± 6.5% when two manually annotated slices of the specimen to be segmented were used for training. Rank correlations of relative component volumes with manually annotated volumes were high in this situation (P = 0.82-0.98. Using the obtained histology segmentations to train plaque component classification methods in ex vivo MRI and in vivo MRI and CT resulted in similar image segmentations for training on the automated histology segmentations as for training on a fully manual ground truth. The size of the lipid-rich necrotic core was significantly smaller when training on fully automated histology segmentations than when manually annotated histology sections were used. This difference was reduced and not statistically significant when one or two slices per section were manually annotated for histology segmentation. Conclusions: Good histology segmentations can be obtained by automated segmentation

  13. An object-based approach to hierarchical classification of the Earth's topography from SRTM data

    Science.gov (United States)

    Eisank, C.; Dragut, L.

    2012-04-01

    Digital classification of the Earth's surface has significantly benefited from the availability of global DEMs and recent advances in image processing techniques. Such an innovative approach is object-based analysis, which integrates multi-scale segmentation and rule-based classification. Since the classification is based on spatially configured objects and no longer on solely thematically defined cells, the resulting landforms or landform types are represented in a more realistic way. However, up to now, the object-based approach has not been adopted for broad-scale topographic modelling. Existing global to almost-global terrain classification systems have been implemented on per cell schemes, accepting disadvantages such as the speckled character of outputs and the non-consideration of space. We introduce the first object-based method to automatically classify the Earth's surface as represented by the SRTM into a three-level hierarchy of topographic regions. The new method relies on the concept of decomposing land-surface complexity into ever more homogeneous domains. The SRTM elevation layer is automatically segmented and classified at three levels that represent domains of complexity by using self-adaptive, data-driven techniques. For each domain, scales in the data are detected with the help of local variance and segmentation is performed at these recognised scales. Objects resulting from segmentation are partitioned into sub-domains based on thresholds given by the mean values of elevation and standard deviation of elevation respectively. Results resemble patterns of existing global and regional classifications, displaying a level of detail close to manually drawn maps. Statistical evaluation indicates that most of the classes satisfy the regionalisation requirements of maximising internal homogeneity while minimising external homogeneity. Most objects have boundaries matching natural discontinuities at the regional level. The method is simple and fully

  14. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  15. Research into Financial Position of Listed Companies following Classification via Extreme Learning Machine Based upon DE Optimization

    Directory of Open Access Journals (Sweden)

    Fu Yu

    2016-01-01

    Full Text Available By means of the model of extreme learning machine based upon DE optimization, this article particularly centers on the optimization thinking of such a model as well as its application effect in the field of listed company’s financial position classification. It proves that the improved extreme learning machine algorithm based upon DE optimization eclipses the traditional extreme learning machine algorithm following comparison. Meanwhile, this article also intends to introduce certain research thinking concerning extreme learning machine into the economics classification area so as to fulfill the purpose of computerizing the speedy but effective evaluation of massive financial statements of listed companies pertain to different classes

  16. Sky camera imagery processing based on a sky classification using radiometric data

    International Nuclear Information System (INIS)

    Alonso, J.; Batlles, F.J.; López, G.; Ternero, A.

    2014-01-01

    As part of the development and expansion of CSP (concentrated solar power) technology, one of the most important operational requirements is to have complete control of all factors which may affect the quantity and quality of the solar power produced. New developments and tools in this field are focused on weather forecasting improving both operational security and electricity production. Such is the case with sky cameras, devices which are currently in use in some CSP plants and whose use is expanding in the new technology sector. Their application is mainly focused on cloud detection, estimating their movement as well as their influence on solar radiation attenuation indeed, the presence of clouds is the greatest factor involved in solar radiation attenuation. The aim of this work is the detection and analysis of clouds from images taken by a TSI-880 model sky. In order to obtain accurate image processing, three different models were created, based on a previous sky classification using radiometric data and representative sky conditions parameters. As a consequence, the sky can be classified as cloudless, partially-cloudy or overcast, delivering an average success rate of 92% in sky classification and cloud detection. - Highlights: • We developed a methodology for detection of clouds in total sky imagery (TSI-880). • A classification of sky is presented according to radiometric data and sky parameters. • The sky can be classified as cloudless, partially cloudy and overcast. • The images processing is based on the sky classification for the detection of clouds. • The average success of the developed model is around 92%

  17. Sequence-based classification and identification of Fungi.

    Science.gov (United States)

    Hibbett, David; Abarenkov, Kessy; Kõljalg, Urmas; Öpik, Maarja; Chai, Benli; Cole, James; Wang, Qiong; Crous, Pedro; Robert, Vincent; Helgason, Thorunn; Herr, Joshua R; Kirk, Paul; Lueschow, Shiloh; O'Donnell, Kerry; Nilsson, R Henrik; Oono, Ryoko; Schoch, Conrad; Smyth, Christopher; Walker, Donald M; Porras-Alfaro, Andrea; Taylor, John W; Geiser, David M

    Fungal taxonomy and ecology have been revolutionized by the application of molecular methods and both have increasing connections to genomics and functional biology. However, data streams from traditional specimen- and culture-based systematics are not yet fully integrated with those from metagenomic and metatranscriptomic studies, which limits understanding of the taxonomic diversity and metabolic properties of fungal communities. This article reviews current resources, needs, and opportunities for sequence-based classification and identification (SBCI) in fungi as well as related efforts in prokaryotes. To realize the full potential of fungal SBCI it will be necessary to make advances in multiple areas. Improvements in sequencing methods, including long-read and single-cell technologies, will empower fungal molecular ecologists to look beyond ITS and current shotgun metagenomics approaches. Data quality and accessibility will be enhanced by attention to data and metadata standards and rigorous enforcement of policies for deposition of data and workflows. Taxonomic communities will need to develop best practices for molecular characterization in their focal clades, while also contributing to globally useful datasets including ITS. Changes to nomenclatural rules are needed to enable validPUBLICation of sequence-based taxon descriptions. Finally, cultural shifts are necessary to promote adoption of SBCI and to accord professional credit to individuals who contribute to community resources.

  18. Fuzzy clustering-based feature extraction method for mental task classification.

    Science.gov (United States)

    Gupta, Akshansh; Kumar, Dhirendra

    2017-06-01

    A brain computer interface (BCI) is a communication system by which a person can send messages or requests for basic necessities without using peripheral nerves and muscles. Response to mental task-based BCI is one of the privileged areas of investigation. Electroencephalography (EEG) signals are used to represent the brain activities in the BCI domain. For any mental task classification model, the performance of the learning model depends on the extraction of features from EEG signal. In literature, wavelet transform and empirical mode decomposition are two popular feature extraction methods used to analyze a signal having non-linear and non-stationary property. By adopting the virtue of both techniques, a theoretical adaptive filter-based method to decompose non-linear and non-stationary signal has been proposed known as empirical wavelet transform (EWT) in recent past. EWT does not work well for the signals having overlapped in frequency and time domain and failed to provide good features for further classification. In this work, Fuzzy c-means algorithm is utilized along with EWT to handle this problem. It has been observed from the experimental results that EWT along with fuzzy clustering outperforms in comparison to EWT for the EEG-based response to mental task problem. Further, in case of mental task classification, the ratio of samples to features is very small. To handle the problem of small ratio of samples to features, in this paper, we have also utilized three well-known multivariate feature selection methods viz. Bhattacharyya distance (BD), ratio of scatter matrices (SR), and linear regression (LR). The results of experiment demonstrate that the performance of mental task classification has improved considerably by aforesaid methods. Ranking method and Friedman's statistical test are also performed to rank and compare different combinations of feature extraction methods and feature selection methods which endorse the efficacy of the proposed approach.

  19. Hardwood species classification with DWT based hybrid texture ...

    Indian Academy of Sciences (India)

    durability, availability and rational use of available resources. This would also help in avoiding ... have used different classifiers like MLP-BP-ANN, Pearson correlation, energy value, and SVM and achieved classification ..... best trade-off between classification accuracy and computational time. The RF classifier needs.

  20. Real-time human action classification using a dynamic neural model.

    Science.gov (United States)

    Yu, Zhibin; Lee, Minho

    2015-09-01

    The multiple timescale recurrent neural network (MTRNN) model is a useful tool for recording and regenerating a continuous signal for dynamic tasks. However, our research shows that the MTRNN model is difficult to use for the classification of multiple types of motion when observing a human action. Therefore, in this paper, we propose a new supervised MTRNN model for handling the issue of action classification. Instead of setting the initial states, we define a group of slow context nodes as "classification nodes." The supervised MTRNN model provides both prediction and classification outputs simultaneously during testing. Our experiment results show that the supervised MTRNN model inherits the basic function of an MTRNN and can be used to generate action signals. In addition, the results show that the robustness of the supervised MTRNN model is better than that of the MTRNN model when generating both action sequences and action classification tasks. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. A Hierarchical Feature Extraction Model for Multi-Label Mechanical Patent Classification

    Directory of Open Access Journals (Sweden)

    Jie Hu

    2018-01-01

    Full Text Available Various studies have focused on feature extraction methods for automatic patent classification in recent years. However, most of these approaches are based on the knowledge from experts in related domains. Here we propose a hierarchical feature extraction model (HFEM for multi-label mechanical patent classification, which is able to capture both local features of phrases as well as global and temporal semantics. First, a n-gram feature extractor based on convolutional neural networks (CNNs is designed to extract salient local lexical-level features. Next, a long dependency feature extraction model based on the bidirectional long–short-term memory (BiLSTM neural network model is proposed to capture sequential correlations from higher-level sequence representations. Then the HFEM algorithm and its hierarchical feature extraction architecture are detailed. We establish the training, validation and test datasets, containing 72,532, 18,133, and 2679 mechanical patent documents, respectively, and then check the performance of HFEMs. Finally, we compared the results of the proposed HFEM and three other single neural network models, namely CNN, long–short-term memory (LSTM, and BiLSTM. The experimental results indicate that our proposed HFEM outperforms the other compared models in both precision and recall.

  2. Classification of Land Cover and Land Use Based on Convolutional Neural Networks

    Science.gov (United States)

    Yang, Chun; Rottensteiner, Franz; Heipke, Christian

    2018-04-01

    Land cover describes the physical material of the earth's surface, whereas land use describes the socio-economic function of a piece of land. Land use information is typically collected in geospatial databases. As such databases become outdated quickly, an automatic update process is required. This paper presents a new approach to determine land cover and to classify land use objects based on convolutional neural networks (CNN). The input data are aerial images and derived data such as digital surface models. Firstly, we apply a CNN to determine the land cover for each pixel of the input image. We compare different CNN structures, all of them based on an encoder-decoder structure for obtaining dense class predictions. Secondly, we propose a new CNN-based methodology for the prediction of the land use label of objects from a geospatial database. In this context, we present a strategy for generating image patches of identical size from the input data, which are classified by a CNN. Again, we compare different CNN architectures. Our experiments show that an overall accuracy of up to 85.7 % and 77.4 % can be achieved for land cover and land use, respectively. The classification of land cover has a positive contribution to the classification of the land use classification.

  3. Toward a Safety Risk-Based Classification of Unmanned Aircraft

    Science.gov (United States)

    Torres-Pomales, Wilfredo

    2016-01-01

    There is a trend of growing interest and demand for greater access of unmanned aircraft (UA) to the National Airspace System (NAS) as the ongoing development of UA technology has created the potential for significant economic benefits. However, the lack of a comprehensive and efficient UA regulatory framework has constrained the number and kinds of UA operations that can be performed. This report presents initial results of a study aimed at defining a safety-risk-based UA classification as a plausible basis for a regulatory framework for UA operating in the NAS. Much of the study up to this point has been at a conceptual high level. The report includes a survey of contextual topics, analysis of safety risk considerations, and initial recommendations for a risk-based approach to safe UA operations in the NAS. The next phase of the study will develop and leverage deeper clarity and insight into practical engineering and regulatory considerations for ensuring that UA operations have an acceptable level of safety.

  4. A Neural-Network-Based Semi-Automated Geospatial Classification Tool

    Science.gov (United States)

    Hale, R. G.; Herzfeld, U. C.

    2014-12-01

    North America's largest glacier system, the Bering Bagley Glacier System (BBGS) in Alaska, surged in 2011-2013, as shown by rapid mass transfer, elevation change, and heavy crevassing. Little is known about the physics controlling surge glaciers' semi-cyclic patterns; therefore, it is crucial to collect and analyze as much data as possible so that predictive models can be made. In addition, physical signs frozen in ice in the form of crevasses may help serve as a warning for future surges. The BBGS surge provided an opportunity to develop an automated classification tool for crevasse classification based on imagery collected from small aircraft. The classification allows one to link image classification to geophysical processes associated with ice deformation. The tool uses an approach that employs geostatistical functions and a feed-forward perceptron with error back-propagation. The connectionist-geostatistical approach uses directional experimental (discrete) variograms to parameterize images into a form that the Neural Network (NN) can recognize. In an application to preform analysis on airborne video graphic data from the surge of the BBGS, an NN was able to distinguish 18 different crevasse classes with 95 percent or higher accuracy, for over 3,000 images. Recognizing that each surge wave results in different crevasse types and that environmental conditions affect the appearance in imagery, we designed the tool's semi-automated pre-training algorithm to be adaptable. The tool can be optimized to specific settings and variables of image analysis: (airborne and satellite imagery, different camera types, observation altitude, number and types of classes, and resolution). The generalization of the classification tool brings three important advantages: (1) multiple types of problems in geophysics can be studied, (2) the training process is sufficiently formalized to allow non-experts in neural nets to perform the training process, and (3) the time required to

  5. The Zipf Law revisited: An evolutionary model of emerging classification

    Energy Technology Data Exchange (ETDEWEB)

    Levitin, L.B. [Boston Univ., MA (United States); Schapiro, B. [TINA, Brandenburg (Germany); Perlovsky, L. [NRC, Wakefield, MA (United States)

    1996-12-31

    Zipf`s Law is a remarkable rank-frequency relationship observed in linguistics (the frequencies of the use of words are approximately inversely proportional to their ranks in the decreasing frequency order) as well as in the behavior of many complex systems of surprisingly different nature. We suggest an evolutionary model of emerging classification of objects into classes corresponding to concepts and denoted by words. The evolution of the system is derived from two basic assumptions: first, the probability to recognize an object as belonging to a known class is proportional to the number of objects in this class already recognized, and, second, there exists a small probability to observe an object that requires creation of a new class ({open_quotes}mutation{close_quotes} that gives birth to a new {open_quotes}species{close_quotes}). It is shown that the populations of classes in such a system obey the Zipf Law provided that the rate of emergence of new classes is small. The model leads also to the emergence of a second-tier structure of {open_quotes}super-classes{close_quotes} - groups of classes with almost equal populations.

  6. Ovarian Cancer Classification based on Mass Spectrometry Analysis of Sera

    Directory of Open Access Journals (Sweden)

    Baolin Wu

    2006-01-01

    Full Text Available In our previous study [1], we have compared the performance of a number of widely used discrimination methods for classifying ovarian cancer using Matrix Assisted Laser Desorption Ionization (MALDI mass spectrometry data on serum samples obtained from Reflectron mode. Our results demonstrate good performance with a random forest classifier. In this follow-up study, to improve the molecular classification power of the MALDI platform for ovarian cancer disease, we expanded the mass range of the MS data by adding data acquired in Linear mode and evaluated the resultant decrease in classification error. A general statistical framework is proposed to obtain unbiased classification error estimates and to analyze the effects of sample size and number of selected m/z features on classification errors. We also emphasize the importance of combining biological knowledge and statistical analysis to obtain both biologically and statistically sound results. Our study shows improvement in classification accuracy upon expanding the mass range of the analysis. In order to obtain the best classification accuracies possible, we found that a relatively large training sample size is needed to obviate the sample variations. For the ovarian MS dataset that is the focus of the current study, our results show that approximately 20-40 m/z features are needed to achieve the best classification accuracy from MALDI-MS analysis of sera. Supplementary information can be found at http://bioinformatics.med.yale.edu/proteomics/BioSupp2.html.

  7. 2D Modeling and Classification of Extended Objects in a Network of HRR Radars

    NARCIS (Netherlands)

    Fasoula, A.

    2011-01-01

    In this thesis, the modeling of extended objects with low-dimensional representations of their 2D geometry is addressed. The ultimate objective is the classification of the objects using libraries of such compact 2D object models that are much smaller than in the state-of-the-art classification

  8. Roof Type Selection Based on Patch-Based Classification Using Deep Learning for High Resolution Satellite Imagery

    Science.gov (United States)

    Partovi, T.; Fraundorfer, F.; Azimi, S.; Marmanis, D.; Reinartz, P.

    2017-05-01

    3D building reconstruction from remote sensing image data from satellites is still an active research topic and very valuable for 3D city modelling. The roof model is the most important component to reconstruct the Level of Details 2 (LoD2) for a building in 3D modelling. While the general solution for roof modelling relies on the detailed cues (such as lines, corners and planes) extracted from a Digital Surface Model (DSM), the correct detection of the roof type and its modelling can fail due to low quality of the DSM generated by dense stereo matching. To reduce dependencies of roof modelling on DSMs, the pansharpened satellite images as a rich resource of information are used in addition. In this paper, two strategies are employed for roof type classification. In the first one, building roof types are classified in a state-of-the-art supervised pre-trained convolutional neural network (CNN) framework. In the second strategy, deep features from deep layers of different pre-trained CNN model are extracted and then an RBF kernel using SVM is employed to classify the building roof type. Based on roof complexity of the scene, a roof library including seven types of roofs is defined. A new semi-automatic method is proposed to generate training and test patches of each roof type in the library. Using the pre-trained CNN model does not only decrease the computation time for training significantly but also increases the classification accuracy.

  9. ROOF TYPE SELECTION BASED ON PATCH-BASED CLASSIFICATION USING DEEP LEARNING FOR HIGH RESOLUTION SATELLITE IMAGERY

    Directory of Open Access Journals (Sweden)

    T. Partovi

    2017-05-01

    Full Text Available 3D building reconstruction from remote sensing image data from satellites is still an active research topic and very valuable for 3D city modelling. The roof model is the most important component to reconstruct the Level of Details 2 (LoD2 for a building in 3D modelling. While the general solution for roof modelling relies on the detailed cues (such as lines, corners and planes extracted from a Digital Surface Model (DSM, the correct detection of the roof type and its modelling can fail due to low quality of the DSM generated by dense stereo matching. To reduce dependencies of roof modelling on DSMs, the pansharpened satellite images as a rich resource of information are used in addition. In this paper, two strategies are employed for roof type classification. In the first one, building roof types are classified in a state-of-the-art supervised pre-trained convolutional neural network (CNN framework. In the second strategy, deep features from deep layers of different pre-trained CNN model are extracted and then an RBF kernel using SVM is employed to classify the building roof type. Based on roof complexity of the scene, a roof library including seven types of roofs is defined. A new semi-automatic method is proposed to generate training and test patches of each roof type in the library. Using the pre-trained CNN model does not only decrease the computation time for training significantly but also increases the classification accuracy.

  10. Aspect Βased Classification Model for Social Reviews

    Directory of Open Access Journals (Sweden)

    J. Mir

    2017-12-01

    Full Text Available Aspect based opinion mining investigates deeply, the emotions related to one’s aspects. Aspects and opinion word identification is the core task of aspect based opinion mining. In previous studies aspect based opinion mining have been applied on service or product domain. Moreover, product reviews are short and simple whereas, social reviews are long and complex. However, this study introduces an efficient model for social reviews which classifies aspects and opinion words related to social domain. The main contributions of this paper are auto tagging and data training phase, feature set definition and dictionary usage. Proposed model results are compared with CR model and Naïve Bayes classifier on same dataset having accuracy 98.17% and precision 96.01%, while recall and F1 are 96.00% and 96.01% respectively. The experimental results show that the proposed model performs better than the CR model and Naïve Bayes classifier.

  11. Semi-supervised vibration-based classification and condition monitoring of compressors

    Science.gov (United States)

    Potočnik, Primož; Govekar, Edvard

    2017-09-01

    Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.

  12. Emotional textile image classification based on cross-domain convolutional sparse autoencoders with feature selection

    Science.gov (United States)

    Li, Zuhe; Fan, Yangyu; Liu, Weihua; Yu, Zeqi; Wang, Fengqin

    2017-01-01

    We aim to apply sparse autoencoder-based unsupervised feature learning to emotional semantic analysis for textile images. To tackle the problem of limited training data, we present a cross-domain feature learning scheme for emotional textile image classification using convolutional autoencoders. We further propose a correlation-analysis-based feature selection method for the weights learned by sparse autoencoders to reduce the number of features extracted from large size images. First, we randomly collect image patches on an unlabeled image dataset in the source domain and learn local features with a sparse autoencoder. We then conduct feature selection according to the correlation between different weight vectors corresponding to the autoencoder's hidden units. We finally adopt a convolutional neural network including a pooling layer to obtain global feature activations of textile images in the target domain and send these global feature vectors into logistic regression models for emotional image classification. The cross-domain unsupervised feature learning method achieves 65% to 78% average accuracy in the cross-validation experiments corresponding to eight emotional categories and performs better than conventional methods. Feature selection can reduce the computational cost of global feature extraction by about 50% while improving classification performance.

  13. Palm-vein classification based on principal orientation features.

    Directory of Open Access Journals (Sweden)

    Yujia Zhou

    Full Text Available Personal recognition using palm-vein patterns has emerged as a promising alternative for human recognition because of its uniqueness, stability, live body identification, flexibility, and difficulty to cheat. With the expanding application of palm-vein pattern recognition, the corresponding growth of the database has resulted in a long response time. To shorten the response time of identification, this paper proposes a simple and useful classification for palm-vein identification based on principal direction features. In the registration process, the Gaussian-Radon transform is adopted to extract the orientation matrix and then compute the principal direction of a palm-vein image based on the orientation matrix. The database can be classified into six bins based on the value of the principal direction. In the identification process, the principal direction of the test sample is first extracted to ascertain the corresponding bin. One-by-one matching with the training samples is then performed in the bin. To improve recognition efficiency while maintaining better recognition accuracy, two neighborhood bins of the corresponding bin are continuously searched to identify the input palm-vein image. Evaluation experiments are conducted on three different databases, namely, PolyU, CASIA, and the database of this study. Experimental results show that the searching range of one test sample in PolyU, CASIA and our database by the proposed method for palm-vein identification can be reduced to 14.29%, 14.50%, and 14.28%, with retrieval accuracy of 96.67%, 96.00%, and 97.71%, respectively. With 10,000 training samples in the database, the execution time of the identification process by the traditional method is 18.56 s, while that by the proposed approach is 3.16 s. The experimental results confirm that the proposed approach is more efficient than the traditional method, especially for a large database.

  14. Trace elements based classification on clinkers. Application to Spanish clinkers

    Directory of Open Access Journals (Sweden)

    Tamás, F. D.

    2001-12-01

    Full Text Available The qualitative identification to determine the origin (i.e. manufacturing factory of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the obtained classifier was measured by ten-fold cross validation. The results show that the proposed method is useful to identify an easy-to-use expert system that is able to determine the origin of the clinker based on its trace element content.

    En el presente trabajo se describe el procedimiento de identificación cualitativa de clínkeres españoles con el objeto de determinar su origen (fábrica. Esa clasificación de los clínkeres se basa en el contenido de sus elementos traza. Se analizaron 15 clínkeres diferentes procedentes de 11 fábricas de cemento españolas, determinándose los contenidos en Mg, Sr, Ba, Mn, Ti, Zr, Zn y V. Se ha diseñado un sistema experto mediante un árbol de decisión binario basado en los datos recogidos. La clasificación obtenida fue examinada mediante la validación cruzada de 10 valores. Los resultados obtenidos muestran que el modelo propuesto es válido para identificar, de manera fácil, un sistema experto capaz de determinar el origen de un clínker basándose en el contenido de sus elementos traza.

  15. Radar-Derived Quantitative Precipitation Estimation Based on Precipitation Classification

    Directory of Open Access Journals (Sweden)

    Lili Yang

    2016-01-01

    Full Text Available A method for improving radar-derived quantitative precipitation estimation is proposed. Tropical vertical profiles of reflectivity (VPRs are first determined from multiple VPRs. Upon identifying a tropical VPR, the event can be further classified as either tropical-stratiform or tropical-convective rainfall by a fuzzy logic (FL algorithm. Based on the precipitation-type fields, the reflectivity values are converted into rainfall rate using a Z-R relationship. In order to evaluate the performance of this rainfall classification scheme, three experiments were conducted using three months of data and two study cases. In Experiment I, the Weather Surveillance Radar-1988 Doppler (WSR-88D default Z-R relationship was applied. In Experiment II, the precipitation regime was separated into convective and stratiform rainfall using the FL algorithm, and corresponding Z-R relationships were used. In Experiment III, the precipitation regime was separated into convective, stratiform, and tropical rainfall, and the corresponding Z-R relationships were applied. The results show that the rainfall rates obtained from all three experiments match closely with the gauge observations, although Experiment II could solve the underestimation, when compared to Experiment I. Experiment III significantly reduced this underestimation and generated the most accurate radar estimates of rain rate among the three experiments.

  16. Impact of Biopharmaceutics Classification System-based biowaivers.

    Science.gov (United States)

    Cook, Jack A; Davit, Barbara M; Polli, James E

    2010-10-04

    The Biopharmaceutics Classification System (BCS) is employed to waive in vivo bioequivalence testing (i.e. provide "biowaivers") for new and generic drugs that are BCS class I. Granting biowaivers under systems such as the BCS eliminates unnecessary drug exposures to healthy subjects and provides economic relief, while maintaining the high public health standard for therapeutic equivalence. International scientific consensus suggests class III drugs are also eligible for biowaivers. The objective of this study was to estimate the economic impact of class I BCS-based biowaivers, along with the economic impact of a potential expansion to BCS class III. Methods consider the distribution of drugs across the four BCS classes, numbers of in vivo bioequivalence studies performed from a five year period, and effects of highly variable drugs (HVDs). Results indicate that 26% of all drugs are class I non-HVDs, 7% are class I HVDs, 27% are class III non-HVDs, and 3% are class III HVDs. An estimated 66 to 76 million dollars can be saved each year in clinical study costs if all class I compounds were granted biowaivers. Between 21 and 24 million dollars of this savings is from HVDs. If BCS class III compounds were also granted waivers, an additional direct savings of 62 to 71 million dollars would be realized, with 9 to 10 million dollars coming from HVDs.

  17. Knowledge-based sea ice classification by polarimetric SAR

    DEFF Research Database (Denmark)

    Skriver, Henning; Dierking, Wolfgang

    2004-01-01

    Polarimetric SAR images acquired at C- and L-band over sea ice in the Greenland Sea, Baltic Sea, and Beaufort Sea have been analysed with respect to their potential for ice type classification. The polarimetric data were gathered by the Danish EMISAR and the US AIRSAR which both are airborne...... systems. A hierarchical classification scheme was chosen for sea ice because our knowledge about magnitudes, variations, and dependences of sea ice signatures can be directly considered. The optimal sequence of classification rules and the rules themselves depend on the ice conditions/regimes. The use...... of the polarimetric phase information improves the classification only in the case of thin ice types but is not necessary for thicker ice (above about 30 cm thickness)...

  18. Classification of CT brain images based on deep learning networks.

    Science.gov (United States)

    Gao, Xiaohong W; Hui, Rui; Tian, Zengmin

    2017-01-01

    While computerised tomography (CT) may have been the first imaging tool to study human brain, it has not yet been implemented into clinical decision making process for diagnosis of Alzheimer's disease (AD). On the other hand, with the nature of being prevalent, inexpensive and non-i