integrating feature selection: Topics by WorldWideScience.org

Sample records for integrating feature selection

Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention.

Science.gov (United States)

Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei

2016-01-13

An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features.
Aging, selective attention, and feature integration.

Science.gov (United States)

Plude, D J; Doussard-Roosevelt, J A

1989-03-01

This study used feature-integration theory as a means of determining the point in processing at which selective attention deficits originate. The theory posits an initial stage of processing in which features are registered in parallel and then a serial process in which features are conjoined to form complex stimuli. Performance of young and older adults on feature versus conjunction search is compared. Analyses of reaction times and error rates suggest that elderly adults in addition to young adults, can capitalize on the early parallel processing stage of visual information processing, and that age decrements in visual search arise as a result of the later, serial stage of processing. Analyses of a third, unconfounded, conjunction search condition reveal qualitatively similar modes of conjunction search in young and older adults. The contribution of age-related data limitations is found to be secondary to the contribution of age decrements in selective attention.
Integrative approaches to the prediction of protein functions based on the feature selection

Directory of Open Access Journals (Sweden)

Lee Hyunju

2009-12-01

Full Text Available Abstract Background Protein function prediction has been one of the most important issues in functional genomics. With the current availability of various genomic data sets, many researchers have attempted to develop integration models that combine all available genomic data for protein function prediction. These efforts have resulted in the improvement of prediction quality and the extension of prediction coverage. However, it has also been observed that integrating more data sources does not always increase the prediction quality. Therefore, selecting data sources that highly contribute to the protein function prediction has become an important issue. Results We present systematic feature selection methods that assess the contribution of genome-wide data sets to predict protein functions and then investigate the relationship between genomic data sources and protein functions. In this study, we use ten different genomic data sources in Mus musculus, including: protein-domains, protein-protein interactions, gene expressions, phenotype ontology, phylogenetic profiles and disease data sources to predict protein functions that are labelled with Gene Ontology (GO terms. We then apply two approaches to feature selection: exhaustive search feature selection using a kernel based logistic regression (KLR, and a kernel based L1-norm regularized logistic regression (KL1LR. In the first approach, we exhaustively measure the contribution of each data set for each function based on its prediction quality. In the second approach, we use the estimated coefficients of features as measures of contribution of data sources. Our results show that the proposed methods improve the prediction quality compared to the full integration of all data sources and other filter-based feature selection methods. We also show that contributing data sources can differ depending on the protein function. Furthermore, we observe that highly contributing data sets can be similar among
Feature selection for high-dimensional integrated data

KAUST Repository

Zheng, Charles

2012-04-26

Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of feature selection in which only a subset of the predictors Xt are dependent on the multidimensional variate Y, and the remainder of the predictors constitute a “noise set” Xu independent of Y. Using Monte Carlo simulations, we investigated the relative performance of two methods: thresholding and singular-value decomposition, in combination with stochastic optimization to determine “empirical bounds” on the small-sample accuracy of an asymptotic approximation. We demonstrate utility of the thresholding and SVD feature selection methods to with respect to a recent infant intestinal gene expression and metagenomics dataset.
Feature selection for high-dimensional integrated data

KAUST Repository

Zheng, Charles; Schwartz, Scott; Chapkin, Robert S.; Carroll, Raymond J.; Ivanov, Ivan

2012-01-01

Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of feature selection in which only a subset of the predictors Xt are dependent on the multidimensional variate Y, and the remainder of the predictors constitute a “noise set” Xu independent of Y. Using Monte Carlo simulations, we investigated the relative performance of two methods: thresholding and singular-value decomposition, in combination with stochastic optimization to determine “empirical bounds” on the small-sample accuracy of an asymptotic approximation. We demonstrate utility of the thresholding and SVD feature selection methods to with respect to a recent infant intestinal gene expression and metagenomics dataset.
Integration trumps selection in object recognition

Science.gov (United States)

Saarela, Toni P.; Landy, Michael S.

2015-01-01

Summary Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several “cues” (color, luminance, texture etc.), and humans can integrate sensory cues to improve detection and recognition [1–3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue-invariance by responding to a given shape independent of the visual cue defining it [5–8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10,11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11,12], imaging [13–16], and single-cell and neural population recordings [17,18]. Besides single features, attention can select whole objects [19–21]. Objects are among the suggested “units” of attention because attention to a single feature of an object causes the selection of all of its features [19–21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near-optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. PMID:25802154
Integration trumps selection in object recognition.

Science.gov (United States)

Saarela, Toni P; Landy, Michael S

2015-03-30

Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several "cues" (color, luminance, texture, etc.), and humans can integrate sensory cues to improve detection and recognition [1-3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue invariance by responding to a given shape independent of the visual cue defining it [5-8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10, 11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11, 12], imaging [13-16], and single-cell and neural population recordings [17, 18]. Besides single features, attention can select whole objects [19-21]. Objects are among the suggested "units" of attention because attention to a single feature of an object causes the selection of all of its features [19-21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. Copyright © 2015 Elsevier Ltd. All rights reserved.
Online feature selection with streaming features.

Science.gov (United States)

Wu, Xindong; Yu, Kui; Ding, Wei; Wang, Hao; Zhu, Xingquan

2013-05-01

We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.
Online Feature Selection for Classifying Emphysema in HRCT Images

Directory of Open Access Journals (Sweden)

M. Prasad

2008-06-01

Full Text Available Feature subset selection, applied as a pre- processing step to machine learning, is valuable in dimensionality reduction, eliminating irrelevant data and improving classifier performance. In the classic formulation of the feature selection problem, it is assumed that all the features are available at the beginning. However, in many real world problems, there are scenarios where not all features are present initially and must be integrated as they become available. In such scenarios, online feature selection provides an efficient way to sort through a large space of features. It is in this context that we introduce online feature selection for the classification of emphysema, a smoking related disease that appears as low attenuation regions in High Resolution Computer Tomography (HRCT images. The technique was successfully evaluated on 61 HRCT scans and compared with different online feature selection approaches, including hill climbing, best first search, grafting, and correlation-based feature selection. The results were also compared against ldensity maskr, a standard approach used for emphysema detection in medical image analysis.
Feature selection using genetic algorithms for fetal heart rate analysis

International Nuclear Information System (INIS)

Xu, Liang; Redman, Christopher W G; Georgieva, Antoniya; Payne, Stephen J

2014-01-01

The fetal heart rate (FHR) is monitored on a paper strip (cardiotocogram) during labour to assess fetal health. If necessary, clinicians can intervene and assist with a prompt delivery of the baby. Data-driven computerized FHR analysis could help clinicians in the decision-making process. However, selecting the best computerized FHR features that relate to labour outcome is a pressing research problem. The objective of this study is to apply genetic algorithms (GA) as a feature selection method to select the best feature subset from 64 FHR features and to integrate these best features to recognize unfavourable FHR patterns. The GA was trained on 404 cases and tested on 106 cases (both balanced datasets) using three classifiers, respectively. Regularization methods and backward selection were used to optimize the GA. Reasonable classification performance is shown on the testing set for the best feature subset (Cohen's kappa values of 0.45 to 0.49 using different classifiers). This is, to our knowledge, the first time that a feature selection method for FHR analysis has been developed on a database of this size. This study indicates that different FHR features, when integrated, can show good performance in predicting labour outcome. It also gives the importance of each feature, which will be a valuable reference point for further studies. (paper)
Simultaneous Channel and Feature Selection of Fused EEG Features Based on Sparse Group Lasso

Directory of Open Access Journals (Sweden)

Jin-Jia Wang

2015-01-01

Full Text Available Feature extraction and classification of EEG signals are core parts of brain computer interfaces (BCIs. Due to the high dimension of the EEG feature vector, an effective feature selection algorithm has become an integral part of research studies. In this paper, we present a new method based on a wrapped Sparse Group Lasso for channel and feature selection of fused EEG signals. The high-dimensional fused features are firstly obtained, which include the power spectrum, time-domain statistics, AR model, and the wavelet coefficient features extracted from the preprocessed EEG signals. The wrapped channel and feature selection method is then applied, which uses the logistical regression model with Sparse Group Lasso penalized function. The model is fitted on the training data, and parameter estimation is obtained by modified blockwise coordinate descent and coordinate gradient descent method. The best parameters and feature subset are selected by using a 10-fold cross-validation. Finally, the test data is classified using the trained model. Compared with existing channel and feature selection methods, results show that the proposed method is more suitable, more stable, and faster for high-dimensional feature fusion. It can simultaneously achieve channel and feature selection with a lower error rate. The test accuracy on the data used from international BCI Competition IV reached 84.72%.
An Ensemble Method with Integration of Feature Selection and Classifier Selection to Detect the Landslides

Science.gov (United States)

Zhongqin, G.; Chen, Y.

2017-12-01

Abstract Quickly identify the spatial distribution of landslides automatically is essential for the prevention, mitigation and assessment of the landslide hazard. It's still a challenging job owing to the complicated characteristics and vague boundary of the landslide areas on the image. The high resolution remote sensing image has multi-scales, complex spatial distribution and abundant features, the object-oriented image classification methods can make full use of the above information and thus effectively detect the landslides after the hazard happened. In this research we present a new semi-supervised workflow, taking advantages of recent object-oriented image analysis and machine learning algorithms to quick locate the different origins of landslides of some areas on the southwest part of China. Besides a sequence of image segmentation, feature selection, object classification and error test, this workflow ensemble the feature selection and classifier selection. The feature this study utilized were normalized difference vegetation index (NDVI) change, textural feature derived from the gray level co-occurrence matrices (GLCM), spectral feature and etc. The improvement of this study shows this algorithm significantly removes some redundant feature and the classifiers get fully used. All these improvements lead to a higher accuracy on the determination of the shape of landslides on the high resolution remote sensing image, in particular the flexibility aimed at different kinds of landslides.
Investigation of selected surface integrity features of duplex stainless steel (DSS after turning

Directory of Open Access Journals (Sweden)

G. Krolczyk

2015-01-01

Full Text Available The article presents surface roughness profiles and Abbott - Firestone curves with vertical and amplitude parameters of surface roughness after turning by means of a coated sintered carbide wedge with a coating with ceramic intermediate layer. The investigation comprised the influence of cutting speed on the selected features of surface integrity in dry machining. The material under investigation was duplex stainless steel with two-phase ferritic-austenitic structure. The tests have been performed under production conditions during machining of parts for electric motors and deep-well pumps. The obtained results allow to draw conclusions about the characteristics of surface properties of the machined parts.
Feature confirmation in object perception: Feature integration theory 26 years on from the Treisman Bartlett lecture.

Science.gov (United States)

Humphreys, Glyn W

2016-10-01

The Treisman Bartlett lecture, reported in the Quarterly Journal of Experimental Psychology in 1988, provided a major overview of the feature integration theory of attention. This has continued to be a dominant account of human visual attention to this day. The current paper provides a summary of the work reported in the lecture and an update on critical aspects of the theory as applied to visual object perception. The paper highlights the emergence of findings that pose significant challenges to the theory and which suggest that revisions are required that allow for (a) several rather than a single form of feature integration, (b) some forms of feature integration to operate preattentively, (c) stored knowledge about single objects and interactions between objects to modulate perceptual integration, (d) the application of feature-based inhibition to object files where visual features are specified, which generates feature-based spreading suppression and scene segmentation, and (e) a role for attention in feature confirmation rather than feature integration in visual selection. A feature confirmation account of attention in object perception is outlined.
Feature-selective attention in healthy old age: a selective decline in selective attention?

Science.gov (United States)

Quigley, Cliodhna; Müller, Matthias M

2014-02-12

Deficient selection against irrelevant information has been proposed to underlie age-related cognitive decline. We recently reported evidence for maintained early sensory selection when older and younger adults used spatial selective attention to perform a challenging task. Here we explored age-related differences when spatial selection is not possible and feature-selective attention must be deployed. We additionally compared the integrity of feedforward processing by exploiting the well established phenomenon of suppression of visual cortical responses attributable to interstimulus competition. Electroencephalogram was measured while older and younger human adults responded to brief occurrences of coherent motion in an attended stimulus composed of randomly moving, orientation-defined, flickering bars. Attention was directed to horizontal or vertical bars by a pretrial cue, after which two orthogonally oriented, overlapping stimuli or a single stimulus were presented. Horizontal and vertical bars flickered at different frequencies and thereby elicited separable steady-state visual-evoked potentials, which were used to examine the effect of feature-based selection and the competitive influence of a second stimulus on ongoing visual processing. Age differences were found in feature-selective attentional modulation of visual responses: older adults did not show consistent modulation of magnitude or phase. In contrast, the suppressive effect of a second stimulus was robust and comparable in magnitude across age groups, suggesting that bottom-up processing of the current stimuli is essentially unchanged in healthy old age. Thus, it seems that visual processing per se is unchanged, but top-down attentional control is compromised in older adults when space cannot be used to guide selection.
Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection.

Science.gov (United States)

Li, Baopu; Meng, Max Q-H

2012-05-01

Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.
Attentional Selection of Feature Conjunctions Is Accomplished by Parallel and Independent Selection of Single Features.

Science.gov (United States)

Andersen, Søren K; Müller, Matthias M; Hillyard, Steven A

2015-07-08

Experiments that study feature-based attention have often examined situations in which selection is based on a single feature (e.g., the color red). However, in more complex situations relevant stimuli may not be set apart from other stimuli by a single defining property but by a specific combination of features. Here, we examined sustained attentional selection of stimuli defined by conjunctions of color and orientation. Human observers attended to one out of four concurrently presented superimposed fields of randomly moving horizontal or vertical bars of red or blue color to detect brief intervals of coherent motion. Selective stimulus processing in early visual cortex was assessed by recordings of steady-state visual evoked potentials (SSVEPs) elicited by each of the flickering fields of stimuli. We directly contrasted attentional selection of single features and feature conjunctions and found that SSVEP amplitudes on conditions in which selection was based on a single feature only (color or orientation) exactly predicted the magnitude of attentional enhancement of SSVEPs when attending to a conjunction of both features. Furthermore, enhanced SSVEP amplitudes elicited by attended stimuli were accompanied by equivalent reductions of SSVEP amplitudes elicited by unattended stimuli in all cases. We conclude that attentional selection of a feature-conjunction stimulus is accomplished by the parallel and independent facilitation of its constituent feature dimensions in early visual cortex. The ability to perceive the world is limited by the brain's processing capacity. Attention affords adaptive behavior by selectively prioritizing processing of relevant stimuli based on their features (location, color, orientation, etc.). We found that attentional mechanisms for selection of different features belonging to the same object operate independently and in parallel: concurrent attentional selection of two stimulus features is simply the sum of attending to each of those
Efficient Multi-Label Feature Selection Using Entropy-Based Label Selection

Directory of Open Access Journals (Sweden)

Jaesung Lee

2016-11-01

Full Text Available Multi-label feature selection is designed to select a subset of features according to their importance to multiple labels. This task can be achieved by ranking the dependencies of features and selecting the features with the highest rankings. In a multi-label feature selection problem, the algorithm may be faced with a dataset containing a large number of labels. Because the computational cost of multi-label feature selection increases according to the number of labels, the algorithm may suffer from a degradation in performance when processing very large datasets. In this study, we propose an efficient multi-label feature selection method based on an information-theoretic label selection strategy. By identifying a subset of labels that significantly influence the importance of features, the proposed method efficiently outputs a feature subset. Experimental results demonstrate that the proposed method can identify a feature subset much faster than conventional multi-label feature selection methods for large multi-label datasets.
FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES

Directory of Open Access Journals (Sweden)

Ratri Enggar Pawening

2016-06-01

Full Text Available Datasets with heterogeneous features can affect feature selection results that are not appropriate because it is difficult to evaluate heterogeneous features concurrently. Feature transformation (FT is another way to handle heterogeneous features subset selection. The results of transformation from non-numerical into numerical features may produce redundancy to the original numerical features. In this paper, we propose a method to select feature subset based on mutual information (MI for classifying heterogeneous features. We use unsupervised feature transformation (UFT methods and joint mutual information maximation (JMIM methods. UFT methods is used to transform non-numerical features into numerical features. JMIM methods is used to select feature subset with a consideration of the class label. The transformed and the original features are combined entirely, then determine features subset by using JMIM methods, and classify them using support vector machine (SVM algorithm. The classification accuracy are measured for any number of selected feature subset and compared between UFT-JMIM methods and Dummy-JMIM methods. The average classification accuracy for all experiments in this study that can be achieved by UFT-JMIM methods is about 84.47% and Dummy-JMIM methods is about 84.24%. This result shows that UFT-JMIM methods can minimize information loss between transformed and original features, and select feature subset to avoid redundant and irrelevant features.
Temporal Feature Integration for Music Organisation

DEFF Research Database (Denmark)

Meng, Anders

2006-01-01

This Ph.D. thesis focuses on temporal feature integration for music organisation. Temporal feature integration is the process of combining all the feature vectors of a given time-frame into a single new feature vector in order to capture relevant information in the frame. Several existing methods...... for handling sequences of features are formulated in the temporal feature integration framework. Two datasets for music genre classification have been considered as valid test-beds for music organisation. Human evaluations of these, have been obtained to access the subjectivity on the datasets. Temporal...... ranking' approach is proposed for ranking the short-time features at larger time-scales according to their discriminative power in a music genre classification task. The multivariate AR (MAR) model has been proposed for temporal feature integration. It effectively models local dynamical structure...

Unsupervised Feature Subset Selection

DEFF Research Database (Denmark)

Søndberg-Madsen, Nicolaj; Thomsen, C.; Pena, Jose

2003-01-01

This paper studies filter and hybrid filter-wrapper feature subset selection for unsupervised learning (data clustering). We constrain the search for the best feature subset by scoring the dependence of every feature on the rest of the features, conjecturing that these scores discriminate some ir...... irrelevant features. We report experimental results on artificial and real data for unsupervised learning of naive Bayes models. Both the filter and hybrid approaches perform satisfactorily....
Selecting Optimal Feature Set in High-Dimensional Data by Swarm Search

Directory of Open Access Journals (Sweden)

Simon Fong

2013-01-01

Full Text Available Selecting the right set of features from data of high dimensionality for inducing an accurate classification model is a tough computational challenge. It is almost a NP-hard problem as the combinations of features escalate exponentially as the number of features increases. Unfortunately in data mining, as well as other engineering applications and bioinformatics, some data are described by a long array of features. Many feature subset selection algorithms have been proposed in the past, but not all of them are effective. Since it takes seemingly forever to use brute force in exhaustively trying every possible combination of features, stochastic optimization may be a solution. In this paper, we propose a new feature selection scheme called Swarm Search to find an optimal feature set by using metaheuristics. The advantage of Swarm Search is its flexibility in integrating any classifier into its fitness function and plugging in any metaheuristic algorithm to facilitate heuristic search. Simulation experiments are carried out by testing the Swarm Search over some high-dimensional datasets, with different classification algorithms and various metaheuristic algorithms. The comparative experiment results show that Swarm Search is able to attain relatively low error rates in classification without shrinking the size of the feature subset to its minimum.
Feature Selection by Reordering

Czech Academy of Sciences Publication Activity Database

Jiřina, Marcel; Jiřina jr., M.

2005-01-01

Roč. 2, č. 1 (2005), s. 155-161 ISSN 1738-6438 Institutional research plan: CEZ:AV0Z10300504 Keywords : feature selection * data reduction * ordering of features Subject RIV: BA - General Mathematics
Principal Feature Analysis: A Multivariate Feature Selection Method for fMRI Data

Directory of Open Access Journals (Sweden)

Lijun Wang

2013-01-01

Full Text Available Brain decoding with functional magnetic resonance imaging (fMRI requires analysis of complex, multivariate data. Multivoxel pattern analysis (MVPA has been widely used in recent years. MVPA treats the activation of multiple voxels from fMRI data as a pattern and decodes brain states using pattern classification methods. Feature selection is a critical procedure of MVPA because it decides which features will be included in the classification analysis of fMRI data, thereby improving the performance of the classifier. Features can be selected by limiting the analysis to specific anatomical regions or by computing univariate (voxel-wise or multivariate statistics. However, these methods either discard some informative features or select features with redundant information. This paper introduces the principal feature analysis as a novel multivariate feature selection method for fMRI data processing. This multivariate approach aims to remove features with redundant information, thereby selecting fewer features, while retaining the most information.
Does Attention Serve to Integrate Features?

Science.gov (United States)

Navon, David; Treisman, Anne

1990-01-01

An article and two commentaries consider the attentional feature-integration theory proposed by A. Treisman and colleagues. Hypotheses about the encoding of conjunctions are reviewed. Whether or not data support perceptual feature-integration is argued. (SLD)
Discriminative semi-supervised feature selection via manifold regularization.

Science.gov (United States)

Xu, Zenglin; King, Irwin; Lyu, Michael Rung-Tsong; Jin, Rong

2010-07-01

Feature selection has attracted a huge amount of interest in both research and application communities of data mining. We consider the problem of semi-supervised feature selection, where we are given a small amount of labeled examples and a large amount of unlabeled examples. Since a small number of labeled samples are usually insufficient for identifying the relevant features, the critical problem arising from semi-supervised feature selection is how to take advantage of the information underneath the unlabeled data. To address this problem, we propose a novel discriminative semi-supervised feature selection method based on the idea of manifold regularization. The proposed approach selects features through maximizing the classification margin between different classes and simultaneously exploiting the geometry of the probability distribution that generates both labeled and unlabeled data. In comparison with previous semi-supervised feature selection algorithms, our proposed semi-supervised feature selection method is an embedded feature selection method and is able to find more discriminative features. We formulate the proposed feature selection method into a convex-concave optimization problem, where the saddle point corresponds to the optimal solution. To find the optimal solution, the level method, a fairly recent optimization method, is employed. We also present a theoretic proof of the convergence rate for the application of the level method to our problem. Empirical evaluation on several benchmark data sets demonstrates the effectiveness of the proposed semi-supervised feature selection method.
Integrated reporting and board features

Directory of Open Access Journals (Sweden)

Rares HURGHIS

2017-02-01

Full Text Available In the last two decades the concept of sustainability reporting gained more importance in the companies’ annual reports, a trend which is embedded also in integrated reporting. Issuing an integrated report became a necessity, because the report explains to the investors how the organization creates value over time. The governance structure, more exactly the board of directors, decides whether or not the company will issue an integrated report. Thus, are there certain features of the board that might influence the issue of an integrated report? Do the companies which issue an integrated report have certain features of the governance structure? Looking for an answer to these questions, we seek for any possible correlations between a disclosure index and the corporate governance structure characteristics, on a sample from the companies participating at the International Integrated Reporting Council Examples Database. The results highlight that only the size of the board influences the extent to which the issued integrated report is in accordance with the International Framework.
EEG feature selection method based on decision tree.

Science.gov (United States)

Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun

2015-01-01

This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.
Feature Import Vector Machine: A General Classifier with Flexible Feature Selection.

Science.gov (United States)

Ghosh, Samiran; Wang, Yazhen

2015-02-01

The support vector machine (SVM) and other reproducing kernel Hilbert space (RKHS) based classifier systems are drawing much attention recently due to its robustness and generalization capability. General theme here is to construct classifiers based on the training data in a high dimensional space by using all available dimensions. The SVM achieves huge data compression by selecting only few observations which lie close to the boundary of the classifier function. However when the number of observations are not very large (small n ) but the number of dimensions/features are large (large p ), then it is not necessary that all available features are of equal importance in the classification context. Possible selection of an useful fraction of the available features may result in huge data compression. In this paper we propose an algorithmic approach by means of which such an optimal set of features could be selected. In short, we reverse the traditional sequential observation selection strategy of SVM to that of sequential feature selection. To achieve this we have modified the solution proposed by Zhu and Hastie (2005) in the context of import vector machine (IVM), to select an optimal sub-dimensional model to build the final classifier with sufficient accuracy.
Temporal feature integration for music genre classification

DEFF Research Database (Denmark)

Meng, Anders; Ahrendt, Peter; Larsen, Jan

2007-01-01

, but they capture neither the temporal dynamics nor dependencies among the individual feature dimensions. Here, a multivariate autoregressive feature model is proposed to solve this problem for music genre classification. This model gives two different feature sets, the diagonal autoregressive (DAR......) and multivariate autoregressive (MAR) features which are compared against the baseline mean-variance as well as two other temporal feature integration techniques. Reproducibility in performance ranking of temporal feature integration methods were demonstrated using two data sets with five and eleven music genres...
Feature Selection for Chemical Sensor Arrays Using Mutual Information

Science.gov (United States)

Wang, X. Rosalind; Lizier, Joseph T.; Nowotny, Thomas; Berna, Amalia Z.; Prokopenko, Mikhail; Trowell, Stephen C.

2014-01-01

We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. PMID:24595058
Enhanced feature integration in musicians

DEFF Research Database (Denmark)

Hansen, Niels Christian; Højlund, Andreas; Møller, Cecilie

the classical oddball control paradigm which used identical sounds. This novel finding supports the dependent processing hypothesis suggesting that musicians recruit overlapping neural resources facilitating more holistic representations of domain-relevant stimuli. These specialised refinements in predictive......Distinguishing and integrating features of sensory input is essential to human survival and no less paramount in music perception and cognition. Yet, little is known about training-induced plasticity of neural mechanisms for auditory feature integration. This study aimed to contrast the two...
Genetic search feature selection for affective modeling

DEFF Research Database (Denmark)

Martínez, Héctor P.; Yannakakis, Georgios N.

2010-01-01

Automatic feature selection is a critical step towards the generation of successful computational models of affect. This paper presents a genetic search-based feature selection method which is developed as a global-search algorithm for improving the accuracy of the affective models built....... The method is tested and compared against sequential forward feature selection and random search in a dataset derived from a game survey experiment which contains bimodal input features (physiological and gameplay) and expressed pairwise preferences of affect. Results suggest that the proposed method...
Feature selection and multi-kernel learning for sparse representation on a manifold

KAUST Repository

Wang, Jim Jing-Yan

2014-03-01

Sparse representation has been widely studied as a part-based data representation method and applied in many scientific and engineering fields, such as bioinformatics and medical imaging. It seeks to represent a data sample as a sparse linear combination of some basic items in a dictionary. Gao etal. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity graph constructed directly from the original feature space is not necessarily a reliable reflection of the intrinsic manifold of the data samples. To overcome this problem, we integrate feature selection and multiple kernel learning into the sparse coding on the manifold. To this end, unified objectives are defined for feature selection, multiple kernel learning, sparse coding, and graph regularization. By optimizing the objective functions iteratively, we develop novel data representation algorithms with feature selection and multiple kernel learning respectively. Experimental results on two challenging tasks, N-linked glycosylation prediction and mammogram retrieval, demonstrate that the proposed algorithms outperform the traditional sparse coding methods. © 2013 Elsevier Ltd.
Feature selection and multi-kernel learning for sparse representation on a manifold.

Science.gov (United States)

Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

2014-03-01

Sparse representation has been widely studied as a part-based data representation method and applied in many scientific and engineering fields, such as bioinformatics and medical imaging. It seeks to represent a data sample as a sparse linear combination of some basic items in a dictionary. Gao et al. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity graph constructed directly from the original feature space is not necessarily a reliable reflection of the intrinsic manifold of the data samples. To overcome this problem, we integrate feature selection and multiple kernel learning into the sparse coding on the manifold. To this end, unified objectives are defined for feature selection, multiple kernel learning, sparse coding, and graph regularization. By optimizing the objective functions iteratively, we develop novel data representation algorithms with feature selection and multiple kernel learning respectively. Experimental results on two challenging tasks, N-linked glycosylation prediction and mammogram retrieval, demonstrate that the proposed algorithms outperform the traditional sparse coding methods. Copyright © 2013 Elsevier Ltd. All rights reserved.
Naive Bayes-Guided Bat Algorithm for Feature Selection

Directory of Open Access Journals (Sweden)

Ahmed Majid Taha

2013-01-01

Full Text Available When the amount of data and information is said to double in every 20 months or so, feature selection has become highly important and beneficial. Further improvements in feature selection will positively affect a wide array of applications in fields such as pattern recognition, machine learning, or signal processing. Bio-inspired method called Bat Algorithm hybridized with a Naive Bayes classifier has been presented in this work. The performance of the proposed feature selection algorithm was investigated using twelve benchmark datasets from different domains and was compared to three other well-known feature selection algorithms. Discussion focused on four perspectives: number of features, classification accuracy, stability, and feature generalization. The results showed that BANB significantly outperformed other algorithms in selecting lower number of features, hence removing irrelevant, redundant, or noisy features while maintaining the classification accuracy. BANB is also proven to be more stable than other methods and is capable of producing more general feature subsets.
Naive Bayes-Guided Bat Algorithm for Feature Selection

Science.gov (United States)

Taha, Ahmed Majid; Mustapha, Aida; Chen, Soong-Der

2013-01-01

When the amount of data and information is said to double in every 20 months or so, feature selection has become highly important and beneficial. Further improvements in feature selection will positively affect a wide array of applications in fields such as pattern recognition, machine learning, or signal processing. Bio-inspired method called Bat Algorithm hybridized with a Naive Bayes classifier has been presented in this work. The performance of the proposed feature selection algorithm was investigated using twelve benchmark datasets from different domains and was compared to three other well-known feature selection algorithms. Discussion focused on four perspectives: number of features, classification accuracy, stability, and feature generalization. The results showed that BANB significantly outperformed other algorithms in selecting lower number of features, hence removing irrelevant, redundant, or noisy features while maintaining the classification accuracy. BANB is also proven to be more stable than other methods and is capable of producing more general feature subsets. PMID:24396295
DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

KAUST Repository

Soufan, Othman

2015-02-26

Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem\\'s dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

KAUST Repository

Soufan, Othman; Kleftogiannis, Dimitrios A.; Kalnis, Panos; Bajic, Vladimir B.

2015-01-01

Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
Feature diagnosticity and task context shape activity in human scene-selective cortex.

Science.gov (United States)

Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S

2016-01-15

Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.

Integrative analysis to select cancer candidate biomarkers to targeted validation

Science.gov (United States)

Heberle, Henry; Domingues, Romênia R.; Granato, Daniela C.; Yokoo, Sami; Canevarolo, Rafael R.; Winck, Flavia V.; Ribeiro, Ana Carolina P.; Brandão, Thaís Bianca; Filgueiras, Paulo R.; Cruz, Karen S. P.; Barbuto, José Alexandre; Poppi, Ronei J.; Minghim, Rosane; Telles, Guilherme P.; Fonseca, Felipe Paiva; Fox, Jay W.; Santos-Silva, Alan R.; Coletta, Ricardo D.; Sherman, Nicholas E.; Paes Leme, Adriana F.

2015-01-01

Targeted proteomics has flourished as the method of choice for prospecting for and validating potential candidate biomarkers in many diseases. However, challenges still remain due to the lack of standardized routines that can prioritize a limited number of proteins to be further validated in human samples. To help researchers identify candidate biomarkers that best characterize their samples under study, a well-designed integrative analysis pipeline, comprising MS-based discovery, feature selection methods, clustering techniques, bioinformatic analyses and targeted approaches was performed using discovery-based proteomic data from the secretomes of three classes of human cell lines (carcinoma, melanoma and non-cancerous). Three feature selection algorithms, namely, Beta-binomial, Nearest Shrunken Centroids (NSC), and Support Vector Machine-Recursive Features Elimination (SVM-RFE), indicated a panel of 137 candidate biomarkers for carcinoma and 271 for melanoma, which were differentially abundant between the tumor classes. We further tested the strength of the pipeline in selecting candidate biomarkers by immunoblotting, human tissue microarrays, label-free targeted MS and functional experiments. In conclusion, the proposed integrative analysis was able to pre-qualify and prioritize candidate biomarkers from discovery-based proteomics to targeted MS. PMID:26540631
Classification Influence of Features on Given Emotions and Its Application in Feature Selection

Science.gov (United States)

Xing, Yin; Chen, Chuang; Liu, Li-Long

2018-04-01

In order to solve the problem that there is a large amount of redundant data in high-dimensional speech emotion features, we analyze deeply the extracted speech emotion features and select better features. Firstly, a given emotion is classified by each feature. Secondly, the recognition rate is ranked in descending order. Then, the optimal threshold of features is determined by rate criterion. Finally, the better features are obtained. When applied in Berlin and Chinese emotional data set, the experimental results show that the feature selection method outperforms the other traditional methods.
Feature Selection via Chaotic Antlion Optimization.

Directory of Open Access Journals (Sweden)

Hossam M Zawbaa

Full Text Available Selecting a subset of relevant properties from a large set of features that describe a dataset is a challenging machine learning task. In biology, for instance, the advances in the available technologies enable the generation of a very large number of biomarkers that describe the data. Choosing the more informative markers along with performing a high-accuracy classification over the data can be a daunting task, particularly if the data are high dimensional. An often adopted approach is to formulate the feature selection problem as a biobjective optimization problem, with the aim of maximizing the performance of the data analysis model (the quality of the data training fitting while minimizing the number of features used.We propose an optimization approach for the feature selection problem that considers a "chaotic" version of the antlion optimizer method, a nature-inspired algorithm that mimics the hunting mechanism of antlions in nature. The balance between exploration of the search space and exploitation of the best solutions is a challenge in multi-objective optimization. The exploration/exploitation rate is controlled by the parameter I that limits the random walk range of the ants/prey. This variable is increased iteratively in a quasi-linear manner to decrease the exploration rate as the optimization progresses. The quasi-linear decrease in the variable I may lead to immature convergence in some cases and trapping in local minima in other cases. The chaotic system proposed here attempts to improve the tradeoff between exploration and exploitation. The methodology is evaluated using different chaotic maps on a number of feature selection datasets. To ensure generality, we used ten biological datasets, but we also used other types of data from various sources. The results are compared with the particle swarm optimizer and with genetic algorithm variants for feature selection using a set of quality metrics.
Feature selection for splice site prediction: A new method using EDA-based feature ranking

Directory of Open Access Journals (Sweden)

Rouzé Pierre

2004-05-01

Full Text Available Abstract Background The identification of relevant biological features in large and complex datasets is an important step towards gaining insight in the processes underlying the data. Other advantages of feature selection include the ability of the classification system to attain good or even better solutions using a restricted subset of features, and a faster classification. Thus, robust methods for fast feature selection are of key importance in extracting knowledge from complex biological data. Results In this paper we present a novel method for feature subset selection applied to splice site prediction, based on estimation of distribution algorithms, a more general framework of genetic algorithms. From the estimated distribution of the algorithm, a feature ranking is derived. Afterwards this ranking is used to iteratively discard features. We apply this technique to the problem of splice site prediction, and show how it can be used to gain insight into the underlying biological process of splicing. Conclusion We show that this technique proves to be more robust than the traditional use of estimation of distribution algorithms for feature selection: instead of returning a single best subset of features (as they normally do this method provides a dynamical view of the feature selection process, like the traditional sequential wrapper methods. However, the method is faster than the traditional techniques, and scales better to datasets described by a large number of features.
SIP-FS: a novel feature selection for data representation

Directory of Open Access Journals (Sweden)

Yiyou Guo

2018-02-01

Full Text Available Abstract Multiple features are widely used to characterize real-world datasets. It is desirable to select leading features with stability and interpretability from a set of distinct features for a comprehensive data description. However, most of existing feature selection methods focus on the predictability (e.g., prediction accuracy of selected results yet neglect stability. To obtain compact data representation, a novel feature selection method is proposed to improve stability, and interpretability without sacrificing predictability (SIP-FS. Instead of mutual information, generalized correlation is adopted in minimal redundancy maximal relevance to measure the relation between different feature types. Several feature types (each contains a certain number of features can then be selected and evaluated quantitatively to determine what types contribute to a specific class, thereby enhancing the so-called interpretability of features. Moreover, stability is introduced in the criterion of SIP-FS to obtain consistent results of ranking. We conduct experiments on three publicly available datasets using one-versus-all strategy to select class-specific features. The experiments illustrate that SIP-FS achieves significant performance improvements in terms of stability and interpretability with desirable prediction accuracy and indicates advantages over several state-of-the-art approaches.
FEATUREOUS: AN INTEGRATED ENVIRONMENT FOR FEATURE-CENTRIC ANALYSIS AND MODIFICATION OF OBJECT-ORIENTED SOFTWARE

DEFF Research Database (Denmark)

Olszak, Andrzej; Jørgensen, Bo Nørregaard

2011-01-01

The decentralized nature of collaborations between objects in object-oriented software makes it difficult to understand the implementations of user-observable program features and their respective interdependencies. As feature-centric program understanding and modification are essential during...... software maintenance and evolution, this situation needs to change. In this paper, we present Featureous, an integrated development environment built on top of the NetBeans IDE that facilitates feature-centric analysis of object-oriented software. Our integrated development environment encompasses...... a lightweight feature location mechanism, a number of reusable analytical views, and necessary APIs for supporting future extensions. The base of the integrated development environment is a conceptual framework comprising of three complementary dimensions of comprehension: perspective, abstraction...
Doubly sparse factor models for unifying feature transformation and feature selection

International Nuclear Information System (INIS)

Katahira, Kentaro; Okanoya, Kazuo; Okada, Masato; Matsumoto, Narihisa; Sugase-Miyamoto, Yasuko

2010-01-01

A number of unsupervised learning methods for high-dimensional data are largely divided into two groups based on their procedures, i.e., (1) feature selection, which discards irrelevant dimensions of the data, and (2) feature transformation, which constructs new variables by transforming and mixing over all dimensions. We propose a method that both selects and transforms features in a common Bayesian inference procedure. Our method imposes a doubly automatic relevance determination (ARD) prior on the factor loading matrix. We propose a variational Bayesian inference for our model and demonstrate the performance of our method on both synthetic and real data.
Doubly sparse factor models for unifying feature transformation and feature selection

Energy Technology Data Exchange (ETDEWEB)

Katahira, Kentaro; Okanoya, Kazuo; Okada, Masato [ERATO, Okanoya Emotional Information Project, Japan Science Technology Agency, Saitama (Japan); Matsumoto, Narihisa; Sugase-Miyamoto, Yasuko, E-mail: okada@k.u-tokyo.ac.j [Human Technology Research Institute, National Institute of Advanced Industrial Science and Technology, Ibaraki (Japan)

2010-06-01

A number of unsupervised learning methods for high-dimensional data are largely divided into two groups based on their procedures, i.e., (1) feature selection, which discards irrelevant dimensions of the data, and (2) feature transformation, which constructs new variables by transforming and mixing over all dimensions. We propose a method that both selects and transforms features in a common Bayesian inference procedure. Our method imposes a doubly automatic relevance determination (ARD) prior on the factor loading matrix. We propose a variational Bayesian inference for our model and demonstrate the performance of our method on both synthetic and real data.
Toward a Comprehensive Framework for Evaluating the Core Integration Features of Enterprise Integration Middleware Technologies

Directory of Open Access Journals (Sweden)

Hossein Moradi

2013-01-01

Full Text Available To achieve greater automation of their business processes, organizations face the challenge of integrating disparate systems. In attempting to overcome this problem, organizations are turning to different kinds of enterprise integration. Implementing enterprise integration is a complex task involving both technological and business challenges and requires appropriate middleware technologies. Different enterprise integration solutions provide various functions and features which lead to the complexity of their evaluation process. To overcome this complexity, appropriate tools for evaluating the core integration features of enterprise integration solutions is required. This paper proposes a new comprehensive framework for evaluating the core integration features of both intra-enterprise and inter-enterprise Integration's enabling technologies, which simplify the process of evaluating the requirements met by enterprise integration middleware technologies.The proposed framework for evaluating the core integration features of enterprise integration middleware technologies was enhanced using the structural and conceptual aspects of previous frameworks. It offers a new schema for which various enterprise integration middleware technologies are categorized in different classifications and are evaluated based on their supporting level for the core integration features' criteria. These criteria include the functional and supporting features. The proposed framework, which is a revised version of our previous framework in this area, has developed the scope, structure and content of the mentioned framework.
Joint Feature Selection and Classification for Multilabel Learning.

Science.gov (United States)

Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong

2018-03-01

Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.
Features Selection for Skin Micro-Image Symptomatic Recognition

Institute of Scientific and Technical Information of China (English)

HUYue-li; CAOJia-lin; ZHAOQian; FENGXu

2004-01-01

Automatic recognition of skin micro-image symptom is important in skin diagnosis and treatment. Feature selection is to improve the classification performance of skin micro-image symptom.This paper proposes a hybrid approach based on the support vector machine (SVM) technique and genetic algorithm (GA) to select an optimum feature subset from the feature group extracted from the skin micro-images. An adaptive GA is introduced for maintaining the convergence rate. With the proposed method, the average cross validation accuracy is increased from 88.25% using all features to 96.92% using only selected features provided by a classifier for classification of 5 classes of skin symptoms. The experimental results are satisfactory.
Features Selection for Skin Micro-Image Symptomatic Recognition

Institute of Scientific and Technical Information of China (English)

HU Yue-li; CAO Jia-lin; ZHAO Qian; FENG Xu

2004-01-01

Automatic recognition of skin micro-image symptom is important in skin diagnosis and treatment. Feature selection is to improve the classification performance of skin micro-image symptom.This paper proposes a hybrid approach based on the support vector machine (SVM) technique and genetic algorithm (GA) to select an optimum feature subset from the feature group extracted from the skin micro-images. An adaptive GA is introduced for maintaining the convergence rate. With the proposed method, the average cross validation accuracy is increased from 88.25% using all features to 96.92 % using only selected features provided by a classifier for classification of 5 classes of skin symptoms. The experimental results are satisfactory.
Classification Using Markov Blanket for Feature Selection

DEFF Research Database (Denmark)

Zeng, Yifeng; Luo, Jian

2009-01-01

Selecting relevant features is in demand when a large data set is of interest in a classification task. It produces a tractable number of features that are sufficient and possibly improve the classification performance. This paper studies a statistical method of Markov blanket induction algorithm...... for filtering features and then applies a classifier using the Markov blanket predictors. The Markov blanket contains a minimal subset of relevant features that yields optimal classification performance. We experimentally demonstrate the improved performance of several classifiers using a Markov blanket...... induction as a feature selection method. In addition, we point out an important assumption behind the Markov blanket induction algorithm and show its effect on the classification performance....
Adversarial Feature Selection Against Evasion Attacks.

Science.gov (United States)

Zhang, Fei; Chan, Patrick P K; Biggio, Battista; Yeung, Daniel S; Roli, Fabio

2016-03-01

Pattern recognition and machine learning techniques have been increasingly adopted in adversarial settings such as spam, intrusion, and malware detection, although their security against well-crafted attacks that aim to evade detection by manipulating data at test time has not yet been thoroughly assessed. While previous work has been mainly focused on devising adversary-aware classification algorithms to counter evasion attempts, only few authors have considered the impact of using reduced feature sets on classifier security against the same attacks. An interesting, preliminary result is that classifier security to evasion may be even worsened by the application of feature selection. In this paper, we provide a more detailed investigation of this aspect, shedding some light on the security properties of feature selection against evasion attacks. Inspired by previous work on adversary-aware classifiers, we propose a novel adversary-aware feature selection model that can improve classifier security against evasion attacks, by incorporating specific assumptions on the adversary's data manipulation strategy. We focus on an efficient, wrapper-based implementation of our approach, and experimentally validate its soundness on different application examples, including spam and malware detection.
Crossmodal integration enhances neural representation of task-relevant features in audiovisual face perception.

Science.gov (United States)

Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Liu, Yongjian; Liang, Changhong; Sun, Pei

2015-02-01

Previous studies have shown that audiovisual integration improves identification performance and enhances neural activity in heteromodal brain areas, for example, the posterior superior temporal sulcus/middle temporal gyrus (pSTS/MTG). Furthermore, it has also been demonstrated that attention plays an important role in crossmodal integration. In this study, we considered crossmodal integration in audiovisual facial perception and explored its effect on the neural representation of features. The audiovisual stimuli in the experiment consisted of facial movie clips that could be classified into 2 gender categories (male vs. female) or 2 emotion categories (crying vs. laughing). The visual/auditory-only stimuli were created from these movie clips by removing the auditory/visual contents. The subjects needed to make a judgment about the gender/emotion category for each movie clip in the audiovisual, visual-only, or auditory-only stimulus condition as functional magnetic resonance imaging (fMRI) signals were recorded. The neural representation of the gender/emotion feature was assessed using the decoding accuracy and the brain pattern-related reproducibility indices, obtained by a multivariate pattern analysis method from the fMRI data. In comparison to the visual-only and auditory-only stimulus conditions, we found that audiovisual integration enhanced the neural representation of task-relevant features and that feature-selective attention might play a role of modulation in the audiovisual integration. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Hybrid feature selection for supporting lightweight intrusion detection systems

Science.gov (United States)

Song, Jianglong; Zhao, Wentao; Liu, Qiang; Wang, Xin

2017-08-01

Redundant and irrelevant features not only cause high resource consumption but also degrade the performance of Intrusion Detection Systems (IDS), especially when coping with big data. These features slow down the process of training and testing in network traffic classification. Therefore, a hybrid feature selection approach in combination with wrapper and filter selection is designed in this paper to build a lightweight intrusion detection system. Two main phases are involved in this method. The first phase conducts a preliminary search for an optimal subset of features, in which the chi-square feature selection is utilized. The selected set of features from the previous phase is further refined in the second phase in a wrapper manner, in which the Random Forest(RF) is used to guide the selection process and retain an optimized set of features. After that, we build an RF-based detection model and make a fair comparison with other approaches. The experimental results on NSL-KDD datasets show that our approach results are in higher detection accuracy as well as faster training and testing processes.
Feature Selection Based on Mutual Correlation

Czech Academy of Sciences Publication Activity Database

Haindl, Michal; Somol, Petr; Ververidis, D.; Kotropoulos, C.

2006-01-01

Roč. 19, č. 4225 (2006), s. 569-577 ISSN 0302-9743. [Iberoamerican Congress on Pattern Recognition. CIARP 2006 /11./. Cancun, 14.11.2006-17.11.2006] R&D Projects: GA AV ČR 1ET400750407; GA MŠk 1M0572; GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : feature selection Subject RIV: BD - Theory of Information Impact factor: 0.402, year: 2005 http://library.utia.cas.cz/separaty/historie/haindl-feature selection based on mutual correlation.pdf
Effective traffic features selection algorithm for cyber-attacks samples

Science.gov (United States)

Li, Yihong; Liu, Fangzheng; Du, Zhenyu

2018-05-01

By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.
Penalized feature selection and classification in bioinformatics

OpenAIRE

Ma, Shuangge; Huang, Jian

2008-01-01

In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classific...
A Meta-Heuristic Regression-Based Feature Selection for Predictive Analytics

Directory of Open Access Journals (Sweden)

Bharat Singh

2014-11-01

Full Text Available A high-dimensional feature selection having a very large number of features with an optimal feature subset is an NP-complete problem. Because conventional optimization techniques are unable to tackle large-scale feature selection problems, meta-heuristic algorithms are widely used. In this paper, we propose a particle swarm optimization technique while utilizing regression techniques for feature selection. We then use the selected features to classify the data. Classification accuracy is used as a criterion to evaluate classifier performance, and classification is accomplished through the use of k-nearest neighbour (KNN and Bayesian techniques. Various high dimensional data sets are used to evaluate the usefulness of the proposed approach. Results show that our approach gives better results when compared with other conventional feature selection algorithms.

Feature Selection Criteria for Real Time EKF-SLAM Algorithm

Directory of Open Access Journals (Sweden)

Fernando Auat Cheein

2010-02-01

Full Text Available This paper presents a seletion procedure for environmet features for the correction stage of a SLAM (Simultaneous Localization and Mapping algorithm based on an Extended Kalman Filter (EKF. This approach decreases the computational time of the correction stage which allows for real and constant-time implementations of the SLAM. The selection procedure consists in chosing the features the SLAM system state covariance is more sensible to. The entire system is implemented on a mobile robot equipped with a range sensor laser. The features extracted from the environment correspond to lines and corners. Experimental results of the real time SLAM algorithm and an analysis of the processing-time consumed by the SLAM with the feature selection procedure proposed are shown. A comparison between the feature selection approach proposed and the classical sequential EKF-SLAM along with an entropy feature selection approach is also performed.
Novel Automatic Filter-Class Feature Selection for Machine Learning Regression

DEFF Research Database (Denmark)

Wollsen, Morten Gill; Hallam, John; Jørgensen, Bo Nørregaard

2017-01-01

With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require...... model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature...
Integration of heterogeneous features for remote sensing scene classification

Science.gov (United States)

Wang, Xin; Xiong, Xingnan; Ning, Chen; Shi, Aiye; Lv, Guofang

2018-01-01

Scene classification is one of the most important issues in remote sensing (RS) image processing. We find that features from different channels (shape, spectral, texture, etc.), levels (low-level and middle-level), or perspectives (local and global) could provide various properties for RS images, and then propose a heterogeneous feature framework to extract and integrate heterogeneous features with different types for RS scene classification. The proposed method is composed of three modules (1) heterogeneous features extraction, where three heterogeneous feature types, called DS-SURF-LLC, mean-Std-LLC, and MS-CLBP, are calculated, (2) heterogeneous features fusion, where the multiple kernel learning (MKL) is utilized to integrate the heterogeneous features, and (3) an MKL support vector machine classifier for RS scene classification. The proposed method is extensively evaluated on three challenging benchmark datasets (a 6-class dataset, a 12-class dataset, and a 21-class dataset), and the experimental results show that the proposed method leads to good classification performance. It produces good informative features to describe the RS image scenes. Moreover, the integration of heterogeneous features outperforms some state-of-the-art features on RS scene classification tasks.
Discrete Biogeography Based Optimization for Feature Selection in Molecular Signatures.

Science.gov (United States)

Liu, Bo; Tian, Meihong; Zhang, Chunhua; Li, Xiangtao

2015-04-01

Biomarker discovery from high-dimensional data is a complex task in the development of efficient cancer diagnoses and classification. However, these data are usually redundant and noisy, and only a subset of them present distinct profiles for different classes of samples. Thus, selecting high discriminative genes from gene expression data has become increasingly interesting in the field of bioinformatics. In this paper, a discrete biogeography based optimization is proposed to select the good subset of informative gene relevant to the classification. In the proposed algorithm, firstly, the fisher-markov selector is used to choose fixed number of gene data. Secondly, to make biogeography based optimization suitable for the feature selection problem; discrete migration model and discrete mutation model are proposed to balance the exploration and exploitation ability. Then, discrete biogeography based optimization, as we called DBBO, is proposed by integrating discrete migration model and discrete mutation model. Finally, the DBBO method is used for feature selection, and three classifiers are used as the classifier with the 10 fold cross-validation method. In order to show the effective and efficiency of the algorithm, the proposed algorithm is tested on four breast cancer dataset benchmarks. Comparison with genetic algorithm, particle swarm optimization, differential evolution algorithm and hybrid biogeography based optimization, experimental results demonstrate that the proposed method is better or at least comparable with previous method from literature when considering the quality of the solutions obtained. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Improving Music Genre Classification by Short-Time Feature Integration

DEFF Research Database (Denmark)

Meng, Anders; Ahrendt, Peter; Larsen, Jan

2005-01-01

Many different short-time features, using time windows in the size of 10-30 ms, have been proposed for music segmentation, retrieval and genre classification. However, often the available time frame of the music to make the actual decision or comparison (the decision time horizon) is in the range...... of seconds instead of milliseconds. The problem of making new features on the larger time scale from the short-time features (feature integration) has only received little attention. This paper investigates different methods for feature integration and late information fusion for music genre classification...
Feature Selection Methods for Zero-Shot Learning of Neural Activity

Directory of Open Access Journals (Sweden)

Carlos A. Caceres

2017-06-01

Full Text Available Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.
NetProt: Complex-based Feature Selection.

Science.gov (United States)

Goh, Wilson Wen Bin; Wong, Limsoon

2017-08-04

Protein complex-based feature selection (PCBFS) provides unparalleled reproducibility with high phenotypic relevance on proteomics data. Currently, there are five PCBFS paradigms, but not all representative methods have been implemented or made readily available. To allow general users to take advantage of these methods, we developed the R-package NetProt, which provides implementations of representative feature-selection methods. NetProt also provides methods for generating simulated differential data and generating pseudocomplexes for complex-based performance benchmarking. The NetProt open source R package is available for download from https://github.com/gohwils/NetProt/releases/ , and online documentation is available at http://rpubs.com/gohwils/204259 .
A redundancy-removing feature selection algorithm for nominal data

Directory of Open Access Journals (Sweden)

Zhihua Li

2015-10-01

Full Text Available No order correlation or similarity metric exists in nominal data, and there will always be more redundancy in a nominal dataset, which means that an efficient mutual information-based nominal-data feature selection method is relatively difficult to find. In this paper, a nominal-data feature selection method based on mutual information without data transformation, called the redundancy-removing more relevance less redundancy algorithm, is proposed. By forming several new information-related definitions and the corresponding computational methods, the proposed method can compute the information-related amount of nominal data directly. Furthermore, by creating a new evaluation function that considers both the relevance and the redundancy globally, the new feature selection method can evaluate the importance of each nominal-data feature. Although the presented feature selection method takes commonly used MIFS-like forms, it is capable of handling high-dimensional datasets without expensive computations. We perform extensive experimental comparisons of the proposed algorithm and other methods using three benchmarking nominal datasets with two different classifiers. The experimental results demonstrate the average advantage of the presented algorithm over the well-known NMIFS algorithm in terms of the feature selection and classification accuracy, which indicates that the proposed method has a promising performance.
Multi-task feature selection in microarray data by binary integer programming.

Science.gov (United States)

Lan, Liang; Vucetic, Slobodan

2013-12-20

A major challenge in microarray classification is that the number of features is typically orders of magnitude larger than the number of examples. In this paper, we propose a novel feature filter algorithm to select the feature subset with maximal discriminative power and minimal redundancy by solving a quadratic objective function with binary integer constraints. To improve the computational efficiency, the binary integer constraints are relaxed and a low-rank approximation to the quadratic term is applied. The proposed feature selection algorithm was extended to solve multi-task microarray classification problems. We compared the single-task version of the proposed feature selection algorithm with 9 existing feature selection methods on 4 benchmark microarray data sets. The empirical results show that the proposed method achieved the most accurate predictions overall. We also evaluated the multi-task version of the proposed algorithm on 8 multi-task microarray datasets. The multi-task feature selection algorithm resulted in significantly higher accuracy than when using the single-task feature selection methods.
Long-lasting modulation of feature integration by transcranial magnetic stimulation

NARCIS (Netherlands)

Scharnowski, Frank; Rueter, Johannes; Jolij, Jacob; Hermens, Frouke; Kammer, Thomas; Herzog, Michael H.

The human brain analyzes a visual object first by basic feature detectors. On the objects way to a conscious percept, these features are integrated in subsequent stages of the visual hierarchy. The time course of this feature integration is largely unknown. To shed light on the temporal dynamics of
Long-lasting modulation of feature integration by transcranial magnetic stimulation

NARCIS (Netherlands)

Scharnowski, Frank; Rueter, Johannes; Jolij, Jacob; Hermens, Frouke; Kammer, Thomas; Herzog, Michael H.

2009-01-01

The human brain analyzes a visual object first by basic feature detectors. On the objects way to a conscious percept, these features are integrated in subsequent stages of the visual hierarchy. The time course of this feature integration is largely unknown. To shed light on the temporal dynamics of
Tracing the breeding farm of domesticated pig using feature selection (

Directory of Open Access Journals (Sweden)

Taehyung Kwon

2017-11-01

Full Text Available Objective Increasing food safety demands in the animal product market have created a need for a system to trace the food distribution process, from the manufacturer to the retailer, and genetic traceability is an effective method to trace the origin of animal products. In this study, we successfully achieved the farm tracing of 6,018 multi-breed pigs, using single nucleotide polymorphism (SNP markers strictly selected through least absolute shrinkage and selection operator (LASSO feature selection. Methods We performed farm tracing of domesticated pig (Sus scrofa from SNP markers and selected the most relevant features for accurate prediction. Considering multi-breed composition of our data, we performed feature selection using LASSO penalization on 4,002 SNPs that are shared between breeds, which also includes 179 SNPs with small between-breed difference. The 100 highest-scored features were extracted from iterative simulations and then evaluated using machine-leaning based classifiers. Results We selected 1,341 SNPs from over 45,000 SNPs through iterative LASSO feature selection, to minimize between-breed differences. We subsequently selected 100 highest-scored SNPs from iterative scoring, and observed high statistical measures in classification of breeding farms by cross-validation only using these SNPs. Conclusion The study represents a successful application of LASSO feature selection on multi-breed pig SNP data to trace the farm information, which provides a valuable method and possibility for further researches on genetic traceability.
Vibration and acoustic frequency spectra for industrial process modeling using selective fusion multi-condition samples and multi-source features

Science.gov (United States)

Tang, Jian; Qiao, Junfei; Wu, ZhiWei; Chai, Tianyou; Zhang, Jian; Yu, Wen

2018-01-01

Frequency spectral data of mechanical vibration and acoustic signals relate to difficult-to-measure production quality and quantity parameters of complex industrial processes. A selective ensemble (SEN) algorithm can be used to build a soft sensor model of these process parameters by fusing valued information selectively from different perspectives. However, a combination of several optimized ensemble sub-models with SEN cannot guarantee the best prediction model. In this study, we use several techniques to construct mechanical vibration and acoustic frequency spectra of a data-driven industrial process parameter model based on selective fusion multi-condition samples and multi-source features. Multi-layer SEN (MLSEN) strategy is used to simulate the domain expert cognitive process. Genetic algorithm and kernel partial least squares are used to construct the inside-layer SEN sub-model based on each mechanical vibration and acoustic frequency spectral feature subset. Branch-and-bound and adaptive weighted fusion algorithms are integrated to select and combine outputs of the inside-layer SEN sub-models. Then, the outside-layer SEN is constructed. Thus, "sub-sampling training examples"-based and "manipulating input features"-based ensemble construction methods are integrated, thereby realizing the selective information fusion process based on multi-condition history samples and multi-source input features. This novel approach is applied to a laboratory-scale ball mill grinding process. A comparison with other methods indicates that the proposed MLSEN approach effectively models mechanical vibration and acoustic signals.
Oculomotor selection underlies feature retention in visual working memory.

Science.gov (United States)

Hanning, Nina M; Jonikaitis, Donatas; Deubel, Heiner; Szinte, Martin

2016-02-01

Oculomotor selection, spatial task relevance, and visual working memory (WM) are described as three processes highly intertwined and sustained by similar cortical structures. However, because task-relevant locations always constitute potential saccade targets, no study so far has been able to distinguish between oculomotor selection and spatial task relevance. We designed an experiment that allowed us to dissociate in humans the contribution of task relevance, oculomotor selection, and oculomotor execution to the retention of feature representations in WM. We report that task relevance and oculomotor selection lead to dissociable effects on feature WM maintenance. In a first task, in which an object's location was encoded as a saccade target, its feature representations were successfully maintained in WM, whereas they declined at nonsaccade target locations. Likewise, we observed a similar WM benefit at the target of saccades that were prepared but never executed. In a second task, when an object's location was marked as task relevant but constituted a nonsaccade target (a location to avoid), feature representations maintained at that location did not benefit. Combined, our results demonstrate that oculomotor selection is consistently associated with WM, whereas task relevance is not. This provides evidence for an overlapping circuitry serving saccade target selection and feature-based WM that can be dissociated from processes encoding task-relevant locations. Copyright © 2016 the American Physiological Society.
Temporal resolution for the perception of features and conjunctions.

Science.gov (United States)

Bodelón, Clara; Fallah, Mazyar; Reynolds, John H

2007-01-24

The visual system decomposes stimuli into their constituent features, represented by neurons with different feature selectivities. How the signals carried by these feature-selective neurons are integrated into coherent object representations is unknown. To constrain the set of possible integrative mechanisms, we quantified the temporal resolution of perception for color, orientation, and conjunctions of these two features. We find that temporal resolution is measurably higher for each feature than for their conjunction, indicating that time is required to integrate features into a perceptual whole. This finding places temporal limits on the mechanisms that could mediate this form of perceptual integration.
Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach

Directory of Open Access Journals (Sweden)

Daniel Peralta

2015-01-01

Full Text Available Nowadays, many disciplines have to deal with big datasets that additionally involve a high number of features. Feature selection methods aim at eliminating noisy, redundant, or irrelevant features that may deteriorate the classification performance. However, traditional methods lack enough scalability to cope with datasets of millions of instances and extract successful results in a delimited time. This paper presents a feature selection algorithm based on evolutionary computation that uses the MapReduce paradigm to obtain subsets of features from big datasets. The algorithm decomposes the original dataset in blocks of instances to learn from them in the map phase; then, the reduce phase merges the obtained partial results into a final vector of feature weights, which allows a flexible application of the feature selection procedure using a threshold to determine the selected subset of features. The feature selection method is evaluated by using three well-known classifiers (SVM, Logistic Regression, and Naive Bayes implemented within the Spark framework to address big data problems. In the experiments, datasets up to 67 millions of instances and up to 2000 attributes have been managed, showing that this is a suitable framework to perform evolutionary feature selection, improving both the classification accuracy and its runtime when dealing with big data problems.
Feature and Region Selection for Visual Learning.

Science.gov (United States)

Zhao, Ji; Wang, Liantao; Cabral, Ricardo; De la Torre, Fernando

2016-03-01

Visual learning problems, such as object classification and action recognition, are typically approached using extensions of the popular bag-of-words (BoWs) model. Despite its great success, it is unclear what visual features the BoW model is learning. Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: 1) our approach accommodates non-linear additive kernels, such as the popular χ(2) and intersection kernel; 2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; 3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; and 4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach.
Feature extraction for dynamic integration of classifiers

NARCIS (Netherlands)

Pechenizkiy, M.; Tsymbal, A.; Puuronen, S.; Patterson, D.W.

2007-01-01

Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. In this paper, we present an algorithm for the dynamic integration of classifiers in the space of extracted features (FEDIC). It is based on the technique
Feature selection and nearest centroid classification for protein mass spectrometry

Directory of Open Access Journals (Sweden)

Levner Ilya

2005-03-01

Full Text Available Abstract Background The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry. Results This study examines the performance of the nearest centroid classifier coupled with the following feature selection algorithms. Student-t test, Kolmogorov-Smirnov test, and the P-test are univariate statistics used for filter-based feature ranking. From the wrapper approaches we tested sequential forward selection and a modified version of sequential backward selection. Embedded approaches included shrunken nearest centroid and a novel version of boosting based feature selection we developed. In addition, we tested several dimensionality reduction approaches, namely principal component analysis and principal component analysis coupled with linear discriminant analysis. To fairly assess each algorithm, evaluation was done using stratified cross validation with an internal leave-one-out cross-validation loop for automated feature selection. Comprehensive experiments, conducted on five popular cancer data sets, revealed that the less advocated sequential forward selection and boosted feature selection algorithms produce the most consistent results across all data sets. In contrast, the state-of-the-art performance reported on isolated data sets for several of the studied algorithms, does not hold across all data sets. Conclusion This study tested a number of popular feature
Feature selection toolbox software package

Czech Academy of Sciences Publication Activity Database

Pudil, Pavel; Novovičová, Jana; Somol, Petr

2002-01-01

Roč. 23, č. 4 (2002), s. 487-492 ISSN 0167-8655 R&D Projects: GA ČR GA402/01/0981 Institutional research plan: CEZ:AV0Z1075907 Keywords : pattern recognition * feature selection * loating search algorithms Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.409, year: 2002

Meta-Analysis of DNA Tumor-Viral Integration Site Selection Indicates a Role for Repeats, Gene Expression and Epigenetics

Directory of Open Access Journals (Sweden)

Janet M. Doolittle-Hall

2015-11-01

Full Text Available Oncoviruses cause tremendous global cancer burden. For several DNA tumor viruses, human genome integration is consistently associated with cancer development. However, genomic features associated with tumor viral integration are poorly understood. We sought to define genomic determinants for 1897 loci prone to hosting human papillomavirus (HPV, hepatitis B virus (HBV or Merkel cell polyomavirus (MCPyV. These were compared to HIV, whose enzyme-mediated integration is well understood. A comprehensive catalog of integration sites was constructed from the literature and experimentally-determined HPV integration sites. Features were scored in eight categories (genes, expression, open chromatin, histone modifications, methylation, protein binding, chromatin segmentation and repeats and compared to random loci. Random forest models determined loci classification and feature selection. HPV and HBV integrants were not fragile site associated. MCPyV preferred integration near sensory perception genes. Unique signatures of integration-associated predictive genomic features were detected. Importantly, repeats, actively-transcribed regions and histone modifications were common tumor viral integration signatures.
Feature Selection with the Boruta Package

OpenAIRE

Kursa, Miron B.; Rudnicki, Witold R.

2010-01-01

This article describes a R package Boruta, implementing a novel feature selection algorithm for finding emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.
Pairwise Constraint-Guided Sparse Learning for Feature Selection.

Science.gov (United States)

Liu, Mingxia; Zhang, Daoqiang

2016-01-01

Feature selection aims to identify the most informative features for a compact and accurate data representation. As typical supervised feature selection methods, Lasso and its variants using L1-norm-based regularization terms have received much attention in recent studies, most of which use class labels as supervised information. Besides class labels, there are other types of supervised information, e.g., pairwise constraints that specify whether a pair of data samples belong to the same class (must-link constraint) or different classes (cannot-link constraint). However, most of existing L1-norm-based sparse learning methods do not take advantage of the pairwise constraints that provide us weak and more general supervised information. For addressing that problem, we propose a pairwise constraint-guided sparse (CGS) learning method for feature selection, where the must-link and the cannot-link constraints are used as discriminative regularization terms that directly concentrate on the local discriminative structure of data. Furthermore, we develop two variants of CGS, including: 1) semi-supervised CGS that utilizes labeled data, pairwise constraints, and unlabeled data and 2) ensemble CGS that uses the ensemble of pairwise constraint sets. We conduct a series of experiments on a number of data sets from University of California-Irvine machine learning repository, a gene expression data set, two real-world neuroimaging-based classification tasks, and two large-scale attribute classification tasks. Experimental results demonstrate the efficacy of our proposed methods, compared with several established feature selection methods.
Integral fast reactor concept inherent safety features

International Nuclear Information System (INIS)

Marchaterre, J.F.; Sevy, R.H.; Cahalan, J.E.

1987-01-01

The Integral Fast Reactor (IFR) is an innovative liquid-metal-cooled reactor concept being developed at Argonne National Laboratory. The two major goals of the IFT development effort are improved economics and enhanced safety. The design features that together fulfill these goals are: 1) a liquid metal (sodium) coolant, 2) a pool-type reactor primary system configuration, 3) an advanced ternary alloy metallic fuel, and 4) an integral fuel cycle. This paper reviews the design features that contribute to the safety margins inherent to the IFR concept. Special emphasis is placed on the ability of the IFR design to accommodate anticipated transients without scram (ATWS)
Integral Fast Reactor concept inherent safety features

International Nuclear Information System (INIS)

Marchaterre, J.F.; Sevy, R.H.; Cahalan, J.E.

1986-01-01

The Integral Fast Reactor (IFR) is an innovative liquid-metal-cooled reactor concept being developed at Argonne National Laboratory. The two major goals of the IFR development effort are improved economics and enhanced safety. The design features that together fulfill these goals are: (1) a liquid metal (sodium) coolant, (2) a pool-type reactor primary system configuration, (3) an advanced ternary alloy metallic fuel, and (4) an integral fuel cycle. This paper reviews the design features that contribute to the safety margins inherent to the IFR concept. Special emphasis is placed on the ability of the IFR design to accommodate anticipated transients without scram (ATWS)
Improving Naive Bayes with Online Feature Selection for Quick Adaptation to Evolving Feature Usefulness

Energy Technology Data Exchange (ETDEWEB)

Pon, R K; Cardenas, A F; Buttler, D J

2007-09-19

The definition of what makes an article interesting varies from user to user and continually evolves even for a single user. As a result, for news recommendation systems, useless document features can not be determined a priori and all features are usually considered for interestingness classification. Consequently, the presence of currently useless features degrades classification performance [1], particularly over the initial set of news articles being classified. The initial set of document is critical for a user when considering which particular news recommendation system to adopt. To address these problems, we introduce an improved version of the naive Bayes classifier with online feature selection. We use correlation to determine the utility of each feature and take advantage of the conditional independence assumption used by naive Bayes for online feature selection and classification. The augmented naive Bayes classifier performs 28% better than the traditional naive Bayes classifier in recommending news articles from the Yahoo! RSS feeds.
[Electroencephalogram Feature Selection Based on Correlation Coefficient Analysis].

Science.gov (United States)

Zhou, Jinzhi; Tang, Xiaofang

2015-08-01

In order to improve the accuracy of classification with small amount of motor imagery training data on the development of brain-computer interface (BCD systems, we proposed an analyzing method to automatically select the characteristic parameters based on correlation coefficient analysis. Throughout the five sample data of dataset IV a from 2005 BCI Competition, we utilized short-time Fourier transform (STFT) and correlation coefficient calculation to reduce the number of primitive electroencephalogram dimension, then introduced feature extraction based on common spatial pattern (CSP) and classified by linear discriminant analysis (LDA). Simulation results showed that the average rate of classification accuracy could be improved by using correlation coefficient feature selection method than those without using this algorithm. Comparing with support vector machine (SVM) optimization features algorithm, the correlation coefficient analysis can lead better selection parameters to improve the accuracy of classification.
A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data

KAUST Repository

Abusamra, Heba

2013-01-01

Different experiments have been applied to compare the performance of the classification methods with and without performing feature selection. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.
Feature Selection with the Boruta Package

Directory of Open Access Journals (Sweden)

Miron B. Kursa

2010-10-01

Full Text Available This article describes a R package Boruta, implementing a novel feature selection algorithm for finding emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.
Embedded Incremental Feature Selection for Reinforcement Learning

Science.gov (United States)

2012-05-01

Prior to this work, feature selection for reinforce- ment learning has focused on linear value function ap- proximation ( Kolter and Ng, 2009; Parr et al...InProceed- ings of the the 23rd International Conference on Ma- chine Learning, pages 449–456. Kolter , J. Z. and Ng, A. Y. (2009). Regularization and feature
Absolute cosine-based SVM-RFE feature selection method for prostate histopathological grading.

Science.gov (United States)

Sahran, Shahnorbanun; Albashish, Dheeb; Abdullah, Azizi; Shukor, Nordashima Abd; Hayati Md Pauzi, Suria

2018-04-18

Feature selection (FS) methods are widely used in grading and diagnosing prostate histopathological images. In this context, FS is based on the texture features obtained from the lumen, nuclei, cytoplasm and stroma, all of which are important tissue components. However, it is difficult to represent the high-dimensional textures of these tissue components. To solve this problem, we propose a new FS method that enables the selection of features with minimal redundancy in the tissue components. We categorise tissue images based on the texture of individual tissue components via the construction of a single classifier and also construct an ensemble learning model by merging the values obtained by each classifier. Another issue that arises is overfitting due to the high-dimensional texture of individual tissue components. We propose a new FS method, SVM-RFE(AC), that integrates a Support Vector Machine-Recursive Feature Elimination (SVM-RFE) embedded procedure with an absolute cosine (AC) filter method to prevent redundancy in the selected features of the SV-RFE and an unoptimised classifier in the AC. We conducted experiments on H&E histopathological prostate and colon cancer images with respect to three prostate classifications, namely benign vs. grade 3, benign vs. grade 4 and grade 3 vs. grade 4. The colon benchmark dataset requires a distinction between grades 1 and 2, which are the most difficult cases to distinguish in the colon domain. The results obtained by both the single and ensemble classification models (which uses the product rule as its merging method) confirm that the proposed SVM-RFE(AC) is superior to the other SVM and SVM-RFE-based methods. We developed an FS method based on SVM-RFE and AC and successfully showed that its use enabled the identification of the most crucial texture feature of each tissue component. Thus, it makes possible the distinction between multiple Gleason grades (e.g. grade 3 vs. grade 4) and its performance is far superior to
Effective Feature Selection for Classification of Promoter Sequences.

Directory of Open Access Journals (Sweden)

Kouser K

Full Text Available Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine, KNN (K Nearest Neighbor and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.
A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization.

Science.gov (United States)

He, Xiaofei; Ji, Ming; Zhang, Chiyuan; Bao, Hujun

2011-10-01

In many information processing tasks, one is often confronted with very high-dimensional data. Feature selection techniques are designed to find the meaningful feature subset of the original features which can facilitate clustering, classification, and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenarios, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. Based on Laplacian regularized least squares, which finds a smooth function on the data manifold and minimizes the empirical loss, we propose two novel feature selection algorithms which aim to minimize the expected prediction error of the regularized regression model. Specifically, we select those features such that the size of the parameter covariance matrix of the regularized regression model is minimized. Motivated from experimental design, we use trace and determinant operators to measure the size of the covariance matrix. Efficient computational schemes are also introduced to solve the corresponding optimization problems. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithms.
DYNAMIC FEATURE SELECTION FOR WEB USER IDENTIFICATION ON LINGUISTIC AND STYLISTIC FEATURES OF ONLINE TEXTS

Directory of Open Access Journals (Sweden)

A. A. Vorobeva

2017-01-01

Full Text Available The paper deals with identification and authentication of web users participating in the Internet information processes (based on features of online texts.In digital forensics web user identification based on various linguistic features can be used to discover identity of individuals, criminals or terrorists using the Internet to commit cybercrimes. Internet could be used as a tool in different types of cybercrimes (fraud and identity theft, harassment and anonymous threats, terrorist or extremist statements, distribution of illegal content and information warfare. Linguistic identification of web users is a kind of biometric identification, it can be used to narrow down the suspects, identify a criminal and prosecute him. Feature set includes various linguistic and stylistic features extracted from online texts. We propose dynamic feature selection for each web user identification task. Selection is based on calculating Manhattan distance to k-nearest neighbors (Relief-f algorithm. This approach improves the identification accuracy and minimizes the number of features. Experiments were carried out on several datasets with different level of class imbalance. Experiment results showed that features relevance varies in different set of web users (probable authors of some text; features selection for each set of web users improves identification accuracy by 4% at the average that is approximately 1% higher than with the use of static set of features. The proposed approach is most effective for a small number of training samples (messages per user.
Efficient Generation and Selection of Combined Features for Improved Classification

KAUST Repository

Shono, Ahmad N.

2014-05-01

This study contributes a methodology and associated toolkit developed to allow users to experiment with the use of combined features in classification problems. Methods are provided for efficiently generating combined features from an original feature set, for efficiently selecting the most discriminating of these generated combined features, and for efficiently performing a preliminary comparison of the classification results when using the original features exclusively against the results when using the selected combined features. The potential benefit of considering combined features in classification problems is demonstrated by applying the developed methodology and toolkit to three sample data sets where the discovery of combined features containing new discriminating information led to improved classification results.
AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

Science.gov (United States)

Sun, Lei; Wang, Jun; Wei, Jinmao

2017-03-14

The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.
[Feature extraction for breast cancer data based on geometric algebra theory and feature selection using differential evolution].

Science.gov (United States)

Li, Jing; Hong, Wenxue

2014-12-01

The feature extraction and feature selection are the important issues in pattern recognition. Based on the geometric algebra representation of vector, a new feature extraction method using blade coefficient of geometric algebra was proposed in this study. At the same time, an improved differential evolution (DE) feature selection method was proposed to solve the elevated high dimension issue. The simple linear discriminant analysis was used as the classifier. The result of the 10-fold cross-validation (10 CV) classification of public breast cancer biomedical dataset was more than 96% and proved superior to that of the original features and traditional feature extraction method.
Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors

Directory of Open Access Journals (Sweden)

Hong Zhao

2013-01-01

Full Text Available Feature selection is an essential process in data mining applications since it reduces a model’s complexity. However, feature selection with various types of costs is still a new research topic. In this paper, we study the cost-sensitive feature selection problem of numeric data with measurement errors. The major contributions of this paper are fourfold. First, a new data model is built to address test costs and misclassification costs as well as error boundaries. It is distinguished from the existing models mainly on the error boundaries. Second, a covering-based rough set model with normal distribution measurement errors is constructed. With this model, coverings are constructed from data rather than assigned by users. Third, a new cost-sensitive feature selection problem is defined on this model. It is more realistic than the existing feature selection problems. Fourth, both backtracking and heuristic algorithms are proposed to deal with the new problem. Experimental results show the efficiency of the pruning techniques for the backtracking algorithm and the effectiveness of the heuristic algorithm. This study is a step toward realistic applications of the cost-sensitive learning.
A Feature Subset Selection Method Based On High-Dimensional Mutual Information

Directory of Open Access Journals (Sweden)

Chee Keong Kwoh

2011-04-01

Full Text Available Feature selection is an important step in building accurate classifiers and provides better understanding of the data sets. In this paper, we propose a feature subset selection method based on high-dimensional mutual information. We also propose to use the entropy of the class attribute as a criterion to determine the appropriate subset of features when building classifiers. We prove that if the mutual information between a feature set X and the class attribute Y equals to the entropy of Y , then X is a Markov Blanket of Y . We show that in some cases, it is infeasible to approximate the high-dimensional mutual information with algebraic combinations of pairwise mutual information in any forms. In addition, the exhaustive searches of all combinations of features are prerequisite for finding the optimal feature subsets for classifying these kinds of data sets. We show that our approach outperforms existing filter feature subset selection methods for most of the 24 selected benchmark data sets.
Retroviral DNA integration: viral and cellular determinants of target-site selection.

Directory of Open Access Journals (Sweden)

Mary K Lewinski

2006-06-01

Full Text Available Retroviruses differ in their preferences for sites for viral DNA integration in the chromosomes of infected cells. Human immunodeficiency virus (HIV integrates preferentially within active transcription units, whereas murine leukemia virus (MLV integrates preferentially near transcription start sites and CpG islands. We investigated the viral determinants of integration-site selection using HIV chimeras with MLV genes substituted for their HIV counterparts. We found that transferring the MLV integrase (IN coding region into HIV (to make HIVmIN caused the hybrid to integrate with a specificity close to that of MLV. Addition of MLV gag (to make HIVmGagmIN further increased the similarity of target-site selection to that of MLV. A chimeric virus with MLV Gag only (HIVmGag displayed targeting preferences different from that of both HIV and MLV, further implicating Gag proteins in targeting as well as IN. We also report a genome-wide analysis indicating that MLV, but not HIV, favors integration near DNase I-hypersensitive sites (i.e., +/- 1 kb, and that HIVmIN and HIVmGagmIN also favored integration near these features. These findings reveal that IN is the principal viral determinant of integration specificity; they also reveal a new role for Gag-derived proteins, and strengthen models for integration targeting based on tethering of viral IN proteins to host proteins.

Feature Extraction and Selection Strategies for Automated Target Recognition

Science.gov (United States)

Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin

2010-01-01

Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory region of-interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.
Adaptive feature selection using v-shaped binary particle swarm optimization.

Science.gov (United States)

Teng, Xuyang; Dong, Hongbin; Zhou, Xiurong

2017-01-01

Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers.
HEART RATE VARIABILITY CLASSIFICATION USING SADE-ELM CLASSIFIER WITH BAT FEATURE SELECTION

Directory of Open Access Journals (Sweden)

R Kavitha

2017-07-01

Full Text Available The electrical activity of the human heart is measured by the vital bio medical signal called ECG. This electrocardiogram is employed as a crucial source to gather the diagnostic information of a patient’s cardiopathy. The monitoring function of cardiac disease is diagnosed by documenting and handling the electrocardiogram (ECG impulses. In the recent years many research has been done and developing an enhanced method to identify the risk in the patient’s body condition by processing and analysing the ECG signal. This analysis of the signal helps to find the cardiac abnormalities, arrhythmias, and many other heart problems. ECG signal is processed to detect the variability in heart rhythm; heart rate variability is calculated based on the time interval between heart beats. Heart Rate Variability HRV is measured by the variation in the beat to beat interval. The Heart rate Variability (HRV is an essential aspect to diagnose the properties of the heart. Recent development enhances the potential with the aid of non-linear metrics in reference point with feature selection. In this paper, the fundamental elements are taken from the ECG signal for feature selection process where Bat algorithm is employed for feature selection to predict the best feature and presented to the classifier for accurate classification. The popular machine learning algorithm ELM is taken for classification, integrated with evolutionary algorithm named Self- Adaptive Differential Evolution Extreme Learning Machine SADEELM to improve the reliability of classification. It combines Effective Fuzzy Kohonen clustering network (EFKCN to be able to increase the accuracy of the effect for HRV transmission classification. Hence, it is observed that the experiment carried out unveils that the precision is improved by the SADE-ELM method and concurrently optimizes the computation time.
Speech Emotion Feature Selection Method Based on Contribution Analysis Algorithm of Neural Network

International Nuclear Information System (INIS)

Wang Xiaojia; Mao Qirong; Zhan Yongzhao

2008-01-01

There are many emotion features. If all these features are employed to recognize emotions, redundant features may be existed. Furthermore, recognition result is unsatisfying and the cost of feature extraction is high. In this paper, a method to select speech emotion features based on contribution analysis algorithm of NN is presented. The emotion features are selected by using contribution analysis algorithm of NN from the 95 extracted features. Cluster analysis is applied to analyze the effectiveness for the features selected, and the time of feature extraction is evaluated. Finally, 24 emotion features selected are used to recognize six speech emotions. The experiments show that this method can improve the recognition rate and the time of feature extraction
An Appraisal Model Based on a Synthetic Feature Selection Approach for Students’ Academic Achievement

Directory of Open Access Journals (Sweden)

Ching-Hsue Cheng

2017-11-01

Full Text Available Obtaining necessary information (and even extracting hidden messages from existing big data, and then transforming them into knowledge, is an important skill. Data mining technology has received increased attention in various fields in recent years because it can be used to find historical patterns and employ machine learning to aid in decision-making. When we find unexpected rules or patterns from the data, they are likely to be of high value. This paper proposes a synthetic feature selection approach (SFSA, which is combined with a support vector machine (SVM to extract patterns and find the key features that influence students’ academic achievement. For verifying the proposed model, two databases, namely, “Student Profile” and “Tutorship Record”, were collected from an elementary school in Taiwan, and were concatenated into an integrated dataset based on students’ names as a research dataset. The results indicate the following: (1 the accuracy of the proposed feature selection approach is better than that of the Minimum-Redundancy-Maximum-Relevance (mRMR approach; (2 the proposed model is better than the listing methods when the six least influential features have been deleted; and (3 the proposed model can enhance the accuracy and facilitate the interpretation of the pattern from a hybrid-type dataset of students’ academic achievement.
Feature Selection using Multi-objective Genetic Algorith m: A Hybrid Approach

OpenAIRE

Ahuja, Jyoti; GJUST - Guru Jambheshwar University of Sciecne and Technology; Ratnoo, Saroj Dahiya; GJUST - Guru Jambheshwar University of Sciecne and Technology

2015-01-01

Feature selection is an important pre-processing task for building accurate and comprehensible classification models. Several researchers have applied filter, wrapper or hybrid approaches using genetic algorithms which are good candidates for optimization problems that involve large search spaces like in the case of feature selection. Moreover, feature selection is an inherently multi-objective problem with many competing objectives involving size, predictive power and redundancy of the featu...
NMDA receptor antagonist ketamine impairs feature integration in visual perception.

Science.gov (United States)

Meuwese, Julia D I; van Loon, Anouk M; Scholte, H Steven; Lirk, Philipp B; Vulink, Nienke C C; Hollmann, Markus W; Lamme, Victor A F

2013-01-01

Recurrent interactions between neurons in the visual cortex are crucial for the integration of image elements into coherent objects, such as in figure-ground segregation of textured images. Blocking N-methyl-D-aspartate (NMDA) receptors in monkeys can abolish neural signals related to figure-ground segregation and feature integration. However, it is unknown whether this also affects perceptual integration itself. Therefore, we tested whether ketamine, a non-competitive NMDA receptor antagonist, reduces feature integration in humans. We administered a subanesthetic dose of ketamine to healthy subjects who performed a texture discrimination task in a placebo-controlled double blind within-subject design. We found that ketamine significantly impaired performance on the texture discrimination task compared to the placebo condition, while performance on a control fixation task was much less impaired. This effect is not merely due to task difficulty or a difference in sedation levels. We are the first to show a behavioral effect on feature integration by manipulating the NMDA receptor in humans.
NMDA receptor antagonist ketamine impairs feature integration in visual perception.

Directory of Open Access Journals (Sweden)

Julia D I Meuwese

Full Text Available Recurrent interactions between neurons in the visual cortex are crucial for the integration of image elements into coherent objects, such as in figure-ground segregation of textured images. Blocking N-methyl-D-aspartate (NMDA receptors in monkeys can abolish neural signals related to figure-ground segregation and feature integration. However, it is unknown whether this also affects perceptual integration itself. Therefore, we tested whether ketamine, a non-competitive NMDA receptor antagonist, reduces feature integration in humans. We administered a subanesthetic dose of ketamine to healthy subjects who performed a texture discrimination task in a placebo-controlled double blind within-subject design. We found that ketamine significantly impaired performance on the texture discrimination task compared to the placebo condition, while performance on a control fixation task was much less impaired. This effect is not merely due to task difficulty or a difference in sedation levels. We are the first to show a behavioral effect on feature integration by manipulating the NMDA receptor in humans.
The neurocognitive basis of feature integration

NARCIS (Netherlands)

Keizer, André Willem

2010-01-01

One of the most striking features of the brain is that it is modular; it consists of often highly specialized areas. This modular organization requires efficient communication in order to integrate the information that is represented in distinct brain areas. In my thesis, I studied the neural basis
Feature selection is the ReliefF for multiple instance learning

NARCIS (Netherlands)

Zafra, A.; Pechenizkiy, M.; Ventura, S.

2010-01-01

Dimensionality reduction and feature selection in particular are known to be of a great help for making supervised learning more effective and efficient. Many different feature selection techniques have been proposed for the traditional settings, where each instance is expected to have a label. In
Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection

Directory of Open Access Journals (Sweden)

Liogienė Tatjana

2016-07-01

Full Text Available The intensive research of speech emotion recognition introduced a huge collection of speech emotion features. Large feature sets complicate the speech emotion recognition task. Among various feature selection and transformation techniques for one-stage classification, multiple classifier systems were proposed. The main idea of multiple classifiers is to arrange the emotion classification process in stages. Besides parallel and serial cases, the hierarchical arrangement of multi-stage classification is most widely used for speech emotion recognition. In this paper, we present a sequential-forward-feature-selection-based multi-stage classification scheme. The Sequential Forward Selection (SFS and Sequential Floating Forward Selection (SFFS techniques were employed for every stage of the multi-stage classification scheme. Experimental testing of the proposed scheme was performed using the German and Lithuanian emotional speech datasets. Sequential-feature-selection-based multi-stage classification outperformed the single-stage scheme by 12–42 % for different emotion sets. The multi-stage scheme has shown higher robustness to the growth of emotion set. The decrease in recognition rate with the increase in emotion set for multi-stage scheme was lower by 10–20 % in comparison with the single-stage case. Differences in SFS and SFFS employment for feature selection were negligible.
Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification.

Science.gov (United States)

Zhang, Yong; Gong, Dun-Wei; Cheng, Jian

2017-01-01

Feature selection is an important data-preprocessing technique in classification problems such as bioinformatics and signal processing. Generally, there are some situations where a user is interested in not only maximizing the classification performance but also minimizing the cost that may be associated with features. This kind of problem is called cost-based feature selection. However, most existing feature selection approaches treat this task as a single-objective optimization problem. This paper presents the first study of multi-objective particle swarm optimization (PSO) for cost-based feature selection problems. The task of this paper is to generate a Pareto front of nondominated solutions, that is, feature subsets, to meet different requirements of decision-makers in real-world applications. In order to enhance the search capability of the proposed algorithm, a probability-based encoding technology and an effective hybrid operator, together with the ideas of the crowding distance, the external archive, and the Pareto domination relationship, are applied to PSO. The proposed PSO-based multi-objective feature selection algorithm is compared with several multi-objective feature selection algorithms on five benchmark datasets. Experimental results show that the proposed algorithm can automatically evolve a set of nondominated solutions, and it is a highly competitive feature selection method for solving cost-based feature selection problems.
Effect of feature-selective attention on neuronal responses in macaque area MT

Science.gov (United States)

Chen, X.; Hoffmann, K.-P.; Albright, T. D.

2012-01-01

Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color). PMID:22170961
Predictive Feature Selection for Genetic Policy Search

Science.gov (United States)

2014-05-22

limited manual intervention are becoming increasingly desirable as more complex tasks in dynamic and high- tempo environments are explored. Reinforcement...states in many domains causes features relevant to the reward variations to be overlooked, which hinders the policy search. 3.4 Parameter Selection PFS...the current feature subset. This local minimum may be “deceptive,” meaning that it does not clearly lead to the global optimal policy ( Goldberg and
Comparison of feature selection and classification for MALDI-MS data

Directory of Open Access Journals (Sweden)

Yang Mary

2009-07-01

Full Text Available Abstract Introduction In the classification of Mass Spectrometry (MS proteomics data, peak detection, feature selection, and learning classifiers are critical to classification accuracy. To better understand which methods are more accurate when classifying data, some publicly available peak detection algorithms for Matrix assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS data were recently compared; however, the issue of different feature selection methods and different classification models as they relate to classification performance has not been addressed. With the application of intelligent computing, much progress has been made in the development of feature selection methods and learning classifiers for the analysis of high-throughput biological data. The main objective of this paper is to compare the methods of feature selection and different learning classifiers when applied to MALDI-MS data and to provide a subsequent reference for the analysis of MS proteomics data. Results We compared a well-known method of feature selection, Support Vector Machine Recursive Feature Elimination (SVMRFE, and a recently developed method, Gradient based Leave-one-out Gene Selection (GLGS that effectively performs microarray data analysis. We also compared several learning classifiers including K-Nearest Neighbor Classifier (KNNC, Naïve Bayes Classifier (NBC, Nearest Mean Scaled Classifier (NMSC, uncorrelated normal based quadratic Bayes Classifier recorded as UDC, Support Vector Machines, and a distance metric learning for Large Margin Nearest Neighbor classifier (LMNN based on Mahanalobis distance. To compare, we conducted a comprehensive experimental study using three types of MALDI-MS data. Conclusion Regarding feature selection, SVMRFE outperformed GLGS in classification. As for the learning classifiers, when classification models derived from the best training were compared, SVMs performed the best with respect to the expected testing
Rough-fuzzy clustering and unsupervised feature selection for wavelet based MR image segmentation.

Directory of Open Access Journals (Sweden)

Pradipta Maji

Full Text Available Image segmentation is an indispensable process in the visualization of human tissues, particularly during clinical analysis of brain magnetic resonance (MR images. For many human experts, manual segmentation is a difficult and time consuming task, which makes an automated brain MR image segmentation method desirable. In this regard, this paper presents a new segmentation method for brain MR images, integrating judiciously the merits of rough-fuzzy computing and multiresolution image analysis technique. The proposed method assumes that the major brain tissues, namely, gray matter, white matter, and cerebrospinal fluid from the MR images are considered to have different textural properties. The dyadic wavelet analysis is used to extract the scale-space feature vector for each pixel, while the rough-fuzzy clustering is used to address the uncertainty problem of brain MR image segmentation. An unsupervised feature selection method is introduced, based on maximum relevance-maximum significance criterion, to select relevant and significant textural features for segmentation problem, while the mathematical morphology based skull stripping preprocessing step is proposed to remove the non-cerebral tissues like skull. The performance of the proposed method, along with a comparison with related approaches, is demonstrated on a set of synthetic and real brain MR images using standard validity indices.
An Accurate Integral Method for Vibration Signal Based on Feature Information Extraction

Directory of Open Access Journals (Sweden)

Yong Zhu

2015-01-01

Full Text Available After summarizing the advantages and disadvantages of current integral methods, a novel vibration signal integral method based on feature information extraction was proposed. This method took full advantage of the self-adaptive filter characteristic and waveform correction feature of ensemble empirical mode decomposition in dealing with nonlinear and nonstationary signals. This research merged the superiorities of kurtosis, mean square error, energy, and singular value decomposition on signal feature extraction. The values of the four indexes aforementioned were combined into a feature vector. Then, the connotative characteristic components in vibration signal were accurately extracted by Euclidean distance search, and the desired integral signals were precisely reconstructed. With this method, the interference problem of invalid signal such as trend item and noise which plague traditional methods is commendably solved. The great cumulative error from the traditional time-domain integral is effectively overcome. Moreover, the large low-frequency error from the traditional frequency-domain integral is successfully avoided. Comparing with the traditional integral methods, this method is outstanding at removing noise and retaining useful feature information and shows higher accuracy and superiority.
A selective overview of feature screening for ultrahigh-dimensional data.

Science.gov (United States)

JingYuan, Liu; Wei, Zhong; RunZe, L I

2015-10-01

High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of high-dimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data. Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many high-dimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.
Feature-Selective Attention Adaptively Shifts Noise Correlations in Primary Auditory Cortex.

Science.gov (United States)

Downer, Joshua D; Rapone, Brittany; Verhein, Jessica; O'Connor, Kevin N; Sutter, Mitchell L

2017-05-24

Sensory environments often contain an overwhelming amount of information, with both relevant and irrelevant information competing for neural resources. Feature attention mediates this competition by selecting the sensory features needed to form a coherent percept. How attention affects the activity of populations of neurons to support this process is poorly understood because population coding is typically studied through simulations in which one sensory feature is encoded without competition. Therefore, to study the effects of feature attention on population-based neural coding, investigations must be extended to include stimuli with both relevant and irrelevant features. We measured noise correlations ( r noise ) within small neural populations in primary auditory cortex while rhesus macaques performed a novel feature-selective attention task. We found that the effect of feature-selective attention on r noise depended not only on the population tuning to the attended feature, but also on the tuning to the distractor feature. To attempt to explain how these observed effects might support enhanced perceptual performance, we propose an extension of a simple and influential model in which shifts in r noise can simultaneously enhance the representation of the attended feature while suppressing the distractor. These findings present a novel mechanism by which attention modulates neural populations to support sensory processing in cluttered environments. SIGNIFICANCE STATEMENT Although feature-selective attention constitutes one of the building blocks of listening in natural environments, its neural bases remain obscure. To address this, we developed a novel auditory feature-selective attention task and measured noise correlations ( r noise ) in rhesus macaque A1 during task performance. Unlike previous studies showing that the effect of attention on r noise depends on population tuning to the attended feature, we show that the effect of attention depends on the tuning
Integrated Phoneme Subspace Method for Speech Feature Extraction

Directory of Open Access Journals (Sweden)

Park Hyunsin

2009-01-01

Full Text Available Speech feature extraction has been a key focus in robust speech recognition research. In this work, we discuss data-driven linear feature transformations applied to feature vectors in the logarithmic mel-frequency filter bank domain. Transformations are based on principal component analysis (PCA, independent component analysis (ICA, and linear discriminant analysis (LDA. Furthermore, this paper introduces a new feature extraction technique that collects the correlation information among phoneme subspaces and reconstructs feature space for representing phonemic information efficiently. The proposed speech feature vector is generated by projecting an observed vector onto an integrated phoneme subspace (IPS based on PCA or ICA. The performance of the new feature was evaluated for isolated word speech recognition. The proposed method provided higher recognition accuracy than conventional methods in clean and reverberant environments.

Variable selection in near-infrared spectroscopy: Benchmarking of feature selection methods on biodiesel data

International Nuclear Information System (INIS)

Balabin, Roman M.; Smirnov, Sergey V.

2011-01-01

During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm -1 ) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic
Max-AUC feature selection in computer-aided detection of polyps in CT colonography.

Science.gov (United States)

Xu, Jian-Wu; Suzuki, Kenji

2014-03-01

We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level.
Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

Science.gov (United States)

Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

2015-01-01

Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.
Feature selection in classification of eye movements using electrooculography for activity recognition.

Science.gov (United States)

Mala, S; Latha, K

2014-01-01

Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.
An ant colony optimization based feature selection for web page classification.

Science.gov (United States)

Saraç, Esra; Özel, Selma Ayşe

2014-01-01

The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods.
When do letter features migrate? A boundary condition for feature-integration theory.

Science.gov (United States)

Butler, B E; Mewhort, D J; Browse, R A

1991-01-01

Feature-integration theory postulates that a lapse of attention will allow letter features to change position and to recombine as illusory conjunctions (Treisman & Paterson, 1984). To study such errors, we used a set of uppercase letters known to yield illusory conjunctions in each of three tasks. The first, a bar-probe task, showed whole-character mislocations but not errors based on feature migration and recombination. The second, a two-alternative forced-choice detection task, allowed subjects to focus on the presence or absence of subletter features and showed illusory conjunctions based on feature migration and recombination. The third was also a two-alternative forced-choice detection task, but we manipulated the subjects' knowledge of the shape of the stimuli: In the case-certain condition, the stimuli were always in uppercase, but in the case-uncertain condition, the stimuli could appear in either upper- or lowercase. Subjects in the case-certain condition produced illusory conjunctions based on feature recombination, whereas subjects in the case-uncertain condition did not. The results suggest that when subjects can view the stimuli as feature groups, letter features regroup as illusory conjunctions; when subjects encode the stimuli as letters, whole items may be mislocated, but subletter features are not. Thus, illusory conjunctions reflect the subject's processing strategy, rather than the architecture of the visual system.
Integrated Active and Passive Polymer Optical Components with nm to mm Features

DEFF Research Database (Denmark)

Christiansen, Mads Brøkner; Schøler, Mikkel; Kristensen, Anders

2007-01-01

We present wafer-scale fabrication of integrated active and passive polymer optics with nm to mm features. First order DFB lasers, defined in dye doped SU-8 resist are integrated with SU-8 waveguides.......We present wafer-scale fabrication of integrated active and passive polymer optics with nm to mm features. First order DFB lasers, defined in dye doped SU-8 resist are integrated with SU-8 waveguides....
Enhancement web proxy cache performance using Wrapper Feature Selection methods with NB and J48

Science.gov (United States)

Mahmoud Al-Qudah, Dua'a.; Funke Olanrewaju, Rashidah; Wong Azman, Amelia

2017-11-01

Web proxy cache technique reduces response time by storing a copy of pages between client and server sides. If requested pages are cached in the proxy, there is no need to access the server. Due to the limited size and excessive cost of cache compared to the other storages, cache replacement algorithm is used to determine evict page when the cache is full. On the other hand, the conventional algorithms for replacement such as Least Recently Use (LRU), First in First Out (FIFO), Least Frequently Use (LFU), Randomized Policy etc. may discard important pages just before use. Furthermore, using conventional algorithm cannot be well optimized since it requires some decision to intelligently evict a page before replacement. Hence, most researchers propose an integration among intelligent classifiers and replacement algorithm to improves replacement algorithms performance. This research proposes using automated wrapper feature selection methods to choose the best subset of features that are relevant and influence classifiers prediction accuracy. The result present that using wrapper feature selection methods namely: Best First (BFS), Incremental Wrapper subset selection(IWSS)embedded NB and particle swarm optimization(PSO)reduce number of features and have a good impact on reducing computation time. Using PSO enhance NB classifier accuracy by 1.1%, 0.43% and 0.22% over using NB with all features, using BFS and using IWSS embedded NB respectively. PSO rises J48 accuracy by 0.03%, 1.91 and 0.04% over using J48 classifier with all features, using IWSS-embedded NB and using BFS respectively. While using IWSS embedded NB fastest NB and J48 classifiers much more than BFS and PSO. However, it reduces computation time of NB by 0.1383 and reduce computation time of J48 by 2.998.
Integral fast reactor safety features

International Nuclear Information System (INIS)

Cahalan, J.E.; Kramer, J.M.; Marchaterre, J.F.; Mueller, C.J.; Pedersen, D.R.; Sevy, R.H.; Wade, D.C.; Wei, T.Y.C.

1988-01-01

The integral fast reactor (IFR) is an advanced liquid-metal-cooled reactor concept being developed at Argonne National Laboratory. The two major goals of the IFR development effort are improved economics and enhanced safety. In addition to liquid metal cooling, the principal design features that distinguish the IFR are: a pool-type primary system, and advanced ternary alloy metallic fuel, and an integral fuel cycle with on-site fuel reprocessing and fabrication. This paper focuses on the technical aspects of the improved safety margins available in the IFR concept. This increased level of safety is made possible by the liquid metal (sodium) coolant and pool-type primary system layout, which together facilitate passive decay heat removal, and a sodium-bonded metallic fuel pin design with thermal and neutronic properties that provide passive core responses which control and mitigate the consequences of reactor accidents
Hierarchical feature selection for erythema severity estimation

Science.gov (United States)

Wang, Li; Shi, Chenbo; Shu, Chang

2014-10-01

At present PASI system of scoring is used for evaluating erythema severity, which can help doctors to diagnose psoriasis [1-3]. The system relies on the subjective judge of doctors, where the accuracy and stability cannot be guaranteed [4]. This paper proposes a stable and precise algorithm for erythema severity estimation. Our contributions are twofold. On one hand, in order to extract the multi-scale redness of erythema, we design the hierarchical feature. Different from traditional methods, we not only utilize the color statistical features, but also divide the detect window into small window and extract hierarchical features. Further, a feature re-ranking step is introduced, which can guarantee that extracted features are irrelevant to each other. On the other hand, an adaptive boosting classifier is applied for further feature selection. During the step of training, the classifier will seek out the most valuable feature for evaluating erythema severity, due to its strong learning ability. Experimental results demonstrate the high precision and robustness of our algorithm. The accuracy is 80.1% on the dataset which comprise 116 patients' images with various kinds of erythema. Now our system has been applied for erythema medical efficacy evaluation in Union Hosp, China.
Degree of contribution (DoC) feature selection algorithm for structural brain MRI volumetric features in depression detection.

Science.gov (United States)

Kipli, Kuryati; Kouzani, Abbas Z

2015-07-01

Accurate detection of depression at an individual level using structural magnetic resonance imaging (sMRI) remains a challenge. Brain volumetric changes at a structural level appear to have importance in depression biomarkers studies. An automated algorithm is developed to select brain sMRI volumetric features for the detection of depression. A feature selection (FS) algorithm called degree of contribution (DoC) is developed for selection of sMRI volumetric features. This algorithm uses an ensemble approach to determine the degree of contribution in detection of major depressive disorder. The DoC is the score of feature importance used for feature ranking. The algorithm involves four stages: feature ranking, subset generation, subset evaluation, and DoC analysis. The performance of DoC is evaluated on the Duke University Multi-site Imaging Research in the Analysis of Depression sMRI dataset. The dataset consists of 115 brain sMRI scans of 88 healthy controls and 27 depressed subjects. Forty-four sMRI volumetric features are used in the evaluation. The DoC score of forty-four features was determined as the accuracy threshold (Acc_Thresh) was varied. The DoC performance was compared with that of four existing FS algorithms. At all defined Acc_Threshs, DoC outperformed the four examined FS algorithms for the average classification score and the maximum classification score. DoC has a good ability to generate reduced-size subsets of important features that could yield high classification accuracy. Based on the DoC score, the most discriminant volumetric features are those from the left-brain region.
A Hybrid Feature Selection Approach for Arabic Documents Classification

NARCIS (Netherlands)

Habib, Mena Badieh; Sarhan, Ahmed A. E.; Salem, Abdel-Badeeh M.; Fayed, Zaki T.; Gharib, Tarek F.

Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge number of features. Feature selection tries to
Simultaneous feature selection and classification via Minimax Probability Machine

Directory of Open Access Journals (Sweden)

Liming Yang

2010-12-01

Full Text Available This paper presents a novel method for simultaneous feature selection and classification by incorporating a robust L1-norm into the objective function of Minimax Probability Machine (MPM. A fractional programming framework is derived by using a bound on the misclassification error involving the mean and covariance of the data. Furthermore, the problems are solved by the Quadratic Interpolation method. Experiments show that our methods can select fewer features to improve the generalization compared to MPM, which illustrates the effectiveness of the proposed algorithms.
Selection of individual features of a speech signal using genetic algorithms

Directory of Open Access Journals (Sweden)

Kamil Kamiński

2016-03-01

Full Text Available The paper presents an automatic speaker’s recognition system, implemented in the Matlab environment, and demonstrates how to achieve and optimize various elements of the system. The main emphasis was put on features selection of a speech signal using a genetic algorithm which takes into account synergy of features. The results of optimization of selected elements of a classifier have been also shown, including the number of Gaussian distributions used to model each of the voices. In addition, for creating voice models, a universal voice model has been used.[b]Keywords[/b]: biometrics, automatic speaker recognition, genetic algorithms, feature selection
Feature Selection for Audio Surveillance in Urban Environment

Directory of Open Access Journals (Sweden)

KIKTOVA Eva

2014-05-01

Full Text Available This paper presents the work leading to the acoustic event detection system, which is designed to recognize two types of acoustic events (shot and breaking glass in urban environment. For this purpose, a huge front-end processing was performed for the effective parametric representation of an input sound. MFCC features and features computed during their extraction (MELSPEC and FBANK, then MPEG-7 audio descriptors and other temporal and spectral characteristics were extracted. High dimensional feature sets were created and in the next phase reduced by the mutual information based selection algorithms. Hidden Markov Model based classifier was applied and evaluated by the Viterbi decoding algorithm. Thus very effective feature sets were identified and also the less important features were found.
Feature Selection Using Adaboost for Face Expression Recognition

National Research Council Canada - National Science Library

Silapachote, Piyanuch; Karuppiah, Deepak R; Hanson, Allen R

2005-01-01

We propose a classification technique for face expression recognition using AdaBoost that learns by selecting the relevant global and local appearance features with the most discriminating information...
Integral fast reactor safety features

International Nuclear Information System (INIS)

Cahalan, J.E.; Kramer, J.M.; Marchaterre, J.F.; Mueller, C.J.; Pedersen, D.R.; Sevy, R.H.; Wade, D.C.; Wei, T.Y.C.

1988-01-01

The Integral Fast Reactor (IFR) is an advanced liquid-metal-cooled reactor concept being developed at Argonne National Laboratory. The two major goals of the IFR development effort are improved economics and enhanced safety. In addition to liquid metal cooling, the principal design features that distinguish the IFR are: (1) a pool-type primary system, (2) an advanced ternary alloy metallic fuel, and (3) an integral fuel cycle with on-site fuel reprocessing and fabrication. This paper focuses on the technical aspects of the improved safety margins available in the IFR concept. This increased level of safety is made possible by (1) the liquid metal (sodium) coolant and pool-type primary system layout, which together facilitate passive decay heat removal, and (2) a sodium-bonded metallic fuel pin design with thermal and neutronic properties that provide passive core responses which control and mitigate the consequences of reactor accidents
A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark

Directory of Open Access Journals (Sweden)

Yong Wang

2016-02-01

Full Text Available Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.
Feature selection model based on clustering and ranking in pipeline for microarray data

Directory of Open Access Journals (Sweden)

Barnali Sahu

2017-01-01

Full Text Available Most of the available feature selection techniques in the literature are classifier bound. It means a group of features tied to the performance of a specific classifier as applied in wrapper and hybrid approach. Our objective in this study is to select a set of generic features not tied to any classifier based on the proposed framework. This framework uses attribute clustering and feature ranking techniques in pipeline in order to remove redundant features. On each uncovered cluster, signal-to-noise ratio, t-statistics and significance analysis of microarray are independently applied to select the top ranked features. Both filter and evolutionary wrapper approaches have been considered for feature selection and the data set with selected features are given to ensemble of predefined statistically different classifiers. The class labels of the test data are determined using majority voting technique. Moreover, with the aforesaid objectives, this paper focuses on obtaining a stable result out of various classification models. Further, a comparative analysis has been performed to study the classification accuracy and computational time of the current approach and evolutionary wrapper techniques. It gives a better insight into the features and further enhancing the classification accuracy with less computational time.
Game Theoretic Approach for Systematic Feature Selection; Application in False Alarm Detection in Intensive Care Units

Directory of Open Access Journals (Sweden)

Fatemeh Afghah

2018-03-01

Full Text Available Intensive Care Units (ICUs are equipped with many sophisticated sensors and monitoring devices to provide the highest quality of care for critically ill patients. However, these devices might generate false alarms that reduce standard of care and result in desensitization of caregivers to alarms. Therefore, reducing the number of false alarms is of great importance. Many approaches such as signal processing and machine learning, and designing more accurate sensors have been developed for this purpose. However, the significant intrinsic correlation among the extracted features from different sensors has been mostly overlooked. A majority of current data mining techniques fail to capture such correlation among the collected signals from different sensors that limits their alarm recognition capabilities. Here, we propose a novel information-theoretic predictive modeling technique based on the idea of coalition game theory to enhance the accuracy of false alarm detection in ICUs by accounting for the synergistic power of signal attributes in the feature selection stage. This approach brings together techniques from information theory and game theory to account for inter-features mutual information in determining the most correlated predictors with respect to false alarm by calculating Banzhaf power of each feature. The numerical results show that the proposed method can enhance classification accuracy and improve the area under the ROC (receiver operating characteristic curve compared to other feature selection techniques, when integrated in classifiers such as Bayes-Net that consider inter-features dependencies.

Enhancing the Performance of LibSVM Classifier by Kernel F-Score Feature Selection

Science.gov (United States)

Sarojini, Balakrishnan; Ramaraj, Narayanasamy; Nickolas, Savarimuthu

Medical Data mining is the search for relationships and patterns within the medical datasets that could provide useful knowledge for effective clinical decisions. The inclusion of irrelevant, redundant and noisy features in the process model results in poor predictive accuracy. Much research work in data mining has gone into improving the predictive accuracy of the classifiers by applying the techniques of feature selection. Feature selection in medical data mining is appreciable as the diagnosis of the disease could be done in this patient-care activity with minimum number of significant features. The objective of this work is to show that selecting the more significant features would improve the performance of the classifier. We empirically evaluate the classification effectiveness of LibSVM classifier on the reduced feature subset of diabetes dataset. The evaluations suggest that the feature subset selected improves the predictive accuracy of the classifier and reduce false negatives and false positives.
New Hybrid Features Selection Method: A Case Study on Websites Phishing

Directory of Open Access Journals (Sweden)

Khairan D. Rajab

2017-01-01

Full Text Available Phishing is one of the serious web threats that involves mimicking authenticated websites to deceive users in order to obtain their financial information. Phishing has caused financial damage to the different online stakeholders. It is massive in the magnitude of hundreds of millions; hence it is essential to minimize this risk. Classifying websites into “phishy” and legitimate types is a primary task in data mining that security experts and decision makers are hoping to improve particularly with respect to the detection rate and reliability of the results. One way to ensure the reliability of the results and to enhance performance is to identify a set of related features early on so the data dimensionality reduces and irrelevant features are discarded. To increase reliability of preprocessing, this article proposes a new feature selection method that combines the scores of multiple known methods to minimize discrepancies in feature selection results. The proposed method has been applied to the problem of website phishing classification to show its pros and cons in identifying relevant features. Results against a security dataset reveal that the proposed preprocessing method was able to derive new features datasets which when mined generate high competitive classifiers with reference to detection rate when compared to results obtained from other features selection methods.
A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data

KAUST Repository

Abusamra, Heba

2013-05-01

Microarray technology has enriched the study of gene expression in such a way that scientists are now able to measure the expression levels of thousands of genes in a single experiment. Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification, interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This thesis aims on a comparative study of state-of-the-art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k- nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t- statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used for this study. Different experiments have been applied to compare the performance of the classification methods with and without performing feature selection. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in
Demonstration of IGCC features - plant integration and syngas combustion

Energy Technology Data Exchange (ETDEWEB)

Hannemann, F.; Huth, M.; Karg, J.; Schiffers, U. [Siemens AG Power Generation (KWU), Erlanger/Muelheim (Germany)

2000-07-01

Siemens is involved in three IGCC plants in Europe that are currently in operation. Against the background of the Puertollano and Buggenum plants, some of the specific new features of fully integrated IGCC power plants are discussed, including: requirements and design features of the gas turbine syngas supply system; gas turbine operating experience with air extraction for the air separation unit from the gas turbine air compressor; and design requirements and operational features of the combustion system. 7 refs., 17 figs., 1 tab.
Object-based selection from spatially-invariant representations: evidence from a feature-report task.

Science.gov (United States)

Matsukura, Michi; Vecera, Shaun P

2011-02-01

Attention selects objects as well as locations. When attention selects an object's features, observers identify two features from a single object more accurately than two features from two different objects (object-based effect of attention; e.g., Duncan, Journal of Experimental Psychology: General, 113, 501-517, 1984). Several studies have demonstrated that object-based attention can operate at a late visual processing stage that is independent of objects' spatial information (Awh, Dhaliwal, Christensen, & Matsukura, Psychological Science, 12, 329-334, 2001; Matsukura & Vecera, Psychonomic Bulletin & Review, 16, 529-536, 2009; Vecera, Journal of Experimental Psychology: General, 126, 14-18, 1997; Vecera & Farah, Journal of Experimental Psychology: General, 123, 146-160, 1994). In the present study, we asked two questions regarding this late object-based selection mechanism. In Part I, we investigated how observers' foreknowledge of to-be-reported features allows attention to select objects, as opposed to individual features. Using a feature-report task, a significant object-based effect was observed when to-be-reported features were known in advance but not when this advance knowledge was absent. In Part II, we examined what drives attention to select objects rather than individual features in the absence of observers' foreknowledge of to-be-reported features. Results suggested that, when there was no opportunity for observers to direct their attention to objects that possess to-be-reported features at the time of stimulus presentation, these stimuli must retain strong perceptual cues to establish themselves as separate objects.
Heuristic algorithms for feature selection under Bayesian models with block-diagonal covariance structure.

Science.gov (United States)

Foroughi Pour, Ali; Dalton, Lori A

2018-03-21

Many bioinformatics studies aim to identify markers, or features, that can be used to discriminate between distinct groups. In problems where strong individual markers are not available, or where interactions between gene products are of primary interest, it may be necessary to consider combinations of features as a marker family. To this end, recent work proposes a hierarchical Bayesian framework for feature selection that places a prior on the set of features we wish to select and on the label-conditioned feature distribution. While an analytical posterior under Gaussian models with block covariance structures is available, the optimal feature selection algorithm for this model remains intractable since it requires evaluating the posterior over the space of all possible covariance block structures and feature-block assignments. To address this computational barrier, in prior work we proposed a simple suboptimal algorithm, 2MNC-Robust, with robust performance across the space of block structures. Here, we present three new heuristic feature selection algorithms. The proposed algorithms outperform 2MNC-Robust and many other popular feature selection algorithms on synthetic data. In addition, enrichment analysis on real breast cancer, colon cancer, and Leukemia data indicates they also output many of the genes and pathways linked to the cancers under study. Bayesian feature selection is a promising framework for small-sample high-dimensional data, in particular biomarker discovery applications. When applied to cancer data these algorithms outputted many genes already shown to be involved in cancer as well as potentially new biomarkers. Furthermore, one of the proposed algorithms, SPM, outputs blocks of heavily correlated genes, particularly useful for studying gene interactions and gene networks.
Character feature integration of Chinese calligraphy and font

Science.gov (United States)

Shi, Cao; Xiao, Jianguo; Jia, Wenhua; Xu, Canhui

2013-01-01

A framework is proposed in this paper to effectively generate a new hybrid character type by means of integrating local contour feature of Chinese calligraphy with structural feature of font in computer system. To explore traditional art manifestation of calligraphy, multi-directional spatial filter is applied for local contour feature extraction. Then the contour of character image is divided into sub-images. The sub-images in the identical position from various characters are estimated by Gaussian distribution. According to its probability distribution, the dilation operator and erosion operator are designed to adjust the boundary of font image. And then new Chinese character images are generated which possess both contour feature of artistical calligraphy and elaborate structural feature of font. Experimental results demonstrate the new characters are visually acceptable, and the proposed framework is an effective and efficient strategy to automatically generate the new hybrid character of calligraphy and font.
A DYNAMIC FEATURE SELECTION METHOD FOR DOCUMENT RANKING WITH RELEVANCE FEEDBACK APPROACH

Directory of Open Access Journals (Sweden)

K. Latha

2010-07-01

Full Text Available Ranking search results is essential for information retrieval and Web search. Search engines need to not only return highly relevant results, but also be fast to satisfy users. As a result, not all available features can be used for ranking, and in fact only a small percentage of these features can be used. Thus, it is crucial to have a feature selection mechanism that can find a subset of features that both meets latency requirements and achieves high relevance. In this paper we describe a 0/1 knapsack procedure for automatically selecting features to use within Generalization model for Document Ranking. We propose an approach for Relevance Feedback using Expectation Maximization method and evaluate the algorithm on the TREC Collection for describing classes of feedback textual information retrieval features. Experimental results, evaluated on standard TREC-9 part of the OHSUMED collections, show that our feature selection algorithm produces models that are either significantly more effective than, or equally effective as, models such as Markov Random Field model, Correlation Co-efficient and Count Difference method
An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem

Directory of Open Access Journals (Sweden)

Jing Bian

2016-01-01

Full Text Available In the era of big data, feature selection is an essential process in machine learning. Although the class imbalance problem has recently attracted a great deal of attention, little effort has been undertaken to develop feature selection techniques. In addition, most applications involving feature selection focus on classification accuracy but not cost, although costs are important. To cope with imbalance problems, we developed a cost-sensitive feature selection algorithm that adds the cost-based evaluation function of a filter feature selection using a chaos genetic algorithm, referred to as CSFSG. The evaluation function considers both feature-acquiring costs (test costs and misclassification costs in the field of network security, thereby weakening the influence of many instances from the majority of classes in large-scale datasets. The CSFSG algorithm reduces the total cost of feature selection and trades off both factors. The behavior of the CSFSG algorithm is tested on a large-scale dataset of network security, using two kinds of classifiers: C4.5 and k-nearest neighbor (KNN. The results of the experimental research show that the approach is efficient and able to effectively improve classification accuracy and to decrease classification time. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
Tensor-based Multi-view Feature Selection with Applications to Brain Diseases

Science.gov (United States)

Cao, Bokai; He, Lifang; Kong, Xiangnan; Yu, Philip S.; Hao, Zhifeng; Ragin, Ann B.

2015-01-01

In the era of big data, we can easily access information from multiple views which may be obtained from different sources or feature subsets. Generally, different views provide complementary information for learning tasks. Thus, multi-view learning can facilitate the learning process and is prevalent in a wide range of application domains. For example, in medical science, measurements from a series of medical examinations are documented for each subject, including clinical, imaging, immunologic, serologic and cognitive measures which are obtained from multiple sources. Specifically, for brain diagnosis, we can have different quantitative analysis which can be seen as different feature subsets of a subject. It is desirable to combine all these features in an effective way for disease diagnosis. However, some measurements from less relevant medical examinations can introduce irrelevant information which can even be exaggerated after view combinations. Feature selection should therefore be incorporated in the process of multi-view learning. In this paper, we explore tensor product to bring different views together in a joint space, and present a dual method of tensor-based multi-view feature selection (dual-Tmfs) based on the idea of support vector machine recursive feature elimination. Experiments conducted on datasets derived from neurological disorder demonstrate the features selected by our proposed method yield better classification performance and are relevant to disease diagnosis. PMID:25937823
Integrated circuit microsensor for selectively detection nitrogen dioxide and diisopropyl methylphosphonate

Energy Technology Data Exchange (ETDEWEB)

Kolesar, E.S. Jr. (Air Force Inst. of Technology, Dept. of Electrical and Computer Engineering, Wright-Patterson Air Force Base, Dayton, OH (United States)); Brothers, C.P. Jr. (Air Force Inst. of Technology, Dept. of Electrical and Computer Engineering, Wright-Patterson Air Force Base, Dayton, OH (United States)); Howe, C.P. (Air Force Inst. of Technology, Dept. of Electrical and Computer Engineering, Wright-Patterson Air Force Base, Dayton, OH (United States)); Jenkins, T.J. (Air Force Inst. of Technology, Dept. of Electrical and Computer Engineering, Wright-Patterson Air Force Base, Dayton, OH (United States)); Moosey, A.T. (Air Force Inst. of Technology, Dept. of Electrical and Computer Engineering, Wright-Patterson Air Force Base, Dayton, OH (United States)); Shin, J.E. (Air Force Inst. of Technology, Dept. of Electrical and Computer Engineering, Wright-Patterson Air Force Base, Dayton, OH (United States)); Wiseman, J.M. (Air Force Inst. of Technology, Dept. of Electrical an

1992-11-20

A novel gas-sensitive microsensor, known as the interdigitated gate electrode field effect transistor (IGEFET), was realized by selectively depositing a chemically-active, electron-beam evaporated copper phthalocyanine (CuPc) thin film onto an integrated circuit (IC). When isothermally operated at 150 C, the microsensor can selectively and reversibly detect parts-per-billion concentration levels of nitrogen dioxide (NO[sub 2]) and diisopropyl methylphosphonate (DIMP). The selectivity feature of the microsensor was established by operating it with a 5 V peak amplitude, 2 [mu]s duration, 1000 Hz repetition frequency pulse, and then analyzing its time- and frequency-domain responses. The envelopes associated with the normalized-difference Fourier transform magnitude spectra, and their derivatives, reveal features which unambiguously distinguish the NO[sub 2] and DIMP challenge gas responses. Furthermore, the area beneath each response envelope may correspondingly be interpreted as the microsensor's sensitivity for a specific challenge gas concentration. Scanning electron microscopy (SEM) was used to characterize the morphology of the CuPc thin film. Additionally, infrared (IR) spectroscopy was employed to verify the [alpha]- and [beta]-phases of the sublimed CuPc thin films and to study the NO[sub 2]- and DIMP-CuPc interactions. (orig.)
CLASS-PAIR-GUIDED MULTIPLE KERNEL LEARNING OF INTEGRATING HETEROGENEOUS FEATURES FOR CLASSIFICATION

Directory of Open Access Journals (Sweden)

Q. Wang

2017-10-01

Full Text Available In recent years, many studies on remote sensing image classification have shown that using multiple features from different data sources can effectively improve the classification accuracy. As a very powerful means of learning, multiple kernel learning (MKL can conveniently be embedded in a variety of characteristics. The conventional combined kernel learned by MKL can be regarded as the compromise of all basic kernels for all classes in classification. It is the best of the whole, but not optimal for each specific class. For this problem, this paper proposes a class-pair-guided MKL method to integrate the heterogeneous features (HFs from multispectral image (MSI and light detection and ranging (LiDAR data. In particular, the one-against-one strategy is adopted, which converts multiclass classification problem to a plurality of two-class classification problem. Then, we select the best kernel from pre-constructed basic kernels set for each class-pair by kernel alignment (KA in the process of classification. The advantage of the proposed method is that only the best kernel for the classification of any two classes can be retained, which leads to greatly enhanced discriminability. Experiments are conducted on two real data sets, and the experimental results show that the proposed method achieves the best performance in terms of classification accuracies in integrating the HFs for classification when compared with several state-of-the-art algorithms.
A New Feature Selection Algorithm Based on the Mean Impact Variance

Directory of Open Access Journals (Sweden)

Weidong Cheng

2014-01-01

Full Text Available The selection of fewer or more representative features from multidimensional features is important when the artificial neural network (ANN algorithm is used as a classifier. In this paper, a new feature selection method called the mean impact variance (MIVAR method is proposed to determine the feature that is more suitable for classification. Moreover, this method is constructed on the basis of the training process of the ANN algorithm. To verify the effectiveness of the proposed method, the MIVAR value is used to rank the multidimensional features of the bearing fault diagnosis. In detail, (1 70-dimensional all waveform features are extracted from a rolling bearing vibration signal with four different operating states, (2 the corresponding MIVAR values of all 70-dimensional features are calculated to rank all features, (3 14 groups of 10-dimensional features are separately generated according to the ranking results and the principal component analysis (PCA algorithm and a back propagation (BP network is constructed, and (4 the validity of the ranking result is proven by training this BP network with these seven groups of 10-dimensional features and by comparing the corresponding recognition rates. The results prove that the features with larger MIVAR value can lead to higher recognition rates.
Feature selection from a facial image for distinction of sasang constitution.

Science.gov (United States)

Koo, Imhoi; Kim, Jong Yeol; Kim, Myoung Geun; Kim, Keun Ho

2009-09-01

Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here.
Feature extraction and sensor selection for NPP initiating event identification

International Nuclear Information System (INIS)

Lin, Ting-Han; Wu, Shun-Chi; Chen, Kuang-You; Chou, Hwai-Pwu

2017-01-01

Highlights: • A two-stage feature extraction scheme for NPP initiating event identification. • With stBP, interrelations among the sensors can be retained for identification. • With dSFS, sensors that are crucial for identification can be efficiently selected. • Efficacy of the scheme is illustrated with data from the Maanshan NPP simulator. - Abstract: Initiating event identification is essential in managing nuclear power plant (NPP) severe accidents. In this paper, a novel two-stage feature extraction scheme that incorporates the proposed sensor type-wise block projection (stBP) and deflatable sequential forward selection (dSFS) is used to elicit the discriminant information in the data obtained from various NPP sensors to facilitate event identification. With the stBP, the primal features can be extracted without eliminating the interrelations among the sensors of the same type. The extracted features are then subjected to a further dimensionality reduction by selecting the sensors that are most relevant to the events under consideration. This selection is not easy, and a combinatorial optimization technique is normally required. With the dSFS, an optimal sensor set can be found with less computational load. Moreover, its sensor deflation stage allows sensors in the preselected set to be iteratively refined to avoid being trapped into a local optimum. Results from detailed experiments containing data of 12 event categories and a total of 112 events generated with a Taiwan’s Maanshan NPP simulator are presented to illustrate the efficacy of the proposed scheme.
UNLABELED SELECTED SAMPLES IN FEATURE EXTRACTION FOR CLASSIFICATION OF HYPERSPECTRAL IMAGES WITH LIMITED TRAINING SAMPLES

Directory of Open Access Journals (Sweden)

A. Kianisarkaleh

2015-12-01

Full Text Available Feature extraction plays a key role in hyperspectral images classification. Using unlabeled samples, often unlimitedly available, unsupervised and semisupervised feature extraction methods show better performance when limited number of training samples exists. This paper illustrates the importance of selecting appropriate unlabeled samples that used in feature extraction methods. Also proposes a new method for unlabeled samples selection using spectral and spatial information. The proposed method has four parts including: PCA, prior classification, posterior classification and sample selection. As hyperspectral image passes these parts, selected unlabeled samples can be used in arbitrary feature extraction methods. The effectiveness of the proposed unlabeled selected samples in unsupervised and semisupervised feature extraction is demonstrated using two real hyperspectral datasets. Results show that through selecting appropriate unlabeled samples, the proposed method can improve the performance of feature extraction methods and increase classification accuracy.
Mining potential biomarkers associated with space flight in Caenorhabditis elegans experienced Shenzhou-8 mission with multiple feature selection techniques

International Nuclear Information System (INIS)

Zhao, Lei; Gao, Ying; Mi, Dong; Sun, Yeqing

2016-01-01

Highlights: • A combined algorithm is proposed to mine biomarkers of spaceflight in C. elegans. • This algorithm makes the feature selection more reliable and robust. • Apply this algorithm to predict 17 positive biomarkers to space environment stress. • The strategy can be used as a general method to select important features. - Abstract: To identify the potential biomarkers associated with space flight, a combined algorithm, which integrates the feature selection techniques, was used to deal with the microarray datasets of Caenorhabditis elegans obtained in the Shenzhou-8 mission. Compared with the ground control treatment, a total of 86 differentially expressed (DE) genes in responses to space synthetic environment or space radiation environment were identified by two filter methods. And then the top 30 ranking genes were selected by the random forest algorithm. Gene Ontology annotation and functional enrichment analyses showed that these genes were mainly associated with metabolism process. Furthermore, clustering analysis showed that 17 genes among these are positive, including 9 for space synthetic environment and 8 for space radiation environment only. These genes could be used as the biomarkers to reflect the space environment stresses. In addition, we also found that microgravity is the main stress factor to change the expression patterns of biomarkers for the short-duration spaceflight.
Mining potential biomarkers associated with space flight in Caenorhabditis elegans experienced Shenzhou-8 mission with multiple feature selection techniques

Energy Technology Data Exchange (ETDEWEB)

Zhao, Lei [Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026 (China); Gao, Ying [Center of Medical Physics and Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Shushanhu Road 350, Hefei 230031 (China); Mi, Dong, E-mail: mid@dlmu.edu.cn [Department of Physics, Dalian Maritime University, Dalian 116026 (China); Sun, Yeqing, E-mail: yqsun@dlmu.edu.cn [Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026 (China)

2016-09-15

Highlights: • A combined algorithm is proposed to mine biomarkers of spaceflight in C. elegans. • This algorithm makes the feature selection more reliable and robust. • Apply this algorithm to predict 17 positive biomarkers to space environment stress. • The strategy can be used as a general method to select important features. - Abstract: To identify the potential biomarkers associated with space flight, a combined algorithm, which integrates the feature selection techniques, was used to deal with the microarray datasets of Caenorhabditis elegans obtained in the Shenzhou-8 mission. Compared with the ground control treatment, a total of 86 differentially expressed (DE) genes in responses to space synthetic environment or space radiation environment were identified by two filter methods. And then the top 30 ranking genes were selected by the random forest algorithm. Gene Ontology annotation and functional enrichment analyses showed that these genes were mainly associated with metabolism process. Furthermore, clustering analysis showed that 17 genes among these are positive, including 9 for space synthetic environment and 8 for space radiation environment only. These genes could be used as the biomarkers to reflect the space environment stresses. In addition, we also found that microgravity is the main stress factor to change the expression patterns of biomarkers for the short-duration spaceflight.
On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

Directory of Open Access Journals (Sweden)

Asriyanti Indah Pratiwi

2018-01-01

Full Text Available Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far.
Reduced multimodal integration of memory features following continuous theta burst stimulation of angular gyrus.

Science.gov (United States)

Yazar, Yasemin; Bergström, Zara M; Simons, Jon S

Lesions of the angular gyrus (AnG) region of human parietal cortex do not cause amnesia, but appear to be associated with reduction in the ability to consciously experience the reliving of previous events. We used continuous theta burst stimulation to test the hypothesis that the cognitive mechanism implicated in this memory deficit might be the integration of retrieved sensory event features into a coherent multimodal memory representation. Healthy volunteers received stimulation to AnG or a vertex control site after studying stimuli that each comprised a visual object embedded in a scene, with the name of the object presented auditorily. Participants were then asked to make memory judgments about the studied stimuli that involved recollection of single event features (visual or auditory), or required integration of event features within the same modality, or across modalities. Participants' ability to retrieve context features from across multiple modalities was significantly reduced after AnG stimulation compared to stimulation of the vertex. This effect was observed only for the integration of cross-modal context features but not for integration of features within the same modality, and could not be accounted for by task difficulty as performance was matched across integration conditions following vertex stimulation. These results support the hypothesis that AnG is necessary for the multimodal integration of distributed cortical episodic features into a unified conscious representation that enables the experience of remembering. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.

Science.gov (United States)

Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru

2014-01-01

Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
The safety features of an integrated maritime reactor

International Nuclear Information System (INIS)

Miyakoshi, Junichi; Yamada, Nobuyuki; Kuwahara, Shin-ichi

1975-01-01

The EFDR-80, a typical integrated maritime reactor, which is being developed in West Germany is outlined. The safety features of the integrated maritime reactor are presented with the analysis of reactor accidents and hazards, and are compared with those of the separated maritime reactor. Furthermore, the safety criteria of maritime reactors in Japan and West Germany are compared, and some of the differences are presented from the viewpoint of reactor design and safety analysis. In this report the authors express an earnest desire that the definite and reasonable safety criteria of the integrated maritime reactor should be established and that the safety criteria of the nuclear ship should be standardized internationally. (auth.)
Organisational Semiosis: integration and serparation between system features and workpractices

Directory of Open Access Journals (Sweden)

Rodney Clarke

2001-05-01

Full Text Available Traditional information systems theory and practice assumes a tight coupling or integration between workpractices in organisations and the information systems which are notionally built to support them. The relationship between the integration and the separation of workpractices and system features has been theorised as dialectical. It has also been argued that the goal of system design would be to achieve a dynamic equilibrium within this dialectic. However, this paper argues that the above mentioned dialectic forged between integrationist and separations! views can be usefully critiqued by applying systemic semiotics. Systemic semiotics refers to a combination of systemic functional linguistics (a semiotic model of language and its extensions into a general semiotic framework called social semiotics. The latter draws heavily on the notion of dialogism which this paper proposes is useful in rethinking the relationship between workpractices and information systems. In addition, concepts of text and context are drawn from systemic functional linguistics in analysing the workpractices associated with the use of actual information systems features. Two examples are used to explicate this dialogic relationship, including: (i the dynamic renegotiation of a workpractice which is assumed to be closely integrated to a system feature (negotiated separation, and (ii the extension of the system into other locations by means of communicatively organising materials and users in the workplace (indirect integration.
An input feature selection method applied to fuzzy neural networks for signal esitmation

International Nuclear Information System (INIS)

Na, Man Gyun; Sim, Young Rok

2001-01-01

It is well known that the performance of a fuzzy neural networks strongly depends on the input features selected for its training. In its applications to sensor signal estimation, there are a large number of input variables related with an output. As the number of input variables increases, the training time of fuzzy neural networks required increases exponentially. Thus, it is essential to reduce the number of inputs to a fuzzy neural networks and to select the optimum number of mutually independent inputs that are able to clearly define the input-output mapping. In this work, principal component analysis (PAC), genetic algorithms (GA) and probability theory are combined to select new important input features. A proposed feature selection method is applied to the signal estimation of the steam generator water level, the hot-leg flowrate, the pressurizer water level and the pressurizer pressure sensors in pressurized water reactors and compared with other input feature selection methods
A study of metaheuristic algorithms for high dimensional feature selection on microarray data

Science.gov (United States)

Dankolo, Muhammad Nasiru; Radzi, Nor Haizan Mohamed; Sallehuddin, Roselina; Mustaffa, Noorfa Haszlinna

2017-11-01

Microarray systems enable experts to examine gene profile at molecular level using machine learning algorithms. It increases the potentials of classification and diagnosis of many diseases at gene expression level. Though, numerous difficulties may affect the efficiency of machine learning algorithms which includes vast number of genes features comprised in the original data. Many of these features may be unrelated to the intended analysis. Therefore, feature selection is necessary to be performed in the data pre-processing. Many feature selection algorithms are developed and applied on microarray which including the metaheuristic optimization algorithms. This paper discusses the application of the metaheuristics algorithms for feature selection in microarray dataset. This study reveals that, the algorithms have yield an interesting result with limited resources thereby saving computational expenses of machine learning algorithms.
Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

Science.gov (United States)

Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

2017-01-01

Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods.
Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

Science.gov (United States)

Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

2017-01-01

Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods. PMID:28120883
Finger vein recognition with personalized feature selection.

Science.gov (United States)

Xi, Xiaoming; Yang, Gongping; Yin, Yilong; Meng, Xianjing

2013-08-22

Finger veins are a promising biometric pattern for personalized identification in terms of their advantages over existing biometrics. Based on the spatial pyramid representation and the combination of more effective information such as gray, texture and shape, this paper proposes a simple but powerful feature, called Pyramid Histograms of Gray, Texture and Orientation Gradients (PHGTOG). For a finger vein image, PHGTOG can reflect the global spatial layout and local details of gray, texture and shape. To further improve the recognition performance and reduce the computational complexity, we select a personalized subset of features from PHGTOG for each subject by using the sparse weight vector, which is trained by using LASSO and called PFS-PHGTOG. We conduct extensive experiments to demonstrate the promise of the PHGTOG and PFS-PHGTOG, experimental results on our databases show that PHGTOG outperforms the other existing features. Moreover, PFS-PHGTOG can further boost the performance in comparison with PHGTOG.
Finger Vein Recognition with Personalized Feature Selection

Directory of Open Access Journals (Sweden)

Xianjing Meng

2013-08-01

Full Text Available Finger veins are a promising biometric pattern for personalized identification in terms of their advantages over existing biometrics. Based on the spatial pyramid representation and the combination of more effective information such as gray, texture and shape, this paper proposes a simple but powerful feature, called Pyramid Histograms of Gray, Texture and Orientation Gradients (PHGTOG. For a finger vein image, PHGTOG can reflect the global spatial layout and local details of gray, texture and shape. To further improve the recognition performance and reduce the computational complexity, we select a personalized subset of features from PHGTOG for each subject by using the sparse weight vector, which is trained by using LASSO and called PFS-PHGTOG. We conduct extensive experiments to demonstrate the promise of the PHGTOG and PFS-PHGTOG, experimental results on our databases show that PHGTOG outperforms the other existing features. Moreover, PFS-PHGTOG can further boost the performance in comparison with PHGTOG.
Feature Selection from a Facial Image for Distinction of Sasang Constitution

Directory of Open Access Journals (Sweden)

Imhoi Koo

2009-01-01

Full Text Available Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here.
Feature Selection from a Facial Image for Distinction of Sasang Constitution

Science.gov (United States)

Koo, Imhoi; Kim, Jong Yeol; Kim, Myoung Geun

2009-01-01

Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here. PMID:19745013
Infrared face recognition based on LBP histogram and KW feature selection

Science.gov (United States)

Xie, Zhihua

2014-07-01

The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).
Relevant test set using feature selection algorithm for early detection ...

African Journals Online (AJOL)

The objective of feature selection is to find the most relevant features for classification. Thus, the dimensionality of the information will be reduced and may improve classification's accuracy. This paper proposed a minimum set of relevant questions that can be used for early detection of dyslexia. In this research, we ...
Receptive fields selection for binary feature description.

Science.gov (United States)

Fan, Bin; Kong, Qingqun; Trzcinski, Tomasz; Wang, Zhiheng; Pan, Chunhong; Fua, Pascal

2014-06-01

Feature description for local image patch is widely used in computer vision. While the conventional way to design local descriptor is based on expert experience and knowledge, learning-based methods for designing local descriptor become more and more popular because of their good performance and data-driven property. This paper proposes a novel data-driven method for designing binary feature descriptor, which we call receptive fields descriptor (RFD). Technically, RFD is constructed by thresholding responses of a set of receptive fields, which are selected from a large number of candidates according to their distinctiveness and correlations in a greedy way. Using two different kinds of receptive fields (namely rectangular pooling area and Gaussian pooling area) for selection, we obtain two binary descriptors RFDR and RFDG .accordingly. Image matching experiments on the well-known patch data set and Oxford data set demonstrate that RFD significantly outperforms the state-of-the-art binary descriptors, and is comparable with the best float-valued descriptors at a fraction of processing time. Finally, experiments on object recognition tasks confirm that both RFDR and RFDG successfully bridge the performance gap between binary descriptors and their floating-point competitors.
A ROC-based feature selection method for computer-aided detection and diagnosis

Science.gov (United States)

Wang, Songyuan; Zhang, Guopeng; Liao, Qimei; Zhang, Junying; Jiao, Chun; Lu, Hongbing

2014-03-01

Image-based computer-aided detection and diagnosis (CAD) has been a very active research topic aiming to assist physicians to detect lesions and distinguish them from benign to malignant. However, the datasets fed into a classifier usually suffer from small number of samples, as well as significantly less samples available in one class (have a disease) than the other, resulting in the classifier's suboptimal performance. How to identifying the most characterizing features of the observed data for lesion detection is critical to improve the sensitivity and minimize false positives of a CAD system. In this study, we propose a novel feature selection method mR-FAST that combines the minimal-redundancymaximal relevance (mRMR) framework with a selection metric FAST (feature assessment by sliding thresholds) based on the area under a ROC curve (AUC) generated on optimal simple linear discriminants. With three feature datasets extracted from CAD systems for colon polyps and bladder cancer, we show that the space of candidate features selected by mR-FAST is more characterizing for lesion detection with higher AUC, enabling to find a compact subset of superior features at low cost.
Raman spectral feature selection using ant colony optimization for breast cancer diagnosis.

Science.gov (United States)

Fallahzadeh, Omid; Dehghani-Bidgoli, Zohreh; Assarian, Mohammad

2018-06-04

Pathology as a common diagnostic test of cancer is an invasive, time-consuming, and partially subjective method. Therefore, optical techniques, especially Raman spectroscopy, have attracted the attention of cancer diagnosis researchers. However, as Raman spectra contain numerous peaks involved in molecular bounds of the sample, finding the best features related to cancerous changes can improve the accuracy of diagnosis in this method. The present research attempted to improve the power of Raman-based cancer diagnosis by finding the best Raman features using the ACO algorithm. In the present research, 49 spectra were measured from normal, benign, and cancerous breast tissue samples using a 785-nm micro-Raman system. After preprocessing for removal of noise and background fluorescence, the intensity of 12 important Raman bands of the biological samples was extracted as features of each spectrum. Then, the ACO algorithm was applied to find the optimum features for diagnosis. As the results demonstrated, by selecting five features, the classification accuracy of the normal, benign, and cancerous groups increased by 14% and reached 87.7%. ACO feature selection can improve the diagnostic accuracy of Raman-based diagnostic models. In the present study, features corresponding to ν(C-C) αhelix proline, valine (910-940), νs(C-C) skeletal lipids (1110-1130), and δ(CH2)/δ(CH3) proteins (1445-1460) were selected as the best features in cancer diagnosis.
The effect of destination linked feature selection in real-time network intrusion detection

CSIR Research Space (South Africa)

Mzila, P

2013-07-01

Full Text Available techniques in the network intrusion detection system (NIDS) is the feature selection technique. The ability of NIDS to accurately identify intrusion from the network traffic relies heavily on feature selection, which describes the pattern of the network...
Compensatory selection for roads over natural linear features by wolves in northern Ontario: Implications for caribou conservation.

Directory of Open Access Journals (Sweden)

Erica J Newton

Full Text Available Woodland caribou (Rangifer tarandus caribou in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism-a compensatory functional response to anthropogenic
Integrative analysis of gene expression and DNA methylation using unsupervised feature extraction for detecting candidate cancer biomarkers.

Science.gov (United States)

Moon, Myungjin; Nakai, Kenta

2018-04-01

Currently, cancer biomarker discovery is one of the important research topics worldwide. In particular, detecting significant genes related to cancer is an important task for early diagnosis and treatment of cancer. Conventional studies mostly focus on genes that are differentially expressed in different states of cancer; however, noise in gene expression datasets and insufficient information in limited datasets impede precise analysis of novel candidate biomarkers. In this study, we propose an integrative analysis of gene expression and DNA methylation using normalization and unsupervised feature extractions to identify candidate biomarkers of cancer using renal cell carcinoma RNA-seq datasets. Gene expression and DNA methylation datasets are normalized by Box-Cox transformation and integrated into a one-dimensional dataset that retains the major characteristics of the original datasets by unsupervised feature extraction methods, and differentially expressed genes are selected from the integrated dataset. Use of the integrated dataset demonstrated improved performance as compared with conventional approaches that utilize gene expression or DNA methylation datasets alone. Validation based on the literature showed that a considerable number of top-ranked genes from the integrated dataset have known relationships with cancer, implying that novel candidate biomarkers can also be acquired from the proposed analysis method. Furthermore, we expect that the proposed method can be expanded for applications involving various types of multi-omics datasets.
An improved strategy for skin lesion detection and classification using uniform segmentation and feature selection based approach.

Science.gov (United States)

Nasir, Muhammad; Attique Khan, Muhammad; Sharif, Muhammad; Lali, Ikram Ullah; Saba, Tanzila; Iqbal, Tassawar

2018-02-21

Melanoma is the deadliest type of skin cancer with highest mortality rate. However, the annihilation in early stage implies a high survival rate therefore, it demands early diagnosis. The accustomed diagnosis methods are costly and cumbersome due to the involvement of experienced experts as well as the requirements for highly equipped environment. The recent advancements in computerized solutions for these diagnoses are highly promising with improved accuracy and efficiency. In this article, we proposed a method for the classification of melanoma and benign skin lesions. Our approach integrates preprocessing, lesion segmentation, features extraction, features selection, and classification. Preprocessing is executed in the context of hair removal by DullRazor, whereas lesion texture and color information are utilized to enhance the lesion contrast. In lesion segmentation, a hybrid technique has been implemented and results are fused using additive law of probability. Serial based method is applied subsequently that extracts and fuses the traits such as color, texture, and HOG (shape). The fused features are selected afterwards by implementing a novel Boltzman Entropy method. Finally, the selected features are classified by Support Vector Machine. The proposed method is evaluated on publically available data set PH2. Our approach has provided promising results of sensitivity 97.7%, specificity 96.7%, accuracy 97.5%, and F-score 97.5%, which are significantly better than the results of existing methods available on the same data set. The proposed method detects and classifies melanoma significantly good as compared to existing methods. © 2018 Wiley Periodicals, Inc.

Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System.

Science.gov (United States)

Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir

2015-01-01

The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Multi-level gene/MiRNA feature selection using deep belief nets and active learning.

Science.gov (United States)

Ibrahim, Rania; Yousri, Noha A; Ismail, Mohamed A; El-Makky, Nagwa M

2014-01-01

Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting the most informative genes/miRNAs. Recently, deep learning has proven its ability in representing the data in multiple levels of abstraction, allowing for better discrimination between different classes. However, the idea of using deep learning for feature selection is not widely used in the bioinformatics field yet. In this paper, a novel multi-level feature selection approach named MLFS is proposed for selecting genes/miRNAs based on expression profiles. The approach is based on both deep and active learning. Moreover, an extension to use the technique for miRNAs is presented by considering the biological relation between miRNAs and genes. Experimental results show that the approach was able to outperform classical feature selection methods in hepatocellular carcinoma (HCC) by 9%, lung cancer by 6% and breast cancer by around 10% in F1-measure. Results also show the enhancement in F1-measure of our approach over recently related work in [1] and [2].
INTEGRATED INFORMATION SYSTEM ARCHITECTURE PROVIDING BEHAVIORAL FEATURE

Directory of Open Access Journals (Sweden)

Vladimir N. Shvedenko

2016-11-01

Full Text Available The paper deals with creation of integrated information system architecture capable of supporting management decisions using behavioral features. The paper considers the architecture of information decision support system for production system management. The behavioral feature is given to an information system, and it ensures extraction, processing of information, management decision-making with both automated and automatic modes of decision-making subsystem being permitted. Practical implementation of information system with behavior is based on service-oriented architecture: there is a set of independent services in the information system that provides data of its subsystems or data processing by separate application under the chosen variant of the problematic situation settlement. For creation of integrated information system with behavior we propose architecture including the following subsystems: data bus, subsystem for interaction with the integrated applications based on metadata, business process management subsystem, subsystem for the current state analysis of the enterprise and management decision-making, behavior training subsystem. For each problematic situation a separate logical layer service is created in Unified Service Bus handling problematic situations. This architecture reduces system information complexity due to the fact that with a constant amount of system elements the number of links decreases, since each layer provides communication center of responsibility for the resource with the services of corresponding applications. If a similar problematic situation occurs, its resolution is automatically removed from problem situation metamodel repository and business process metamodel of its settlement. In the business process performance commands are generated to the corresponding centers of responsibility to settle a problematic situation.
Robust Feature Selection from Microarray Data Based on Cooperative Game Theory and Qualitative Mutual Information

Directory of Open Access Journals (Sweden)

Atiyeh Mortazavi

2016-01-01

Full Text Available High dimensionality of microarray data sets may lead to low efficiency and overfitting. In this paper, a multiphase cooperative game theoretic feature selection approach is proposed for microarray data classification. In the first phase, due to high dimension of microarray data sets, the features are reduced using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. In the second phase, Shapley index is used to evaluate the power of each feature. The main innovation of the proposed approach is to employ Qualitative Mutual Information (QMI for this purpose. The idea of Qualitative Mutual Information causes the selected features to have more stability and this stability helps to deal with the problem of data imbalance and scarcity. In the third phase, a forward selection scheme is applied which uses a scoring function to weight each feature. The performance of the proposed method is compared with other popular feature selection algorithms such as Fisher ratio, minimum redundancy maximum relevance, and previous works on cooperative game based feature selection. The average classification accuracy on eleven microarray data sets shows that the proposed method improves both average accuracy and average stability compared to other approaches.
A new and fast image feature selection method for developing an optimal mammographic mass detection scheme.

Science.gov (United States)

Tan, Maxine; Pu, Jiantao; Zheng, Bin

2014-08-01

Selecting optimal features from a large image feature pool remains a major challenge in developing computer-aided detection (CAD) schemes of medical images. The objective of this study is to investigate a new approach to significantly improve efficacy of image feature selection and classifier optimization in developing a CAD scheme of mammographic masses. An image dataset including 1600 regions of interest (ROIs) in which 800 are positive (depicting malignant masses) and 800 are negative (depicting CAD-generated false positive regions) was used in this study. After segmentation of each suspicious lesion by a multilayer topographic region growth algorithm, 271 features were computed in different feature categories including shape, texture, contrast, isodensity, spiculation, local topological features, as well as the features related to the presence and location of fat and calcifications. Besides computing features from the original images, the authors also computed new texture features from the dilated lesion segments. In order to select optimal features from this initial feature pool and build a highly performing classifier, the authors examined and compared four feature selection methods to optimize an artificial neural network (ANN) based classifier, namely: (1) Phased Searching with NEAT in a Time-Scaled Framework, (2) A sequential floating forward selection (SFFS) method, (3) A genetic algorithm (GA), and (4) A sequential forward selection (SFS) method. Performances of the four approaches were assessed using a tenfold cross validation method. Among these four methods, SFFS has highest efficacy, which takes 3%-5% of computational time as compared to GA approach, and yields the highest performance level with the area under a receiver operating characteristic curve (AUC) = 0.864 ± 0.034. The results also demonstrated that except using GA, including the new texture features computed from the dilated mass segments improved the AUC results of the ANNs optimized
Sparse Bayesian classification and feature selection for biological expression data with high correlations.

Directory of Open Access Journals (Sweden)

Xian Yang

Full Text Available Classification models built on biological expression data are increasingly used to predict distinct disease subtypes. Selected features that separate sample groups can be the candidates of biomarkers, helping us to discover biological functions/pathways. However, three challenges are associated with building a robust classification and feature selection model: 1 the number of significant biomarkers is much smaller than that of measured features for which the search will be exhaustive; 2 current biological expression data are big in both sample size and feature size which will worsen the scalability of any search algorithms; and 3 expression profiles of certain features are typically highly correlated which may prevent to distinguish the predominant features. Unfortunately, most of the existing algorithms are partially addressing part of these challenges but not as a whole. In this paper, we propose a unified framework to address the above challenges. The classification and feature selection problem is first formulated as a nonconvex optimisation problem. Then the problem is relaxed and solved iteratively by a sequence of convex optimisation procedures which can be distributed computed and therefore allows the efficient implementation on advanced infrastructures. To illustrate the competence of our method over others, we first analyse a randomly generated simulation dataset under various conditions. We then analyse a real gene expression dataset on embryonal tumour. Further downstream analysis, such as functional annotation and pathway analysis, are performed on the selected features which elucidate several biological findings.
Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System

Directory of Open Access Journals (Sweden)

Pavol Partila

2015-01-01

Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Jointly Feature Learning and Selection for Robust Tracking via a Gating Mechanism.

Directory of Open Access Journals (Sweden)

Bineng Zhong

Full Text Available To achieve effective visual tracking, a robust feature representation composed of two separate components (i.e., feature learning and selection for an object is one of the key issues. Typically, a common assumption used in visual tracking is that the raw video sequences are clear, while real-world data is with significant noise and irrelevant patterns. Consequently, the learned features may be not all relevant and noisy. To address this problem, we propose a novel visual tracking method via a point-wise gated convolutional deep network (CPGDN that jointly performs the feature learning and feature selection in a unified framework. The proposed method performs dynamic feature selection on raw features through a gating mechanism. Therefore, the proposed method can adaptively focus on the task-relevant patterns (i.e., a target object, while ignoring the task-irrelevant patterns (i.e., the surrounding background of a target object. Specifically, inspired by transfer learning, we firstly pre-train an object appearance model offline to learn generic image features and then transfer rich feature hierarchies from an offline pre-trained CPGDN into online tracking. In online tracking, the pre-trained CPGDN model is fine-tuned to adapt to the tracking specific objects. Finally, to alleviate the tracker drifting problem, inspired by an observation that a visual target should be an object rather than not, we combine an edge box-based object proposal method to further improve the tracking accuracy. Extensive evaluation on the widely used CVPR2013 tracking benchmark validates the robustness and effectiveness of the proposed method.
Emotional textile image classification based on cross-domain convolutional sparse autoencoders with feature selection

Science.gov (United States)

Li, Zuhe; Fan, Yangyu; Liu, Weihua; Yu, Zeqi; Wang, Fengqin

2017-01-01

We aim to apply sparse autoencoder-based unsupervised feature learning to emotional semantic analysis for textile images. To tackle the problem of limited training data, we present a cross-domain feature learning scheme for emotional textile image classification using convolutional autoencoders. We further propose a correlation-analysis-based feature selection method for the weights learned by sparse autoencoders to reduce the number of features extracted from large size images. First, we randomly collect image patches on an unlabeled image dataset in the source domain and learn local features with a sparse autoencoder. We then conduct feature selection according to the correlation between different weight vectors corresponding to the autoencoder's hidden units. We finally adopt a convolutional neural network including a pooling layer to obtain global feature activations of textile images in the target domain and send these global feature vectors into logistic regression models for emotional image classification. The cross-domain unsupervised feature learning method achieves 65% to 78% average accuracy in the cross-validation experiments corresponding to eight emotional categories and performs better than conventional methods. Feature selection can reduce the computational cost of global feature extraction by about 50% while improving classification performance.
Different cortical mechanisms for spatial vs. feature-based attentional selection in visual working memory

Directory of Open Access Journals (Sweden)

Anna Heuer

2016-08-01

Full Text Available The limited capacity of visual working memory necessitates attentional mechanisms that selectively update and maintain only the most task-relevant content. Psychophysical experiments have shown that the retroactive selection of memory content can be based on visual properties such as location or shape, but the neural basis for such differential selection is unknown. For example, it is not known if there are different cortical modules specialized for spatial versus feature-based mnemonic attention, in the same way that has been demonstrated for attention to perceptual input. Here, we used transcranial magnetic stimulation (TMS to identify areas in human parietal and occipital cortex involved in the selection of objects from memory based on cues to their location (spatial information or their shape (featural information. We found that TMS over the supramarginal gyrus (SMG selectively facilitated spatial selection, whereas TMS over the lateral occipital cortex selectively enhanced feature-based selection for remembered objects in the contralateral visual field. Thus, different cortical regions are responsible for spatial vs. feature-based selection of working memory representations. Since the same regions are involved in attention to external events, these new findings indicate overlapping mechanisms for attentional control over perceptual input and mnemonic representations.
Selection of morphological features of pollen grains for chosen tree taxa

Directory of Open Access Journals (Sweden)

Agnieszka Kubik-Komar

2018-05-01

Full Text Available The basis of aerobiological studies is to monitor airborne pollen concentrations and pollen season timing. This task is performed by appropriately trained staff and is difficult and time consuming. The goal of this research is to select morphological characteristics of grains that are the most discriminative for distinguishing between birch, hazel and alder taxa and are easy to determine automatically from microscope images. This selection is based on the split attributes of the J4.8 classification trees built for different subsets of features. Determining the discriminative features by this method, we provide specific rules for distinguishing between individual taxa, at the same time obtaining a high percentage of correct classification. The most discriminative among the 13 morphological characteristics studied are the following: number of pores, maximum axis, minimum axis, axes difference, maximum oncus width, and number of lateral pores. The classification result of the tree based on this subset is better than the one built on the whole feature set and it is almost 94%. Therefore, selection of attributes before tree building is recommended. The classification results for the features easiest to obtain from the image, i.e. maximum axis, minimum axis, axes difference, and number of lateral pores, are only 2.09 pp lower than those obtained for the complete set, but 3.23 pp lower than the results obtained for the selected most discriminating attributes only.
A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

KAUST Repository

Abusamra, Heba

2013-11-01

Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.
A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

KAUST Repository

Abusamra, Heba

2013-01-01

Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.
An integrated multi-sensor fusion-based deep feature learning approach for rotating machinery diagnosis

Science.gov (United States)

Liu, Jie; Hu, Youmin; Wang, Yan; Wu, Bo; Fan, Jikai; Hu, Zhongxu

2018-05-01

The diagnosis of complicated fault severity problems in rotating machinery systems is an important issue that affects the productivity and quality of manufacturing processes and industrial applications. However, it usually suffers from several deficiencies. (1) A considerable degree of prior knowledge and expertise is required to not only extract and select specific features from raw sensor signals, and but also choose a suitable fusion for sensor information. (2) Traditional artificial neural networks with shallow architectures are usually adopted and they have a limited ability to learn the complex and variable operating conditions. In multi-sensor-based diagnosis applications in particular, massive high-dimensional and high-volume raw sensor signals need to be processed. In this paper, an integrated multi-sensor fusion-based deep feature learning (IMSFDFL) approach is developed to identify the fault severity in rotating machinery processes. First, traditional statistics and energy spectrum features are extracted from multiple sensors with multiple channels and combined. Then, a fused feature vector is constructed from all of the acquisition channels. Further, deep feature learning with stacked auto-encoders is used to obtain the deep features. Finally, the traditional softmax model is applied to identify the fault severity. The effectiveness of the proposed IMSFDFL approach is primarily verified by a one-stage gearbox experimental platform that uses several accelerometers under different operating conditions. This approach can identify fault severity more effectively than the traditional approaches.
Feature integration and spatial attention: common processes for endogenous and exogenous orienting.

Science.gov (United States)

Henderickx, David; Maetens, Kathleen; Soetens, Eric

2010-05-01

Briand (J Exp Psychol Hum Percept Perform 24:1243-1256, 1998) and Briand and Klein (J Exp Psychol Hum Percept Perform 13:228-241, 1987) demonstrated that spatial cueing effects are larger for detecting conjunction of features than for detecting simple features when spatial attention is oriented exogenously, and not when attention is oriented endogenously. Their results were interpreted as if only exogenous attention affects the posterior spatial attention system that performs the feature binding function attributed to spatial attention by Treisman's feature integration theory (FIT; 1980). In a series of 6 experiments, we attempted to replicate Briand's findings. Manipulations of distractor string size and symmetry of stimulus presentation left and right from fixation were implemented in Posner's cueing paradigm. The data indicate that both exogenous and endogenous cueing address the same attentional mechanism needed for feature binding. The results also limit the generalisability of Briand's proposal concerning the role of exogenous attention in feature integration. Furthermore, the importance to control the effect of unintended attentional capture in a cueing task is demonstrated.
Toward optimal feature selection using ranking methods and classification algorithms

Directory of Open Access Journals (Sweden)

Novaković Jasmina

2011-01-01

Full Text Available We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be important for classification accuracy. In our experiments, ranking methods with different supervised learning algorithms give quite different results for balanced accuracy. Our cases confirm that, in order to be sure that a subset of features giving the highest accuracy has been selected, the use of many different indices is recommended.
Feature selection for Bayesian network classifiers using the MDL-FS score

NARCIS (Netherlands)

Drugan, Madalina M.; Wiering, Marco A.

When constructing a Bayesian network classifier from data, the more or less redundant features included in a dataset may bias the classifier and as a consequence may result in a relatively poor classification accuracy. In this paper, we study the problem of selecting appropriate subsets of features
Framework for the Integration of Mobile Device Features in PLM

OpenAIRE

Hopf, Jens Michael

2016-01-01

Currently, companies have covered their business processes with stationary workstations while mobile business applications have limited relevance. Companies can cover their overall business processes more time-efficiently and cost-effectively when they integrate mobile users in workflows using mobile device features. The objective is a framework that can be used to model and control business applications for PLM processes using mobile device features to allow a totally new user experience.
Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression

Directory of Open Access Journals (Sweden)

Shahrbanoo Goli

2016-01-01

Full Text Available The Support Vector Regression (SVR model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model.
Feature selection gait-based gender classification under different circumstances

Science.gov (United States)

Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

2014-05-01

This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

Comparisons and Selections of Features and Classifiers for Short Text Classification

Science.gov (United States)

Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi

2017-10-01

Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.
D3.5 Report on ECO social network integration features

NARCIS (Netherlands)

Viñuales, Javier; Driesner, Jorge; Tejera, Sara; Tomasini, Alessandra; Loozen, Kjeld; Rocio, Vítor; Bohuschke, Felix; Ternier, Stefaan

2016-01-01

This document describes how integrations with social networks are being developed in ECO platforms. With these new features, participants will be able to share results and other contents through Facebook, Twitter, Google plus.
Feature Selection Methods for Robust Decoding of Finger Movements in a Non-human Primate

Science.gov (United States)

Padmanaban, Subash; Baker, Justin; Greger, Bradley

2018-01-01

Objective: The performance of machine learning algorithms used for neural decoding of dexterous tasks may be impeded due to problems arising when dealing with high-dimensional data. The objective of feature selection algorithms is to choose a near-optimal subset of features from the original feature space to improve the performance of the decoding algorithm. The aim of our study was to compare the effects of four feature selection techniques, Wilcoxon signed-rank test, Relative Importance, Principal Component Analysis (PCA), and Mutual Information Maximization on SVM classification performance for a dexterous decoding task. Approach: A nonhuman primate (NHP) was trained to perform small coordinated movements—similar to typing. An array of microelectrodes was implanted in the hand area of the motor cortex of the NHP and used to record action potentials (AP) during finger movements. A Support Vector Machine (SVM) was used to classify which finger movement the NHP was making based upon AP firing rates. We used the SVM classification to examine the functional parameters of (i) robustness to simulated failure and (ii) longevity of classification. We also compared the effect of using isolated-neuron and multi-unit firing rates as the feature vector supplied to the SVM. Main results: The average decoding accuracy for multi-unit features and single-unit features using Mutual Information Maximization (MIM) across 47 sessions was 96.74 ± 3.5% and 97.65 ± 3.36% respectively. The reduction in decoding accuracy between using 100% of the features and 10% of features based on MIM was 45.56% (from 93.7 to 51.09%) and 4.75% (from 95.32 to 90.79%) for multi-unit and single-unit features respectively. MIM had best performance compared to other feature selection methods. Significance: These results suggest improved decoding performance can be achieved by using optimally selected features. The results based on clinically relevant performance metrics also suggest that the decoding
Feature Selection based on Machine Learning in MRIs for Hippocampal Segmentation

Science.gov (United States)

Tangaro, Sabina; Amoroso, Nicola; Brescia, Massimo; Cavuoti, Stefano; Chincarini, Andrea; Errico, Rosangela; Paolo, Inglese; Longo, Giuseppe; Maglietta, Rosalia; Tateo, Andrea; Riccio, Giuseppe; Bellotti, Roberto

2015-01-01

Neurodegenerative diseases are frequently associated with structural changes in the brain. Magnetic resonance imaging (MRI) scans can show these variations and therefore can be used as a supportive feature for a number of neurodegenerative diseases. The hippocampus has been known to be a biomarker for Alzheimer disease and other neurological and psychiatric diseases. However, it requires accurate, robust, and reproducible delineation of hippocampal structures. Fully automatic methods are usually the voxel based approach; for each voxel a number of local features were calculated. In this paper, we compared four different techniques for feature selection from a set of 315 features extracted for each voxel: (i) filter method based on the Kolmogorov-Smirnov test; two wrapper methods, respectively, (ii) sequential forward selection and (iii) sequential backward elimination; and (iv) embedded method based on the Random Forest Classifier on a set of 10 T1-weighted brain MRIs and tested on an independent set of 25 subjects. The resulting segmentations were compared with manual reference labelling. By using only 23 feature for each voxel (sequential backward elimination) we obtained comparable state-of-the-art performances with respect to the standard tool FreeSurfer.
EEG-based recognition of video-induced emotions: selecting subject-independent feature set.

Science.gov (United States)

Kortelainen, Jukka; Seppänen, Tapio

2013-01-01

Emotions are fundamental for everyday life affecting our communication, learning, perception, and decision making. Including emotions into the human-computer interaction (HCI) could be seen as a significant step forward offering a great potential for developing advanced future technologies. While the electrical activity of the brain is affected by emotions, offers electroencephalogram (EEG) an interesting channel to improve the HCI. In this paper, the selection of subject-independent feature set for EEG-based emotion recognition is studied. We investigate the effect of different feature sets in classifying person's arousal and valence while watching videos with emotional content. The classification performance is optimized by applying a sequential forward floating search algorithm for feature selection. The best classification rate (65.1% for arousal and 63.0% for valence) is obtained with a feature set containing power spectral features from the frequency band of 1-32 Hz. The proposed approach substantially improves the classification rate reported in the literature. In future, further analysis of the video-induced EEG changes including the topographical differences in the spectral features is needed.
Fast Branch & Bound algorithms for optimal feature selection

Czech Academy of Sciences Publication Activity Database

Somol, Petr; Pudil, Pavel; Kittler, J.

2004-01-01

Roč. 26, č. 7 (2004), s. 900-912 ISSN 0162-8828 R&D Projects: GA ČR GA402/02/1271; GA ČR GA402/03/1310; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : subset search * feature selection * search tree Subject RIV: BD - Theory of Information Impact factor: 4.352, year: 2004
Genetic Fuzzy System (GFS based wavelet co-occurrence feature selection in mammogram classification for breast cancer diagnosis

Directory of Open Access Journals (Sweden)

Meenakshi M. Pawar

2016-09-01

Full Text Available Breast cancer is significant health problem diagnosed mostly in women worldwide. Therefore, early detection of breast cancer is performed with the help of digital mammography, which can reduce mortality rate. This paper presents wrapper based feature selection approach for wavelet co-occurrence feature (WCF using Genetic Fuzzy System (GFS in mammogram classification problem. The performance of GFS algorithm is explained using mini-MIAS database. WCF features are obtained from detail wavelet coefficients at each level of decomposition of mammogram image. At first level of decomposition, 18 features are applied to GFS algorithm, which selects 5 features with an average classification success rate of 39.64%. Subsequently, at second level it selects 9 features from 36 features and the classification success rate is improved to 56.75%. For third level, 16 features are selected from 54 features and average success rate is improved to 64.98%. Lastly, at fourth level 72 features are applied to GFS, which selects 16 features and thereby increasing average success rate to 89.47%. Hence, GFS algorithm is the effective way of obtaining optimal set of feature in breast cancer diagnosis.
Genetic Particle Swarm Optimization–Based Feature Selection for Very-High-Resolution Remotely Sensed Imagery Object Change Detection

Science.gov (United States)

Chen, Qiang; Chen, Yunhao; Jiang, Weiguo

2016-01-01

In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm. PMID:27483285
Genetic Particle Swarm Optimization-Based Feature Selection for Very-High-Resolution Remotely Sensed Imagery Object Change Detection.

Science.gov (United States)

Chen, Qiang; Chen, Yunhao; Jiang, Weiguo

2016-07-30

In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm.
Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets

NARCIS (Netherlands)

Aalaei, Shokoufeh; Shahraki, Hadi; Rowhanimanesh, Alireza; Eslami, Saeid

2016-01-01

This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. To
A Local Asynchronous Distributed Privacy Preserving Feature Selection Algorithm for Large Peer-to-Peer Networks

Data.gov (United States)

National Aeronautics and Space Administration — In this paper we develop a local distributed privacy preserving algorithm for feature selection in a large peer-to-peer environment. Feature selection is often used...
Obstructive sleep apnea screening by integrating snore feature classes

International Nuclear Information System (INIS)

Abeyratne, U R; De Silva, S; Hukins, C; Duce, B

2013-01-01

Obstructive sleep apnea (OSA) is a serious sleep disorder with high community prevalence. More than 80% of OSA suffers remain undiagnosed. Polysomnography (PSG) is the current reference standard used for OSA diagnosis. It is expensive, inconvenient and demands the extensive involvement of a sleep technologist. At present, a low cost, unattended, convenient OSA screening technique is an urgent requirement. Snoring is always almost associated with OSA and is one of the earliest nocturnal symptoms. With the onset of sleep, the upper airway undergoes both functional and structural changes, leading to spatially and temporally distributed sites conducive to snore sound (SS) generation. The goal of this paper is to investigate the possibility of developing a snore based multi-feature class OSA screening tool by integrating snore features that capture functional, structural, and spatio-temporal dependences of SS. In this paper, we focused our attention to the features in voiced parts of a snore, where quasi-repetitive packets of energy are visible. Individual snore feature classes were then optimized using logistic regression for optimum OSA diagnostic performance. Consequently, all feature classes were integrated and optimized to obtain optimum OSA classification sensitivity and specificity. We also augmented snore features with neck circumference, which is a one-time measurement readily available at no extra cost. The performance of the proposed method was evaluated using snore recordings from 86 subjects (51 males and 35 females). Data from each subject consisted of 6–8 h long sound recordings, made concurrently with routine PSG in a clinical sleep laboratory. Clinical diagnosis supported by standard PSG was used as the reference diagnosis to compare our results against. Our proposed techniques resulted in a sensitivity of 93±9% with specificity 93±9% for females and sensitivity of 92±6% with specificity 93±7% for males at an AHI decision threshold of 15 events
Self-Adaptive MOEA Feature Selection for Classification of Bankruptcy Prediction Data

Science.gov (United States)

Gaspar-Cunha, A.; Recio, G.; Costa, L.; Estébanez, C.

2014-01-01

Bankruptcy prediction is a vast area of finance and accounting whose importance lies in the relevance for creditors and investors in evaluating the likelihood of getting into bankrupt. As companies become complex, they develop sophisticated schemes to hide their real situation. In turn, making an estimation of the credit risks associated with counterparts or predicting bankruptcy becomes harder. Evolutionary algorithms have shown to be an excellent tool to deal with complex problems in finances and economics where a large number of irrelevant features are involved. This paper provides a methodology for feature selection in classification of bankruptcy data sets using an evolutionary multiobjective approach that simultaneously minimise the number of features and maximise the classifier quality measure (e.g., accuracy). The proposed methodology makes use of self-adaptation by applying the feature selection algorithm while simultaneously optimising the parameters of the classifier used. The methodology was applied to four different sets of data. The obtained results showed the utility of using the self-adaptation of the classifier. PMID:24707201
Reliability criteria selection for integrated resource planning

International Nuclear Information System (INIS)

Ruiu, D.; Ye, C.; Billinton, R.; Lakhanpal, D.

1993-01-01

A study was conducted on the selection of a generating system reliability criterion that ensures a reasonable continuity of supply while minimizing the total costs to utility customers. The study was conducted using the Institute for Electronic and Electrical Engineers (IEEE) reliability test system as the study system. The study inputs and results for conditions and load forecast data, new supply resources data, demand-side management resource data, resource planning criterion, criterion value selection, supply side development, integrated resource development, and best criterion values, are tabulated and discussed. Preliminary conclusions are drawn as follows. In the case of integrated resource planning, the selection of the best value for a given type of reliability criterion can be done using methods similar to those used for supply side planning. The reliability criteria values previously used for supply side planning may not be economically justified when integrated resource planning is used. Utilities may have to revise and adopt new, and perhaps lower supply reliability criteria for integrated resource planning. More complex reliability criteria, such as energy related indices, which take into account the magnitude, frequency and duration of the expected interruptions are better adapted than the simpler capacity-based reliability criteria such as loss of load expectation. 7 refs., 5 figs., 10 tabs
The impact of feature selection on one and two-class classification performance for plant microRNAs.

Science.gov (United States)

Khalifa, Waleed; Yousef, Malik; Saçar Demirci, Müşerref Duygu; Allmer, Jens

2016-01-01

MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ∼29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ∼13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.
The impact of feature selection on one and two-class classification performance for plant microRNAs

Directory of Open Access Journals (Sweden)

Waleed Khalifa

2016-06-01

Full Text Available MicroRNAs (miRNAs are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18–24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC is used in the field; because negative examples are hard to come by, one-class classification (OCC has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ∼29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ∼13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.
Enhancing Web Service Selection by User Preferences of Non-Functional Features

OpenAIRE

Badr , Youakim; Abraham , Ajith; Biennier , Frédérique; Grosan , Crina

2008-01-01

International audience; Selection of an appropriate Web service for a particular task has become a difficult challenge due to the increasing number of Web services offering similar functionalities. The functional properties describe what the service can do and the nonfunctional properties depict how the service can do it. Non-functional properties involving qualitative or quantitative features have become essential criteria to enhance the selection process of services making the selection pro...
Cross-modal integration of lexical-semantic features during word processing: evidence from oscillatory dynamics during EEG.

Directory of Open Access Journals (Sweden)

Markus J van Ackeren

Full Text Available In recent years, numerous studies have provided converging evidence that word meaning is partially stored in modality-specific cortical networks. However, little is known about the mechanisms supporting the integration of this distributed semantic content into coherent conceptual representations. In the current study we aimed to address this issue by using EEG to look at the spatial and temporal dynamics of feature integration during word comprehension. Specifically, participants were presented with two modality-specific features (i.e., visual or auditory features such as silver and loud and asked to verify whether these two features were compatible with a subsequently presented target word (e.g., WHISTLE. Each pair of features described properties from either the same modality (e.g., silver, tiny = visual features or different modalities (e.g., silver, loud = visual, auditory. Behavioral and EEG data were collected. The results show that verifying features that are putatively represented in the same modality-specific network is faster than verifying features across modalities. At the neural level, integrating features across modalities induces sustained oscillatory activity around the theta range (4-6 Hz in left anterior temporal lobe (ATL, a putative hub for integrating distributed semantic content. In addition, enhanced long-range network interactions in the theta range were seen between left ATL and a widespread cortical network. These results suggest that oscillatory dynamics in the theta range could be involved in integrating multimodal semantic content by creating transient functional networks linking distributed modality-specific networks and multimodal semantic hubs such as left ATL.
Particle swarm optimization based feature enhancement and feature selection for improved emotion recognition in speech and glottal signals.

Science.gov (United States)

Muthusamy, Hariharan; Polat, Kemal; Yaacob, Sazali

2015-01-01

In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.
Structural features of subtype-selective EP receptor modulators.

Science.gov (United States)

Markovič, Tijana; Jakopin, Žiga; Dolenc, Marija Sollner; Mlinarič-Raščan, Irena

2017-01-01

Prostaglandin E2 is a potent endogenous molecule that binds to four different G-protein-coupled receptors: EP1-4. Each of these receptors is a valuable drug target, with distinct tissue localisation and signalling pathways. We review the structural features of EP modulators required for subtype-selective activity, as well as the structural requirements for improved pharmacokinetic parameters. Novel EP receptor subtype selective agonists and antagonists appear to be valuable drug candidates in the therapy of many pathophysiological states, including ulcerative colitis, glaucoma, bone healing, B cell lymphoma, neurological diseases, among others, which have been studied in vitro, in vivo and in early phase clinical trials. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

Improving mass candidate detection in mammograms via feature maxima propagation and local feature selection.

Science.gov (United States)

Melendez, Jaime; Sánchez, Clara I; van Ginneken, Bram; Karssemeijer, Nico

2014-08-01

Mass candidate detection is a crucial component of multistep computer-aided detection (CAD) systems. It is usually performed by combining several local features by means of a classifier. When these features are processed on a per-image-location basis (e.g., for each pixel), mismatching problems may arise while constructing feature vectors for classification, which is especially true when the behavior expected from the evaluated features is a peaked response due to the presence of a mass. In this study, two of these problems, consisting of maxima misalignment and differences of maxima spread, are identified and two solutions are proposed. The first proposed method, feature maxima propagation, reproduces feature maxima through their neighboring locations. The second method, local feature selection, combines different subsets of features for different feature vectors associated with image locations. Both methods are applied independently and together. The proposed methods are included in a mammogram-based CAD system intended for mass detection in screening. Experiments are carried out with a database of 382 digital cases. Sensitivity is assessed at two sets of operating points. The first one is the interval of 3.5-15 false positives per image (FPs/image), which is typical for mass candidate detection. The second one is 1 FP/image, which allows to estimate the quality of the mass candidate detector's output for use in subsequent steps of the CAD system. The best results are obtained when the proposed methods are applied together. In that case, the mean sensitivity in the interval of 3.5-15 FPs/image significantly increases from 0.926 to 0.958 (p < 0.0002). At the lower rate of 1 FP/image, the mean sensitivity improves from 0.628 to 0.734 (p < 0.0002). Given the improved detection performance, the authors believe that the strategies proposed in this paper can render mass candidate detection approaches based on image location classification more robust to feature
Improving permafrost distribution modelling using feature selection algorithms

Science.gov (United States)

Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail

2016-04-01

The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its
Feature-selective attention: evidence for a decline in old age.

Science.gov (United States)

Quigley, Cliodhna; Andersen, Søren K; Schulze, Lars; Grunwald, Martin; Müller, Matthias M

2010-04-19

Although attention in older adults is an active research area, feature-selective aspects have not yet been explicitly studied. Here we report the results of an exploratory study involving directed changes in feature-selective attention. The stimuli used were two random dot kinematograms (RDKs) of different colours, superimposed and centrally presented. A colour cue with random onset after the beginning of each trial instructed young and older subjects to attend to one of the RDKs and detect short intervals of coherent motion while ignoring analogous motion events in the non-cued RDK. Behavioural data show that older adults could detect motion, but discriminated target from distracter motion less reliably than young adults. The method of frequency tagging allowed us to separate the EEG responses to the attended and ignored stimuli and directly compare steady-state visual evoked potential (SSVEP) amplitudes elicited by each stimulus before and after cue onset. We found that younger adults show a clear attentional enhancement of SSVEP amplitude in the post-cue interval, while older adults' SSVEP responses to attended and ignored stimuli do not differ. Thus, in situations where attentional selection cannot be spatially resolved, older adults show a deficit in selection that is not shared by young adults. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
A kernel-based multivariate feature selection method for microarray data classification.

Directory of Open Access Journals (Sweden)

Shiquan Sun

Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.
Integrative and distinctive coding of visual and conceptual object features in the ventral visual stream.

Science.gov (United States)

Martin, Chris B; Douglas, Danielle; Newsome, Rachel N; Man, Louisa Ly; Barense, Morgan D

2018-02-02

A significant body of research in cognitive neuroscience is aimed at understanding how object concepts are represented in the human brain. However, it remains unknown whether and where the visual and abstract conceptual features that define an object concept are integrated. We addressed this issue by comparing the neural pattern similarities among object-evoked fMRI responses with behavior-based models that independently captured the visual and conceptual similarities among these stimuli. Our results revealed evidence for distinctive coding of visual features in lateral occipital cortex, and conceptual features in the temporal pole and parahippocampal cortex. By contrast, we found evidence for integrative coding of visual and conceptual object features in perirhinal cortex. The neuroanatomical specificity of this effect was highlighted by results from a searchlight analysis. Taken together, our findings suggest that perirhinal cortex uniquely supports the representation of fully specified object concepts through the integration of their visual and conceptual features. © 2018, Martin et al.
Integrative and distinctive coding of visual and conceptual object features in the ventral visual stream

Science.gov (United States)

Douglas, Danielle; Newsome, Rachel N; Man, Louisa LY

2018-01-01

A significant body of research in cognitive neuroscience is aimed at understanding how object concepts are represented in the human brain. However, it remains unknown whether and where the visual and abstract conceptual features that define an object concept are integrated. We addressed this issue by comparing the neural pattern similarities among object-evoked fMRI responses with behavior-based models that independently captured the visual and conceptual similarities among these stimuli. Our results revealed evidence for distinctive coding of visual features in lateral occipital cortex, and conceptual features in the temporal pole and parahippocampal cortex. By contrast, we found evidence for integrative coding of visual and conceptual object features in perirhinal cortex. The neuroanatomical specificity of this effect was highlighted by results from a searchlight analysis. Taken together, our findings suggest that perirhinal cortex uniquely supports the representation of fully specified object concepts through the integration of their visual and conceptual features. PMID:29393853
Linear feature selection in texture analysis - A PLS based method

DEFF Research Database (Denmark)

Marques, Joselene; Igel, Christian; Lillholm, Martin

2013-01-01

We present a texture analysis methodology that combined uncommitted machine-learning techniques and partial least square (PLS) in a fully automatic framework. Our approach introduces a robust PLS-based dimensionality reduction (DR) step to specifically address outliers and high-dimensional feature...... and considering all CV groups, the methods selected 36 % of the original features available. The diagnosis evaluation reached a generalization area-under-the-ROC curve of 0.92, which was higher than established cartilage-based markers known to relate to OA diagnosis....
PACS administrators' and radiologists' perspective on the importance of features for PACS selection.

Science.gov (United States)

Joshi, Vivek; Narra, Vamsi R; Joshi, Kailash; Lee, Kyootai; Melson, David

2014-08-01

Picture archiving and communication systems (PACS) play a critical role in radiology. This paper presents the criteria important to PACS administrators for selecting a PACS. A set of criteria are identified and organized into an integrative hierarchical framework. Survey responses from 48 administrators are used to identify the relative weights of these criteria through an analytical hierarchy process. The five main dimensions for PACS selection in order of importance are system continuity and functionality, system performance and architecture, user interface for workflow management, user interface for image manipulation, and display quality. Among the subdimensions, the highest weights were assessed for security, backup, and continuity; tools for continuous performance monitoring; support for multispecialty images; and voice recognition/transcription. PACS administrators' preferences were generally in line with that of previously reported results for radiologists. Both groups assigned the highest priority to ensuring business continuity and preventing loss of data through features such as security, backup, downtime prevention, and tools for continuous PACS performance monitoring. PACS administrators' next high priorities were support for multispecialty images, image retrieval speeds from short-term and long-term storage, real-time monitoring, and architectural issues of compatibility and integration with other products. Thus, next to ensuring business continuity, administrators' focus was on issues that impact their ability to deliver services and support. On the other hand, radiologists gave high priorities to voice recognition, transcription, and reporting; structured reporting; and convenience and responsiveness in manipulation of images. Thus, radiologists' focus appears to be on issues that may impact their productivity, effort, and accuracy.
GMDH-Based Semi-Supervised Feature Selection for Electricity Load Classification Forecasting

Directory of Open Access Journals (Sweden)

Lintao Yang

2018-01-01

Full Text Available With the development of smart power grids, communication network technology and sensor technology, there has been an exponential growth in complex electricity load data. Irregular electricity load fluctuations caused by the weather and holiday factors disrupt the daily operation of the power companies. To deal with these challenges, this paper investigates a day-ahead electricity peak load interval forecasting problem. It transforms the conventional continuous forecasting problem into a novel interval forecasting problem, and then further converts the interval forecasting problem into the classification forecasting problem. In addition, an indicator system influencing the electricity load is established from three dimensions, namely the load series, calendar data, and weather data. A semi-supervised feature selection algorithm is proposed to address an electricity load classification forecasting issue based on the group method of data handling (GMDH technology. The proposed algorithm consists of three main stages: (1 training the basic classifier; (2 selectively marking the most suitable samples from the unclassified label data, and adding them to an initial training set; and (3 training the classification models on the final training set and classifying the test samples. An empirical analysis of electricity load dataset from four Chinese cities is conducted. Results show that the proposed model can address the electricity load classification forecasting problem more efficiently and effectively than the FW-Semi FS (forward semi-supervised feature selection and GMDH-U (GMDH-based semi-supervised feature selection for customer classification models.
Feature Selection of Network Intrusion Data using Genetic Algorithm and Particle Swarm Optimization

Directory of Open Access Journals (Sweden)

Iwan Syarif

2016-12-01

Full Text Available This paper describes the advantages of using Evolutionary Algorithms (EA for feature selection on network intrusion dataset. Most current Network Intrusion Detection Systems (NIDS are unable to detect intrusions in real time because of high dimensional data produced during daily operation. Extracting knowledge from huge data such as intrusion data requires new approach. The more complex the datasets, the higher computation time and the harder they are to be interpreted and analyzed. This paper investigates the performance of feature selection algoritms in network intrusiona data. We used Genetic Algorithms (GA and Particle Swarm Optimizations (PSO as feature selection algorithms. When applied to network intrusion datasets, both GA and PSO have significantly reduces the number of features. Our experiments show that GA successfully reduces the number of attributes from 41 to 15 while PSO reduces the number of attributes from 41 to 9. Using k Nearest Neighbour (k-NN as a classifier,the GA-reduced dataset which consists of 37% of original attributes, has accuracy improvement from 99.28% to 99.70% and its execution time is also 4.8 faster than the execution time of original dataset. Using the same classifier, PSO-reduced dataset which consists of 22% of original attributes, has the fastest execution time (7.2 times faster than the execution time of original datasets. However, its accuracy is slightly reduced 0.02% from 99.28% to 99.26%. Overall, both GA and PSO are good solution as feature selection techniques because theyhave shown very good performance in reducing the number of features significantly while still maintaining and sometimes improving the classification accuracy as well as reducing the computation time.
Feature Selection for Nonstationary Data: Application to Human Recognition Using Medical Biometrics.

Science.gov (United States)

Komeili, Majid; Louis, Wael; Armanfard, Narges; Hatzinakos, Dimitrios

2018-05-01

Electrocardiogram (ECG) and transient evoked otoacoustic emission (TEOAE) are among the physiological signals that have attracted significant interest in biometric community due to their inherent robustness to replay and falsification attacks. However, they are time-dependent signals and this makes them hard to deal with in across-session human recognition scenario where only one session is available for enrollment. This paper presents a novel feature selection method to address this issue. It is based on an auxiliary dataset with multiple sessions where it selects a subset of features that are more persistent across different sessions. It uses local information in terms of sample margins while enforcing an across-session measure. This makes it a perfect fit for aforementioned biometric recognition problem. Comprehensive experiments on ECG and TEOAE variability due to time lapse and body posture are done. Performance of the proposed method is compared against seven state-of-the-art feature selection algorithms as well as another six approaches in the area of ECG and TEOAE biometric recognition. Experimental results demonstrate that the proposed method performs noticeably better than other algorithms.
An Empirical Study of Wrappers for Feature Subset Selection based on a Parallel Genetic Algorithm: The Multi-Wrapper Model

KAUST Repository

Soufan, Othman

2012-09-01

Feature selection is the first task of any learning approach that is applied in major fields of biomedical, bioinformatics, robotics, natural language processing and social networking. In feature subset selection problem, a search methodology with a proper criterion seeks to find the best subset of features describing data (relevance) and achieving better performance (optimality). Wrapper approaches are feature selection methods which are wrapped around a classification algorithm and use a performance measure to select the best subset of features. We analyze the proper design of the objective function for the wrapper approach and highlight an objective based on several classification algorithms. We compare the wrapper approaches to different feature selection methods based on distance and information based criteria. Significant improvement in performance, computational time, and selection of minimally sized feature subsets is achieved by combining different objectives for the wrapper model. In addition, considering various classification methods in the feature selection process could lead to a global solution of desirable characteristics.
A Permutation Importance-Based Feature Selection Method for Short-Term Electricity Load Forecasting Using Random Forest

Directory of Open Access Journals (Sweden)

Nantian Huang

2016-09-01

Full Text Available The prediction accuracy of short-term load forecast (STLF depends on prediction model choice and feature selection result. In this paper, a novel random forest (RF-based feature selection method for STLF is proposed. First, 243 related features were extracted from historical load data and the time information of prediction points to form the original feature set. Subsequently, the original feature set was used to train an RF as the original model. After the training process, the prediction error of the original model on the test set was recorded and the permutation importance (PI value of each feature was obtained. Then, an improved sequential backward search method was used to select the optimal forecasting feature subset based on the PI value of each feature. Finally, the optimal forecasting feature subset was used to train a new RF model as the final prediction model. Experiments showed that the prediction accuracy of RF trained by the optimal forecasting feature subset was higher than that of the original model and comparative models based on support vector regression and artificial neural network.
A stereo remote sensing feature selection method based on artificial bee colony algorithm

Science.gov (United States)

Yan, Yiming; Liu, Pigang; Zhang, Ye; Su, Nan; Tian, Shu; Gao, Fengjiao; Shen, Yi

2014-05-01

To improve the efficiency of stereo information for remote sensing classification, a stereo remote sensing feature selection method is proposed in this paper presents, which is based on artificial bee colony algorithm. Remote sensing stereo information could be described by digital surface model (DSM) and optical image, which contain information of the three-dimensional structure and optical characteristics, respectively. Firstly, three-dimensional structure characteristic could be analyzed by 3D-Zernike descriptors (3DZD). However, different parameters of 3DZD could descript different complexity of three-dimensional structure, and it needs to be better optimized selected for various objects on the ground. Secondly, features for representing optical characteristic also need to be optimized. If not properly handled, when a stereo feature vector composed of 3DZD and image features, that would be a lot of redundant information, and the redundant information may not improve the classification accuracy, even cause adverse effects. To reduce information redundancy while maintaining or improving the classification accuracy, an optimized frame for this stereo feature selection problem is created, and artificial bee colony algorithm is introduced for solving this optimization problem. Experimental results show that the proposed method can effectively improve the computational efficiency, improve the classification accuracy.
Selective Integration in the Material-Point Method

DEFF Research Database (Denmark)

Andersen, Lars; Andersen, Søren; Damkilde, Lars

2009-01-01

The paper deals with stress integration in the material-point method. In order to avoid parasitic shear in bending, a formulation is proposed, based on selective integration in the background grid that is used to solve the governing equations. The suggested integration scheme is compared...... to a traditional material-point-method computation in which the stresses are evaluated at the material points. The deformation of a cantilever beam is analysed, assuming elastic or elastoplastic material behaviour....
Information Theory for Gabor Feature Selection for Face Recognition

Directory of Open Access Journals (Sweden)

Shen Linlin

2006-01-01

Full Text Available A discriminative and robust feature—kernel enhanced informative Gabor feature—is proposed in this paper for face recognition. Mutual information is applied to select a set of informative and nonredundant Gabor features, which are then further enhanced by kernel methods for recognition. Compared with one of the top performing methods in the 2004 Face Verification Competition (FVC2004, our methods demonstrate a clear advantage over existing methods in accuracy, computation efficiency, and memory cost. The proposed method has been fully tested on the FERET database using the FERET evaluation protocol. Significant improvements on three of the test data sets are observed. Compared with the classical Gabor wavelet-based approaches using a huge number of features, our method requires less than 4 milliseconds to retrieve a few hundreds of features. Due to the substantially reduced feature dimension, only 4 seconds are required to recognize 200 face images. The paper also unified different Gabor filter definitions and proposed a training sample generation algorithm to reduce the effects caused by unbalanced number of samples available in different classes.
Information Theory for Gabor Feature Selection for Face Recognition

Science.gov (United States)

Shen, Linlin; Bai, Li

2006-12-01

A discriminative and robust feature—kernel enhanced informative Gabor feature—is proposed in this paper for face recognition. Mutual information is applied to select a set of informative and nonredundant Gabor features, which are then further enhanced by kernel methods for recognition. Compared with one of the top performing methods in the 2004 Face Verification Competition (FVC2004), our methods demonstrate a clear advantage over existing methods in accuracy, computation efficiency, and memory cost. The proposed method has been fully tested on the FERET database using the FERET evaluation protocol. Significant improvements on three of the test data sets are observed. Compared with the classical Gabor wavelet-based approaches using a huge number of features, our method requires less than 4 milliseconds to retrieve a few hundreds of features. Due to the substantially reduced feature dimension, only 4 seconds are required to recognize 200 face images. The paper also unified different Gabor filter definitions and proposed a training sample generation algorithm to reduce the effects caused by unbalanced number of samples available in different classes.
Wavelet-Based Feature Extraction in Fault Diagnosis for Biquad High-Pass Filter Circuit

OpenAIRE

Yuehai Wang; Yongzheng Yan; Qinyong Wang

2016-01-01

Fault diagnosis for analog circuit has become a prominent factor in improving the reliability of integrated circuit due to its irreplaceability in modern integrated circuits. In fact fault diagnosis based on intelligent algorithms has become a popular research topic as efficient feature extraction and selection are a critical and intricate task in analog fault diagnosis. Further, it is extremely important to propose some general guidelines for the optimal feature extraction and selection. In ...
Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics.

Science.gov (United States)

Lin, Xiaohui; Li, Chao; Zhang, Yanhui; Su, Benzhe; Fan, Meng; Wei, Hai

2017-12-26

Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.
Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

Directory of Open Access Journals (Sweden)

Xiaohui Lin

2017-12-01

Full Text Available Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

Directory of Open Access Journals (Sweden)

Maolong Xi

2016-01-01

Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.
Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

Science.gov (United States)

Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

2016-01-01

This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363
An enhanced PSO-DEFS based feature selection with biometric authentication for identification of diabetic retinopathy

Directory of Open Access Journals (Sweden)

Umarani Balakrishnan

2016-11-01

Full Text Available Recently, automatic diagnosis of diabetic retinopathy (DR from the retinal image is the most significant research topic in the medical applications. Diabetic macular edema (DME is the major reason for the loss of vision in patients suffering from DR. Early identification of the DR enables to prevent the vision loss and encourage diabetic control activities. Many techniques are developed to diagnose the DR. The major drawbacks of the existing techniques are low accuracy and high time complexity. To overcome these issues, this paper proposes an enhanced particle swarm optimization-differential evolution feature selection (PSO-DEFS based feature selection approach with biometric authentication for the identification of DR. Initially, a hybrid median filter (HMF is used for pre-processing the input images. Then, the pre-processed images are embedded with each other by using least significant bit (LSB for authentication purpose. Simultaneously, the image features are extracted using convoluted local tetra pattern (CLTrP and Tamura features. Feature selection is performed using PSO-DEFS and PSO-gravitational search algorithm (PSO-GSA to reduce time complexity. Based on some performance metrics, the PSO-DEFS is chosen as a better choice for feature selection. The feature selection is performed based on the fitness value. A multi-relevance vector machine (M-RVM is introduced to classify the 13 normal and 62 abnormal images among 75 images from 60 patients. Finally, the DR patients are further classified by M-RVM. The experimental results exhibit that the proposed approach achieves better accuracy, sensitivity, and specificity than the existing techniques.
Comparative Study on Feature Selection and Fusion Schemes for Emotion Recognition from Speech

Directory of Open Access Journals (Sweden)

Santiago Planet

2012-09-01

Full Text Available The automatic analysis of speech to detect affective states may improve the way users interact with electronic devices. However, the analysis only at the acoustic level could be not enough to determine the emotion of a user in a realistic scenario. In this paper we analyzed the spontaneous speech recordings of the FAU Aibo Corpus at the acoustic and linguistic levels to extract two sets of features. The acoustic set was reduced by a greedy procedure selecting the most relevant features to optimize the learning stage. We compared two versions of this greedy selection algorithm by performing the search of the relevant features forwards and backwards. We experimented with three classification approaches: Naïve-Bayes, a support vector machine and a logistic model tree, and two fusion schemes: decision-level fusion, merging the hard-decisions of the acoustic and linguistic classifiers by means of a decision tree; and feature-level fusion, concatenating both sets of features before the learning stage. Despite the low performance achieved by the linguistic data, a dramatic improvement was achieved after its combination with the acoustic information, improving the results achieved by this second modality on its own. The results achieved by the classifiers using the parameters merged at feature level outperformed the classification results of the decision-level fusion scheme, despite the simplicity of the scheme. Moreover, the extremely reduced set of acoustic features obtained by the greedy forward search selection algorithm improved the results provided by the full set.
TEHRAN AIR POLLUTANTS PREDICTION BASED ON RANDOM FOREST FEATURE SELECTION METHOD

Directory of Open Access Journals (Sweden)

A. Shamsoddini

2017-09-01

Full Text Available Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.
Tehran Air Pollutants Prediction Based on Random Forest Feature Selection Method

Science.gov (United States)

Shamsoddini, A.; Aboodi, M. R.; Karami, J.

2017-09-01

Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.
A proposed framework on hybrid feature selection techniques for handling high dimensional educational data

Science.gov (United States)

Shahiri, Amirah Mohamed; Husain, Wahidah; Rashid, Nur'Aini Abd

2017-10-01

Huge amounts of data in educational datasets may cause the problem in producing quality data. Recently, data mining approach are increasingly used by educational data mining researchers for analyzing the data patterns. However, many research studies have concentrated on selecting suitable learning algorithms instead of performing feature selection process. As a result, these data has problem with computational complexity and spend longer computational time for classification. The main objective of this research is to provide an overview of feature selection techniques that have been used to analyze the most significant features. Then, this research will propose a framework to improve the quality of students' dataset. The proposed framework uses filter and wrapper based technique to support prediction process in future study.
Effective Feature Preprocessing for Time Series Forecasting

DEFF Research Database (Denmark)

Zhao, Junhua; Dong, Zhaoyang; Xu, Zhao

2006-01-01

Time series forecasting is an important area in data mining research. Feature preprocessing techniques have significant influence on forecasting accuracy, therefore are essential in a forecasting model. Although several feature preprocessing techniques have been applied in time series forecasting...... performance in time series forecasting. It is demonstrated in our experiment that, effective feature preprocessing can significantly enhance forecasting accuracy. This research can be a useful guidance for researchers on effectively selecting feature preprocessing techniques and integrating them with time...... series forecasting models....
An opinion formation based binary optimization approach for feature selection

Science.gov (United States)

Hamedmoghadam, Homayoun; Jalili, Mahdi; Yu, Xinghuo

2018-02-01

This paper proposed a novel optimization method based on opinion formation in complex network systems. The proposed optimization technique mimics human-human interaction mechanism based on a mathematical model derived from social sciences. Our method encodes a subset of selected features to the opinion of an artificial agent and simulates the opinion formation process among a population of agents to solve the feature selection problem. The agents interact using an underlying interaction network structure and get into consensus in their opinions, while finding better solutions to the problem. A number of mechanisms are employed to avoid getting trapped in local minima. We compare the performance of the proposed method with a number of classical population-based optimization methods and a state-of-the-art opinion formation based method. Our experiments on a number of high dimensional datasets reveal outperformance of the proposed algorithm over others.
Integrated shape and material selection for single and multi-performance criteria

International Nuclear Information System (INIS)

Singh, Jasveer; Mirjalili, Vahid; Pasini, Damiano

2011-01-01

Research highlights: → The method of shape transformers is extended to torsional stiffness and combined load design. → The method is generalized for multi-criteria selection of shape and material. → Performance charts are presented for single and multi-objective selection of cross-section shape and material. → A four quadrant performance chart is presented to visualize the relation between objective function space and design variable space. -- Abstract: A shape and material selection method, based on the concept of shape transformers, has been recently introduced to characterize the mass efficiency of lightweight beams under bending and shear. This paper extends this method to deal with the case of torsional stiffness design, and generalize it to single and multi-crieria selection of lightweight shafts subjected to a combination of bending, shear, and torsional load. The novel feature of the paper is the useful integration of shape and material to model and visualize multi-objective selection problems. The scheme is centered on concept selection in structural design, and hinges on measures that govern the shape properties of a cross-section regardless of its size. These measures, referred as shape transformers, can classify shapes in a way similar to material classification. The procedure is exemplified by considering torsional stiffness as a constraint. The performance charts are developed for single and multi-criteria to visualize in a glance the whole range of cross-sectional shapes for each material. Each design chart is explained with a brief example.
Trends and Features of Student Research Integration in Educational Program

Science.gov (United States)

Grinenko, Svetlana; Makarova, Elena; Andreassen, John-Erik

2016-01-01

This study examines trends and features of student research integration in educational program during international cooperation between Østfold University College in Norway and Southern Federal University in Russia. According to research and education approach the international project is aimed to use four education models, which linked student…
Data Visualization and Feature Selection Methods in Gel-based Proteomics

DEFF Research Database (Denmark)

Silva, Tomé Santos; Richard, Nadege; Dias, Jorge P.

2014-01-01

-based proteomics, summarizing the current state of research within this field. Particular focus is given on discussing the usefulness of available multivariate analysis tools both for data visualization and feature selection purposes. Visual examples are given using a real gel-based proteomic dataset as basis....
Feature selection and multi-kernel learning for adaptive graph regularized nonnegative matrix factorization

KAUST Repository

Wang, Jim Jing-Yan

2014-09-20

Nonnegative matrix factorization (NMF), a popular part-based representation technique, does not capture the intrinsic local geometric structure of the data space. Graph regularized NMF (GNMF) was recently proposed to avoid this limitation by regularizing NMF with a nearest neighbor graph constructed from the input data set. However, GNMF has two main bottlenecks. First, using the original feature space directly to construct the graph is not necessarily optimal because of the noisy and irrelevant features and nonlinear distributions of data samples. Second, one possible way to handle the nonlinear distribution of data samples is by kernel embedding. However, it is often difficult to choose the most suitable kernel. To solve these bottlenecks, we propose two novel graph-regularized NMF methods, AGNMFFS and AGNMFMK, by introducing feature selection and multiple-kernel learning to the graph regularized NMF, respectively. Instead of using a fixed graph as in GNMF, the two proposed methods learn the nearest neighbor graph that is adaptive to the selected features and learned multiple kernels, respectively. For each method, we propose a unified objective function to conduct feature selection/multi-kernel learning, NMF and adaptive graph regularization simultaneously. We further develop two iterative algorithms to solve the two optimization problems. Experimental results on two challenging pattern classification tasks demonstrate that the proposed methods significantly outperform state-of-the-art data representation methods.
A HYBRID FILTER AND WRAPPER FEATURE SELECTION APPROACH FOR DETECTING CONTAMINATION IN DRINKING WATER MANAGEMENT SYSTEM

Directory of Open Access Journals (Sweden)

S. VISALAKSHI

2017-07-01

Full Text Available Feature selection is an important task in predictive models which helps to identify the irrelevant features in the high - dimensional dataset. In this case of water contamination detection dataset, the standard wrapper algorithm alone cannot be applied because of the complexity. To overcome this computational complexity problem and making it lighter, filter-wrapper based algorithm has been proposed. In this work, reducing the feature space is a significant component of water contamination. The main findings are as follows: (1 The main goal is speeding up the feature selection process, so the proposed filter - based feature pre-selection is applied and guarantees that useful data are improbable to be detached in the initial stage which discussed briefly in this paper. (2 The resulting features are again filtered by using the Genetic Algorithm coded with Support Vector Machine method, where it facilitates to nutshell the subset of features with high accuracy and decreases the expense. Experimental results show that the proposed methods trim down redundant features effectively and achieved better classification accuracy.
Analysis of Different Feature Selection Criteria Based on a Covariance Convergence Perspective for a SLAM Algorithm

Science.gov (United States)

Auat Cheein, Fernando A.; Carelli, Ricardo

2011-01-01

This paper introduces several non-arbitrary feature selection techniques for a Simultaneous Localization and Mapping (SLAM) algorithm. The feature selection criteria are based on the determination of the most significant features from a SLAM convergence perspective. The SLAM algorithm implemented in this work is a sequential EKF (Extended Kalman filter) SLAM. The feature selection criteria are applied on the correction stage of the SLAM algorithm, restricting it to correct the SLAM algorithm with the most significant features. This restriction also causes a decrement in the processing time of the SLAM. Several experiments with a mobile robot are shown in this work. The experiments concern the map reconstruction and a comparison between the different proposed techniques performance. The experiments were carried out at an outdoor environment composed by trees, although the results shown herein are not restricted to a special type of features. PMID:22346568
Technical Evaluation Report 27: Educational Wikis: Features and selection criteria

Directory of Open Access Journals (Sweden)

Jim Rudolph

2004-04-01

Full Text Available This report discusses the educational uses of the ‘wiki,’ an increasingly popular approach to online community development. Wikis are defined and compared with ‘blogging’ methods; characteristics of major wiki engines are described; and wiki features and selection criteria are examined.
Notes on the evolution of feature selection methodology

Czech Academy of Sciences Publication Activity Database

Somol, Petr; Novovičová, Jana; Pudil, Pavel

2007-01-01

Roč. 43, č. 5 (2007), s. 713-730 ISSN 0023-5954 R&D Projects: GA ČR GA102/07/1594; GA MŠk 1M0572; GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : feature selection * branch and bound * sequential search * mixture model Subject RIV: IN - Informatics, Computer Science Impact factor: 0.552, year: 2007
Cancer microarray data feature selection using multi-objective binary particle swarm optimization algorithm

Science.gov (United States)

Annavarapu, Chandra Sekhara Rao; Dara, Suresh; Banka, Haider

2016-01-01

Cancer investigations in microarray data play a major role in cancer analysis and the treatment. Cancer microarray data consists of complex gene expressed patterns of cancer. In this article, a Multi-Objective Binary Particle Swarm Optimization (MOBPSO) algorithm is proposed for analyzing cancer gene expression data. Due to its high dimensionality, a fast heuristic based pre-processing technique is employed to reduce some of the crude domain features from the initial feature set. Since these pre-processed and reduced features are still high dimensional, the proposed MOBPSO algorithm is used for finding further feature subsets. The objective functions are suitably modeled by optimizing two conflicting objectives i.e., cardinality of feature subsets and distinctive capability of those selected subsets. As these two objective functions are conflicting in nature, they are more suitable for multi-objective modeling. The experiments are carried out on benchmark gene expression datasets, i.e., Colon, Lymphoma and Leukaemia available in literature. The performance of the selected feature subsets with their classification accuracy and validated using 10 fold cross validation techniques. A detailed comparative study is also made to show the betterment or competitiveness of the proposed algorithm. PMID:27822174
An integrated model for supplier selection process

Institute of Scientific and Technical Information of China (English)

无

2003-01-01

In today's highly competitive manufacturing environment, the supplier selection process becomes one of crucial activities in supply chain management. In order to select the best supplier(s) it is not only necessary to continuously tracking and benchmarking performance of suppliers but also to make a tradeoff between tangible and intangible factors some of which may conflict. In this paper an integration of case-based reasoning (CBR), analytical network process (ANP) and linear programming (LP) is proposed to solve the supplier selection problem.
Emotion of Physiological Signals Classification Based on TS Feature Selection

Institute of Scientific and Technical Information of China (English)

Wang Yujing; Mo Jianlin

2015-01-01

This paper propose a method of TS-MLP about emotion recognition of physiological signal.It can recognize emotion successfully by Tabu search which selects features of emotion’s physiological signals and multilayer perceptron that is used to classify emotion.Simulation shows that it has achieved good emotion classification performance.

Less is more: Avoiding the LIBS dimensionality curse through judicious feature selection for explosive detection

Science.gov (United States)

Kumar Myakalwar, Ashwin; Spegazzini, Nicolas; Zhang, Chi; Kumar Anubham, Siva; Dasari, Ramachandra R.; Barman, Ishan; Kumar Gundawar, Manoj

2015-01-01

Despite its intrinsic advantages, translation of laser induced breakdown spectroscopy for material identification has been often impeded by the lack of robustness of developed classification models, often due to the presence of spurious correlations. While a number of classifiers exhibiting high discriminatory power have been reported, efforts in establishing the subset of relevant spectral features that enable a fundamental interpretation of the segmentation capability and avoid the ‘curse of dimensionality’ have been lacking. Using LIBS data acquired from a set of secondary explosives, we investigate judicious feature selection approaches and architect two different chemometrics classifiers –based on feature selection through prerequisite knowledge of the sample composition and genetic algorithm, respectively. While the full spectral input results in classification rate of ca.92%, selection of only carbon to hydrogen spectral window results in near identical performance. Importantly, the genetic algorithm-derived classifier shows a statistically significant improvement to ca. 94% accuracy for prospective classification, even though the number of features used is an order of magnitude smaller. Our findings demonstrate the impact of rigorous feature selection in LIBS and also hint at the feasibility of using a discrete filter based detector thereby enabling a cheaper and compact system more amenable to field operations. PMID:26286630
Multi-Stage Feature Selection by Using Genetic Algorithms for Fault Diagnosis in Gearboxes Based on Vibration Signal

Directory of Open Access Journals (Sweden)

Mariela Cerrada

2015-09-01

Full Text Available There are growing demands for condition-based monitoring of gearboxes, and techniques to improve the reliability, effectiveness and accuracy for fault diagnosis are considered valuable contributions. Feature selection is still an important aspect in machine learning-based diagnosis in order to reach good performance in the diagnosis system. The main aim of this research is to propose a multi-stage feature selection mechanism for selecting the best set of condition parameters on the time, frequency and time-frequency domains, which are extracted from vibration signals for fault diagnosis purposes in gearboxes. The selection is based on genetic algorithms, proposing in each stage a new subset of the best features regarding the classifier performance in a supervised environment. The selected features are augmented at each stage and used as input for a neural network classifier in the next step, while a new subset of feature candidates is treated by the selection process. As a result, the inherent exploration and exploitation of the genetic algorithms for finding the best solutions of the selection problem are locally focused. The Sensors 2015, 15 23904 approach is tested on a dataset from a real test bed with several fault classes under different running conditions of load and velocity. The model performance for diagnosis is over 98%.
Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients.

Directory of Open Access Journals (Sweden)

Nicole A Capela

Full Text Available Human activity recognition (HAR, using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter. The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree. Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.
Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients.

Science.gov (United States)

Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie

2015-01-01

Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.
Feature selection for neural network based defect classification of ceramic components using high frequency ultrasound.

Science.gov (United States)

Kesharaju, Manasa; Nagarajah, Romesh

2015-09-01

The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. Copyright © 2015 Elsevier B.V. All rights reserved.
SITE-94. Discrete-feature modelling of the Aespoe site: 2. Development of the integrated site-scale model

International Nuclear Information System (INIS)

Geier, J.E.

1996-12-01

A 3-dimensional, discrete-feature hydrological model is developed. The model integrates structural and hydrologic data for the Aespoe site, on scales ranging from semi regional fracture zones to individual fractures in the vicinity of the nuclear waste canisters. Hydrologic properties of the large-scale structures are initially estimated from cross-hole hydrologic test data, and automatically calibrated by numerical simulation of network flow, and comparison with undisturbed heads and observed drawdown in selected cross-hole tests. The calibrated model is combined with a separately derived fracture network model, to yield the integrated model. This model is partly validated by simulation of transient responses to a long-term pumping test and a convergent tracer test, based on the LPT2 experiment at Aespoe. The integrated model predicts that discharge from the SITE-94 repository is predominantly via fracture zones along the eastern shore of Aespoe. Similar discharge loci are produced by numerous model variants that explore uncertainty with regard to effective semi regional boundary conditions, hydrologic properties of the site-scale structures, and alternative structural/hydrological interpretations. 32 refs
SITE-94. Discrete-feature modelling of the Aespoe site: 2. Development of the integrated site-scale model

Energy Technology Data Exchange (ETDEWEB)

Geier, J.E. [Golder Associates AB, Uppsala (Sweden)

1996-12-01

A 3-dimensional, discrete-feature hydrological model is developed. The model integrates structural and hydrologic data for the Aespoe site, on scales ranging from semi regional fracture zones to individual fractures in the vicinity of the nuclear waste canisters. Hydrologic properties of the large-scale structures are initially estimated from cross-hole hydrologic test data, and automatically calibrated by numerical simulation of network flow, and comparison with undisturbed heads and observed drawdown in selected cross-hole tests. The calibrated model is combined with a separately derived fracture network model, to yield the integrated model. This model is partly validated by simulation of transient responses to a long-term pumping test and a convergent tracer test, based on the LPT2 experiment at Aespoe. The integrated model predicts that discharge from the SITE-94 repository is predominantly via fracture zones along the eastern shore of Aespoe. Similar discharge loci are produced by numerous model variants that explore uncertainty with regard to effective semi regional boundary conditions, hydrologic properties of the site-scale structures, and alternative structural/hydrological interpretations. 32 refs.
Oscillatory neuronal activity reflects lexical-semantic feature integration within and across sensory modalities in distributed cortical networks.

Science.gov (United States)

van Ackeren, Markus J; Schneider, Till R; Müsch, Kathrin; Rueschemeyer, Shirley-Ann

2014-10-22

Research from the previous decade suggests that word meaning is partially stored in distributed modality-specific cortical networks. However, little is known about the mechanisms by which semantic content from multiple modalities is integrated into a coherent multisensory representation. Therefore we aimed to characterize differences between integration of lexical-semantic information from a single modality compared with two sensory modalities. We used magnetoencephalography in humans to investigate changes in oscillatory neuronal activity while participants verified two features for a given target word (e.g., "bus"). Feature pairs consisted of either two features from the same modality (visual: "red," "big") or different modalities (auditory and visual: "red," "loud"). The results suggest that integrating modality-specific features of the target word is associated with enhanced high-frequency power (80-120 Hz), while integrating features from different modalities is associated with a sustained increase in low-frequency power (2-8 Hz). Source reconstruction revealed a peak in the anterior temporal lobe for low-frequency and high-frequency effects. These results suggest that integrating lexical-semantic knowledge at different cortical scales is reflected in frequency-specific oscillatory neuronal activity in unisensory and multisensory association networks. Copyright © 2014 the authors 0270-6474/14/3314318-06$15.00/0.
Personnel Selection Influences on Remotely Piloted Aircraft Human-System Integration.

Science.gov (United States)

Carretta, Thomas R; King, Raymond E

2015-08-01

Human-system integration (HSI) is a complex process used to design and develop systems that integrate human capabilities and limitations in an effective and affordable manner. Effective HSI incorporates several domains, including manpower, personnel and training, human factors, environment, safety, occupational health, habitability, survivability, logistics, intelligence, mobility, and command and control. To achieve effective HSI, the relationships among these domains must be considered. Although this integrated approach is well documented, there are many instances where it is not followed. Human factors engineers typically focus on system design with little attention to the skills, abilities, and other characteristics needed by human operators. When problems with fielded systems occur, additional training of personnel is developed and conducted. Personnel selection is seldom considered during the HSI process. Complex systems such as aviation require careful selection of the individuals who will interact with the system. Personnel selection is a two-stage process involving select-in and select-out procedures. Select-in procedures determine which candidates have the aptitude to profit from training and represent the best investment. Select-out procedures focus on medical qualification and determine who should not enter training for medical reasons. The current paper discusses the role of personnel selection in the HSI process in the context of remotely piloted aircraft systems.
The co-feature ratio, a novel method for the measurement of chromatographic and signal selectivity in LC-MS-based metabolomics

International Nuclear Information System (INIS)

Elmsjö, Albert; Haglöf, Jakob; Engskog, Mikael K.R.; Nestor, Marika; Arvidsson, Torbjörn; Pettersson, Curt

2017-01-01

Evaluation of analytical procedures, especially in regards to measuring chromatographic and signal selectivity, is highly challenging in untargeted metabolomics. The aim of this study was to suggest a new straightforward approach for a systematic examination of chromatographic and signal selectivity in LC-MS-based metabolomics. By calculating the ratio between each feature and its co-eluting features (the co-features), a measurement of the chromatographic selectivity (i.e. extent of co-elution) as well as the signal selectivity (e.g. amount of adduct formation) of each feature could be acquired, the co-feature ratio. This approach was used to examine possible differences in chromatographic and signal selectivity present in samples exposed to three different sample preparation procedures. The capability of the co-feature ratio was evaluated both in a classical targeted setting using isotope labelled standards as well as without standards in an untargeted setting. For the targeted analysis, several metabolites showed a skewed quantitative signal due to poor chromatographic selectivity and/or poor signal selectivity. Moreover, evaluation of the untargeted approach through multivariate analysis of the co-feature ratios demonstrated the possibility to screen for metabolites displaying poor chromatographic and/or signal selectivity characteristics. We conclude that the co-feature ratio can be a useful tool in the development and evaluation of analytical procedures in LC-MS-based metabolomics investigations. Increased selectivity through proper choice of analytical procedures may decrease the false positive and false negative discovery rate and thereby increase the validity of any metabolomic investigation. - Highlights: • The co-feature ratio (CFR) is introduced. • CFR measures chromatographic and signal selectivity of a feature. • CFR can be used for evaluating experimental procedures in metabolomics. • CFR can aid in locating features with poor selectivity. �
The co-feature ratio, a novel method for the measurement of chromatographic and signal selectivity in LC-MS-based metabolomics

Energy Technology Data Exchange (ETDEWEB)

Elmsjö, Albert, E-mail: Albert.Elmsjo@farmkemi.uu.se [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden); Haglöf, Jakob; Engskog, Mikael K.R. [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden); Nestor, Marika [Department of Immunology, Genetics and Pathology, Uppsala University (Sweden); Arvidsson, Torbjörn [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden); Medical Product Agency, Uppsala (Sweden); Pettersson, Curt [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden)

2017-03-01

Evaluation of analytical procedures, especially in regards to measuring chromatographic and signal selectivity, is highly challenging in untargeted metabolomics. The aim of this study was to suggest a new straightforward approach for a systematic examination of chromatographic and signal selectivity in LC-MS-based metabolomics. By calculating the ratio between each feature and its co-eluting features (the co-features), a measurement of the chromatographic selectivity (i.e. extent of co-elution) as well as the signal selectivity (e.g. amount of adduct formation) of each feature could be acquired, the co-feature ratio. This approach was used to examine possible differences in chromatographic and signal selectivity present in samples exposed to three different sample preparation procedures. The capability of the co-feature ratio was evaluated both in a classical targeted setting using isotope labelled standards as well as without standards in an untargeted setting. For the targeted analysis, several metabolites showed a skewed quantitative signal due to poor chromatographic selectivity and/or poor signal selectivity. Moreover, evaluation of the untargeted approach through multivariate analysis of the co-feature ratios demonstrated the possibility to screen for metabolites displaying poor chromatographic and/or signal selectivity characteristics. We conclude that the co-feature ratio can be a useful tool in the development and evaluation of analytical procedures in LC-MS-based metabolomics investigations. Increased selectivity through proper choice of analytical procedures may decrease the false positive and false negative discovery rate and thereby increase the validity of any metabolomic investigation. - Highlights: • The co-feature ratio (CFR) is introduced. • CFR measures chromatographic and signal selectivity of a feature. • CFR can be used for evaluating experimental procedures in metabolomics. • CFR can aid in locating features with poor selectivity. �
A consistency-based feature selection method allied with linear SVMs for HIV-1 protease cleavage site prediction.

Directory of Open Access Journals (Sweden)

Orkun Oztürk

Full Text Available BACKGROUND: Predicting type-1 Human Immunodeficiency Virus (HIV-1 protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug design (especially for HIV-1 protease inhibitors against this life-threatening virus. However, some drawbacks (like the shortage of the available training data and the high dimensionality of the feature space turn this task into a difficult classification problem. Thus, various machine learning techniques, and specifically several classification methods have been proposed in order to increase the accuracy of the classification model. In addition, for several classification problems, which are characterized by having few samples and many features, selecting the most relevant features is a major factor for increasing classification accuracy. RESULTS: We propose for HIV-1 data a consistency-based feature selection approach in conjunction with recursive feature elimination of support vector machines (SVMs. We used various classifiers for evaluating the results obtained from the feature selection process. We further demonstrated the effectiveness of our proposed method by comparing it with a state-of-the-art feature selection method applied on HIV-1 data, and we evaluated the reported results based on attributes which have been selected from different combinations. CONCLUSION: Applying feature selection on training data before realizing the classification task seems to be a reasonable data-mining process when working with types of data similar to HIV-1. On HIV-1 data, some feature selection or extraction operations in conjunction with different classifiers have been tested and noteworthy outcomes have been reported. These facts motivate for the work presented in this paper. SOFTWARE AVAILABILITY: The software is available at http
Feature Selection and Kernel Learning for Local Learning-Based Clustering.

Science.gov (United States)

Zeng, Hong; Cheung, Yiu-ming

2011-08-01

The performance of the most clustering algorithms highly relies on the representation of data in the input space or the Hilbert space of kernel methods. This paper is to obtain an appropriate data representation through feature selection or kernel learning within the framework of the Local Learning-Based Clustering (LLC) (Wu and Schölkopf 2006) method, which can outperform the global learning-based ones when dealing with the high-dimensional data lying on manifold. Specifically, we associate a weight to each feature or kernel and incorporate it into the built-in regularization of the LLC algorithm to take into account the relevance of each feature or kernel for the clustering. Accordingly, the weights are estimated iteratively in the clustering process. We show that the resulting weighted regularization with an additional constraint on the weights is equivalent to a known sparse-promoting penalty. Hence, the weights of those irrelevant features or kernels can be shrunk toward zero. Extensive experiments show the efficacy of the proposed methods on the benchmark data sets.
Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests.

Science.gov (United States)

Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A

2017-09-15

Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code
Brain activity patterns uniquely supporting visual feature integration after traumatic brain injury

Directory of Open Access Journals (Sweden)

Anjali eRaja Beharelle

2011-12-01

Full Text Available Traumatic brain injury (TBI patients typically respond more slowly and with more variability than controls during tasks of attention requiring speeded reaction time. These behavioral changes are attributable, at least in part, to diffuse axonal injury (DAI, which affects integrated processing in distributed systems. Here we use a multivariate method sensitive to distributed neural activity to compare brain activity patterns of patients with chronic phase moderate-to-severe TBI to those of controls during performance on a visual feature-integration task assessing complex attentional processes that has previously shown sensitivity to TBI. The TBI patients were carefully screened to be free of large focal lesions that can affect performance and brain activation independently of DAI. The task required subjects to hold either one or three features of a target in mind while suppressing responses to distracting information. In controls, the multi-feature condition activated a distributed network including limbic, prefrontal, and medial temporal structures. TBI patients engaged this same network in the single-feature and baseline conditions. In multi-feature presentations, TBI patients alone activated additional frontal, parietal, and occipital regions. These results are consistent with neuroimaging studies using tasks assessing different cognitive domains, where increased spread of brain activity changes was associated with TBI. Our results also extend previous findings that brain activity for relatively moderate task demands in TBI patients is similar to that associated with of high task demands in controls.
Benefit salience and consumers' selective attention to product features

OpenAIRE

Ratneshwar, S; Warlop, Luk; Mick, DG; Seeger, G

1997-01-01

Although attention is a key construct in models of marketing communication and consumer choice, its selective nature has rarely been examined in common time-pressured conditions. We focus on the role of benefit salience, that is, the readiness with which particular benefits are brought to mind by consumers in relation to a given product category. Study I demonstrated that when product feature information was presented rapidly, individuals for whom the benefit of personalised customer service ...
Conditional Mutual Information Based Feature Selection for Classification Task

Czech Academy of Sciences Publication Activity Database

Novovičová, Jana; Somol, Petr; Haindl, Michal; Pudil, Pavel

2007-01-01

Roč. 45, č. 4756 (2007), s. 417-426 ISSN 0302-9743 R&D Projects: GA MŠk 1M0572; GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Pattern classification * feature selection * conditional mutual information * text categorization Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.402, year: 2005
A full feature FASTBUS slave interface using semicustom integrated circuits

International Nuclear Information System (INIS)

Skegg, R.; Daviel, A.; Downing, R.

1986-01-01

Two semi-custom integrated circuits have been designed and manufactured which enable the construction of a full featured FASTBUS slave interface without the need for a detailed knowledge of the FASTBUS protocol. A relatively small amount of board space is required compared to implementations using conventional circuits. The semi-custom devices are described in detail, and an application example is given. (orig.)
A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification

Science.gov (United States)

Wang, Suge; Li, Deyu; Wei, Yingjie; Li, Hongxia

With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers' decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.
Mutual Information Based Dynamic Integration of Multiple Feature Streams for Robust Real-Time LVCSR

Science.gov (United States)

Sato, Shoei; Kobayashi, Akio; Onoe, Kazuo; Homma, Shinichi; Imai, Toru; Takagi, Tohru; Kobayashi, Tetsunori

We present a novel method of integrating the likelihoods of multiple feature streams, representing different acoustic aspects, for robust speech recognition. The integration algorithm dynamically calculates a frame-wise stream weight so that a higher weight is given to a stream that is robust to a variety of noisy environments or speaking styles. Such a robust stream is expected to show discriminative ability. A conventional method proposed for the recognition of spoken digits calculates the weights front the entropy of the whole set of HMM states. This paper extends the dynamic weighting to a real-time large-vocabulary continuous speech recognition (LVCSR) system. The proposed weight is calculated in real-time from mutual information between an input stream and active HMM states in a searchs pace without an additional likelihood calculation. Furthermore, the mutual information takes the width of the search space into account by calculating the marginal entropy from the number of active states. In this paper, we integrate three features that are extracted through auditory filters by taking into account the human auditory system's ability to extract amplitude and frequency modulations. Due to this, features representing energy, amplitude drift, and resonant frequency drifts, are integrated. These features are expected to provide complementary clues for speech recognition. Speech recognition experiments on field reports and spontaneous commentary from Japanese broadcast news showed that the proposed method reduced error words by 9.2% in field reports and 4.7% in spontaneous commentaries relative to the best result obtained from a single stream.

Feature selection based on SVM significance maps for classification of dementia

NARCIS (Netherlands)

E.E. Bron (Esther); M. Smits (Marion); J.C. van Swieten (John); W.J. Niessen (Wiro); S. Klein (Stefan)

2014-01-01

textabstractSupport vector machine significance maps (SVM p-maps) previously showed clusters of significantly different voxels in dementiarelated brain regions. We propose a novel feature selection method for classification of dementia based on these p-maps. In our approach, the SVM p-maps are
Image Retrieval based on Integration between Color and Geometric Moment Features

International Nuclear Information System (INIS)

Saad, M.H.; Saleh, H.I.; Konbor, H.; Ashour, M.

2012-01-01

Content based image retrieval is the retrieval of images based on visual features such as colour, texture and shape. .the Current approaches to CBIR differ in terms of which image features are extracted; recent work deals with combination of distances or scores from different and usually independent representations in an attempt to induce high level semantics from the low level descriptors of the images. content-based image retrieval has many application areas such as, education, commerce, military, searching, commerce, and biomedicine and Web image classification. This paper proposes a new image retrieval system, which uses color and geometric moment feature to form the feature vectors. Bhattacharyya distance and histogram intersection are used to perform feature matching. This framework integrates the color histogram which represents the global feature and geometric moment as local descriptor to enhance the retrieval results. The proposed technique is proper for precisely retrieving images even in deformation cases such as geometric deformations and noise. It is tested on a standard the results shows that a combination of our approach as a local image descriptor with other global descriptors outperforms other approaches.
Feature-selective Attention in Frontoparietal Cortex: Multivoxel Codes Adjust to Prioritize Task-relevant Information.

Science.gov (United States)

Jackson, Jade; Rich, Anina N; Williams, Mark A; Woolgar, Alexandra

2017-02-01

Human cognition is characterized by astounding flexibility, enabling us to select appropriate information according to the objectives of our current task. A circuit of frontal and parietal brain regions, often referred to as the frontoparietal attention network or multiple-demand (MD) regions, are believed to play a fundamental role in this flexibility. There is evidence that these regions dynamically adjust their responses to selectively process information that is currently relevant for behavior, as proposed by the "adaptive coding hypothesis" [Duncan, J. An adaptive coding model of neural function in prefrontal cortex. Nature Reviews Neuroscience, 2, 820-829, 2001]. Could this provide a neural mechanism for feature-selective attention, the process by which we preferentially process one feature of a stimulus over another? We used multivariate pattern analysis of fMRI data during a perceptually challenging categorization task to investigate whether the representation of visual object features in the MD regions flexibly adjusts according to task relevance. Participants were trained to categorize visually similar novel objects along two orthogonal stimulus dimensions (length/orientation) and performed short alternating blocks in which only one of these dimensions was relevant. We found that multivoxel patterns of activation in the MD regions encoded the task-relevant distinctions more strongly than the task-irrelevant distinctions: The MD regions discriminated between stimuli of different lengths when length was relevant and between the same objects according to orientation when orientation was relevant. The data suggest a flexible neural system that adjusts its representation of visual objects to preferentially encode stimulus features that are currently relevant for behavior, providing a neural mechanism for feature-selective attention.
Fuzzy Mutual Information Based min-Redundancy and Max-Relevance Heterogeneous Feature Selection

Directory of Open Access Journals (Sweden)

Daren Yu

2011-08-01

Full Text Available Feature selection is an important preprocessing step in pattern classification and machine learning, and mutual information is widely used to measure relevance between features and decision. However, it is difficult to directly calculate relevance between continuous or fuzzy features using mutual information. In this paper we introduce the fuzzy information entropy and fuzzy mutual information for computing relevance between numerical or fuzzy features and decision. The relationship between fuzzy information entropy and differential entropy is also discussed. Moreover, we combine fuzzy mutual information with qmin-Redundancy-Max-Relevanceq, qMax-Dependencyq and min-Redundancy-Max-Dependencyq algorithms. The performance and stability of the proposed algorithms are tested on benchmark data sets. Experimental results show the proposed algorithms are effective and stable.
Microdevelopment of Complex Featural and Spatial Integration with Contextual Support

Directory of Open Access Journals (Sweden)

Pamela L. Hirsch

2015-01-01

Full Text Available Complex spatial decisions involve the ability to combine featural and spatial information in a scene. In the present work, 4- through 9-year-old children completed a complex map-scene correspondence task under baseline and supported conditions. Children compared a photographed scene with a correct map and with map-foils that made salient an object feature or spatial property. Map-scene matches were analyzed for the effects of age and featural-spatial information on children’s selections. In both conditions children significantly favored maps that highlighted object detail and object perspective rather than color, landmark, and metric elements. Children’s correct performance did not differ by age and was suboptimal, but their ability to choose correct maps improved significantly when contextual support was provided. Strategy variability was prominent for all age groups, but at age 9 with support children were more likely to give up their focus on features and transition to the use of spatial strategies. These findings suggest the possibility of a U-shaped curve for children’s development of geometric knowledge: geometric coding is predominant early on, diminishes for a time in middle childhood in favor of a preference for features, and then reemerges along with the more advanced abilities to combine featural and spatial information.
Visual feature integration indicated by pHase-locked frontal-parietal EEG signals.

Science.gov (United States)

Phillips, Steven; Takeda, Yuji; Singh, Archana

2012-01-01

The capacity to integrate multiple sources of information is a prerequisite for complex cognitive ability, such as finding a target uniquely identifiable by the conjunction of two or more features. Recent studies identified greater frontal-parietal synchrony during conjunctive than non-conjunctive (feature) search. Whether this difference also reflects greater information integration, rather than just differences in cognitive strategy (e.g., top-down versus bottom-up control of attention), or task difficulty is uncertain. Here, we examine the first possibility by parametrically varying the number of integrated sources from one to three and measuring phase-locking values (PLV) of frontal-parietal EEG electrode signals, as indicators of synchrony. Linear regressions, under hierarchical false-discovery rate control, indicated significant positive slopes for number of sources on PLV in the 30-38 Hz, 175-250 ms post-stimulus frequency-time band for pairs in the sagittal plane (i.e., F3-P3, Fz-Pz, F4-P4), after equating conditions for behavioural performance (to exclude effects due to task difficulty). No such effects were observed for pairs in the transverse plane (i.e., F3-F4, C3-C4, P3-P4). These results provide support for the idea that anterior-posterior phase-locking in the lower gamma-band mediates integration of visual information. They also provide a potential window into cognitive development, seen as developing the capacity to integrate more sources of information.
Features of selection of children for occupations by artistic gymnastics in modern Kurdistan

OpenAIRE

Abdulvahid Dlshad Nihad

2015-01-01

Purpose: to study the organizational and pedagogical conditions of selection of children for occupations existing in the republic Kurdistan artistic gymnastics Material and Methods: questioning of 24 trainers on artistic gymnastics and experts in physical culture of the republic Kurdistan was carried out. The general questions of selection and methodical features of selection of children for occupations by artistic gymnastics in Kurdistan were studied. Results: questioning revealed absence of...
Orthogonal feature selection method. [For preprocessing of man spectral data

Energy Technology Data Exchange (ETDEWEB)

Kowalski, B R [Univ. of Washington, Seattle; Bender, C F

1976-01-01

A new method of preprocessing spectral data for extraction of molecular structural information is desired. This SELECT method generates orthogonal features that are important for classification purposes and that also retain their identity to the original measurements. A brief introduction to chemical pattern recognition is presented. A brief description of the method and an application to mass spectral data analysis follow. (BLM)
Optimal Feature Space Selection in Detecting Epileptic Seizure based on Recurrent Quantification Analysis and Genetic Algorithm

Directory of Open Access Journals (Sweden)

Saleh LAshkari

2016-06-01

Full Text Available Selecting optimal features based on nature of the phenomenon and high discriminant ability is very important in the data classification problems. Since it doesn't require any assumption about stationary condition and size of the signal and the noise in Recurrent Quantification Analysis (RQA, it may be useful for epileptic seizure Detection. In this study, RQA was used to discriminate ictal EEG from the normal EEG where optimal features selected by combination of algorithm genetic and Bayesian Classifier. Recurrence plots of hundred samples in each two categories were obtained with five distance norms in this study: Euclidean, Maximum, Minimum, Normalized and Fixed Norm. In order to choose optimal threshold for each norm, ten threshold of ε was generated and then the best feature space was selected by genetic algorithm in combination with a bayesian classifier. The results shown that proposed method is capable of discriminating the ictal EEG from the normal EEG where for Minimum norm and 0.1˂ε˂1, accuracy was 100%. In addition, the sensitivity of proposed framework to the ε and the distance norm parameters was low. The optimal feature presented in this study is Trans which it was selected in most feature spaces with high accuracy.
YamiPred: A novel evolutionary method for predicting pre-miRNAs and selecting relevant features

KAUST Repository

Kleftogiannis, Dimitrios A.; Theofilatos, Konstantinos; Likothanassis, Spiros; Mavroudi, Seferina

2015-01-01

MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of Support Vector Machines (SVM) with Genetic Algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.
YamiPred: A novel evolutionary method for predicting pre-miRNAs and selecting relevant features

KAUST Repository

Kleftogiannis, Dimitrios A.

2015-01-23

MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of Support Vector Machines (SVM) with Genetic Algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.
Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

Science.gov (United States)

Al-Rajab, Murad; Lu, Joan; Xu, Qiang

2017-07-01

This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.
Low cost, small form factor, and integration as the key features for the optical component industry takeoff

Science.gov (United States)

Schiattone, Francesco; Bonino, Stefano; Gobbi, Luigi; Groppi, Angelamaria; Marazzi, Marco; Musio, Maurizio

2003-04-01

In the past the optical component market has been mainly driven by performances. Today, as the number of competitors has drastically increased, the system integrators have a wide range of possible suppliers and solutions giving them the possibility to be more focused on cost and also on footprint reduction. So, if performances are still essential, low cost and Small Form Factor issues are becoming more and more crucial in selecting components. Another evolution in the market is the current request of the optical system companies to simplify the supply chain in order to reduce the assembling and testing steps at system level. This corresponds to a growing demand in providing subassemblies, modules or hybrid integrated components: that means also Integration will be an issue in which all the optical component companies will compete to gain market shares. As we can see looking several examples offered by electronic market, to combine low cost and SFF is a very challenging task but Integration can help in achieving both features. In this work we present how these issues could be approached giving examples of some advanced solutions applied to LiNbO3 modulators. In particular we describe the progress made on automation, new materials and low cost fabrication methods for the parts. We also introduce an approach in integrating optical and electrical functionality on LiNbO3 modulators including RF driver, bias control loop, attenuator and photodiode integrated in a single device.
Human activity recognition based on feature selection in smart home using back-propagation algorithm.

Science.gov (United States)

Fang, Hongqing; He, Lei; Si, Hao; Liu, Peng; Xie, Xiaolei

2014-09-01

In this paper, Back-propagation(BP) algorithm has been used to train the feed forward neural network for human activity recognition in smart home environments, and inter-class distance method for feature selection of observed motion sensor events is discussed and tested. And then, the human activity recognition performances of neural network using BP algorithm have been evaluated and compared with other probabilistic algorithms: Naïve Bayes(NB) classifier and Hidden Markov Model(HMM). The results show that different feature datasets yield different activity recognition accuracy. The selection of unsuitable feature datasets increases the computational complexity and degrades the activity recognition accuracy. Furthermore, neural network using BP algorithm has relatively better human activity recognition performances than NB classifier and HMM. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Automating an integrated spatial data-mining model for landfill site selection

Science.gov (United States)

Abujayyab, Sohaib K. M.; Ahamad, Mohd Sanusi S.; Yahya, Ahmad Shukri; Ahmad, Siti Zubaidah; Aziz, Hamidi Abdul

2017-10-01

An integrated programming environment represents a robust approach to building a valid model for landfill site selection. One of the main challenges in the integrated model is the complicated processing and modelling due to the programming stages and several limitations. An automation process helps avoid the limitations and improve the interoperability between integrated programming environments. This work targets the automation of a spatial data-mining model for landfill site selection by integrating between spatial programming environment (Python-ArcGIS) and non-spatial environment (MATLAB). The model was constructed using neural networks and is divided into nine stages distributed between Matlab and Python-ArcGIS. A case study was taken from the north part of Peninsular Malaysia. 22 criteria were selected to utilise as input data and to build the training and testing datasets. The outcomes show a high-performance accuracy percentage of 98.2% in the testing dataset using 10-fold cross validation. The automated spatial data mining model provides a solid platform for decision makers to performing landfill site selection and planning operations on a regional scale.
Electricity market price spike analysis by a hybrid data model and feature selection technique

International Nuclear Information System (INIS)

Amjady, Nima; Keynia, Farshid

2010-01-01

In a competitive electricity market, energy price forecasting is an important activity for both suppliers and consumers. For this reason, many techniques have been proposed to predict electricity market prices in the recent years. However, electricity price is a complex volatile signal owning many spikes. Most of electricity price forecast techniques focus on the normal price prediction, while price spike forecast is a different and more complex prediction process. Price spike forecasting has two main aspects: prediction of price spike occurrence and value. In this paper, a novel technique for price spike occurrence prediction is presented composed of a new hybrid data model, a novel feature selection technique and an efficient forecast engine. The hybrid data model includes both wavelet and time domain variables as well as calendar indicators, comprising a large candidate input set. The set is refined by the proposed feature selection technique evaluating both relevancy and redundancy of the candidate inputs. The forecast engine is a probabilistic neural network, which are fed by the selected candidate inputs of the feature selection technique and predict price spike occurrence. The efficiency of the whole proposed method for price spike occurrence forecasting is evaluated by means of real data from the Queensland and PJM electricity markets. (author)
Dataset of statements on policy integration of selected intergovernmental organizations

Directory of Open Access Journals (Sweden)

Jale Tosun

2018-04-01

Full Text Available This article describes data for 78 intergovernmental organizations (IGOs working on topics related to energy governance, environmental protection, and the economy. The number of IGOs covered also includes organizations active in other sectors. The point of departure for data construction was the Correlates of War dataset, from which we selected this sample of IGOs. We updated and expanded the empirical information on the IGOs selected by manual coding. Most importantly, we collected the primary law texts of the individual IGOs in order to code whether they commit themselves to environmental policy integration (EPI, climate policy integration (CPI and/or energy policy integration (EnPI.
Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations

Science.gov (United States)

2012-01-01

Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent
Mining method selection by integrated AHP and PROMETHEE method.

Science.gov (United States)

Bogdanovic, Dejan; Nikolic, Djordje; Ilic, Ivana

2012-03-01

Selecting the best mining method among many alternatives is a multicriteria decision making problem. The aim of this paper is to demonstrate the implementation of an integrated approach that employs AHP and PROMETHEE together for selecting the most suitable mining method for the "Coka Marin" underground mine in Serbia. The related problem includes five possible mining methods and eleven criteria to evaluate them. Criteria are accurately chosen in order to cover the most important parameters that impact on the mining method selection, such as geological and geotechnical properties, economic parameters and geographical factors. The AHP is used to analyze the structure of the mining method selection problem and to determine weights of the criteria, and PROMETHEE method is used to obtain the final ranking and to make a sensitivity analysis by changing the weights. The results have shown that the proposed integrated method can be successfully used in solving mining engineering problems.
Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.

Science.gov (United States)

Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens

2016-01-01

MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.

Investigation of Selected Surface Integrity Features of Duplex Stainless Steel (DSS) after Turning

Czech Academy of Sciences Publication Activity Database

Krolczyk, G.; Nieslony, P.; Legutko, S.; Hloch, Sergej; Samardžić, I.

2015-01-01

Roč. 54, č. 1 (2015), s. 91-94 ISSN 0543-5846 Institutional support: RVO:68145535 Keywords : duplex stainless steel * machining * turning * surface integrity * surface roughness Subject RIV: JQ - Machines ; Tools Impact factor: 0.959, year: 2014 http://hrcak.srce.hr/126702
Integration of School Features into Taiwanese Elementary School New English Curriculum

Science.gov (United States)

Chien, Chin-Wen

2014-01-01

Elementary school English activation curriculum, an additional two culture classes, has been implemented only in New Taipei City in Taiwan starting from 2010, so only a few studies focus on it. This is a case study of an English teacher's integration of a school's features into the activation curriculum in a rural elementary school. This study…
Selecting Informative Features of the Helicopter and Aircraft Acoustic Signals in the Adaptive Autonomous Information Systems for Recognition

Directory of Open Access Journals (Sweden)

V. K. Hohlov

2017-01-01

Full Text Available The article forms the rationale for selecting the informative features of the helicopter and aircraft acoustic signals to solve a problem of their recognition and shows that the most informative ones are the counts of extrema in the energy spectra of the input signals, which represent non-centered random variables. An apparatus of the multiple initial regression coefficients was selected as a mathematical tool of research. The application of digital re-circulators with positive and negative feedbacks, which have the comb-like frequency characteristics, solves the problem of selecting informative features. A distinguishing feature of such an approach is easy agility of the comb frequency characteristics just through the agility of a delay value of digital signal in the feedback circuit. Adding an adaptation block to the selection block of the informative features enables us to ensure the invariance of used informative feature and counts of local extrema of the spectral power density to the airspeed of a helicopter. The paper gives reasons for the principle of adaptation and the structure of the adaptation block. To form the discriminator characteristics are used the cross-correlation statistical characteristics of the quadrature components of acoustic signal realizations, obtained by Hilbert transform. The paper proposes to provide signal recognition using a regression algorithm that allows handling the non-centered informative features and using a priori information about coefficients of initial regression of signal and noise.The simulation in Matlab Simulink has shown that selected informative features of signals in regressive processing of signal realizations of 0.5 s duration have good separability, and based on a set of 100 acoustic signal realizations in each class in full-scale conditions, has proved ensuring a correct recognition probability of 0.975, at least. The considered principles of informative features selection and adaptation can
Inference for feature selection using the Lasso with high-dimensional data

DEFF Research Database (Denmark)

Brink-Jensen, Kasper; Ekstrøm, Claus Thorn

2014-01-01

Penalized regression models such as the Lasso have proved useful for variable selection in many fields - especially for situations with high-dimensional data where the numbers of predictors far exceeds the number of observations. These methods identify and rank variables of importance but do...... not generally provide any inference of the selected variables. Thus, the variables selected might be the "most important" but need not be significant. We propose a significance test for the selection found by the Lasso. We introduce a procedure that computes inference and p-values for features chosen...... by the Lasso. This method rephrases the null hypothesis and uses a randomization approach which ensures that the error rate is controlled even for small samples. We demonstrate the ability of the algorithm to compute $p$-values of the expected magnitude with simulated data using a multitude of scenarios...
Advances in feature selection methods for hyperspectral image processing in food industry applications: a review.

Science.gov (United States)

Dai, Qiong; Cheng, Jun-Hu; Sun, Da-Wen; Zeng, Xin-An

2015-01-01

There is an increased interest in the applications of hyperspectral imaging (HSI) for assessing food quality, safety, and authenticity. HSI provides abundance of spatial and spectral information from foods by combining both spectroscopy and imaging, resulting in hundreds of contiguous wavebands for each spatial position of food samples, also known as the curse of dimensionality. It is desirable to employ feature selection algorithms for decreasing computation burden and increasing predicting accuracy, which are especially relevant in the development of online applications. Recently, a variety of feature selection algorithms have been proposed that can be categorized into three groups based on the searching strategy namely complete search, heuristic search and random search. This review mainly introduced the fundamental of each algorithm, illustrated its applications in hyperspectral data analysis in the food field, and discussed the advantages and disadvantages of these algorithms. It is hoped that this review should provide a guideline for feature selections and data processing in the future development of hyperspectral imaging technique in foods.
Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

Directory of Open Access Journals (Sweden)

Qingzhong Liu

Full Text Available Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA, which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods: Support Vector Machine Recursive Feature Elimination (SVMRFE, Leave-One-Out Calculation Sequential Forward Selection (LOOCSFS, Gradient based Leave-one-out Gene Selection (GLGS. To evaluate the performance of these gene selection methods, we employ several popular learning classifiers on the MicroArray Quality Control phase II on predictive modeling (MAQC-II breast cancer dataset and the MAQC-II multiple myeloma dataset. Experimental results show that gene selection is strictly paired with learning classifier. Overall, our approach outperforms other compared methods. The biological functional analysis based on the MAQC-II breast cancer dataset convinced us to apply our method for phenotype prediction. Additionally, learning classifiers also play important roles in the classification of microarray data and our experimental results indicate that the Nearest Mean Scale Classifier (NMSC is a good choice due to its prediction reliability and its stability across the three performance measurements: Testing accuracy, MCC values, and
Automatic Target Recognition: Statistical Feature Selection of Non-Gaussian Distributed Target Classes

Science.gov (United States)

2011-06-01

implementing, and evaluating many feature selection algorithms. Mucciardi and Gose compared seven different techniques for choosing subsets of pattern...122 THIS PAGE INTENTIONALLY LEFT BLANK 123 LIST OF REFERENCES [1] A. Mucciardi and E. Gose , “A comparison of seven techniques for
Prediction of protein modification sites of pyrrolidone carboxylic acid using mRMR feature selection and analysis.

Directory of Open Access Journals (Sweden)

Lu-Lu Zheng

Full Text Available Pyrrolidone carboxylic acid (PCA is formed during a common post-translational modification (PTM of extracellular and multi-pass membrane proteins. In this study, we developed a new predictor to predict the modification sites of PCA based on maximum relevance minimum redundancy (mRMR and incremental feature selection (IFS. We incorporated 727 features that belonged to 7 kinds of protein properties to predict the modification sites, including sequence conservation, residual disorder, amino acid factor, secondary structure and solvent accessibility, gain/loss of amino acid during evolution, propensity of amino acid to be conserved at protein-protein interface and protein surface, and deviation of side chain carbon atom number. Among these 727 features, 244 features were selected by mRMR and IFS as the optimized features for the prediction, with which the prediction model achieved a maximum of MCC of 0.7812. Feature analysis showed that all feature types contributed to the modification process. Further site-specific feature analysis showed that the features derived from PCA's surrounding sites contributed more to the determination of PCA sites than other sites. The detailed feature analysis in this paper might provide important clues for understanding the mechanism of the PCA formation and guide relevant experimental validations.
Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

Science.gov (United States)

Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas

2014-01-01

Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727
Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification.

Science.gov (United States)

Fan, Jianqing; Feng, Yang; Jiang, Jiancheng; Tong, Xin

We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.
Feature Selection for Wheat Yield Prediction

Science.gov (United States)

Ruß, Georg; Kruse, Rudolf

Carrying out effective and sustainable agriculture has become an important issue in recent years. Agricultural production has to keep up with an everincreasing population by taking advantage of a field’s heterogeneity. Nowadays, modern technology such as the global positioning system (GPS) and a multitude of developed sensors enable farmers to better measure their fields’ heterogeneities. For this small-scale, precise treatment the term precision agriculture has been coined. However, the large amounts of data that are (literally) harvested during the growing season have to be analysed. In particular, the farmer is interested in knowing whether a newly developed heterogeneity sensor is potentially advantageous or not. Since the sensor data are readily available, this issue should be seen from an artificial intelligence perspective. There it can be treated as a feature selection problem. The additional task of yield prediction can be treated as a multi-dimensional regression problem. This article aims to present an approach towards solving these two practically important problems using artificial intelligence and data mining ideas and methodologies.
Theory-practice integration in selected clinical situations

Directory of Open Access Journals (Sweden)

M Davhana-Maselesele

2001-09-01

Full Text Available The current changes in health care systems challenge knowledgeable, mature and independent practitioners to integrate theoretical content with practice. The aim of this study was to investigate the problems of integrating theory with practice in selected clinical nursing situations. The study focused on rendering of family planning services to clients as a component of Community Nursing Science. Structured observation schedules were used to observe the theoretical content of the curriculum as well as the practical application of what has been taught in the clinical area. The findings of the study revealed that there was a need for an integrated holistic curriculum, which would address the needs of the community. It was concluded that a problem-based and community-based curriculum, intersectoral collaboration between college and hospital managements and student involvement in all processes of teaching and learning would improve the integration of theory and practice. There also appeared to be a need for tutors to be more involved in clinical teaching and accompaniment.
Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach

Directory of Open Access Journals (Sweden)

Taigang Liu

2015-12-01

Full Text Available The prior knowledge of protein structural class may offer useful clues on understanding its functionality as well as its tertiary structure. Though various significant efforts have been made to find a fast and effective computational approach to address this problem, it is still a challenging topic in the field of bioinformatics. The position-specific score matrix (PSSM profile has been shown to provide a useful source of information for improving the prediction performance of protein structural class. However, this information has not been adequately explored. To this end, in this study, we present a feature extraction technique which is based on gapped-dipeptides composition computed directly from PSSM. Then, a careful feature selection technique is performed based on support vector machine-recursive feature elimination (SVM-RFE. These optimal features are selected to construct a final predictor. The results of jackknife tests on four working datasets show that our method obtains satisfactory prediction accuracies by extracting features solely based on PSSM and could serve as a very promising tool to predict protein structural class.
Before you see it, you see its parts: evidence for feature encoding and integration in preschool children and adults.

Science.gov (United States)

Thompson, L A; Massaro, D W

1989-07-01

Preschool children and adults were compared in two experiments examining the basic issue of whether perceptual representations of objects are built-up from independent features along the dimensions of size and brightness. Experiment 1 was a visual search experiment. Subjects searched for targets which differed from distractors either by a single feature or by a conjunction of features. Results from preschoolers were comparable to those from adults, and were consistent with Treisman and Gelade's (1980, Cognitive Psychology, 12, 97-136) feature-integration theory of attention. Their theory states that independent features are encoded in parallel and are later combined with a spatial attention mechanism. However, children's significantly steeper conjunctive search slope indicated a slower speed of feature integration. In Experiment 2, four mathematical models of pattern recognition were tested against classification task data. The findings from both age groups were again consistent with a model assuming that size and brightness features are initially registered, and then integrated. Moreover, the data from Experiment 2 imply that perceptual growth entails small changes in the discriminability of featural representations; however, both experiments show that the operations performed on these representations are the same developmentally.
EUROPEAN INTEGRATION FROM POLAND’S VIEWPOINT. SELECTED ISSUES

OpenAIRE

Iwona M. Pawlas

2014-01-01

It has been ten years since Poland joined the European Union in May 2004. Integration with the EU structures resulted in considerable economic, social and political advantages. On the other hand membership in the EU created new challenges for Poland, the Polish companies and the Polish citizens. The paper reviews selected issues of Poland’s integration with the European Union with special focus on net financial effect of membership, competitiveness of Polish goods on single European market, p...
Feature Selection and Parameter Optimization of Support Vector Machines Based on Modified Artificial Fish Swarm Algorithms

Directory of Open Access Journals (Sweden)

Kuan-Cheng Lin

2015-01-01

Full Text Available Rapid advances in information and communication technology have made ubiquitous computing and the Internet of Things popular and practicable. These applications create enormous volumes of data, which are available for analysis and classification as an aid to decision-making. Among the classification methods used to deal with big data, feature selection has proven particularly effective. One common approach involves searching through a subset of the features that are the most relevant to the topic or represent the most accurate description of the dataset. Unfortunately, searching through this kind of subset is a combinatorial problem that can be very time consuming. Meaheuristic algorithms are commonly used to facilitate the selection of features. The artificial fish swarm algorithm (AFSA employs the intelligence underlying fish swarming behavior as a means to overcome optimization of combinatorial problems. AFSA has proven highly successful in a diversity of applications; however, there remain shortcomings, such as the likelihood of falling into a local optimum and a lack of multiplicity. This study proposes a modified AFSA (MAFSA to improve feature selection and parameter optimization for support vector machine classifiers. Experiment results demonstrate the superiority of MAFSA in classification accuracy using subsets with fewer features for given UCI datasets, compared to the original FASA.
Relevance feature selection of modal frequency-ambient condition pattern recognition in structural health assessment for reinforced concrete buildings

Directory of Open Access Journals (Sweden)

He-Qing Mu

2016-08-01

Full Text Available Modal frequency is an important indicator for structural health assessment. Previous studies have shown that this indicator is substantially affected by the fluctuation of ambient conditions, such as temperature and humidity. Therefore, recognizing the pattern between modal frequency and ambient conditions is necessary for reliable long-term structural health assessment. In this article, a novel machine-learning algorithm is proposed to automatically select relevance features in modal frequency-ambient condition pattern recognition based on structural dynamic response and ambient condition measurement. In contrast to the traditional feature selection approaches by examining a large number of combinations of extracted features, the proposed algorithm conducts continuous relevance feature selection by introducing a sophisticated hyperparameterization on the weight parameter vector controlling the relevancy of different features in the prediction model. The proposed algorithm is then utilized for structural health assessment for a reinforced concrete building based on 1-year daily measurements. It turns out that the optimal model class including the relevance features for each vibrational mode is capable to capture the pattern between the corresponding modal frequency and the ambient conditions.
Classification of Alzheimer's disease patients with hippocampal shape wrapper-based feature selection and support vector machine

Science.gov (United States)

Young, Jonathan; Ridgway, Gerard; Leung, Kelvin; Ourselin, Sebastien

2012-02-01

It is well known that hippocampal atrophy is a marker of the onset of Alzheimer's disease (AD) and as a result hippocampal volumetry has been used in a number of studies to provide early diagnosis of AD and predict conversion of mild cognitive impairment patients to AD. However, rates of atrophy are not uniform across the hippocampus making shape analysis a potentially more accurate biomarker. This study studies the hippocampi from 226 healthy controls, 148 AD patients and 330 MCI patients obtained from T1 weighted structural MRI images from the ADNI database. The hippocampi are anatomically segmented using the MAPS multi-atlas segmentation method, and the resulting binary images are then processed with SPHARM software to decompose their shapes as a weighted sum of spherical harmonic basis functions. The resulting parameterizations are then used as feature vectors in Support Vector Machine (SVM) classification. A wrapper based feature selection method was used as this considers the utility of features in discriminating classes in combination, fully exploiting the multivariate nature of the data and optimizing the selected set of features for the type of classifier that is used. The leave-one-out cross validated accuracy obtained on training data is 88.6% for classifying AD vs controls and 74% for classifying MCI-converters vs MCI-stable with very compact feature sets, showing that this is a highly promising method. There is currently a considerable fall in accuracy on unseen data indicating that the feature selection is sensitive to the data used, however feature ensemble methods may overcome this.
Fisher Information Based Meteorological Factors Introduction and Features Selection for Short-Term Load Forecasting

Directory of Open Access Journals (Sweden)

Shuping Cai

2018-03-01

Full Text Available Weather information is an important factor in short-term load forecasting (STLF. However, for a long time, more importance has always been attached to forecasting models instead of other processes such as the introduction of weather factors or feature selection for STLF. The main aim of this paper is to develop a novel methodology based on Fisher information for meteorological variables introduction and variable selection in STLF. Fisher information computation for one-dimensional and multidimensional weather variables is first described, and then the introduction of meteorological factors and variables selection for STLF models are discussed in detail. On this basis, different forecasting models with the proposed methodology are established. The proposed methodology is implemented on real data obtained from Electric Power Utility of Zhenjiang, Jiangsu Province, in southeast China. The results show the advantages of the proposed methodology in comparison with other traditional ones regarding prediction accuracy, and it has very good practical significance. Therefore, it can be used as a unified method for introducing weather variables into STLF models, and selecting their features.
Features of Functioning the Integrated Building Thermal Model

Directory of Open Access Journals (Sweden)

Morozov Maxim N.

2017-01-01

Full Text Available A model of the building heating system, consisting of energy source, a distributed automatic control system, elements of individual heating unit and heating system is designed. Application Simulink of mathematical package Matlab is selected as a platform for the model. There are the specialized application Simscape libraries in aggregate with a wide range of Matlab mathematical tools allow to apply the “acausal” modeling concept. Implementation the “physical” representation of the object model gave improving the accuracy of the models. Principle of operation and features of the functioning of the thermal model is described. The investigations of building cooling dynamics were carried out.

Size-selective detection in integrated optical interferometric biosensors

NARCIS (Netherlands)

Mulder, Harmen K P; Ymeti, Aurel; Subramaniam, Vinod; Kanger, Johannes S

2012-01-01

We present a new size-selective detection method for integrated optical interferometric biosensors that can strongly enhance their performance. We demonstrate that by launching multiple wavelengths into a Young interferometer waveguide sensor it is feasible to derive refractive index changes from
Features of mechanical snubbers and the method of selection

International Nuclear Information System (INIS)

Sunakoda, Katsuaki

1978-01-01

In the oil snubbers used in the high radiation environment of nuclear power stations, gas generation from oil and the deterioration of rubber material for sealing occur due to radiation damage, therefore periodical inspection and replacement are required during operation. The mechanical snubbers developed as aseismatic supporters in place of oil snubbers have entered the stage of practical use, and are made by two companies in USA and a company in Japan. Their features as compared with oil snubbers are as follows. The cost and time required for the maintenance were made as small as possible because the increase of the service life of mechanical components can be expected. The temperature dependence of mechanical snubbers is small. The matters demanding attention in the maintenance are the secular change of lubricating oil and the effect of radiation, and the rust prevention of ball screw bearings. These problems are being studied by Power Reactor and Nuclear Fuel Development Corp. for the fast prototype reactor Monju. The structural feature is to convert the thrust movement of equipments and pipings due to thermal expansion and contraction or earthquakes into rotating motion, using ball screws. The features and the construction of SMS type mechanical snubbers, the test and inspection prior to their shipping, the method of selection, and the method of handling them in actual places are explained. (Kako, I.)
Feature Selection for Motor Imagery EEG Classification Based on Firefly Algorithm and Learning Automata.

Science.gov (United States)

Liu, Aiming; Chen, Kun; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi

2017-11-08

Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain-computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain-computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain-computer interface systems.
Feature Selection for Motor Imagery EEG Classification Based on Firefly Algorithm and Learning Automata

Directory of Open Access Journals (Sweden)

Aiming Liu

2017-11-01

Full Text Available Motor Imagery (MI electroencephalography (EEG is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP and local characteristic-scale decomposition (LCD algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA classifier. Both the fourth brain–computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain–computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain–computer interface systems.
Visual feature integration theory: past, present, and future.

Science.gov (United States)

Quinlan, Philip T

2003-09-01

Visual feature integration theory was one of the most influential theories of visual information processing in the last quarter of the 20th century. This article provides an exposition of the theory and a review of the associated data. In the past much emphasis has been placed on how the theory explains performance in various visual search tasks. The relevant literature is discussed and alternative accounts are described. Amendments to the theory are also set out. Many other issues concerning internal processes and representations implicated by the theory are reviewed. The article closes with a synopsis of what has been learned from consideration of the theory, and it is concluded that some of the issues may remain intractable unless appropriate neuroscientific investigations are carried out.
Alertness modulates conflict adaptation and feature integration in an opposite way.

Directory of Open Access Journals (Sweden)

Peiduo Liu

Full Text Available Previous studies show that the congruency sequence effect can result from both the conflict adaptation effect (CAE and feature integration effect which can be observed as the repetition priming effect (RPE and feature overlap effect (FOE depending on different experimental conditions. Evidence from neuroimaging studies suggests that a close correlation exists between the neural mechanisms of alertness-related modulations and the congruency sequence effect. However, little is known about whether and how alertness mediates the congruency sequence effect. In Experiment 1, the Attentional Networks Test (ANT and a modified flanker task were used to evaluate whether the alertness of the attentional functions had a correlation with the CAE and RPE. In Experimental 2, the ANT and another modified flanker task were used to investigate whether alertness of the attentional functions correlate with the CAE and FOE. In Experiment 1, through the correlative analysis, we found a significant positive correlation between alertness and the CAE, and a negative correlation between the alertness and the RPE. Moreover, a significant negative correlation existed between CAE and RPE. In Experiment 2, we found a marginally significant negative correlation between the CAE and the RPE, but the correlation between alertness and FOE, CAE and FOE was not significant. These results suggest that alertness can modulate conflict adaptation and feature integration in an opposite way. Participants at the high alerting level group may tend to use the top-down cognitive processing strategy, whereas participants at the low alerting level group tend to use the bottom-up processing strategy.
Alertness Modulates Conflict Adaptation and Feature Integration in an Opposite Way

Science.gov (United States)

Chen, Jia; Huang, Xiting; Chen, Antao

2013-01-01

Previous studies show that the congruency sequence effect can result from both the conflict adaptation effect (CAE) and feature integration effect which can be observed as the repetition priming effect (RPE) and feature overlap effect (FOE) depending on different experimental conditions. Evidence from neuroimaging studies suggests that a close correlation exists between the neural mechanisms of alertness-related modulations and the congruency sequence effect. However, little is known about whether and how alertness mediates the congruency sequence effect. In Experiment 1, the Attentional Networks Test (ANT) and a modified flanker task were used to evaluate whether the alertness of the attentional functions had a correlation with the CAE and RPE. In Experimental 2, the ANT and another modified flanker task were used to investigate whether alertness of the attentional functions correlate with the CAE and FOE. In Experiment 1, through the correlative analysis, we found a significant positive correlation between alertness and the CAE, and a negative correlation between the alertness and the RPE. Moreover, a significant negative correlation existed between CAE and RPE. In Experiment 2, we found a marginally significant negative correlation between the CAE and the RPE, but the correlation between alertness and FOE, CAE and FOE was not significant. These results suggest that alertness can modulate conflict adaptation and feature integration in an opposite way. Participants at the high alerting level group may tend to use the top-down cognitive processing strategy, whereas participants at the low alerting level group tend to use the bottom-up processing strategy. PMID:24250824
Feature and score fusion based multiple classifier selection for iris recognition.

Science.gov (United States)

Islam, Md Rabiul

2014-01-01

The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al.
Feature and Score Fusion Based Multiple Classifier Selection for Iris Recognition

Directory of Open Access Journals (Sweden)

Md. Rabiul Islam

2014-01-01

Full Text Available The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al.
Integrating Features of Islamic Traditional Home and Smart Home

Directory of Open Access Journals (Sweden)

Mona El Basyouni

2017-06-01

Full Text Available Architecture is a mirror that reflects the various elements of its environment and surroundings, such as climate, geographical characteristics, standard architectural principles, and social, cultural and scientific developments. Muslims of different regions were able, through architecture, to portray their temperaments and environments, free of external influence and guarantee life goals for users. Every day, building owners and occupants experience the constant challenges of comfort, convenience, cost, productivity, performance and sustainability. Owners, designers, builders, and operators are continuously faced with new processes, technologies and offerings to help them achieve better building performance. Since an intelligent building is run by a “system of systems” that is integrated to deliver a higher level of operational efficiency and an improved set of user-interface tools than are usually found in traditional building automation; at the other hand Arab homes with Islamic Identity guarantee all life goals for use.. Hence, this research focus on the smart environmental treatments of Islamic features for traditional architecture in Arabs homes, features of smart home and life goals for resident users.Trying to achieve a methodology combining them for enriching Arab experience of traditional architecture and its architectural results, with the modern trends of smart architecture. This combination aims at creating a residential model combining the benefits and features of Arab Islamic identity and intelligent design.
Feature Genes Selection Using Supervised Locally Linear Embedding and Correlation Coefficient for Microarray Classification.

Science.gov (United States)

Xu, Jiucheng; Mu, Huiyu; Wang, Yun; Huang, Fangzhou

2018-01-01

The selection of feature genes with high recognition ability from the gene expression profiles has gained great significance in biology. However, most of the existing methods have a high time complexity and poor classification performance. Motivated by this, an effective feature selection method, called supervised locally linear embedding and Spearman's rank correlation coefficient (SLLE-SC 2 ), is proposed which is based on the concept of locally linear embedding and correlation coefficient algorithms. Supervised locally linear embedding takes into account class label information and improves the classification performance. Furthermore, Spearman's rank correlation coefficient is used to remove the coexpression genes. The experiment results obtained on four public tumor microarray datasets illustrate that our method is valid and feasible.
The effects of selective and divided attention on sensory precision and integration.

Science.gov (United States)

Odegaard, Brian; Wozny, David R; Shams, Ladan

2016-02-12

In our daily lives, our capacity to selectively attend to stimuli within or across sensory modalities enables enhanced perception of the surrounding world. While previous research on selective attention has studied this phenomenon extensively, two important questions still remain unanswered: (1) how selective attention to a single modality impacts sensory integration processes, and (2) the mechanism by which selective attention improves perception. We explored how selective attention impacts performance in both a spatial task and a temporal numerosity judgment task, and employed a Bayesian Causal Inference model to investigate the computational mechanism(s) impacted by selective attention. We report three findings: (1) in the spatial domain, selective attention improves precision of the visual sensory representations (which were relatively precise), but not the auditory sensory representations (which were fairly noisy); (2) in the temporal domain, selective attention improves the sensory precision in both modalities (both of which were fairly reliable to begin with); (3) in both tasks, selective attention did not exert a significant influence over the tendency to integrate sensory stimuli. Therefore, it may be postulated that a sensory modality must possess a certain inherent degree of encoding precision in order to benefit from selective attention. It also appears that in certain basic perceptual tasks, the tendency to integrate crossmodal signals does not depend significantly on selective attention. We conclude with a discussion of how these results relate to recent theoretical considerations of selective attention. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Advanced GPR imaging of sedimentary features: integrated attribute analysis applied to sand dunes

Science.gov (United States)

Zhao, Wenke; Forte, Emanuele; Fontolan, Giorgio; Pipan, Michele

2018-04-01

We evaluate the applicability and the effectiveness of integrated GPR attribute analysis to image the internal sedimentary features of the Piscinas Dunes, SW Sardinia, Italy. The main objective is to explore the limits of GPR techniques to study sediment-bodies geometry and to provide a non-invasive high-resolution characterization of the different subsurface domains of dune architecture. On such purpose, we exploit the high-quality Piscinas data-set to extract and test different attributes of the GPR trace. Composite displays of multi-attributes related to amplitude, frequency, similarity and textural features are displayed with overlays and RGB mixed models. A multi-attribute comparative analysis is used to characterize different radar facies to better understand the characteristics of internal reflection patterns. The results demonstrate that the proposed integrated GPR attribute analysis can provide enhanced information about the spatial distribution of sediment bodies, allowing an enhanced and more constrained data interpretation.
The Influence of Selective and Divided Attention on Audiovisual Integration in Children.

Science.gov (United States)

Yang, Weiping; Ren, Yanna; Yang, Dan Ou; Yuan, Xue; Wu, Jinglong

2016-01-24

This article aims to investigate whether there is a difference in audiovisual integration in school-aged children (aged 6 to 13 years; mean age = 9.9 years) between the selective attention condition and divided attention condition. We designed a visual and/or auditory detection task that included three blocks (divided attention, visual-selective attention, and auditory-selective attention). The results showed that the response to bimodal audiovisual stimuli was faster than to unimodal auditory or visual stimuli under both divided attention and auditory-selective attention conditions. However, in the visual-selective attention condition, no significant difference was found between the unimodal visual and bimodal audiovisual stimuli in response speed. Moreover, audiovisual behavioral facilitation effects were compared between divided attention and selective attention (auditory or visual attention). In doing so, we found that audiovisual behavioral facilitation was significantly difference between divided attention and selective attention. The results indicated that audiovisual integration was stronger in the divided attention condition than that in the selective attention condition in children. Our findings objectively support the notion that attention can modulate audiovisual integration in school-aged children. Our study might offer a new perspective for identifying children with conditions that are associated with sustained attention deficit, such as attention-deficit hyperactivity disorder. © The Author(s) 2016.
Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data

Directory of Open Access Journals (Sweden)

Harris Lyndsay N

2006-04-01

Full Text Available Abstract Background Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological classification problems. Feature selection and classification algorithms need to be robust to noise and outliers in the data. Results We developed a recursive support vector machine (R-SVM algorithm to select important genes/biomarkers for the classification of noisy data. We compared its performance to a similar, state-of-the-art method (SVM recursive feature elimination or SVM-RFE, paying special attention to the ability of recovering the true informative genes/biomarkers and the robustness to outliers in the data. Simulation experiments show that a 5 %-~20 % improvement over SVM-RFE can be achieved regard to these properties. The SVM-based methods are also compared with a conventional univariate method and their respective strengths and weaknesses are discussed. R-SVM was applied to two sets of SELDI-TOF-MS proteomics data, one from a human breast cancer study and the other from a study on rat liver cirrhosis. Important biomarkers found by the algorithm were validated by follow-up biological experiments. Conclusion The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features. The multivariate SVM-based method outperforms the univariate method in the classification performance, but univariate methods can reveal more of the differentially expressed features especially when there are correlations between the features.
A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities.

Science.gov (United States)

Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah

2018-02-01

Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection.

Science.gov (United States)

Wang, Yong; Wu, Qiao-Feng; Chen, Chen; Wu, Ling-Yun; Yan, Xian-Zhong; Yu, Shu-Guang; Zhang, Xiang-Sun; Liang, Fan-Rong

2012-01-01

Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Our result demonstrates that metabolic profiling might be a promising method to
A Semidefinite Programming Based Search Strategy for Feature Selection with Mutual Information Measure.

Science.gov (United States)

Naghibi, Tofigh; Hoffmann, Sarah; Pfister, Beat

2015-08-01

Feature subset selection, as a special case of the general subset selection problem, has been the topic of a considerable number of studies due to the growing importance of data-mining applications. In the feature subset selection problem there are two main issues that need to be addressed: (i) Finding an appropriate measure function than can be fairly fast and robustly computed for high-dimensional data. (ii) A search strategy to optimize the measure over the subset space in a reasonable amount of time. In this article mutual information between features and class labels is considered to be the measure function. Two series expansions for mutual information are proposed, and it is shown that most heuristic criteria suggested in the literature are truncated approximations of these expansions. It is well-known that searching the whole subset space is an NP-hard problem. Here, instead of the conventional sequential search algorithms, we suggest a parallel search strategy based on semidefinite programming (SDP) that can search through the subset space in polynomial time. By exploiting the similarities between the proposed algorithm and an instance of the maximum-cut problem in graph theory, the approximation ratio of this algorithm is derived and is compared with the approximation ratio of the backward elimination method. The experiments show that it can be misleading to judge the quality of a measure solely based on the classification accuracy, without taking the effect of the non-optimum search strategy into account.
Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

KAUST Repository

Abusamra, Heba

2016-07-20

The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.
A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods.

Science.gov (United States)

Torija, Antonio J; Ruiz, Diego P

2015-02-01

The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R(2)=0.94 and mean absolute error (MAE)=1.14-1.16 dB(A)). Copyright © 2014 Elsevier B.V. All rights reserved.

Integrating genomic selection into dairy cattle breeding programmes: a review.

Science.gov (United States)

Bouquet, A; Juga, J

2013-05-01

Extensive genetic progress has been achieved in dairy cattle populations on many traits of economic importance because of efficient breeding programmes. Success of these programmes has relied on progeny testing of the best young males to accurately assess their genetic merit and hence their potential for breeding. Over the last few years, the integration of dense genomic information into statistical tools used to make selection decisions, commonly referred to as genomic selection, has enabled gains in predicting accuracy of breeding values for young animals without own performance. The possibility to select animals at an early stage allows defining new breeding strategies aimed at boosting genetic progress while reducing costs. The first objective of this article was to review methods used to model and optimize breeding schemes integrating genomic selection and to discuss their relative advantages and limitations. The second objective was to summarize the main results and perspectives on the use of genomic selection in practical breeding schemes, on the basis of the example of dairy cattle populations. Two main designs of breeding programmes integrating genomic selection were studied in dairy cattle. Genomic selection can be used either for pre-selecting males to be progeny tested or for selecting males to be used as active sires in the population. The first option produces moderate genetic gains without changing the structure of breeding programmes. The second option leads to large genetic gains, up to double those of conventional schemes because of a major reduction in the mean generation interval, but it requires greater changes in breeding programme structure. The literature suggests that genomic selection becomes more attractive when it is coupled with embryo transfer technologies to further increase selection intensity on the dam-to-sire pathway. The use of genomic information also offers new opportunities to improve preservation of genetic variation. However
Constructing and validating readability models: the method of integrating multilevel linguistic features with machine learning.

Science.gov (United States)

Sung, Yao-Ting; Chen, Ju-Ling; Cha, Ji-Her; Tseng, Hou-Chiang; Chang, Tao-Hsing; Chang, Kuo-En

2015-06-01

Multilevel linguistic features have been proposed for discourse analysis, but there have been few applications of multilevel linguistic features to readability models and also few validations of such models. Most traditional readability formulae are based on generalized linear models (GLMs; e.g., discriminant analysis and multiple regression), but these models have to comply with certain statistical assumptions about data properties and include all of the data in formulae construction without pruning the outliers in advance. The use of such readability formulae tends to produce a low text classification accuracy, while using a support vector machine (SVM) in machine learning can enhance the classification outcome. The present study constructed readability models by integrating multilevel linguistic features with SVM, which is more appropriate for text classification. Taking the Chinese language as an example, this study developed 31 linguistic features as the predicting variables at the word, semantic, syntax, and cohesion levels, with grade levels of texts as the criterion variable. The study compared four types of readability models by integrating unilevel and multilevel linguistic features with GLMs and an SVM. The results indicate that adopting a multilevel approach in readability analysis provides a better representation of the complexities of both texts and the reading comprehension process.
An integrated approach to site selection for nuclear power plants

International Nuclear Information System (INIS)

Hassan, E.M.A.

1975-01-01

A method of analysing and evaluating the large number of factors influencing site selection is proposed, which can interrelate these factors and associated problems in an integrated way and at the same time establish a technique for site evaluation. The objective is to develop an integrated programme that illustrates the complexity and dynamic interrelationships of the various factors to develop an improved understanding of the functions and objectives of siting nuclear power plants and would aim finally at the development of an effective procedure and technique for site evaluation and/or comparative evaluation for making rational site-selection decisions. (author)
Feature Subset Selection and Instance Filtering for Cross-project Defect Prediction - Classification and Ranking

Directory of Open Access Journals (Sweden)

Faimison Porto

2016-12-01

Full Text Available The defect prediction models can be a good tool on organizing the project's test resources. The models can be constructed with two main goals: 1 to classify the software parts - defective or not; or 2 to rank the most defective parts in a decreasing order. However, not all companies maintain an appropriate set of historical defect data. In this case, a company can build an appropriate dataset from known external projects - called Cross-project Defect Prediction (CPDP. The CPDP models, however, present low prediction performances due to the heterogeneity of data. Recently, Instance Filtering methods were proposed in order to reduce this heterogeneity by selecting the most similar instances from the training dataset. Originally, the similarity is calculated based on all the available dataset features (or independent variables. We propose that using only the most relevant features on the similarity calculation can result in more accurate filtered datasets and better prediction performances. In this study we extend our previous work. We analyse both prediction goals - Classification and Ranking. We present an empirical evaluation of 41 different methods by associating Instance Filtering methods with Feature Selection methods. We used 36 versions of 11 open source projects on experiments. The results show similar evidences for both prediction goals. First, the defect prediction performance of CPDP models can be improved by associating Feature Selection and Instance Filtering. Second, no evaluated method presented general better performances. Indeed, the most appropriate method can vary according to the characteristics of the project being predicted.
Integrated water flow model and modflow-farm process: A comparison of theory, approaches, and features of two integrated hydrologic models

Science.gov (United States)

Dogrul, Emin C.; Schmid, Wolfgang; Hanson, Randall T.; Kadir, Tariq; Chung, Francis

2016-01-01

Effective modeling of conjunctive use of surface and subsurface water resources requires simulation of land use-based root zone and surface flow processes as well as groundwater flows, streamflows, and their interactions. Recently, two computer models developed for this purpose, the Integrated Water Flow Model (IWFM) from the California Department of Water Resources and the MODFLOW with Farm Process (MF-FMP) from the US Geological Survey, have been applied to complex basins such as the Central Valley of California. As both IWFM and MFFMP are publicly available for download and can be applied to other basins, there is a need to objectively compare the main approaches and features used in both models. This paper compares the concepts, as well as the method and simulation features of each hydrologic model pertaining to groundwater, surface water, and landscape processes. The comparison is focused on the integrated simulation of water demand and supply, water use, and the flow between coupled hydrologic processes. The differences in the capabilities and features of these two models could affect the outcome and types of water resource problems that can be simulated.
Intelligent feature selection techniques for pattern classification of Lamb wave signals

International Nuclear Information System (INIS)

Hinders, Mark K.; Miller, Corey A.

2014-01-01

Lamb wave interaction with flaws is a complex, three-dimensional phenomenon, which often frustrates signal interpretation schemes based on mode arrival time shifts predicted by dispersion curves. As the flaw severity increases, scattering and mode conversion effects will often dominate the time-domain signals, obscuring available information about flaws because multiple modes may arrive on top of each other. Even for idealized flaw geometries the scattering and mode conversion behavior of Lamb waves is very complex. Here, multi-mode Lamb waves in a metal plate are propagated across a rectangular flat-bottom hole in a sequence of pitch-catch measurements corresponding to the double crosshole tomography geometry. The flaw is sequentially deepened, with the Lamb wave measurements repeated at each flaw depth. Lamb wave tomography reconstructions are used to identify which waveforms have interacted with the flaw and thereby carry information about its depth. Multiple features are extracted from each of the Lamb wave signals using wavelets, which are then fed to statistical pattern classification algorithms that identify flaw severity. In order to achieve the highest classification accuracy, an optimal feature space is required but it’s never known a priori which features are going to be best. For structural health monitoring we make use of the fact that physical flaws, such as corrosion, will only increase over time. This allows us to identify feature vectors which are topologically well-behaved by requiring that sequential classes “line up” in feature vector space. An intelligent feature selection routine is illustrated that identifies favorable class distributions in multi-dimensional feature spaces using computational homology theory. Betti numbers and formal classification accuracies are calculated for each feature space subset to establish a correlation between the topology of the class distribution and the corresponding classification accuracy
Integrating communication protocol selection with partitioning in hardware/software codesign

DEFF Research Database (Denmark)

Knudsen, Peter Voigt; Madsen, Jan

1998-01-01

frequencies of system components such as buses, CPU's, ASIC's, software code size, hardware area, and component prices. A distinct feature of the model is the modeling of driver processing of data (packing, splitting, compression, etc.) and its impact on communication throughput. The integration...
An Integrated Skills Approach Using Feature Movies in EFL at Tertiary Level

Science.gov (United States)

Tuncay, Hidayet

2014-01-01

This paper presents the results of a case study based on an integrated skills approach using feature movies (DVDs) in EFL syllabi at the tertiary level. 100 students took part in the study and the data was collected through a three - section survey questionnaire: demographic items, 18 likert scale questions and an open-ended question. The data…
Feature Selection Strategy for Classification of Single-Trial EEG Elicited by Motor Imagery

DEFF Research Database (Denmark)

Prasad, Swati; Tan, Zheng-Hua; Prasad, Ramjee

2011-01-01

Brain-Computer Interface (BCI) provides new means of communication for people with motor disabilities by utilizing electroencephalographic activity. Selection of features from Electroencephalogram (EEG) signals for classification plays a key part in the development of BCI systems. In this paper, we...
Optimum location of external markers using feature selection algorithms for real-time tumor tracking in external-beam radiotherapy: a virtual phantom study.

Science.gov (United States)

Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin

2016-01-08

In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in
Day-ahead price forecasting of electricity markets by a new feature selection algorithm and cascaded neural network technique

International Nuclear Information System (INIS)

Amjady, Nima; Keynia, Farshid

2009-01-01

With the introduction of restructuring into the electric power industry, the price of electricity has become the focus of all activities in the power market. Electricity price forecast is key information for electricity market managers and participants. However, electricity price is a complex signal due to its non-linear, non-stationary, and time variant behavior. In spite of performed research in this area, more accurate and robust price forecast methods are still required. In this paper, a new forecast strategy is proposed for day-ahead price forecasting of electricity markets. Our forecast strategy is composed of a new two stage feature selection technique and cascaded neural networks. The proposed feature selection technique comprises modified Relief algorithm for the first stage and correlation analysis for the second stage. The modified Relief algorithm selects candidate inputs with maximum relevancy with the target variable. Then among the selected candidates, the correlation analysis eliminates redundant inputs. Selected features by the two stage feature selection technique are used for the forecast engine, which is composed of 24 consecutive forecasters. Each of these 24 forecasters is a neural network allocated to predict the price of 1 h of the next day. The whole proposed forecast strategy is examined on the Spanish and Australia's National Electricity Markets Management Company (NEMMCO) and compared with some of the most recent price forecast methods.
Control structure selection for energy integrated distillation column

DEFF Research Database (Denmark)

Hansen, J.E.; Jørgensen, Sten Bay

1998-01-01

This paper treats a case study on control structure selection for an almost binary distillation column. The column is energy integrated with a heat pump in order to transfer heat from the condenser to the reboiler. This integrated plant configuration renders the possible control structures somewhat...... different from what is usual for binary distillation columns. Further the heat pump enables disturbances to propagate faster through the system. The plant has six possible actuators of which three must be used to stabilize the system. Hereby three actuators are left for product purity control. An MILP...
Integrated model for supplier selection and performance evaluation

Directory of Open Access Journals (Sweden)

Borges de Araújo, Maria Creuza

2015-08-01

Full Text Available This paper puts forward a model for selecting suppliers and evaluating the performance of those already working with a company. A simulation was conducted in a food industry. This sector has high significance in the economy of Brazil. The model enables the phases of selecting and evaluating suppliers to be integrated. This is important so that a company can have partnerships with suppliers who are able to meet their needs. Additionally, a group method is used to enable managers who will be affected by this decision to take part in the selection stage. Finally, the classes resulting from the performance evaluation are shown to support the contractor in choosing the most appropriate relationship with its suppliers.
Evaluating EMG Feature and Classifier Selection for Application to Partial-Hand Prosthesis Control

Directory of Open Access Journals (Sweden)

Adenike A. Adewuyi

2016-10-01

Full Text Available Pattern recognition-based myoelectric control of upper limb prostheses has the potential to restore control of multiple degrees of freedom. Though this control method has been extensively studied in individuals with higher-level amputations, few studies have investigated its effectiveness for individuals with partial-hand amputations. Most partial-hand amputees retain a functional wrist and the ability of pattern recognition-based methods to correctly classify hand motions from different wrist positions is not well studied. In this study, focusing on partial-hand amputees, we evaluate (1 the performance of non-linear and linear pattern recognition algorithms and (2 the performance of optimal EMG feature subsets for classification of four hand motion classes in different wrist positions for 16 non-amputees and 4 amputees. Our results show that linear discriminant analysis and linear and non-linear artificial neural networks perform significantly better than the quadratic discriminant analysis for both non-amputees and partial-hand amputees. For amputees, including information from multiple wrist positions significantly decreased error (p<0.001 but no further significant decrease in error occurred when more than 4, 2, or 3 positions were included for the extrinsic (p=0.07, intrinsic (p=0.06, or combined extrinsic and intrinsic muscle EMG (p=0.08, respectively. Finally, we found that a feature set determined by selecting optimal features from each channel outperformed the commonly used time domain (p<0.001 and time domain/autoregressive feature sets (p<0.01. This method can be used as a screening filter to select the features from each channel that provide the best classification of hand postures across different wrist positions.
Practical use of the integrated reporting framework – an analysis of the content of integrated reports of selected companies

Directory of Open Access Journals (Sweden)

Monika Raulinajtys-Grzybek

2017-09-01

Full Text Available Practical use of the integrated reporting framework – an analysis of the content of integrated reports of selected companies The purpose of the article is to provide a research tool for an initial assessment of whether a company’s integrated reports meet the objectives set out in the IIRC Integrated Reporting Framework and its empirical verification. In particular, the research addresses whether the reports meet the goal of improving the quality of information available and covering all factors that influence the organization’s ability to create value. The article uses the theoretical output on the principles of preparing integrated reports and analyzes the content of selected integrated reports. Based on the source analysis, a research tool has been developed for an initial assessment of whether an integrated report fulfills its objectives. It consists of 42 questions that verify the coverage of the defined elements and the implementation of the guiding principles set by the IIRC. For empirical verification of the tool, a comparative analysis was carried out for reports prepared by selected companies operating in the utilities sector. Answering questions from the research tool allows a researcher to formulate conclusions about the implementation of the guiding principles and the completeness of the presentation of the content elements. As a result of the analysis of selected integrated reports, it was stated that various elements of the report are presented with different levels of accuracy in different reports. Reports provide the most complete information on performance and strategy. The information about business model and prospective data is in some cases presented without making a link to other parts of the report – e.g. risks and opportunities, financial data or capitals. The absence of such links limits the ability to claim that an integrated report meets its objectives, since a set of individual reports, each presenting
A Quantum Hybrid PSO Combined with Fuzzy k-NN Approach to Feature Selection and Cell Classification in Cervical Cancer Detection

Directory of Open Access Journals (Sweden)

Abdullah M. Iliyasu

2017-12-01

Full Text Available A quantum hybrid (QH intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO method with the intuitionistic rationality of traditional fuzzy k-nearest neighbours (Fuzzy k-NN algorithm (known simply as the Q-Fuzzy approach is proposed for efficient feature selection and classification of cells in cervical smeared (CS images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection and another hybrid technique combining the standard PSO algorithm with the Fuzzy k-NN technique (P-Fuzzy approach. In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k-NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
Feature selection and classifier parameters estimation for EEG signals peak detection using particle swarm optimization.

Science.gov (United States)

Adam, Asrul; Shapiai, Mohd Ibrahim; Tumari, Mohd Zaidi Mohd; Mohamad, Mohd Saberi; Mubin, Marizan

2014-01-01

Electroencephalogram (EEG) signal peak detection is widely used in clinical applications. The peak point can be detected using several approaches, including time, frequency, time-frequency, and nonlinear domains depending on various peak features from several models. However, there is no study that provides the importance of every peak feature in contributing to a good and generalized model. In this study, feature selection and classifier parameters estimation based on particle swarm optimization (PSO) are proposed as a framework for peak detection on EEG signals in time domain analysis. Two versions of PSO are used in the study: (1) standard PSO and (2) random asynchronous particle swarm optimization (RA-PSO). The proposed framework tries to find the best combination of all the available features that offers good peak detection and a high classification rate from the results in the conducted experiments. The evaluation results indicate that the accuracy of the peak detection can be improved up to 99.90% and 98.59% for training and testing, respectively, as compared to the framework without feature selection adaptation. Additionally, the proposed framework based on RA-PSO offers a better and reliable classification rate as compared to standard PSO as it produces low variance model.
A novel relational regularization feature selection method for joint regression and classification in AD diagnosis.

Science.gov (United States)

Zhu, Xiaofeng; Suk, Heung-Il; Wang, Li; Lee, Seong-Whan; Shen, Dinggang

2017-05-01

In this paper, we focus on joint regression and classification for Alzheimer's disease diagnosis and propose a new feature selection method by embedding the relational information inherent in the observations into a sparse multi-task learning framework. Specifically, the relational information includes three kinds of relationships (such as feature-feature relation, response-response relation, and sample-sample relation), for preserving three kinds of the similarity, such as for the features, the response variables, and the samples, respectively. To conduct feature selection, we first formulate the objective function by imposing these three relational characteristics along with an ℓ 2,1 -norm regularization term, and further propose a computationally efficient algorithm to optimize the proposed objective function. With the dimension-reduced data, we train two support vector regression models to predict the clinical scores of ADAS-Cog and MMSE, respectively, and also a support vector classification model to determine the clinical label. We conducted extensive experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset to validate the effectiveness of the proposed method. Our experimental results showed the efficacy of the proposed method in enhancing the performances of both clinical scores prediction and disease status identification, compared to the state-of-the-art methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Time course of spatial and feature selective attention for partly-occluded objects.

Science.gov (United States)

Kasai, Tetsuko; Takeya, Ryuji

2012-07-01

Attention selects objects/groups as the most fundamental units, and this may be achieved by an attention-spreading mechanism. Previous event-related potential (ERP) studies have found that attention-spreading is reflected by a decrease in the N1 spatial attention effect. The present study tested whether the electrophysiological attention effect is associated with the perception of object unity or amodal completion through the use of partly-occluded objects. ERPs were recorded in 14 participants who were required to pay attention to their left or right visual field and to press a button for a target shape in the attended field. Bilateral stimuli were presented rapidly, and were separated, connected, or connected behind an occluder. Behavioral performance in the connected and occluded conditions was worse than that in the separated condition, indicating that attention spread over perceptual object representations after amodal completion. Consistently, the late N1 spatial attention effect (180-220 ms post-stimulus) and the early phase (230-280 ms) of feature selection effects (target N2) at contralateral sites decreased, equally for the occluded and connected conditions, while the attention effect in the early N1 latency (140-180 ms) shifted most positively for the occluded condition. These results suggest that perceptual organization processes for object recognition transiently modulate spatial and feature selection processes in the visual cortex. Copyright © 2012 Elsevier Ltd. All rights reserved.
Audio-visual synchrony and feature-selective attention co-amplify early visual processing.

Science.gov (United States)

Keitel, Christian; Müller, Matthias M

2016-05-01

Our brain relies on neural mechanisms of selective attention and converging sensory processing to efficiently cope with rich and unceasing multisensory inputs. One prominent assumption holds that audio-visual synchrony can act as a strong attractor for spatial attention. Here, we tested for a similar effect of audio-visual synchrony on feature-selective attention. We presented two superimposed Gabor patches that differed in colour and orientation. On each trial, participants were cued to selectively attend to one of the two patches. Over time, spatial frequencies of both patches varied sinusoidally at distinct rates (3.14 and 3.63 Hz), giving rise to pulse-like percepts. A simultaneously presented pure tone carried a frequency modulation at the pulse rate of one of the two visual stimuli to introduce audio-visual synchrony. Pulsed stimulation elicited distinct time-locked oscillatory electrophysiological brain responses. These steady-state responses were quantified in the spectral domain to examine individual stimulus processing under conditions of synchronous versus asynchronous tone presentation and when respective stimuli were attended versus unattended. We found that both, attending to the colour of a stimulus and its synchrony with the tone, enhanced its processing. Moreover, both gain effects combined linearly for attended in-sync stimuli. Our results suggest that audio-visual synchrony can attract attention to specific stimulus features when stimuli overlap in space.

An Integrated DEMATEL-QFD Model for Medical Supplier Selection

OpenAIRE

Mehtap Dursun; Zeynep Şener

2014-01-01

Supplier selection is considered as one of the most critical issues encountered by operations and purchasing managers to sharpen the company’s competitive advantage. In this paper, a novel fuzzy multi-criteria group decision making approach integrating quality function deployment (QFD) and decision making trial and evaluation laboratory (DEMATEL) method is proposed for supplier selection. The proposed methodology enables to consider the impacts of inner dependence among supplier assessment cr...
BLProt: Prediction of bioluminescent proteins based on support vector machine and relieff feature selection

KAUST Repository

Kandaswamy, Krishna Kumar

2011-08-17

Background: Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.Results: In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.Conclusion: BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. 2011 Kandaswamy et al; licensee BioMed Central Ltd.
BLProt: Prediction of bioluminescent proteins based on support vector machine and relieff feature selection

KAUST Repository

Kandaswamy, Krishna Kumar; Pugalenthi, Ganesan; Hazrati, Mehrnaz Khodam; Kalies, Kai-Uwe; Martinetz, Thomas

2011-01-01

Background: Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.Results: In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.Conclusion: BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. 2011 Kandaswamy et al; licensee BioMed Central Ltd.
An integration of minimum local feature representation methods to recognize large variation of foods

Science.gov (United States)

Razali, Mohd Norhisham bin; Manshor, Noridayu; Halin, Alfian Abdul; Mustapha, Norwati; Yaakob, Razali

2017-10-01

Local invariant features have shown to be successful in describing object appearances for image classification tasks. Such features are robust towards occlusion and clutter and are also invariant against scale and orientation changes. This makes them suitable for classification tasks with little inter-class similarity and large intra-class difference. In this paper, we propose an integrated representation of the Speeded-Up Robust Feature (SURF) and Scale Invariant Feature Transform (SIFT) descriptors, using late fusion strategy. The proposed representation is used for food recognition from a dataset of food images with complex appearance variations. The Bag of Features (BOF) approach is employed to enhance the discriminative ability of the local features. Firstly, the individual local features are extracted to construct two kinds of visual vocabularies, representing SURF and SIFT. The visual vocabularies are then concatenated and fed into a Linear Support Vector Machine (SVM) to classify the respective food categories. Experimental results demonstrate impressive overall recognition at 82.38% classification accuracy based on the challenging UEC-Food100 dataset.
Integration of internal and external facial features in 8- to 10-year-old children and adults.

Science.gov (United States)

Meinhardt-Injac, Bozana; Persike, Malte; Meinhardt, Günter

2014-06-01

Investigation of whole-part and composite effects in 4- to 6-year-old children gave rise to claims that face perception is fully mature within the first decade of life (Crookes & McKone, 2009). However, only internal features were tested, and the role of external features was not addressed, although external features are highly relevant for holistic face perception (Sinha & Poggio, 1996; Axelrod & Yovel, 2010, 2011). In this study, 8- to 10-year-old children and adults performed a same-different matching task with faces and watches. In this task participants attended to either internal or external features. Holistic face perception was tested using a congruency paradigm, in which face and non-face stimuli either agreed or disagreed in both features (congruent contexts) or just in the attended ones (incongruent contexts). In both age groups, pronounced context congruency and inversion effects were found for faces, but not for watches. These findings indicate holistic feature integration for faces. While inversion effects were highly similar in both age groups, context congruency effects were stronger for children. Moreover, children's face matching performance was generally better when attending to external compared to internal features. Adults tended to perform better when attending to internal features. Our results indicate that both adults and 8- to 10-year-old children integrate external and internal facial features into holistic face representations. However, in children's face representations external features are much more relevant. These findings suggest that face perception is holistic but still not adult-like at the end of the first decade of life. Copyright © 2014 Elsevier B.V. All rights reserved.
The Research and Application of SURF Algorithm Based on Feature Point Selection Algorithm

Directory of Open Access Journals (Sweden)

Zhang Fang Hu

2014-04-01

Full Text Available As the pixel information of depth image is derived from the distance information, when implementing SURF algorithm with KINECT sensor for static sign language recognition, there can be some mismatched pairs in palm area. This paper proposes a feature point selection algorithm, by filtering the SURF feature points step by step based on the number of feature points within adaptive radius r and the distance between the two points, it not only greatly improves the recognition rate, but also ensures the robustness under the environmental factors, such as skin color, illumination intensity, complex background, angle and scale changes. The experiment results show that the improved SURF algorithm can effectively improve the recognition rate, has a good robustness.
A multi-feature integration method for fatigue crack detection and crack length estimation in riveted lap joints using Lamb waves

Science.gov (United States)

He, Jingjing; Guan, Xuefei; Peng, Tishun; Liu, Yongming; Saxena, Abhinav; Celaya, Jose; Goebel, Kai

2013-10-01

This paper presents an experimental study of damage detection and quantification in riveted lap joints. Embedded lead zirconate titanate piezoelectric (PZT) ceramic wafer-type sensors are employed to perform in situ non-destructive evaluation (NDE) during fatigue cyclical loading. PZT wafers are used to monitor the wave reflection from the boundaries of the fatigue crack at the edge of bolt joints. The group velocity of the guided wave is calculated to select a proper time window in which the received signal contains the damage information. It is found that the fatigue crack lengths are correlated with three main features of the signal, i.e., correlation coefficient, amplitude change, and phase change. It was also observed that a single feature cannot be used to quantify the damage among different specimens since a considerable variability was observed in the response from different specimens. A multi-feature integration method based on a second-order multivariate regression analysis is proposed for the prediction of fatigue crack lengths using sensor measurements. The model parameters are obtained using training datasets from five specimens. The effectiveness of the proposed methodology is demonstrated using several lap joint specimens from different manufactures and under different loading conditions.
A multi-feature integration method for fatigue crack detection and crack length estimation in riveted lap joints using Lamb waves

International Nuclear Information System (INIS)

He, Jingjing; Guan, Xuefei; Peng, Tishun; Liu, Yongming; Saxena, Abhinav; Celaya, Jose; Goebel, Kai

2013-01-01

This paper presents an experimental study of damage detection and quantification in riveted lap joints. Embedded lead zirconate titanate piezoelectric (PZT) ceramic wafer-type sensors are employed to perform in situ non-destructive evaluation (NDE) during fatigue cyclical loading. PZT wafers are used to monitor the wave reflection from the boundaries of the fatigue crack at the edge of bolt joints. The group velocity of the guided wave is calculated to select a proper time window in which the received signal contains the damage information. It is found that the fatigue crack lengths are correlated with three main features of the signal, i.e., correlation coefficient, amplitude change, and phase change. It was also observed that a single feature cannot be used to quantify the damage among different specimens since a considerable variability was observed in the response from different specimens. A multi-feature integration method based on a second-order multivariate regression analysis is proposed for the prediction of fatigue crack lengths using sensor measurements. The model parameters are obtained using training datasets from five specimens. The effectiveness of the proposed methodology is demonstrated using several lap joint specimens from different manufactures and under different loading conditions. (paper)
Integrating Multi-omic features exploiting Chromosome Conformation Capture data

Directory of Open Access Journals (Sweden)

Ivan eMerelli

2015-02-01

Full Text Available The representation, integration and interpretation of omic data is a complex task, in particular considering the huge amount of information that is daily produced in molecular biology laboratories all around the world. The reason is that sequencing data regarding expression profiles, methylation patterns, and chromatin domains is difficult to harmonize in a systems biology view, since genome browsers only allow coordinate-based representations, discarding functional clusters created by the spatial conformation of the DNA in the nucleus. In this context, recent progresses in high throughput molecular biology techniques and bioinformatics have provided insights into chromatin interactions on a larger scale and offer a formidable support for the interpretation of multi-omic data. In particular, a novel sequencing technique called Chromosome Conformation Capture (3C allows the analysis of the chromosome organization in the cell’s natural state. While performed genome wide, this technique is usually called Hi-C. Inspired by service applications such as Google Maps, we developed NuChart, an R package that integrates Hi-C data to describe the chromosomal neighbourhood starting from the information about gene positions, with the possibility of mapping on the achieved graphs genomic features such as methylation patterns and histone modifications, along with expression profiles. In this paper we show the importance of the NuChart application for the integration of multi-omic data in a systems biology fashion, with particular interest in cytogenetic applications of these techniques. Moreover, we demonstrate how the integration of multi-omic data can provide useful information in understanding why genes are in certain specific positions inside the nucleus and how epigenetic patterns correlate with their expression.
Organizational contextual features that influence the implementation of evidence-based practices across healthcare settings: a systematic integrative review.

Science.gov (United States)

Li, Shelly-Anne; Jeffs, Lianne; Barwick, Melanie; Stevens, Bonnie

2018-05-05

Organizational contextual features have been recognized as important determinants for implementing evidence-based practices across healthcare settings for over a decade. However, implementation scientists have not reached consensus on which features are most important for implementing evidence-based practices. The aims of this review were to identify the most commonly reported organizational contextual features that influence the implementation of evidence-based practices across healthcare settings, and to describe how these features affect implementation. An integrative review was undertaken following literature searches in CINAHL, MEDLINE, PsycINFO, EMBASE, Web of Science, and Cochrane databases from January 2005 to June 2017. English language, peer-reviewed empirical studies exploring organizational context in at least one implementation initiative within a healthcare setting were included. Quality appraisal of the included studies was performed using the Mixed Methods Appraisal Tool. Inductive content analysis informed data extraction and reduction. The search generated 5152 citations. After removing duplicates and applying eligibility criteria, 36 journal articles were included. The majority (n = 20) of the study designs were qualitative, 11 were quantitative, and 5 used a mixed methods approach. Six main organizational contextual features (organizational culture; leadership; networks and communication; resources; evaluation, monitoring and feedback; and champions) were most commonly reported to influence implementation outcomes in the selected studies across a wide range of healthcare settings. We identified six organizational contextual features that appear to be interrelated and work synergistically to influence the implementation of evidence-based practices within an organization. Organizational contextual features did not influence implementation efforts independently from other features. Rather, features were interrelated and often influenced each
INTEGRATION OF IMAGE-DERIVED AND POS-DERIVED FEATURES FOR IMAGE BLUR DETECTION

Directory of Open Access Journals (Sweden)

T.-A. Teo

2016-06-01

Full Text Available The image quality plays an important role for Unmanned Aerial Vehicle (UAV’s applications. The small fixed wings UAV is suffering from the image blur due to the crosswind and the turbulence. Position and Orientation System (POS, which provides the position and orientation information, is installed onto an UAV to enable acquisition of UAV trajectory. It can be used to calculate the positional and angular velocities when the camera shutter is open. This study proposes a POS-assisted method to detect the blur image. The major steps include feature extraction, blur image detection and verification. In feature extraction, this study extracts different features from images and POS. The image-derived features include mean and standard deviation of image gradient. For POS-derived features, we modify the traditional degree-of-linear-blur (blinear method to degree-of-motion-blur (bmotion based on the collinear condition equations and POS parameters. Besides, POS parameters such as positional and angular velocities are also adopted as POS-derived features. In blur detection, this study uses Support Vector Machines (SVM classifier and extracted features (i.e. image information, POS data, blinear and bmotion to separate blur and sharp UAV images. The experiment utilizes SenseFly eBee UAV system. The number of image is 129. In blur image detection, we use the proposed degree-of-motion-blur and other image features to classify the blur image and sharp images. The classification result shows that the overall accuracy using image features is only 56%. The integration of image-derived and POS-derived features have improved the overall accuracy from 56% to 76% in blur detection. Besides, this study indicates that the performance of the proposed degree-of-motion-blur is better than the traditional degree-of-linear-blur.
Optimum location of external markers using feature selection algorithms for real‐time tumor tracking in external‐beam radiotherapy: a virtual phantom study

Science.gov (United States)

Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin

2016-01-01

In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature
An integrated condition-monitoring method for a milling process using reduced decomposition features

International Nuclear Information System (INIS)

Liu, Jie; Wu, Bo; Hu, Youmin; Wang, Yan

2017-01-01

Complex and non-stationary cutting chatter affects productivity and quality in the milling process. Developing an effective condition-monitoring approach is critical to accurately identify cutting chatter. In this paper, an integrated condition-monitoring method is proposed, where reduced features are used to efficiently recognize and classify machine states in the milling process. In the proposed method, vibration signals are decomposed into multiple modes with variational mode decomposition, and Shannon power spectral entropy is calculated to extract features from the decomposed signals. Principal component analysis is adopted to reduce feature size and computational cost. With the extracted feature information, the probabilistic neural network model is used to recognize and classify the machine states, including stable, transition, and chatter states. Experimental studies are conducted, and results show that the proposed method can effectively detect cutting chatter during different milling operation conditions. This monitoring method is also efficient enough to satisfy fast machine state recognition and classification. (paper)
A hybrid approach to select features and classify diseases based on medical data

Science.gov (United States)

AbdelLatif, Hisham; Luo, Jiawei

2018-03-01

Feature selection is popular problem in the classification of diseases in clinical medicine. Here, we developing a hybrid methodology to classify diseases, based on three medical datasets, Arrhythmia, Breast cancer, and Hepatitis datasets. This methodology called k-means ANOVA Support Vector Machine (K-ANOVA-SVM) uses K-means cluster with ANOVA statistical to preprocessing data and selection the significant features, and Support Vector Machines in the classification process. To compare and evaluate the performance, we choice three classification algorithms, decision tree Naïve Bayes, Support Vector Machines and applied the medical datasets direct to these algorithms. Our methodology was a much better classification accuracy is given of 98% in Arrhythmia datasets, 92% in Breast cancer datasets and 88% in Hepatitis datasets, Compare to use the medical data directly with decision tree Naïve Bayes, and Support Vector Machines. Also, the ROC curve and precision with (K-ANOVA-SVM) Achieved best results than other algorithms
Fatigue level estimation of monetary bills based on frequency band acoustic signals with feature selection by supervised SOM

Science.gov (United States)

Teranishi, Masaru; Omatu, Sigeru; Kosaka, Toshihisa

Fatigued monetary bills adversely affect the daily operation of automated teller machines (ATMs). In order to make the classification of fatigued bills more efficient, the development of an automatic fatigued monetary bill classification method is desirable. We propose a new method by which to estimate the fatigue level of monetary bills from the feature-selected frequency band acoustic energy pattern of banking machines. By using a supervised self-organizing map (SOM), we effectively estimate the fatigue level using only the feature-selected frequency band acoustic energy pattern. Furthermore, the feature-selected frequency band acoustic energy pattern improves the estimation accuracy of the fatigue level of monetary bills by adding frequency domain information to the acoustic energy pattern. The experimental results with real monetary bill samples reveal the effectiveness of the proposed method.
CLASSIFICATION OF INFORMAL SETTLEMENTS THROUGH THE INTEGRATION OF 2D AND 3D FEATURES EXTRACTED FROM UAV DATA

Directory of Open Access Journals (Sweden)

C. M. Gevaert

2016-06-01

Full Text Available Unmanned Aerial Vehicles (UAVs are capable of providing very high resolution and up-to-date information to support informal settlement upgrading projects. In order to provide accurate basemaps, urban scene understanding through the identification and classification of buildings and terrain is imperative. However, common characteristics of informal settlements such as small, irregular buildings with heterogeneous roof material and large presence of clutter challenge state-of-the-art algorithms. Especially the dense buildings and steeply sloped terrain cause difficulties in identifying elevated objects. This work investigates how 2D radiometric and textural features, 2.5D topographic features, and 3D geometric features obtained from UAV imagery can be integrated to obtain a high classification accuracy in challenging classification problems for the analysis of informal settlements. It compares the utility of pixel-based and segment-based features obtained from an orthomosaic and DSM with point-based and segment-based features extracted from the point cloud to classify an unplanned settlement in Kigali, Rwanda. Findings show that the integration of 2D and 3D features leads to higher classification accuracies.
License Application Design Selection Feature Report: Aging and Blending

International Nuclear Information System (INIS)

Coltoni, B.; Anderson, M.J.

1999-01-01

The purpose of this document is to evaluate the concepts of Aging and Blending for waste sent to the Monitored Geologic Repository (MGR). These design features are based on pre-emplacement treatment of the waste stream. The envelope of the analysis has been performed under the direction of the License Application Design Selection Team (LADST), which advocated utilizing the Viability Assessment (VA) repository design (DOE 1998c) as the basis. Therefore, this evaluation attempts to modify the VA design only to the extent that Aging and Blending can be accomplished. This modified VA design will be contrasted to the VA Design and the difference in design, costs, and performance will be presented
Effects of changing canopy directional reflectance on feature selection

Science.gov (United States)

Smith, J. A.; Oliver, R. E.; Kilpela, O. E.

1973-01-01

The use of a Monte Carlo model for generating sample directional reflectance data for two simplified target canopies at two different solar positions is reported. Successive iterations through the model permit the calculation of a mean vector and covariance matrix for canopy reflectance for varied sensor view angles. These data may then be used to calculate the divergence between the target distributions for various wavelength combinations and for these view angles. Results of a feature selection analysis indicate that different sets of wavelengths are optimum for target discrimination depending on sensor view angle and that the targets may be more easily discriminated for some scan angles than others. The time-varying behavior of these results is also pointed out.
Topological features of the Sokolov integrable case on the Lie algebra so(3,1)

International Nuclear Information System (INIS)

Novikov, D V

2014-01-01

The integrable Sokolov case on so(3,1) ⋆ is investigated. This is a Hamiltonian system with two degrees of freedom, in which the Hamiltonian and the additional integral are homogeneous polynomials of degrees 2 and 4, respectively. It is an interesting feature of this system that connected components of common level surfaces of the Hamiltonian and the additional integral turn out to be noncompact. The critical points of the moment map and their indices are found, the bifurcation diagram is constructed, and the topology of noncompact level surfaces is determined, that is, the closures of solutions of the Sokolov system on so(3,1) are described. Bibliography: 24 titles
Sequential Modulations in a Combined Horizontal and Vertical Simon Task: Is There ERP Evidence for Feature Integration Effects?

Science.gov (United States)

Hoppe, Katharina; Küper, Kristina; Wascher, Edmund

2017-01-01

In the Simon task, participants respond faster when the task-irrelevant stimulus position and the response position are corresponding, for example on the same side, compared to when they have a non-corresponding relation. Interestingly, this Simon effect is reduced after non-corresponding trials. Such sequential effects can be explained in terms of a more focused processing of the relevant stimulus dimension due to increased cognitive control, which transfers from the previous non-corresponding trial (conflict adaptation effects). Alternatively, sequential modulations of the Simon effect can also be due to the degree of trial-to-trial repetitions and alternations of task features, which is confounded with the correspondence sequence (feature integration effects). In the present study, we used a spatially two-dimensional Simon task with vertical response keys to examine the contribution of adaptive cognitive control and feature integration processes to the sequential modulation of the Simon effect. The two-dimensional Simon task creates correspondences in the vertical as well as in the horizontal dimension. A trial-by-trial alternation of the spatial dimension, for example from a vertical to a horizontal stimulus presentation, generates a subset containing no complete repetitions of task features, but only complete alternations and partial repetitions, which are equally distributed over all correspondence sequences. In line with the assumed feature integration effects, we found sequential modulations of the Simon effect only when the spatial dimension repeated. At least for the horizontal dimension, this pattern was confirmed by the parietal P3b, an event-related potential that is assumed to reflect stimulus-response link processes. Contrary to conflict adaptation effects, cognitive control, measured by the fronto-central N2 component of the EEG, was not sequentially modulated. Overall, our data provide behavioral as well as electrophysiological evidence for feature

Features of selection of children for occupations by artistic gymnastics in modern Kurdistan

Directory of Open Access Journals (Sweden)

Abdulvahid Dlshad Nihad

2015-10-01

Full Text Available Purpose: to study the organizational and pedagogical conditions of selection of children for occupations existing in the republic Kurdistan artistic gymnastics Material and Methods: questioning of 24 trainers on artistic gymnastics and experts in physical culture of the republic Kurdistan was carried out. The general questions of selection and methodical features of selection of children for occupations by artistic gymnastics in Kurdistan were studied. Results: questioning revealed absence of the general approved tests and scientific recommendations concerning their use, dependence of quality of selection on experience of the trainer. Conclusions: experts in the field of physical culture and sport consider inefficient the existing system of selection of children for occupations artistic gymnastics in Kurdistan; gymnastics coaches consider necessary testing’s at children of a level of development of flexibility, dexterity, abilities to manifestation of dynamic force and preservation of dynamic balance
Manifold regularized multi-task feature selection for multi-modality classification in Alzheimer's disease.

Science.gov (United States)

Jie, Biao; Zhang, Daoqiang; Cheng, Bo; Shen, Dinggang

2013-01-01

Accurate diagnosis of Alzheimer's disease (AD), as well as its prodromal stage (i.e., mild cognitive impairment, MCI), is very important for possible delay and early treatment of the disease. Recently, multi-modality methods have been used for fusing information from multiple different and complementary imaging and non-imaging modalities. Although there are a number of existing multi-modality methods, few of them have addressed the problem of joint identification of disease-related brain regions from multi-modality data for classification. In this paper, we proposed a manifold regularized multi-task learning framework to jointly select features from multi-modality data. Specifically, we formulate the multi-modality classification as a multi-task learning framework, where each task focuses on the classification based on each modality. In order to capture the intrinsic relatedness among multiple tasks (i.e., modalities), we adopted a group sparsity regularizer, which ensures only a small number of features to be selected jointly. In addition, we introduced a new manifold based Laplacian regularization term to preserve the geometric distribution of original data from each task, which can lead to the selection of more discriminative features. Furthermore, we extend our method to the semi-supervised setting, which is very important since the acquisition of a large set of labeled data (i.e., diagnosis of disease) is usually expensive and time-consuming, while the collection of unlabeled data is relatively much easier. To validate our method, we have performed extensive evaluations on the baseline Magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET) data of Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Our experimental results demonstrate the effectiveness of the proposed method.
An integrated approach towards future ballistic neck protection materials selection.

Science.gov (United States)

Breeze, John; Helliker, Mark; Carr, Debra J

2013-05-01

Ballistic protection for the neck has historically taken the form of collars attached to the ballistic vest (removable or fixed), but other approaches, including the development of prototypes incorporating ballistic material into the collar of an under body armour shirt, are now being investigated. Current neck collars incorporate the same ballistic protective fabrics as the soft armour of the remaining vest, reflecting how ballistic protective performance alone has historically been perceived as the most important property for neck protection. However, the neck has fundamental differences from the thorax in terms of anatomical vulnerability, flexibility and equipment integration, necessitating a separate solution from the thorax in terms of optimal materials selection. An integrated approach towards the selection of the most appropriate combination of materials to be used for each of the two potential designs of future neck protection has been developed. This approach requires evaluation of the properties of each potential material in addition to ballistic performance alone, including flexibility, mass, wear resistance and thermal burden. The aim of this article is to provide readers with an overview of this integrated approach towards ballistic materials selection and an update of its current progress in the development of future ballistic neck protection.
A Hybrid Feature Subset Selection Algorithm for Analysis of High Correlation Proteomic Data

Science.gov (United States)

Kordy, Hussain Montazery; Baygi, Mohammad Hossein Miran; Moradi, Mohammad Hassan

2012-01-01

Pathological changes within an organ can be reflected as proteomic patterns in biological fluids such as plasma, serum, and urine. The surface-enhanced laser desorption and ionization time-of-flight mass spectrometry (SELDI-TOF MS) has been used to generate proteomic profiles from biological fluids. Mass spectrometry yields redundant noisy data that the most data points are irrelevant features for differentiating between cancer and normal cases. In this paper, we have proposed a hybrid feature subset selection algorithm based on maximum-discrimination and minimum-correlation coupled with peak scoring criteria. Our algorithm has been applied to two independent SELDI-TOF MS datasets of ovarian cancer obtained from the NCI-FDA clinical proteomics databank. The proposed algorithm has used to extract a set of proteins as potential biomarkers in each dataset. We applied the linear discriminate analysis to identify the important biomarkers. The selected biomarkers have been able to successfully diagnose the ovarian cancer patients from the noncancer control group with an accuracy of 100%, a sensitivity of 100%, and a specificity of 100% in the two datasets. The hybrid algorithm has the advantage that increases reproducibility of selected biomarkers and able to find a small set of proteins with high discrimination power. PMID:23717808
MULTI-SCALE SEGMENTATION OF HIGH RESOLUTION REMOTE SENSING IMAGES BY INTEGRATING MULTIPLE FEATURES

Directory of Open Access Journals (Sweden)

Y. Di

2017-05-01

Full Text Available Most of multi-scale segmentation algorithms are not aiming at high resolution remote sensing images and have difficulty to communicate and use layers’ information. In view of them, we proposes a method of multi-scale segmentation of high resolution remote sensing images by integrating multiple features. First, Canny operator is used to extract edge information, and then band weighted distance function is built to obtain the edge weight. According to the criterion, the initial segmentation objects of color images can be gained by Kruskal minimum spanning tree algorithm. Finally segmentation images are got by the adaptive rule of Mumford–Shah region merging combination with spectral and texture information. The proposed method is evaluated precisely using analog images and ZY-3 satellite images through quantitative and qualitative analysis. The experimental results show that the multi-scale segmentation of high resolution remote sensing images by integrating multiple features outperformed the software eCognition fractal network evolution algorithm (highest-resolution network evolution that FNEA on the accuracy and slightly inferior to FNEA on the efficiency.
Integral criterion for selecting nonlinear crystals for frequency conversion

International Nuclear Information System (INIS)

Grechin, Sergei G

2009-01-01

An integral criterion, which takes into account all parameters determining the conversion efficiency, is offered for selecting nonlinear crystals for frequency conversion. The angular phase-matching width is shown to be related to the beam walk-off angle. (nonlinear optical phenomena)
Technique for selection of transient radiation-hard junction-isolated integrated circuits

International Nuclear Information System (INIS)

Crowley, J.L.; Junga, F.A.; Stultz, T.J.

1976-01-01

A technique is presented which demonstrates the feasibility of selecting junction-isolated integrated circuits (JI/ICS) for use in transient radiation environments. The procedure guarantees that all PNPN paths within the integrated circuit are identified and describes the methods used to determine whether the paths represent latchup susceptible structures. Two examples of the latchup analysis are given involving an SSI and an LSI bipolar junction-isolated integrated circuit
Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods

Science.gov (United States)

2013-01-01

Background Machine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers. Results In the first stage of this research, five feature selection methods have been proposed and experimented on the oral cancer prognosis dataset. In the second stage, the model with the features selected from each feature selection methods are tested on the proposed classifiers. Four types of classifiers are chosen; these are namely, ANFIS, artificial neural network, support vector machine and logistic regression. A k-fold cross-validation is implemented on all types of classifiers due to the small sample size. The hybrid model of ReliefF-GA-ANFIS with 3-input features of drink, invasion and p63 achieved the best accuracy (accuracy = 93.81%; AUC = 0.90) for the oral cancer prognosis. Conclusions The results revealed that the prognosis is superior with the presence of both clinicopathologic and genomic markers. The selected features can be investigated further to validate the potential of becoming as significant prognostic signature in the oral cancer studies. PMID:23725313
Computer aided diagnosis system for Alzheimer disease using brain diffusion tensor imaging features selected by Pearson's correlation.

Science.gov (United States)

Graña, M; Termenon, M; Savio, A; Gonzalez-Pinto, A; Echeveste, J; Pérez, J M; Besga, A

2011-09-20

The aim of this paper is to obtain discriminant features from two scalar measures of Diffusion Tensor Imaging (DTI) data, Fractional Anisotropy (FA) and Mean Diffusivity (MD), and to train and test classifiers able to discriminate Alzheimer's Disease (AD) patients from controls on the basis of features extracted from the FA or MD volumes. In this study, support vector machine (SVM) classifier was trained and tested on FA and MD data. Feature selection is done computing the Pearson's correlation between FA or MD values at voxel site across subjects and the indicative variable specifying the subject class. Voxel sites with high absolute correlation are selected for feature extraction. Results are obtained over an on-going study in Hospital de Santiago Apostol collecting anatomical T1-weighted MRI volumes and DTI data from healthy control subjects and AD patients. FA features and a linear SVM classifier achieve perfect accuracy, sensitivity and specificity in several cross-validation studies, supporting the usefulness of DTI-derived features as an image-marker for AD and to the feasibility of building Computer Aided Diagnosis systems for AD based on them. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Performance characteristics of selected integrating ion chambers

International Nuclear Information System (INIS)

Lubenau, J.O.; Liberace, R.

1977-01-01

Certain types of integrating ion chambers have been identified as acceptable equipment for a nationwide medical X-ray exposure survey program. In this study, Victoreen 2.5, 5 and 10 R condenser R-chambers, the Victoreen 666 Diagnostic Probe (used in the integrating mode) and the Bendix 200 mR and 5 R low energy dosimeters were evaluated for recombination losses and for energy dependence. Recombination losses were determined for exposure rates ranging from 0.3 to 80 R/sec. Energy dependence was determined for X-ray beam qualities ranging from 45 kVp and 0.83 mm Al first half value layer to 125 kVp and 4.8 mm Al first half value layer. The data enable selection of instruments so that errors from recombination losses and energy dependence can be minimized. (author)
Is the emotion recognition deficit associated with frontotemporal dementia caused by selective inattention to diagnostic facial features?

Science.gov (United States)

Oliver, Lindsay D; Virani, Karim; Finger, Elizabeth C; Mitchell, Derek G V

2014-07-01

Frontotemporal dementia (FTD) is a debilitating neurodegenerative disorder characterized by severely impaired social and emotional behaviour, including emotion recognition deficits. Though fear recognition impairments seen in particular neurological and developmental disorders can be ameliorated by reallocating attention to critical facial features, the possibility that similar benefits can be conferred to patients with FTD has yet to be explored. In the current study, we examined the impact of presenting distinct regions of the face (whole face, eyes-only, and eyes-removed) on the ability to recognize expressions of anger, fear, disgust, and happiness in 24 patients with FTD and 24 healthy controls. A recognition deficit was demonstrated across emotions by patients with FTD relative to controls. Crucially, removal of diagnostic facial features resulted in an appropriate decline in performance for both groups; furthermore, patients with FTD demonstrated a lack of disproportionate improvement in emotion recognition accuracy as a result of isolating critical facial features relative to controls. Thus, unlike some neurological and developmental disorders featuring amygdala dysfunction, the emotion recognition deficit observed in FTD is not likely driven by selective inattention to critical facial features. Patients with FTD also mislabelled negative facial expressions as happy more often than controls, providing further evidence for abnormalities in the representation of positive affect in FTD. This work suggests that the emotional expression recognition deficit associated with FTD is unlikely to be rectified by adjusting selective attention to diagnostic features, as has proven useful in other select disorders. Copyright © 2014 Elsevier Ltd. All rights reserved.
A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection.

Science.gov (United States)

Ceccarelli, Michele; d'Acierno, Antonio; Facchiano, Angelo

2009-10-15

Mass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realization of efficient predictive and screening protocols. With this kind of data dimensions and samples size the risk of over-fitting and selection bias is pervasive. Therefore the development of bio-informatics methods based on unsupervised feature extraction can lead to general tools which can be applied to several fields of predictive proteomics. We propose a method for feature selection and extraction grounded on the theory of multi-scale spaces for high resolution spectra derived from analysis of serum. Then we use support vector machines for classification. In particular we use a database containing 216 samples spectra divided in 115 cancer and 91 control samples. The overall accuracy averaged over a large cross validation study is 98.18. The area under the ROC curve of the best selected model is 0.9962. We improved previous known results on the problem on the same data, with the advantage that the proposed method has an unsupervised feature selection phase. All the developed code, as MATLAB scripts, can be downloaded from http://medeaserver.isa.cnr.it/dacierno/spectracode.htm.
Temperature Characteristics of Monolithically Integrated Wavelength-Selectable Light Sources

International Nuclear Information System (INIS)

Han Liang-Shun; Zhu Hong-Liang; Zhang Can; Ma Li; Liang Song; Wang Wei

2013-01-01

The temperature characteristics of monolithically integrated wavelength-selectable light sources are experimentally investigated. The wavelength-selectable light sources consist of four distributed feedback (DFB) lasers, a multimode interferometer coupler, and a semiconductor optical amplifier. The oscillating wavelength of the DFB laser could be modulated by adjusting the device operating temperature. A wavelength range covering over 8.0nm is obtained with stable single-mode operation by selecting the appropriate laser and chip temperature. The thermal crosstalk caused by the lateral heat spreading between lasers operating simultaneously is evaluated by oscillating-wavelength shift. The thermal crosstalk approximately decreases exponentially as the increasing distance between lasers
A data-driven multi-model methodology with deep feature selection for short-term wind forecasting

International Nuclear Information System (INIS)

Feng, Cong; Cui, Mingjian; Hodge, Bri-Mathias; Zhang, Jie

2017-01-01

Highlights: • An ensemble model is developed to produce both deterministic and probabilistic wind forecasts. • A deep feature selection framework is developed to optimally determine the inputs to the forecasting methodology. • The developed ensemble methodology has improved the forecasting accuracy by up to 30%. - Abstract: With the growing wind penetration into the power system worldwide, improving wind power forecasting accuracy is becoming increasingly important to ensure continued economic and reliable power system operations. In this paper, a data-driven multi-model wind forecasting methodology is developed with a two-layer ensemble machine learning technique. The first layer is composed of multiple machine learning models that generate individual forecasts. A deep feature selection framework is developed to determine the most suitable inputs to the first layer machine learning models. Then, a blending algorithm is applied in the second layer to create an ensemble of the forecasts produced by first layer models and generate both deterministic and probabilistic forecasts. This two-layer model seeks to utilize the statistically different characteristics of each machine learning algorithm. A number of machine learning algorithms are selected and compared in both layers. This developed multi-model wind forecasting methodology is compared to several benchmarks. The effectiveness of the proposed methodology is evaluated to provide 1-hour-ahead wind speed forecasting at seven locations of the Surface Radiation network. Numerical results show that comparing to the single-algorithm models, the developed multi-model framework with deep feature selection procedure has improved the forecasting accuracy by up to 30%.
Effective automated feature construction and selection for classification of biological sequences.

Directory of Open Access Journals (Sweden)

Uday Kamath

Full Text Available Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection of regulatory regions, splice sites, exons, hypersensitive sites, and more. These problems naturally lend themselves to formulation as classification problems in machine learning. When classification is based on features extracted from the sequences under investigation, success is critically dependent on the chosen set of features.We present an algorithmic framework (EFFECT for automated detection of functional signals in biological sequences. We focus here on classification problems involving DNA sequences which state-of-the-art work in machine learning shows to be challenging and involve complex combinations of local and distal features. EFFECT uses a two-stage process to first construct a set of candidate sequence-based features and then select a most effective subset for the classification task at hand. Both stages make heavy use of evolutionary algorithms to efficiently guide the search towards informative features capable of discriminating between sequences that contain a particular functional signal and those that do not.To demonstrate its generality, EFFECT is applied to three separate problems of importance in DNA research: the recognition of hypersensitive sites, splice sites, and ALU sites. Comparisons with state-of-the-art algorithms show that the framework is both general and powerful. In addition, a detailed analysis of the constructed features shows that they contain valuable biological information about DNA architecture, allowing biologists and other researchers to directly inspect the features and potentially use the insights obtained to assist wet-laboratory studies on retainment or modification
Rolling Bearing Fault Diagnosis Using Modified Neighborhood Preserving Embedding and Maximal Overlap Discrete Wavelet Packet Transform with Sensitive Features Selection

Directory of Open Access Journals (Sweden)

Fei Dong

2018-01-01

Full Text Available In order to enhance the performance of bearing fault diagnosis and classification, features extraction and features dimensionality reduction have become more important. The original statistical feature set was calculated from single branch reconstruction vibration signals obtained by using maximal overlap discrete wavelet packet transform (MODWPT. In order to reduce redundancy information of original statistical feature set, features selection by adjusted rand index and sum of within-class mean deviations (FSASD was proposed to select fault sensitive features. Furthermore, a modified features dimensionality reduction method, supervised neighborhood preserving embedding with label information (SNPEL, was proposed to realize low-dimensional representations for high-dimensional feature space. Finally, vibration signals collected from two experimental test rigs were employed to evaluate the performance of the proposed procedure. The results show that the effectiveness, adaptability, and superiority of the proposed procedure can serve as an intelligent bearing fault diagnosis system.
Teaching the Sociology of Popular Music with the Help of Feature Films: A Selected and Annotated Videography.

Science.gov (United States)

Groce, Stephen B.

1992-01-01

Discusses the use of feature films for courses on popular culture and the sociology of popular music. Suggests that films can illustrate topics such as culture, social groups, deviant behavior, racism, and sexism. Lists a selection of Hollywood feature films with accompanying readings and students' evaluations. (DK)
Pattern Classification Using an Olfactory Model with PCA Feature Selection in Electronic Noses: Study and Application

Directory of Open Access Journals (Sweden)

Junbao Zheng

2012-03-01

Full Text Available Biologically-inspired models and algorithms are considered as promising sensor array signal processing methods for electronic noses. Feature selection is one of the most important issues for developing robust pattern recognition models in machine learning. This paper describes an investigation into the classification performance of a bionic olfactory model with the increase of the dimensions of input feature vector (outer factor as well as its parallel channels (inner factor. The principal component analysis technique was applied for feature selection and dimension reduction. Two data sets of three classes of wine derived from different cultivars and five classes of green tea derived from five different provinces of China were used for experiments. In the former case the results showed that the average correct classification rate increased as more principal components were put in to feature vector. In the latter case the results showed that sufficient parallel channels should be reserved in the model to avoid pattern space crowding. We concluded that 6~8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3~5 pattern classes considering the trade-off between time consumption and classification rate.
Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation using support vector machines with feature selection methods.

Science.gov (United States)

Liang, Ja-Der; Ping, Xiao-Ou; Tseng, Yi-Ju; Huang, Guan-Tarn; Lai, Feipei; Yang, Pei-Ming

2014-12-01

Recurrence of hepatocellular carcinoma (HCC) is an important issue despite effective treatments with tumor eradication. Identification of patients who are at high risk for recurrence may provide more efficacious screening and detection of tumor recurrence. The aim of this study was to develop recurrence predictive models for HCC patients who received radiofrequency ablation (RFA) treatment. From January 2007 to December 2009, 83 newly diagnosed HCC patients receiving RFA as their first treatment were enrolled. Five feature selection methods including genetic algorithm (GA), simulated annealing (SA) algorithm, random forests (RF) and hybrid methods (GA+RF and SA+RF) were utilized for selecting an important subset of features from a total of 16 clinical features. These feature selection methods were combined with support vector machine (SVM) for developing predictive models with better performance. Five-fold cross-validation was used to train and test SVM models. The developed SVM-based predictive models with hybrid feature selection methods and 5-fold cross-validation had averages of the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and area under the ROC curve as 67%, 86%, 82%, 69%, 90%, and 0.69, respectively. The SVM derived predictive model can provide suggestive high-risk recurrent patients, who should be closely followed up after complete RFA treatment. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Gene features selection for three-class disease classification via multiple orthogonal partial least square discriminant analysis and S-plot using microarray data.

Science.gov (United States)

Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu

2013-01-01

DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.

Study of Machine-Learning Classifier and Feature Set Selection for Intent Classification of Korean Tweets about Food Safety

Directory of Open Access Journals (Sweden)

Yeom, Ha-Neul

2014-09-01

Full Text Available In recent years, several studies have proposed making use of the Twitter micro-blogging service to track various trends in online media and discussion. In this study, we specifically examine the use of Twitter to track discussions of food safety in the Korean language. Given the irregularity of keyword use in most tweets, we focus on optimistic machine-learning and feature set selection to classify collected tweets. We build the classifier model using Naive Bayes & Naive Bayes Multinomial, Support Vector Machine, and Decision Tree Algorithms, all of which show good performance. To select an optimum feature set, we construct a basic feature set as a standard for performance comparison, so that further test feature sets can be evaluated. Experiments show that precision and F-measure performance are best when using a Naive Bayes Multinomial classifier model with a test feature set defined by extracting Substantive, Predicate, Modifier, and Interjection parts of speech.
The application of feature selection to the development of Gaussian process models for percutaneous absorption.

Science.gov (United States)

Lam, Lun Tak; Sun, Yi; Davey, Neil; Adams, Rod; Prapopoulou, Maria; Brown, Marc B; Moss, Gary P

2010-06-01

The aim was to employ Gaussian processes to assess mathematically the nature of a skin permeability dataset and to employ these methods, particularly feature selection, to determine the key physicochemical descriptors which exert the most significant influence on percutaneous absorption, and to compare such models with established existing models. Gaussian processes, including automatic relevance detection (GPRARD) methods, were employed to develop models of percutaneous absorption that identified key physicochemical descriptors of percutaneous absorption. Using MatLab software, the statistical performance of these models was compared with single linear networks (SLN) and quantitative structure-permeability relationships (QSPRs). Feature selection methods were used to examine in more detail the physicochemical parameters used in this study. A range of statistical measures to determine model quality were used. The inherently nonlinear nature of the skin data set was confirmed. The Gaussian process regression (GPR) methods yielded predictive models that offered statistically significant improvements over SLN and QSPR models with regard to predictivity (where the rank order was: GPR > SLN > QSPR). Feature selection analysis determined that the best GPR models were those that contained log P, melting point and the number of hydrogen bond donor groups as significant descriptors. Further statistical analysis also found that great synergy existed between certain parameters. It suggested that a number of the descriptors employed were effectively interchangeable, thus questioning the use of models where discrete variables are output, usually in the form of an equation. The use of a nonlinear GPR method produced models with significantly improved predictivity, compared with SLN or QSPR models. Feature selection methods were able to provide important mechanistic information. However, it was also shown that significant synergy existed between certain parameters, and as such it
Spatial attention enhances the selective integration of activity from area MT.

Science.gov (United States)

Masse, Nicolas Y; Herrington, Todd M; Cook, Erik P

2012-09-01

Distinguishing which of the many proposed neural mechanisms of spatial attention actually underlies behavioral improvements in visually guided tasks has been difficult. One attractive hypothesis is that attention allows downstream neural circuits to selectively integrate responses from the most informative sensory neurons. This would allow behavioral performance to be based on the highest-quality signals available in visual cortex. We examined this hypothesis by asking how spatial attention affects both the stimulus sensitivity of middle temporal (MT) neurons and their corresponding correlation with behavior. Analyzing a data set pooled from two experiments involving four monkeys, we found that spatial attention did not appreciably affect either the stimulus sensitivity of the neurons or the correlation between their activity and behavior. However, for those sessions in which there was a robust behavioral effect of attention, focusing attention inside the neuron's receptive field significantly increased the correlation between these two metrics, an indication of selective integration. These results suggest that, similar to mechanisms proposed for the neural basis of perceptual learning, the behavioral benefits of focusing spatial attention are attributable to selective integration of neural activity from visual cortical areas by their downstream targets.
Multimodal Feature Integration in the Angular Gyrus during Episodic and Semantic Retrieval

Science.gov (United States)

Bonnici, Heidi M.; Richter, Franziska R.; Yazar, Yasemin

2016-01-01

Much evidence from distinct lines of investigation indicates the involvement of angular gyrus (AnG) in the retrieval of both episodic and semantic information, but the region's precise function and whether that function differs across episodic and semantic retrieval have yet to be determined. We used univariate and multivariate fMRI analysis methods to examine the role of AnG in multimodal feature integration during episodic and semantic retrieval. Human participants completed episodic and semantic memory tasks involving unimodal (auditory or visual) and multimodal (audio-visual) stimuli. Univariate analyses revealed the recruitment of functionally distinct AnG subregions during the retrieval of episodic and semantic information. Consistent with a role in multimodal feature integration during episodic retrieval, significantly greater AnG activity was observed during retrieval of integrated multimodal episodic memories compared with unimodal episodic memories. Multivariate classification analyses revealed that individual multimodal episodic memories could be differentiated in AnG, with classification accuracy tracking the vividness of participants' reported recollections, whereas distinct unimodal memories were represented in sensory association areas only. In contrast to episodic retrieval, AnG was engaged to a statistically equivalent degree during retrieval of unimodal and multimodal semantic memories, suggesting a distinct role for AnG during semantic retrieval. Modality-specific sensory association areas exhibited corresponding activity during both episodic and semantic retrieval, which mirrored the functional specialization of these regions during perception. The results offer new insights into the integrative processes subserved by AnG and its contribution to our subjective experience of remembering. SIGNIFICANCE STATEMENT Using univariate and multivariate fMRI analyses, we provide evidence that functionally distinct subregions of angular gyrus (An
Multimodal Feature Integration in the Angular Gyrus during Episodic and Semantic Retrieval.

Science.gov (United States)

Bonnici, Heidi M; Richter, Franziska R; Yazar, Yasemin; Simons, Jon S

2016-05-18

Much evidence from distinct lines of investigation indicates the involvement of angular gyrus (AnG) in the retrieval of both episodic and semantic information, but the region's precise function and whether that function differs across episodic and semantic retrieval have yet to be determined. We used univariate and multivariate fMRI analysis methods to examine the role of AnG in multimodal feature integration during episodic and semantic retrieval. Human participants completed episodic and semantic memory tasks involving unimodal (auditory or visual) and multimodal (audio-visual) stimuli. Univariate analyses revealed the recruitment of functionally distinct AnG subregions during the retrieval of episodic and semantic information. Consistent with a role in multimodal feature integration during episodic retrieval, significantly greater AnG activity was observed during retrieval of integrated multimodal episodic memories compared with unimodal episodic memories. Multivariate classification analyses revealed that individual multimodal episodic memories could be differentiated in AnG, with classification accuracy tracking the vividness of participants' reported recollections, whereas distinct unimodal memories were represented in sensory association areas only. In contrast to episodic retrieval, AnG was engaged to a statistically equivalent degree during retrieval of unimodal and multimodal semantic memories, suggesting a distinct role for AnG during semantic retrieval. Modality-specific sensory association areas exhibited corresponding activity during both episodic and semantic retrieval, which mirrored the functional specialization of these regions during perception. The results offer new insights into the integrative processes subserved by AnG and its contribution to our subjective experience of remembering. Using univariate and multivariate fMRI analyses, we provide evidence that functionally distinct subregions of angular gyrus (AnG) contribute to the retrieval of
The Influence of Selected Personality and Workplace Features on Burnout among Nurse Academics

Science.gov (United States)

Kizilci, Sevgi; Erdogan, Vesile; Sozen, Emine

2012-01-01

This study aimed to determine the influence of selected individual and situational features on burnout among nurse academics. The Maslach Burnout Inventory was used to assess the burnout levels of academics. The sample population comprised 94 female participant. The emotion exhaustion (EE) score of the nurse academics was 16.43[plus or minus]5.97,…
Influence of Feature Selection Methods on Classification Sensitivity Based on the Example of A Study of Polish Voivodship Tourist Attractiveness

Directory of Open Access Journals (Sweden)

Bąk Iwona

2014-07-01

Full Text Available The purpose of this article is to determine the influence of various methods of selection of diagnostic features on the sensitivity of classification. Three options of feature selection are presented: a parametric feature selection method with a sum (option I, a median of the correlation coefficients matrix column elements (option II and the method of a reversed matrix (option III. Efficiency of the groupings was verified by the indicators of homogeneity, heterogeneity and the correctness of grouping. In the assessment of group efficiency the approach with the Weber median was used. The undertaken problem was illustrated with a research into the tourist attractiveness of voivodships in Poland in 2011.
Feature Selection and Blind Source Separation in an EEG-Based Brain-Computer Interface

Directory of Open Access Journals (Sweden)

Michael H. Thaut

2005-11-01

Full Text Available Most EEG-based BCI systems make use of well-studied patterns of brain activity. However, those systems involve tasks that indirectly map to simple binary commands such as Ã¢Â€ÂœyesÃ¢Â€Â or Ã¢Â€ÂœnoÃ¢Â€Â or require many weeks of biofeedback training. We hypothesized that signal processing and machine learning methods can be used to discriminate EEG in a direct Ã¢Â€ÂœyesÃ¢Â€Â/Ã¢Â€ÂœnoÃ¢Â€Â BCI from a single session. Blind source separation (BSS and spectral transformations of the EEG produced a 180-dimensional feature space. We used a modified genetic algorithm (GA wrapped around a support vector machine (SVM classifier to search the space of feature subsets. The GA-based search found feature subsets that outperform full feature sets and random feature subsets. Also, BSS transformations of the EEG outperformed the original time series, particularly in conjunction with a subset search of both spaces. The results suggest that BSS and feature selection can be used to improve the performance of even a Ã¢Â€Âœdirect,Ã¢Â€Â single-session BCI.
Functional connectivity supporting the selective maintenance of feature-location binding in visual working memory

Directory of Open Access Journals (Sweden)

Sachiko eTakahama

2014-06-01

Full Text Available Information on an object’s features bound to its location is very important for maintaining object representations in visual working memory. Interactions with dynamic multi-dimensional objects in an external environment require complex cognitive control, including the selective maintenance of feature-location binding. Here, we used event-related functional magnetic resonance imaging to investigate brain activity and functional connectivity related to the maintenance of complex feature-location binding. Participants were required to detect task-relevant changes in feature-location binding between objects defined by color, orientation, and location. We compared a complex binding task requiring complex feature-location binding (color-orientation-location with a simple binding task in which simple feature-location binding, such as color-location, was task-relevant and the other feature was task-irrelevant. Univariate analyses showed that the dorsolateral prefrontal cortex (DLPFC, hippocampus, and frontoparietal network were activated during the maintenance of complex feature-location binding. Functional connectivity analyses indicated cooperation between the inferior precentral sulcus (infPreCS, DLPFC, and hippocampus during the maintenance of complex feature-location binding. In contrast, the connectivity for the spatial updating of simple feature-location binding determined by reanalyzing the data from Takahama et al. (2010 demonstrated that the superior parietal lobule (SPL cooperated with the DLPFC and hippocampus. These results suggest that the connectivity for complex feature-location binding does not simply reflect general memory load and that the DLPFC and hippocampus flexibly modulate the dorsal frontoparietal network, depending on the task requirements, with the infPreCS involved in the maintenance of complex feature-location binding and the SPL involved in the spatial updating of simple feature-location binding.
Feature selection using angle modulated simulated Kalman filter for peak classification of EEG signals.

Science.gov (United States)

Adam, Asrul; Ibrahim, Zuwairie; Mokhtar, Norrima; Shapiai, Mohd Ibrahim; Mubin, Marizan; Saad, Ismail

2016-01-01

In the existing electroencephalogram (EEG) signals peak classification research, the existing models, such as Dumpala, Acir, Liu, and Dingle peak models, employ different set of features. However, all these models may not be able to offer good performance for various applications and it is found to be problem dependent. Therefore, the objective of this study is to combine all the associated features from the existing models before selecting the best combination of features. A new optimization algorithm, namely as angle modulated simulated Kalman filter (AMSKF) will be employed as feature selector. Also, the neural network random weight method is utilized in the proposed AMSKF technique as a classifier. In the conducted experiment, 11,781 samples of peak candidate are employed in this study for the validation purpose. The samples are collected from three different peak event-related EEG signals of 30 healthy subjects; (1) single eye blink, (2) double eye blink, and (3) eye movement signals. The experimental results have shown that the proposed AMSKF feature selector is able to find the best combination of features and performs at par with the existing related studies of epileptic EEG events classification.
Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

Energy Technology Data Exchange (ETDEWEB)

Ahmed, M; Gu, F; Ball, A, E-mail: M.Ahmed@hud.ac.uk [Diagnostic Engineering Research Group, University of Huddersfield, HD1 3DH (United Kingdom)

2011-07-19

Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.
Feature Selection and Fault Classification of Reciprocating Compressors using a Genetic Algorithm and a Probabilistic Neural Network

International Nuclear Information System (INIS)

Ahmed, M; Gu, F; Ball, A

2011-01-01

Reciprocating compressors are widely used in industry for various purposes and faults occurring in them can degrade their performance, consume additional energy and even cause severe damage to the machine. Vibration monitoring techniques are often used for early fault detection and diagnosis, but it is difficult to prescribe a given set of effective diagnostic features because of the wide variety of operating conditions and the complexity of the vibration signals which originate from the many different vibrating and impact sources. This paper studies the use of genetic algorithms (GAs) and neural networks (NNs) to select effective diagnostic features for the fault diagnosis of a reciprocating compressor. A large number of common features are calculated from the time and frequency domains and envelope analysis. Applying GAs and NNs to these features found that envelope analysis has the most potential for differentiating three common faults: valve leakage, inter-cooler leakage and a loose drive belt. Simultaneously, the spread parameter of the probabilistic NN was also optimised. The selected subsets of features were examined based on vibration source characteristics. The approach developed and the trained NN are confirmed as possessing general characteristics for fault detection and diagnosis.
Virtual Ribosome - a comprehensive DNA translation tool with support for integration of sequence feature annotation

DEFF Research Database (Denmark)

Wernersson, Rasmus

2006-01-01

of alternative start codons. ( ii) Integration of sequences feature annotation - in particular, native support for working with files containing intron/ exon structure annotation. The software is available for both download and online use at http://www.cbs.dtu.dk/services/VirtualRibosome/....
A Modified Feature Selection and Artificial Neural Network-Based Day-Ahead Load Forecasting Model for a Smart Grid

Directory of Open Access Journals (Sweden)

Ashfaq Ahmad

2015-12-01

Full Text Available In the operation of a smart grid (SG, day-ahead load forecasting (DLF is an important task. The SG can enhance the management of its conventional and renewable resources with a more accurate DLF model. However, DLF model development is highly challenging due to the non-linear characteristics of load time series in SGs. In the literature, DLF models do exist; however, these models trade off between execution time and forecast accuracy. The newly-proposed DLF model will be able to accurately predict the load of the next day with a fair enough execution time. Our proposed model consists of three modules; the data preparation module, feature selection and the forecast module. The first module makes the historical load curve compatible with the feature selection module. The second module removes redundant and irrelevant features from the input data. The third module, which consists of an artificial neural network (ANN, predicts future load on the basis of selected features. Moreover, the forecast module uses a sigmoid function for activation and a multi-variate auto-regressive model for weight updating during the training process. Simulations are conducted in MATLAB to validate the performance of our newly-proposed DLF model in terms of accuracy and execution time. Results show that our proposed modified feature selection and modified ANN (m(FS + ANN-based model for SGs is able to capture the non-linearity(ies in the history load curve with 97 . 11 % accuracy. Moreover, this accuracy is achieved at the cost of a fair enough execution time, i.e., we have decreased the average execution time of the existing FS + ANN-based model by 38 . 50 % .
Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

Science.gov (United States)

Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

2014-01-01

Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).
Manifold Regularized Multi-Task Feature Selection for Multi-Modality Classification in Alzheimer’s Disease

Science.gov (United States)

Jie, Biao; Cheng, Bo

2014-01-01

Accurate diagnosis of Alzheimer’s disease (AD), as well as its pro-dromal stage (i.e., mild cognitive impairment, MCI), is very important for possible delay and early treatment of the disease. Recently, multi-modality methods have been used for fusing information from multiple different and complementary imaging and non-imaging modalities. Although there are a number of existing multi-modality methods, few of them have addressed the problem of joint identification of disease-related brain regions from multi-modality data for classification. In this paper, we proposed a manifold regularized multi-task learning framework to jointly select features from multi-modality data. Specifically, we formulate the multi-modality classification as a multi-task learning framework, where each task focuses on the classification based on each modality. In order to capture the intrinsic relatedness among multiple tasks (i.e., modalities), we adopted a group sparsity regularizer, which ensures only a small number of features to be selected jointly. In addition, we introduced a new manifold based Laplacian regularization term to preserve the geometric distribution of original data from each task, which can lead to the selection of more discriminative features. Furthermore, we extend our method to the semi-supervised setting, which is very important since the acquisition of a large set of labeled data (i.e., diagnosis of disease) is usually expensive and time-consuming, while the collection of unlabeled data is relatively much easier. To validate our method, we have performed extensive evaluations on the baseline Magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET) data of Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Our experimental results demonstrate the effectiveness of the proposed method. PMID:24505676
An Empirical Study of Wrappers for Feature Subset Selection based on a Parallel Genetic Algorithm: The Multi-Wrapper Model

KAUST Repository

Soufan, Othman

2012-01-01

proper criterion seeks to find the best subset of features describing data (relevance) and achieving better performance (optimality). Wrapper approaches are feature selection methods which are wrapped around a classification algorithm and use a
Research on Methods for Discovering and Selecting Cloud Infrastructure Services Based on Feature Modeling

Directory of Open Access Journals (Sweden)

Huamin Zhu

2016-01-01

Full Text Available Nowadays more and more cloud infrastructure service providers are providing large numbers of service instances which are a combination of diversified resources, such as computing, storage, and network. However, for cloud infrastructure services, the lack of a description standard and the inadequate research of systematic discovery and selection methods have exposed difficulties in discovering and choosing services for users. First, considering the highly configurable properties of a cloud infrastructure service, the feature model method is used to describe such a service. Second, based on the description of the cloud infrastructure service, a systematic discovery and selection method for cloud infrastructure services are proposed. The automatic analysis techniques of the feature model are introduced to verify the model’s validity and to perform the matching of the service and demand models. Finally, we determine the critical decision metrics and their corresponding measurement methods for cloud infrastructure services, where the subjective and objective weighting results are combined to determine the weights of the decision metrics. The best matching instances from various providers are then ranked by their comprehensive evaluations. Experimental results show that the proposed methods can effectively improve the accuracy and efficiency of cloud infrastructure service discovery and selection.
A comprehensive analysis of earthquake damage patterns using high dimensional model representation feature selection

Science.gov (United States)

Taşkin Kaya, Gülşen

2013-10-01

Recently, earthquake damage assessment using satellite images has been a very popular ongoing research direction. Especially with the availability of very high resolution (VHR) satellite images, a quite detailed damage map based on building scale has been produced, and various studies have also been conducted in the literature. As the spatial resolution of satellite images increases, distinguishability of damage patterns becomes more cruel especially in case of using only the spectral information during classification. In order to overcome this difficulty, textural information needs to be involved to the classification to improve the visual quality and reliability of damage map. There are many kinds of textural information which can be derived from VHR satellite images depending on the algorithm used. However, extraction of textural information and evaluation of them have been generally a time consuming process especially for the large areas affected from the earthquake due to the size of VHR image. Therefore, in order to provide a quick damage map, the most useful features describing damage patterns needs to be known in advance as well as the redundant features. In this study, a very high resolution satellite image after Iran, Bam earthquake was used to identify the earthquake damage. Not only the spectral information, textural information was also used during the classification. For textural information, second order Haralick features were extracted from the panchromatic image for the area of interest using gray level co-occurrence matrix with different size of windows and directions. In addition to using spatial features in classification, the most useful features representing the damage characteristic were selected with a novel feature selection method based on high dimensional model representation (HDMR) giving sensitivity of each feature during classification. The method called HDMR was recently proposed as an efficient tool to capture the input
Analysis on working pressure selection of ACME integral test facility

International Nuclear Information System (INIS)

Chen Lian; Chang Huajian; Li Yuquan; Ye Zishen; Qin Benke

2011-01-01

An integral effects test facility, advanced core cooling mechanism experiment facility (ACME) was designed to verify the performance of the passive safety system and validate its safety analysis codes of a pressurized water reactor power plant. Three test facilities for AP1000 design were introduced and review was given. The problems resulted from the different working pressures of its test facilities were analyzed. Then a detailed description was presented on the working pressure selection of ACME facility as well as its characteristics. And the approach of establishing desired testing initial condition was discussed. The selected 9.3 MPa working pressure covered almost all important passive safety system enables the ACME to simulate the LOCAs with the same pressure and property similitude as the prototype. It's expected that the ACME design would be an advanced core cooling integral test facility design. (authors)

Integral Histogram with Random Projection for Pedestrian Detection.

Directory of Open Access Journals (Sweden)

Chang-Hua Liu

Full Text Available In this paper, we give a systematic study to report several deep insights into the HOG, one of the most widely used features in the modern computer vision and image processing applications. We first show that, its magnitudes of gradient can be randomly projected with random matrix. To handle over-fitting, an integral histogram based on the differences of randomly selected blocks is proposed. The experiments show that both the random projection and integral histogram outperform the HOG feature obviously. Finally, the two ideas are combined into a new descriptor termed IHRP, which outperforms the HOG feature with less dimensions and higher speed.
featsel: A framework for benchmarking of feature selection algorithms and cost functions

OpenAIRE

Marcelo S. Reis; Gustavo Estrela; Carlos Eduardo Ferreira; Junior Barrera

2017-01-01

In this paper, we introduce featsel, a framework for benchmarking of feature selection algorithms and cost functions. This framework allows the user to deal with the search space as a Boolean lattice and has its core coded in C++ for computational efficiency purposes. Moreover, featsel includes Perl scripts to add new algorithms and/or cost functions, generate random instances, plot graphs and organize results into tables. Besides, this framework already comes with dozens of algorithms and co...
Featureous: an Integrated Approach to Location, Analysis and Modularization of Features in Java Applications

DEFF Research Database (Denmark)

Olszak, Andrzej

, it is essential that features are properly modularized within the structural organization of software systems. Nevertheless, in many object-oriented applications, features are not represented explicitly. Consequently, features typically end up scattered and tangled over multiple source code units......, such as architectural layers, packages and classes. This lack of modularization is known to make application features difficult to locate, to comprehend and to modify in isolation from one another. To overcome these problems, this thesis proposes Featureous, a novel approach to location, analysis and modularization...... quantitative and qualitative results suggest that Featureous succeeds at efficiently locating features in unfamiliar codebases, at aiding feature-oriented comprehension and modification, and at improving modularization of features using Java packages....
Novel Mahalanobis-based feature selection improves one-class classification of early hepatocellular carcinoma.

Science.gov (United States)

Thomaz, Ricardo de Lima; Carneiro, Pedro Cunha; Bonin, João Eliton; Macedo, Túlio Augusto Alves; Patrocinio, Ana Claudia; Soares, Alcimar Barbosa

2018-05-01

Detection of early hepatocellular carcinoma (HCC) is responsible for increasing survival rates in up to 40%. One-class classifiers can be used for modeling early HCC in multidetector computed tomography (MDCT), but demand the specific knowledge pertaining to the set of features that best describes the target class. Although the literature outlines several features for characterizing liver lesions, it is unclear which is most relevant for describing early HCC. In this paper, we introduce an unconstrained GA feature selection algorithm based on a multi-objective Mahalanobis fitness function to improve the classification performance for early HCC. We compared our approach to a constrained Mahalanobis function and two other unconstrained functions using Welch's t-test and Gaussian Data Descriptors. The performance of each fitness function was evaluated by cross-validating a one-class SVM. The results show that the proposed multi-objective Mahalanobis fitness function is capable of significantly reducing data dimensionality (96.4%) and improving one-class classification of early HCC (0.84 AUC). Furthermore, the results provide strong evidence that intensity features extracted at the arterial to portal and arterial to equilibrium phases are important for classifying early HCC.
iFER: facial expression recognition using automatically selected geometric eye and eyebrow features

Science.gov (United States)

Oztel, Ismail; Yolcu, Gozde; Oz, Cemil; Kazan, Serap; Bunyak, Filiz

2018-03-01

Facial expressions have an important role in interpersonal communications and estimation of emotional states or intentions. Automatic recognition of facial expressions has led to many practical applications and became one of the important topics in computer vision. We present a facial expression recognition system that relies on geometry-based features extracted from eye and eyebrow regions of the face. The proposed system detects keypoints on frontal face images and forms a feature set using geometric relationships among groups of detected keypoints. Obtained feature set is refined and reduced using the sequential forward selection (SFS) algorithm and fed to a support vector machine classifier to recognize five facial expression classes. The proposed system, iFER (eye-eyebrow only facial expression recognition), is robust to lower face occlusions that may be caused by beards, mustaches, scarves, etc. and lower face motion during speech production. Preliminary experiments on benchmark datasets produced promising results outperforming previous facial expression recognition studies using partial face features, and comparable results to studies using whole face information, only slightly lower by ˜ 2.5 % compared to the best whole face facial recognition system while using only ˜ 1 / 3 of the facial region.
Feature learning and change feature classification based on deep learning for ternary change detection in SAR images

Science.gov (United States)

Gong, Maoguo; Yang, Hailun; Zhang, Puzhao

2017-07-01

Ternary change detection aims to detect changes and group the changes into positive change and negative change. It is of great significance in the joint interpretation of spatial-temporal synthetic aperture radar images. In this study, sparse autoencoder, convolutional neural networks (CNN) and unsupervised clustering are combined to solve ternary change detection problem without any supervison. Firstly, sparse autoencoder is used to transform log-ratio difference image into a suitable feature space for extracting key changes and suppressing outliers and noise. And then the learned features are clustered into three classes, which are taken as the pseudo labels for training a CNN model as change feature classifier. The reliable training samples for CNN are selected from the feature maps learned by sparse autoencoder with certain selection rules. Having training samples and the corresponding pseudo labels, the CNN model can be trained by using back propagation with stochastic gradient descent. During its training procedure, CNN is driven to learn the concept of change, and more powerful model is established to distinguish different types of changes. Unlike the traditional methods, the proposed framework integrates the merits of sparse autoencoder and CNN to learn more robust difference representations and the concept of change for ternary change detection. Experimental results on real datasets validate the effectiveness and superiority of the proposed framework.
Support Vector Feature Selection for Early Detection of Anastomosis Leakage From Bag-of-Words in Electronic Health Records.

Science.gov (United States)

Soguero-Ruiz, Cristina; Hindberg, Kristian; Rojo-Alvarez, Jose Luis; Skrovseth, Stein Olav; Godtliebsen, Fred; Mortensen, Kim; Revhaug, Arthur; Lindsetmo, Rolv-Ole; Augestad, Knut Magne; Jenssen, Robert

2016-09-01

The free text in electronic health records (EHRs) conveys a huge amount of clinical information about health state and patient history. Despite a rapidly growing literature on the use of machine learning techniques for extracting this information, little effort has been invested toward feature selection and the features' corresponding medical interpretation. In this study, we focus on the task of early detection of anastomosis leakage (AL), a severe complication after elective surgery for colorectal cancer (CRC) surgery, using free text extracted from EHRs. We use a bag-of-words model to investigate the potential for feature selection strategies. The purpose is earlier detection of AL and prediction of AL with data generated in the EHR before the actual complication occur. Due to the high dimensionality of the data, we derive feature selection strategies using the robust support vector machine linear maximum margin classifier, by investigating: 1) a simple statistical criterion (leave-one-out-based test); 2) an intensive-computation statistical criterion (Bootstrap resampling); and 3) an advanced statistical criterion (kernel entropy). Results reveal a discriminatory power for early detection of complications after CRC (sensitivity 100%; specificity 72%). These results can be used to develop prediction models, based on EHR data, that can support surgeons and patients in the preoperative decision making phase.
Automatic processing of unattended object features by functional connectivity

Directory of Open Access Journals (Sweden)

Katja Martina Mayer

2013-05-01

Full Text Available Observers can selectively attend to object features that are relevant for a task. However, unattended task-irrelevant features may still be processed and possibly integrated with the attended features. This study investigated the neural mechanisms for processing both task-relevant (attended and task-irrelevant (unattended object features. The Garner paradigm was adapted for functional magnetic resonance imaging (fMRI to test whether specific brain areas process the conjunction of features or whether multiple interacting areas are involved in this form of feature integration. Observers attended to shape, colour, or non-rigid motion of novel objects while unattended features changed from trial to trial (change blocks or remained constant (no-change blocks during a given block. This block manipulation allowed us to measure the extent to which unattended features affected neural responses which would reflect the extent to which multiple object features are automatically processed. We did not find Garner interference at the behavioural level. However, we designed the experiment to equate performance across block types so that any fMRI results could not be due solely to differences in task difficulty between change and no-change blocks. Attention to specific features localised several areas known to be involved in object processing. No area showed larger responses on change blocks compared to no-change blocks. However, psychophysiological interaction analyses revealed that several functionally-localised areas showed significant positive interactions with areas in occipito-temporal and frontal areas that depended on block type. Overall, these findings suggest that both regional responses and functional connectivity are crucial for processing multi-featured objects.
An integrated MCDM approach to green supplier selection

Directory of Open Access Journals (Sweden)

Morteza Yazdani

2014-06-01

Full Text Available Supplier selection management has been considered as an important subject for industrial organizations. In order to remain on the market, to gain profitability and to retain competitive advantage, business units need to establish an integrated and structured supplier selection system. In addition, environmental protection problems have been big solicitudes for organizations to consider green approach in supplier selection problem. However, finding proper suppliers involves several variables and it is critically a complex process. In this paper, the main attention is focused on finding the right supplier based on fuzzy multi criteria decision making (MCDM process. The weights of criteria are calculated by analytical hierarchical process (AHP and the final ranking is achieved by fuzzy technique for order preference by similarity to an ideal solution (TOPSIS. TOPSIS advantage among the other similar methods is to obtain the best solution close to ideal solution. The paper attempts to express better understanding by an example of an automobile manufacturing supply chain.
Hybrid Feature Selection Approach Based on GRASP for Cancer Microarray Data

Directory of Open Access Journals (Sweden)

Arpita Nagpal

2017-01-01

Full Text Available Microarray data usually contain a large number of genes, but a small number of samples. Feature subset selection for microarray data aims at reducing the number of genes so that useful information can be extracted from the samples. Reducing the dimension of data sets further helps in improving the computational efficiency of the learning model. In this paper, we propose a modified algorithm based on the tabu search as local search procedures to a Greedy Randomized Adaptive Search Procedure (GRASP for high dimensional microarray data sets. The proposed Tabu based Greedy Randomized Adaptive Search Procedure algorithm is named as TGRASP. In TGRASP, a new parameter has been introduced named as Tabu Tenure and the existing parameters, NumIter and size have been modified. We observed that different parameter settings affect the quality of the optimum. The second proposed algorithm known as FFGRASP (Firefly Greedy Randomized Adaptive Search Procedure uses a firefly optimization algorithm in the local search optimzation phase of the greedy randomized adaptive search procedure (GRASP. Firefly algorithm is one of the powerful algorithms for optimization of multimodal applications. Experimental results show that the proposed TGRASP and FFGRASP algorithms are much better than existing algorithm with respect to three performance parameters viz. accuracy, run time, number of a selected subset of features. We have also compared both the approaches with a unified metric (Extended Adjusted Ratio of Ratios which has shown that TGRASP approach outperforms existing approach for six out of nine cancer microarray datasets and FFGRASP performs better on seven out of nine datasets.
Passive sampling of selected endocrine disrupting compounds using polar organic chemical integrative samplers

International Nuclear Information System (INIS)

Arditsoglou, Anastasia; Voutsa, Dimitra

2008-01-01

Two types of polar organic chemical integrative samplers (pharmaceutical POCIS and pesticide POCIS) were examined for their sampling efficiency of selected endocrine disrupting compounds (EDCs). Laboratory-based calibration of POCISs was conducted by exposing them at high and low concentrations of 14 EDCs (4-alkyl-phenols, their ethoxylate oligomers, bisphenol A, selected estrogens and synthetic steroids) for different time periods. The kinetic studies showed an integrative uptake up to 28 days. The sampling rates for the individual compounds were obtained. The use of POCISs could result in an integrative approach to the quality status of the aquatic systems especially in the case of high variation of water concentrations of EDCs. The sampling efficiency of POCISs under various field conditions was assessed after their deployment in different aquatic environments. - Calibration and field performance of polar organic integrative samplers for monitoring EDCs in aquatic environments
A Pareto-based Ensemble with Feature and Instance Selection for Learning from Multi-Class Imbalanced Datasets.

Science.gov (United States)

Fernández, Alberto; Carmona, Cristobal José; José Del Jesus, María; Herrera, Francisco

2017-09-01

Imbalanced classification is related to those problems that have an uneven distribution among classes. In addition to the former, when instances are located into the overlapped areas, the correct modeling of the problem becomes harder. Current solutions for both issues are often focused on the binary case study, as multi-class datasets require an additional effort to be addressed. In this research, we overcome these problems by carrying out a combination between feature and instance selections. Feature selection will allow simplifying the overlapping areas easing the generation of rules to distinguish among the classes. Selection of instances from all classes will address the imbalance itself by finding the most appropriate class distribution for the learning task, as well as possibly removing noise and difficult borderline examples. For the sake of obtaining an optimal joint set of features and instances, we embedded the searching for both parameters in a Multi-Objective Evolutionary Algorithm, using the C4.5 decision tree as baseline classifier in this wrapper approach. The multi-objective scheme allows taking a double advantage: the search space becomes broader, and we may provide a set of different solutions in order to build an ensemble of classifiers. This proposal has been contrasted versus several state-of-the-art solutions on imbalanced classification showing excellent results in both binary and multi-class problems.
DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues.

Science.gov (United States)

Ma, Xin; Guo, Jing; Sun, Xiao

2016-01-01

DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.
Dysphonic Voice Pattern Analysis of Patients in Parkinson’s Disease Using Minimum Interclass Probability Risk Feature Selection and Bagging Ensemble Learning Methods

Directory of Open Access Journals (Sweden)

Yunfeng Wu

2017-01-01

Full Text Available Analysis of quantified voice patterns is useful in the detection and assessment of dysphonia and related phonation disorders. In this paper, we first study the linear correlations between 22 voice parameters of fundamental frequency variability, amplitude variations, and nonlinear measures. The highly correlated vocal parameters are combined by using the linear discriminant analysis method. Based on the probability density functions estimated by the Parzen-window technique, we propose an interclass probability risk (ICPR method to select the vocal parameters with small ICPR values as dominant features and compare with the modified Kullback-Leibler divergence (MKLD feature selection approach. The experimental results show that the generalized logistic regression analysis (GLRA, support vector machine (SVM, and Bagging ensemble algorithm input with the ICPR features can provide better classification results than the same classifiers with the MKLD selected features. The SVM is much better at distinguishing normal vocal patterns with a specificity of 0.8542. Among the three classification methods, the Bagging ensemble algorithm with ICPR features can identify 90.77% vocal patterns, with the highest sensitivity of 0.9796 and largest area value of 0.9558 under the receiver operating characteristic curve. The classification results demonstrate the effectiveness of our feature selection and pattern analysis methods for dysphonic voice detection and measurement.
WAVELENGTH SELECTION OF HYPERSPECTRAL LIDAR BASED ON FEATURE WEIGHTING FOR ESTIMATION OF LEAF NITROGEN CONTENT IN RICE

Directory of Open Access Journals (Sweden)

L. Du

2016-06-01

Full Text Available Hyperspectral LiDAR (HSL is a novel tool in the field of active remote sensing, which has been widely used in many domains because of its advantageous ability of spectrum-gained. Especially in the precise monitoring of nitrogen in green plants, the HSL plays a dispensable role. The exiting HSL system used for nitrogen status monitoring has a multi-channel detector, which can improve the spectral resolution and receiving range, but maybe result in data redundancy, difficulty in system integration and high cost as well. Thus, it is necessary and urgent to pick out the nitrogen-sensitive feature wavelengths among the spectral range. The present study, aiming at solving this problem, assigns a feature weighting to each centre wavelength of HSL system by using matrix coefficient analysis and divergence threshold. The feature weighting is a criterion to amend the centre wavelength of the detector to accommodate different purpose, especially the estimation of leaf nitrogen content (LNC in rice. By this way, the wavelengths high-correlated to the LNC can be ranked in a descending order, which are used to estimate rice LNC sequentially. In this paper, a HSL system which works based on a wide spectrum emission and a 32-channel detector is conducted to collect the reflectance spectra of rice leaf. These spectra collected by HSL cover a range of 538 nm – 910 nm with a resolution of 12 nm. These 32 wavelengths are strong absorbed by chlorophyll in green plant among this range. The relationship between the rice LNC and reflectance-based spectra is modeled using partial least squares (PLS and support vector machines (SVMs based on calibration and validation datasets respectively. The results indicate that I wavelength selection method of HSL based on feature weighting is effective to choose the nitrogen-sensitive wavelengths, which can also be co-adapted with the hardware of HSL system friendly. II The chosen wavelength has a high correlation with rice LNC
Simultaneous feature selection and parameter optimisation using an artificial ant colony: case study of melting point prediction

Directory of Open Access Journals (Sweden)

Nigsch Florian

2008-10-01

Full Text Available Abstract Background We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC, that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024–1029. We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581–590 of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Results Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6°C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, ε of 0.21 and an RMSE of 45.1°C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3°C, R2 of 0.47 for the same data and has similar performance to a Random Forest model (RMSE of 44.5°C, R2 of 0.55. However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. Conclusion With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors.
Simultaneous feature selection and parameter optimisation using an artificial ant colony: case study of melting point prediction.

Science.gov (United States)

O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John Bo

2008-10-29

We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024-1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581-590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6 degrees C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, epsilon of 0.21) and an RMSE of 45.1 degrees C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3 degrees C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5 degrees C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors.
Selection-for-action in visual search.

Science.gov (United States)

Hannus, Aave; Cornelissen, Frans W; Lindemann, Oliver; Bekkering, Harold

2005-01-01

Grasping an object rather than pointing to it enhances processing of its orientation but not its color. Apparently, visual discrimination is selectively enhanced for a behaviorally relevant feature. In two experiments we investigated the limitations and targets of this bias. Specifically, in Experiment 1 we were interested to find out whether the effect is capacity demanding, therefore we manipulated the set-size of the display. The results indicated a clear cognitive processing capacity requirement, i.e. the magnitude of the effect decreased for a larger set size. Consequently, in Experiment 2, we investigated if the enhancement effect occurs only at the level of behaviorally relevant feature or at a level common to different features. Therefore we manipulated the discriminability of the behaviorally neutral feature (color). Again, results showed that this manipulation influenced the action enhancement of the behaviorally relevant feature. Particularly, the effect of the color manipulation on the action enhancement suggests that the action effect is more likely to bias the competition between different visual features rather than to enhance the processing of the relevant feature. We offer a theoretical account that integrates the action-intention effect within the biased competition model of visual selective attention.
A machine vision system for automated non-invasive assessment of cell viability via dark field microscopy, wavelet feature selection and classification

Directory of Open Access Journals (Sweden)

Friehs Karl

2008-10-01

Full Text Available Abstract Background Cell viability is one of the basic properties indicating the physiological state of the cell, thus, it has long been one of the major considerations in biotechnological applications. Conventional methods for extracting information about cell viability usually need reagents to be applied on the targeted cells. These reagent-based techniques are reliable and versatile, however, some of them might be invasive and even toxic to the target cells. In support of automated noninvasive assessment of cell viability, a machine vision system has been developed. Results This system is based on supervised learning technique. It learns from images of certain kinds of cell populations and trains some classifiers. These trained classifiers are then employed to evaluate the images of given cell populations obtained via dark field microscopy. Wavelet decomposition is performed on the cell images. Energy and entropy are computed for each wavelet subimage as features. A feature selection algorithm is implemented to achieve better performance. Correlation between the results from the machine vision system and commonly accepted gold standards becomes stronger if wavelet features are utilized. The best performance is achieved with a selected subset of wavelet features. Conclusion The machine vision system based on dark field microscopy in conjugation with supervised machine learning and wavelet feature selection automates the cell viability assessment, and yields comparable results to commonly accepted methods. Wavelet features are found to be suitable to describe the discriminative properties of the live and dead cells in viability classification. According to the analysis, live cells exhibit morphologically more details and are intracellularly more organized than dead ones, which display more homogeneous and diffuse gray values throughout the cells. Feature selection increases the system's performance. The reason lies in the fact that feature
Application of integrated QFD and fuzzy AHP approach in selection of suppliers

Directory of Open Access Journals (Sweden)

Bojana Jovanović

2014-10-01

Full Text Available Supplier selection is a widely considered issue in the field of management, especially in quality management. In this paper, in the selection of suppliers of electronic components we used the integrated QFD and fuzzy AHP approaches. The QFD method is used as a tool for translating stakeholder needs into evaluating criteria for suppliers. The fuzzy AHP approach is used as a tool for prioritizing stakeholders, stakeholders’ requirements, evaluating criteria and, finally, for prioritizing suppliers. The paper showcases a case study of implementation of the integrated QFD and fuzzy AHP approaches in the selection of the electronic components supplier in one Serbian company that produces electronic devices. Also presented is the algorithm of implementation of the proposed approach. To the best of our knowledge, this is the first implementation of the proposed approach in a Serbian company.

Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations.

Directory of Open Access Journals (Sweden)

Benjamin A Logsdon

Full Text Available Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL, which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1, and genes involved in endocytosis (RCY1, the spindle checkpoint (BUB2, sulfonate catabolism (JLP1, and cell-cell communication (PRM7. Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data.
Improving Remote Sensing Scene Classification by Integrating Global-Context and Local-Object Features

Directory of Open Access Journals (Sweden)

Dan Zeng

2018-05-01

Full Text Available Recently, many researchers have been dedicated to using convolutional neural networks (CNNs to extract global-context features (GCFs for remote-sensing scene classification. Commonly, accurate classification of scenes requires knowledge about both the global context and local objects. However, unlike the natural images in which the objects cover most of the image, objects in remote-sensing images are generally small and decentralized. Thus, it is hard for vanilla CNNs to focus on both global context and small local objects. To address this issue, this paper proposes a novel end-to-end CNN by integrating the GCFs and local-object-level features (LOFs. The proposed network includes two branches, the local object branch (LOB and global semantic branch (GSB, which are used to generate the LOFs and GCFs, respectively. Then, the concatenation of features extracted from the two branches allows our method to be more discriminative in scene classification. Three challenging benchmark remote-sensing datasets were extensively experimented on; the proposed approach outperformed the existing scene classification methods and achieved state-of-the-art results for all three datasets.
PREFACE: Special section featuring selected papers from the 3rd International Workshop on Numerical Modelling of High Temperature Superconductors Special section featuring selected papers from the 3rd International Workshop on Numerical Modelling of High Temperature Superconductors

Science.gov (United States)

Granados, Xavier; Sánchez, Àlvar; López-López, Josep

2012-10-01

The development of superconducting applications and superconducting engineering requires the support of consistent tools which can provide models for obtaining a good understanding of the behaviour of the systems and predict novel features. These models aim to compute the behaviour of the superconducting systems, design superconducting devices and systems, and understand and test the behavior of the superconducting parts. 50 years ago, in 1962, Charles Bean provided the superconducting community with a model efficient enough to allow the computation of the response of a superconductor to external magnetic fields and currents flowing through in an understandable way: the so called critical-state model. Since then, in addition to the pioneering critical-state approach, other tools have been devised for designing operative superconducting systems, allowing integration of the superconducting design in nearly standard electromagnetic computer-aided design systems by modelling the superconducting parts with consideration of time-dependent processes. In April 2012, Barcelona hosted the 3rd International Workshop on Numerical Modelling of High Temperature Superconductors (HTS), the third in a series of workshops started in Lausanne in 2010 and followed by Cambridge in 2011. The workshop reflected the state-of-the-art and the new initiatives of HTS modelling, considering mathematical, physical and technological aspects within a wide and interdisciplinary scope. Superconductor Science and Technology is now publishing a selection of papers from the workshop which have been selected for their high quality. The selection comprises seven papers covering mathematical, physical and technological topics which contribute to an improvement in the development of procedures, understanding of phenomena and development of applications. We hope that they provide a perspective on the relevance and growth that the modelling of HTS superconductors has achieved in the past 25 years.
Modeling and Detecting Feature Interactions among Integrated Services of Home Network Systems

Science.gov (United States)

Igaki, Hiroshi; Nakamura, Masahide

This paper presents a framework for formalizing and detecting feature interactions (FIs) in the emerging smart home domain. We first establish a model of home network system (HNS), where every networked appliance (or the HNS environment) is characterized as an object consisting of properties and methods. Then, every HNS service is defined as a sequence of method invocations of the appliances. Within the model, we next formalize two kinds of FIs: (a) appliance interactions and (b) environment interactions. An appliance interaction occurs when two method invocations conflict on the same appliance, whereas an environment interaction arises when two method invocations conflict indirectly via the environment. Finally, we propose offline and online methods that detect FIs before service deployment and during execution, respectively. Through a case study with seven practical services, it is shown that the proposed framework is generic enough to capture feature interactions in HNS integrated services. We also discuss several FI resolution schemes within the proposed framework.
Feature Selection as a Time and Cost-Saving Approach for Land Suitability Classification (Case Study of Shavur Plain, Iran

Directory of Open Access Journals (Sweden)

Saeid Hamzeh

2016-10-01

Full Text Available Land suitability classification is important in planning and managing sustainable land use. Most approaches to land suitability analysis combine a large number of land and soil parameters, and are time-consuming and costly. In this study, a potentially useful technique (combined feature selection and fuzzy-AHP method to increase the efficiency of land suitability analysis was presented. To this end, three different feature selection algorithms—random search, best search and genetic methods—were used to determine the most effective parameters for land suitability classification for the cultivation of barely in the Shavur Plain, southwest Iran. Next, land suitability classes were calculated for all methods by using the fuzzy-AHP approach. Salinity (electrical conductivity (EC, alkalinity (exchangeable sodium percentage (ESP, wetness and soil texture were selected using the random search method. Gypsum, EC, ESP, and soil texture were selected using both the best search and genetic methods. The result shows a strong agreement between the standard fuzzy-AHP methods and methods presented in this study. The values of Kappa coefficients were 0.82, 0.79 and 0.79 for the random search, best search and genetic methods, respectively, compared with the standard fuzzy-AHP method. Our results indicate that EC, ESP, soil texture and wetness are the most effective features for evaluating land suitability classification for the cultivation of barely in the study area, and uses of these parameters, together with their appropriate weights as obtained from fuzzy-AHP, can perform good results for land suitability classification. So, the combined feature selection presented and the fuzzy-AHP approach has the potential to save time and money for land suitability classification.
A model of selective visual attention for a stereo pair of images

Science.gov (United States)

Park, Min Chul; Kim, Sung Kyu; Son, Jung-Young

2005-11-01

Human visual attention system has a remarkable ability to interpret complex scenes with the ease and simplicity by selecting or focusing on a small region of visual field without scanning the whole images. In this paper, a novel selective visual attention model by using 3D image display system for a stereo pair of images is proposed. It is based on the feature integration theory and locates ROI(region of interest) or FOA(focus of attention). The disparity map obtained from a stereo pair of images is exploited as one of spatial visual features to form a set of topographic feature maps in our approach. Though the true human cognitive mechanism on the analysis and integration process might be different from our assumption the proposed attention system matches well with the results found by human observers.
Clonal selection versus clonal cooperation: the integrated perception of immune objects [version 1; referees: 2 approved

Directory of Open Access Journals (Sweden)

Serge Nataf

2016-09-01

Full Text Available Analogies between the immune and nervous systems were first envisioned by the immunologist Niels Jerne who introduced the concepts of antigen "recognition" and immune "memory". However, since then, it appears that only the cognitive immunology paradigm proposed by Irun Cohen, attempted to further theorize the immune system functions through the prism of neurosciences. The present paper is aimed at revisiting this analogy-based reasoning. In particular, a parallel is drawn between the brain pathways of visual perception and the processes allowing the global perception of an "immune object". Thus, in the visual system, distinct features of a visual object (shape, color, motion are perceived separately by distinct neuronal populations during a primary perception task. The output signals generated during this first step instruct then an integrated perception task performed by other neuronal networks. Such a higher order perception step is by essence a cooperative task that is mandatory for the global perception of visual objects. Based on a re-interpretation of recent experimental data, it is suggested that similar general principles drive the integrated perception of immune objects in secondary lymphoid organs (SLOs. In this scheme, the four main categories of signals characterizing an immune object (antigenic, contextual, temporal and localization signals are first perceived separately by distinct networks of immunocompetent cells. Then, in a multitude of SLO niches, the output signals generated during this primary perception step are integrated by TH-cells at the single cell level. This process eventually generates a multitude of T-cell and B-cell clones that perform, at the scale of SLOs, an integrated perception of immune objects. Overall, this new framework proposes that integrated immune perception and, consequently, integrated immune responses, rely essentially on clonal cooperation rather than clonal selection.
A selective review of selective attention research from the past century.

Science.gov (United States)

Driver, J

2001-02-01

Research on attention is concerned with selective processing of incoming sensory information. To some extent, our awareness of the world depends on what we choose to attend, not merely on the stimulation entering our senses. British psychologists have made substantial contributions to this topic in the past century. Celebrated examples include Donald Broadbent's filter theory of attention, which set the agenda for most subsequent work; and Anne Treisman's revisions of this account, and her later feature-integration theory. More recent contributions include Alan Allport's prescient emphasis on the relevance of neuroscience data, and John Duncan's integration of such data with psychological theory. An idiosyncratic but roughly chronological review of developments is presented, some practical and clinical implications are briefly sketched, and future directions suggested. One of the biggest changes in the field has been the increasing interplay between psychology and neuroscience, which promises much for the future. A related change has been the realization that selection attention is best thought of as a broad topic, encompassing a range of selective issues, rather than as a single explanatory process.
Development of a seaweed species-selection index for successful culture in a seaweed-based integrated aquaculture system

Science.gov (United States)

Kang, Yun Hee; Hwang, Jae Ran; Chung, Ik Kyo; Park, Sang Rul

2013-03-01

Integrated multi-trophic aquaculture (IMTA) has been proposed as a concept that combines the cultivation of fed aquaculture species ( e.g., finfish/shrimp) with extractive aquaculture species ( e.g., shellfish/seaweed). In seaweed-based integrated aquaculture, seaweeds have the capacity to reduce the environmental impact of nitrogen-rich effluents on coastal ecosystems. Thus, selection of optimal species for such aquaculture is of great importance. The present study aimed to develop a seaweed species-selection index for selecting suitable species in seaweed-based integrated aquaculture system. The index was synthesized using available literature-based information, reference data, and physiological seaweed experiments to identify and prioritize the desired species. Undaria pinnatifida, Porphyra yezoensis and Ulva compressa scored the highest according to a seaweed-based integrated aquaculture suitability index (SASI). Seaweed species with the highest scores were adjudged to fit the integrated aquaculture systems. Despite the application of this model limited by local aquaculture environment, it is considered to be a useful tool for selecting seaweed species in IMTA.
The Temporal Dynamics of Feature Integration for Color, form, and Motion

Directory of Open Access Journals (Sweden)

KS Pilz

2012-07-01

Full Text Available When two similar visual stimuli are presented in rapid succession, only their fused image is perceived, without having conscious access to the single stimuli. Such feature fusion occurs both for color (eg, Efron1973 and form (eg, Scharnowski et al 2007. For verniers, the fusion process lasts for more than 400 ms, as has been shown using TMS (Scharnowski et al 2009. In three experiments, we used light masks to investigate the time course of feature fusion for color, form, and motion. In experiment one, two verniers were presented in rapid succession with opposite offset directions. Subjects had to indicate the offset direction of the vernier. In a second experiment, a red and a green disk were presented in rapid succession, and subjects had to indicate whether the fused, yellow disk appeared rather than red or green. In a third experiment, three frames of random dots were presented successively. The first two frames created a percept of apparent motion to the upper right; and the last two frames, one to the upper left or vice versa. Subjects had to indicate the direction of motion. All stimuli were presented foveally. In all three experiments, we first balanced performance so that neither the first nor the second stimulus dominated the fused percept. In a second step, a light mask was presented either before, during, or after stimulus presentation. Depending on presentation time, the light masks modulated the fusion process so that either the first or the second stimulus dominated the percept. Our results show that unconscious feature fusion lasts more than five times longer than the actual stimulus duration, which indicates that individual features are stored for a substantial amount of time before they are integrated.
Integration of Error Compensation of Coordinate Measuring Machines into Feature Measurement: Part I—Model Development

Science.gov (United States)

Calvo, Roque; D’Amato, Roberto; Gómez, Emilio; Domingo, Rosario

2016-01-01

The development of an error compensation model for coordinate measuring machines (CMMs) and its integration into feature measurement is presented. CMMs are widespread and dependable instruments in industry and laboratories for dimensional measurement. From the tip probe sensor to the machine display, there is a complex transformation of probed point coordinates through the geometrical feature model that makes the assessment of accuracy and uncertainty measurement results difficult. Therefore, error compensation is not standardized, conversely to other simpler instruments. Detailed coordinate error compensation models are generally based on CMM as a rigid-body and it requires a detailed mapping of the CMM’s behavior. In this paper a new model type of error compensation is proposed. It evaluates the error from the vectorial composition of length error by axis and its integration into the geometrical measurement model. The non-explained variability by the model is incorporated into the uncertainty budget. Model parameters are analyzed and linked to the geometrical errors and uncertainty of CMM response. Next, the outstanding measurement models of flatness, angle, and roundness are developed. The proposed models are useful for measurement improvement with easy integration into CMM signal processing, in particular in industrial environments where built-in solutions are sought. A battery of implementation tests are presented in Part II, where the experimental endorsement of the model is included. PMID:27690052
Integration of Error Compensation of Coordinate Measuring Machines into Feature Measurement: Part I—Model Development

Directory of Open Access Journals (Sweden)

Roque Calvo

2016-09-01

Full Text Available The development of an error compensation model for coordinate measuring machines (CMMs and its integration into feature measurement is presented. CMMs are widespread and dependable instruments in industry and laboratories for dimensional measurement. From the tip probe sensor to the machine display, there is a complex transformation of probed point coordinates through the geometrical feature model that makes the assessment of accuracy and uncertainty measurement results difficult. Therefore, error compensation is not standardized, conversely to other simpler instruments. Detailed coordinate error compensation models are generally based on CMM as a rigid-body and it requires a detailed mapping of the CMM’s behavior. In this paper a new model type of error compensation is proposed. It evaluates the error from the vectorial composition of length error by axis and its integration into the geometrical measurement model. The non-explained variability by the model is incorporated into the uncertainty budget. Model parameters are analyzed and linked to the geometrical errors and uncertainty of CMM response. Next, the outstanding measurement models of flatness, angle, and roundness are developed. The proposed models are useful for measurement improvement with easy integration into CMM signal processing, in particular in industrial environments where built-in solutions are sought. A battery of implementation tests are presented in Part II, where the experimental endorsement of the model is included.
Feature selection for portfolio optimization

DEFF Research Database (Denmark)

Bjerring, Thomas Trier; Ross, Omri; Weissensteiner, Alex

2016-01-01

Most portfolio selection rules based on the sample mean and covariance matrix perform poorly out-of-sample. Moreover, there is a growing body of evidence that such optimization rules are not able to beat simple rules of thumb, such as 1/N. Parameter uncertainty has been identified as one major....... While most of the diversification benefits are preserved, the parameter estimation problem is alleviated. We conduct out-of-sample back-tests to show that in most cases different well-established portfolio selection rules applied on the reduced asset universe are able to improve alpha relative...
Cortical mechanisms for trans-saccadic memory and integration of multiple object features

Science.gov (United States)

Prime, Steven L.; Vesia, Michael; Crawford, J. Douglas

2011-01-01

Constructing an internal representation of the world from successive visual fixations, i.e. separated by saccadic eye movements, is known as trans-saccadic perception. Research on trans-saccadic perception (TSP) has been traditionally aimed at resolving the problems of memory capacity and visual integration across saccades. In this paper, we review this literature on TSP with a focus on research showing that egocentric measures of the saccadic eye movement can be used to integrate simple object features across saccades, and that the memory capacity for items retained across saccades, like visual working memory, is restricted to about three to four items. We also review recent transcranial magnetic stimulation experiments which suggest that the right parietal eye field and frontal eye fields play a key functional role in spatial updating of objects in TSP. We conclude by speculating on possible cortical mechanisms for governing egocentric spatial updating of multiple objects in TSP. PMID:21242142
A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data

Directory of Open Access Journals (Sweden)

Himmelreich Uwe

2009-07-01

Full Text Available Abstract Background Regularized regression methods such as principal component or partial least squares regression perform well in learning tasks on high dimensional spectral data, but cannot explicitly eliminate irrelevant features. The random forest classifier with its associated Gini feature importance, on the other hand, allows for an explicit feature elimination, but may not be optimally adapted to spectral data due to the topology of its constituent classification trees which are based on orthogonal splits in feature space. Results We propose to combine the best of both approaches, and evaluated the joint use of a feature selection based on a recursive feature elimination using the Gini importance of random forests' together with regularized classification methods on spectral data sets from medical diagnostics, chemotaxonomy, biomedical analytics, food science, and synthetically modified spectral data. Here, a feature selection using the Gini feature importance with a regularized classification by discriminant partial least squares regression performed as well as or better than a filtering according to different univariate statistical tests, or using regression coefficients in a backward feature elimination. It outperformed the direct application of the random forest classifier, or the direct application of the regularized classifiers on the full set of features. Conclusion The Gini importance of the random forest provided superior means for measuring feature relevance on spectral data, but – on an optimal subset of features – the regularized classifiers might be preferable over the random forest classifier, in spite of their limitation to model linear dependencies only. A feature selection based on Gini importance, however, may precede a regularized linear classification to identify this optimal subset of features, and to earn a double benefit of both dimensionality reduction and the elimination of noise from the classification task.
A prototype feature system for feature retrieval using relationships

Science.gov (United States)

Choi, J.; Usery, E.L.

2009-01-01

Using a feature data model, geographic phenomena can be represented effectively by integrating space, theme, and time. This paper extends and implements a feature data model that supports query and visualization of geographic features using their non-spatial and temporal relationships. A prototype feature-oriented geographic information system (FOGIS) is then developed and storage of features named Feature Database is designed. Buildings from the U.S. Marine Corps Base, Camp Lejeune, North Carolina and subways in Chicago, Illinois are used to test the developed system. The results of the applications show the strength of the feature data model and the developed system 'FOGIS' when they utilize non-spatial and temporal relationships in order to retrieve and visualize individual features.
A Combinatory Approach for Selecting Prognostic Genes in Microarray Studies of Tumour Survivals

Directory of Open Access Journals (Sweden)

Qihua Tan

2009-01-01

Full Text Available Different from significant gene expression analysis which looks for genes that are differentially regulated, feature selection in the microarray-based prognostic gene expression analysis aims at finding a subset of marker genes that are not only differentially expressed but also informative for prediction. Unfortunately feature selection in literature of microarray study is predominated by the simple heuristic univariate gene filter paradigm that selects differentially expressed genes according to their statistical significances. We introduce a combinatory feature selection strategy that integrates differential gene expression analysis with the Gram-Schmidt process to identify prognostic genes that are both statistically significant and highly informative for predicting tumour survival outcomes. Empirical application to leukemia and ovarian cancer survival data through-within- and cross-study validations shows that the feature space can be largely reduced while achieving improved testing performances.
A case on vendor selection methodology: An integrated approach

Directory of Open Access Journals (Sweden)

Nikhil C. Shil

2009-11-01

Full Text Available Vendor selection methodology is a highly researched area in supply chain management l terature and a very significant decision taken by supply chain managers due to technological advances in the manufacturing process. Such research has two basic dimensions: one is related to the identification of variables affecting the performance of the vendors and the other deals with the methodology to be applied. Most of the research conducted in this area deal with the upfront selection of vendors. However, it is very common to have a list of dedicated vendors due to the development of sophisticated production technologies like just in time (JIT, a lean or agile manufacturing process where continuous flow of materials is a requirement. This paper addresses the issue of selecting the optimal vendor from the internal database of a company. Factor analysis, analytical hierarchy process and regression analysis is used in an integrated way to supplement the vendor selection process. The methodology presented here is simply a proposal where every possible room for adjustment is available.
Feature Selection and Predictors of Falls with Foot Force Sensors Using KNN-Based Algorithms

Directory of Open Access Journals (Sweden)

Shengyun Liang

2015-11-01

Full Text Available The aging process may lead to the degradation of lower extremity function in the elderly population, which can restrict their daily quality of life and gradually increase the fall risk. We aimed to determine whether objective measures of physical function could predict subsequent falls. Ground reaction force (GRF data, which was quantified by sample entropy, was collected by foot force sensors. Thirty eight subjects (23 fallers and 15 non-fallers participated in functional movement tests, including walking and sit-to-stand (STS. A feature selection algorithm was used to select relevant features to classify the elderly into two groups: at risk and not at risk of falling down, for three KNN-based classifiers: local mean-based k-nearest neighbor (LMKNN, pseudo nearest neighbor (PNN, local mean pseudo nearest neighbor (LMPNN classification. We compared classification performances, and achieved the best results with LMPNN, with sensitivity, specificity and accuracy all 100%. Moreover, a subset of GRFs was significantly different between the two groups via Wilcoxon rank sum test, which is compatible with the classification results. This method could potentially be used by non-experts to monitor balance and the risk of falling down in the elderly population.
An Application of Discriminant Analysis to Pattern Recognition of Selected Contaminated Soil Features in Thin Sections

DEFF Research Database (Denmark)

Ribeiro, Alexandra B.; Nielsen, Allan Aasbjerg

1997-01-01

qualitative microprobe results: present elements Al, Si, Cr, Fe, As (associated with others). Selected groups of calibrated images (same light conditions and magnification) submitted to discriminant analysis, in order to find a pattern of recognition in the soil features corresponding to contamination already...

Sequence-based classification using discriminatory motif feature selection.

Directory of Open Access Journals (Sweden)

Hao Xiong

Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is
A Technology Selection Framework for Integrating Manufacturing within a Supply Chain

DEFF Research Database (Denmark)

Farooq, Sami; O' Brien, Chris

2012-01-01

This paper describes a structured analytical approach for selecting a manufacturing technology. A framework consisting of six integrated steps is proposed by considering the growing importance of supply chains in manufacturing organisations. The framework makes use of Analytical Hierarchy (AHP......) approach combined with Strategic Assessment Model (SAM) to evaluate and select the technologies appropriate for providing overall competitive advantage. The framework is intended to assist industrial managers in promoting manufacturing and supply chain collaboration and coordination by including intra...
Neurons in the thalamic reticular nucleus are selective for diverse and complex visual features

Science.gov (United States)

Vaingankar, Vishal; Soto-Sanchez, Cristina; Wang, Xin; Sommer, Friedrich T.; Hirsch, Judith A.

2012-01-01

All visual signals the cortex receives are influenced by the perigeniculate sector (PGN) of the thalamic reticular nucleus, which receives input from relay cells in the lateral geniculate and provides feedback inhibition in return. Relay cells have been studied in quantitative depth; they behave in a roughly linear fashion and have receptive fields with a stereotyped center-surround structure. We know far less about reticular neurons. Qualitative studies indicate they simply pool ascending input to generate non-selective gain control. Yet the perigeniculate is complicated; local cells are densely interconnected and fire lengthy bursts. Thus, we employed quantitative methods to explore the perigeniculate using relay cells as controls. By adapting methods of spike-triggered averaging and covariance analysis for bursts, we identified both first and second order features that build reticular receptive fields. The shapes of these spatiotemporal subunits varied widely; no stereotyped pattern emerged. Companion experiments showed that the shape of the first but not second order features could be explained by the overlap of On and Off inputs to a given cell. Moreover, we assessed the predictive power of the receptive field and how much information each component subunit conveyed. Linear-non-linear (LN) models including multiple subunits performed better than those made with just one; further each subunit encoded different visual information. Model performance for reticular cells was always lesser than for relay cells, however, indicating that reticular cells process inputs non-linearly. All told, our results suggest that the perigeniculate encodes diverse visual features to selectively modulate activity transmitted downstream. PMID:23269915
Robust Ground Target Detection by SAR and IR Sensor Fusion Using Adaboost-Based Feature Selection

Science.gov (United States)

Kim, Sungho; Song, Woo-Jin; Kim, So-Hyun

2016-01-01

Long-range ground targets are difficult to detect in a noisy cluttered environment using either synthetic aperture radar (SAR) images or infrared (IR) images. SAR-based detectors can provide a high detection rate with a high false alarm rate to background scatter noise. IR-based approaches can detect hot targets but are affected strongly by the weather conditions. This paper proposes a novel target detection method by decision-level SAR and IR fusion using an Adaboost-based machine learning scheme to achieve a high detection rate and low false alarm rate. The proposed method consists of individual detection, registration, and fusion architecture. This paper presents a single framework of a SAR and IR target detection method using modified Boolean map visual theory (modBMVT) and feature-selection based fusion. Previous methods applied different algorithms to detect SAR and IR targets because of the different physical image characteristics. One method that is optimized for IR target detection produces unsuccessful results in SAR target detection. This study examined the image characteristics and proposed a unified SAR and IR target detection method by inserting a median local average filter (MLAF, pre-filter) and an asymmetric morphological closing filter (AMCF, post-filter) into the BMVT. The original BMVT was optimized to detect small infrared targets. The proposed modBMVT can remove the thermal and scatter noise by the MLAF and detect extended targets by attaching the AMCF after the BMVT. Heterogeneous SAR and IR images were registered automatically using the proposed RANdom SAmple Region Consensus (RANSARC)-based homography optimization after a brute-force correspondence search using the detected target centers and regions. The final targets were detected by feature-selection based sensor fusion using Adaboost. The proposed method showed good SAR and IR target detection performance through feature selection-based decision fusion on a synthetic database generated
Robust Ground Target Detection by SAR and IR Sensor Fusion Using Adaboost-Based Feature Selection

Directory of Open Access Journals (Sweden)

Sungho Kim

2016-07-01

Full Text Available Long-range ground targets are difficult to detect in a noisy cluttered environment using either synthetic aperture radar (SAR images or infrared (IR images. SAR-based detectors can provide a high detection rate with a high false alarm rate to background scatter noise. IR-based approaches can detect hot targets but are affected strongly by the weather conditions. This paper proposes a novel target detection method by decision-level SAR and IR fusion using an Adaboost-based machine learning scheme to achieve a high detection rate and low false alarm rate. The proposed method consists of individual detection, registration, and fusion architecture. This paper presents a single framework of a SAR and IR target detection method using modified Boolean map visual theory (modBMVT and feature-selection based fusion. Previous methods applied different algorithms to detect SAR and IR targets because of the different physical image characteristics. One method that is optimized for IR target detection produces unsuccessful results in SAR target detection. This study examined the image characteristics and proposed a unified SAR and IR target detection method by inserting a median local average filter (MLAF, pre-filter and an asymmetric morphological closing filter (AMCF, post-filter into the BMVT. The original BMVT was optimized to detect small infrared targets. The proposed modBMVT can remove the thermal and scatter noise by the MLAF and detect extended targets by attaching the AMCF after the BMVT. Heterogeneous SAR and IR images were registered automatically using the proposed RANdom SAmple Region Consensus (RANSARC-based homography optimization after a brute-force correspondence search using the detected target centers and regions. The final targets were detected by feature-selection based sensor fusion using Adaboost. The proposed method showed good SAR and IR target detection performance through feature selection-based decision fusion on a synthetic
Improving feature selection process resistance to failures caused by curse-of-dimensionality effects

Czech Academy of Sciences Publication Activity Database

Somol, Petr; Grim, Jiří; Novovičová, Jana; Pudil, P.

2011-01-01

Roč. 47, č. 3 (2011), s. 401-425 ISSN 0023-5954 R&D Projects: GA MŠk 1M0572; GA ČR GA102/08/0593 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : feature selection * curse of dimensionality * over-fitting * stability * machine learning * dimensionality reduction Subject RIV: IN - Informatics, Computer Science Impact factor: 0.454, year: 2011 http://library.utia.cas.cz/separaty/2011/RO/somol-0368741.pdf
Unique features in the ARIES glovebox line

International Nuclear Information System (INIS)

Martinez, H.E.; Brown, W.G.; Flamm, B.; James, C.A.; Laskie, R.; Nelson, T.O.; Wedman, D.E.

1998-01-01

A series of unique features have been incorporated into the Advanced Recovery and Integrated Extraction System (ARIES) at the Los Alamos National Laboratory, TA-55 Plutonium Facility. The features enhance the material handling in the process of the dismantlement of nuclear weapon primaries in the glovebox line. Incorporated into these features are the various plutonium process module's different ventilation zone requirements that the material handling systems must meet. These features include a conveyor system that consists of a remotely controlled cart that transverses the length of the conveyor glovebox, can be operated from a remote location and can deliver process components to the entrance of any selected module glovebox. Within the modules there exists linear motion material handling systems with lifting hoist, which are controlled via an Allen Bradley control panel or local control panels. To remove the packaged products from the hot process line, the package is processed through an air lock/electrolytic decontamination process that removes the radioactive contamination from the outside of the package container and allows the package to be removed from the process line
Integral Indicator of Ecological Footprint for Croatian Power Plants

International Nuclear Information System (INIS)

Strijov, V.; Granic, G.; Juric, Z.; Jelavic, B.; Antesevic Maricic, S.

2009-01-01

The main goal of this paper is to present the methodology of construction of the Integral Indicator for Croatian Thermal Power Plants and Combined Heat and Power Plants. The Integral Indicator is necessary to compare Power Plants selected according to a certain criterion. The criterion of the Ecological Footprint is chosen. The following features of the Power Plants are used: generated electricity and heat; consumed coal and liquid fuel; sulphur content in fuel; emitted CO 2 , SO 2 , NO x and particles. To construct the Integral Indicator the linear model is used. The model parameters are tuned by the Principal Component Analysis algorithm. The constructed Integral Indicator is compared with several others, such as Pareto-Optimal Slicing Indicator and Metric Indicator. The Integral Indicator keeps as much information about features of the Power Plants as possible; it is simple and robust.(author).
Feature selection for disruption prediction from scratch in JET by using genetic algorithms and probabilistic predictors

Energy Technology Data Exchange (ETDEWEB)

Pereira, Augusto, E-mail: augusto.pereira@ciemat.es [Laboratorio Nacional de Fusión, CIEMAT, Madrid (Spain); Vega, Jesús; Moreno, Raúl [Laboratorio Nacional de Fusión, CIEMAT, Madrid (Spain); Dormido-Canto, Sebastián [Dpto. Informática y Automática – UNED, Madrid (Spain); Rattá, Giuseppe A. [Laboratorio Nacional de Fusión, CIEMAT, Madrid (Spain); Pavón, Fernando [Dpto. Informática y Automática – UNED, Madrid (Spain)

2015-10-15

Recently, a probabilistic classifier has been developed at JET to be used as predictor from scratch. It has been applied to a database of 1237 JET ITER-like wall (ILW) discharges (of which 201 disrupted) with good results: success rate of 94% and false alarm rate of 4.21%. A combinatorial analysis between 14 features to ensure the selection of the best ones to achieve good enough results in terms of success rate and false alarm rate was performed. All possible combinations with a number of features between 2 and 7 were tested and 9893 different predictors were analyzed. An important drawback in this analysis was the time required to compute the results that can be estimated in 1731 h (∼2.4 months). Genetic algorithms (GA) are searching algorithms that simulate the process of natural selection. In this article, the GA and the Venn predictors are combined with the objective not only of finding good enough features within the 14 available ones but also of reducing the computational time requirements. Five different performance metrics as measures of the GA fitness function have been evaluated. The best metric was the measurement called Informedness, with just 6 generations (168 predictors at 29.4 h).
Feature selection for disruption prediction from scratch in JET by using genetic algorithms and probabilistic predictors

International Nuclear Information System (INIS)

Pereira, Augusto; Vega, Jesús; Moreno, Raúl; Dormido-Canto, Sebastián; Rattá, Giuseppe A.; Pavón, Fernando

2015-01-01

Recently, a probabilistic classifier has been developed at JET to be used as predictor from scratch. It has been applied to a database of 1237 JET ITER-like wall (ILW) discharges (of which 201 disrupted) with good results: success rate of 94% and false alarm rate of 4.21%. A combinatorial analysis between 14 features to ensure the selection of the best ones to achieve good enough results in terms of success rate and false alarm rate was performed. All possible combinations with a number of features between 2 and 7 were tested and 9893 different predictors were analyzed. An important drawback in this analysis was the time required to compute the results that can be estimated in 1731 h (∼2.4 months). Genetic algorithms (GA) are searching algorithms that simulate the process of natural selection. In this article, the GA and the Venn predictors are combined with the objective not only of finding good enough features within the 14 available ones but also of reducing the computational time requirements. Five different performance metrics as measures of the GA fitness function have been evaluated. The best metric was the measurement called Informedness, with just 6 generations (168 predictors at 29.4 h).
Behavior Selection of Mobile Robot Based on Integration of Multimodal Information

Science.gov (United States)

Chen, Bin; Kaneko, Masahide

Recently, biologically inspired robots have been developed to acquire the capacity for directing visual attention to salient stimulus generated from the audiovisual environment. On purpose to realize this behavior, a general method is to calculate saliency maps to represent how much the external information attracts the robot's visual attention, where the audiovisual information and robot's motion status should be involved. In this paper, we represent a visual attention model where three modalities, that is, audio information, visual information and robot's motor status are considered, while the previous researches have not considered all of them. Firstly, we introduce a 2-D density map, on which the value denotes how much the robot pays attention to each spatial location. Then we model the attention density using a Bayesian network where the robot's motion statuses are involved. Secondly, the information from both of audio and visual modalities is integrated with the attention density map in integrate-fire neurons. The robot can direct its attention to the locations where the integrate-fire neurons are fired. Finally, the visual attention model is applied to make the robot select the visual information from the environment, and react to the content selected. Experimental results show that it is possible for robots to acquire the visual information related to their behaviors by using the attention model considering motion statuses. The robot can select its behaviors to adapt to the dynamic environment as well as to switch to another task according to the recognition results of visual attention.
Coding of visual object features and feature conjunctions in the human brain.

Science.gov (United States)

Martinovic, Jasna; Gruber, Thomas; Müller, Matthias M

2008-01-01

Object recognition is achieved through neural mechanisms reliant on the activity of distributed coordinated neural assemblies. In the initial steps of this process, an object's features are thought to be coded very rapidly in distinct neural assemblies. These features play different functional roles in the recognition process--while colour facilitates recognition, additional contours and edges delay it. Here, we selectively varied the amount and role of object features in an entry-level categorization paradigm and related them to the electrical activity of the human brain. We found that early synchronizations (approx. 100 ms) increased quantitatively when more image features had to be coded, without reflecting their qualitative contribution to the recognition process. Later activity (approx. 200-400 ms) was modulated by the representational role of object features. These findings demonstrate that although early synchronizations may be sufficient for relatively crude discrimination of objects in visual scenes, they cannot support entry-level categorization. This was subserved by later processes of object model selection, which utilized the representational value of object features such as colour or edges to select the appropriate model and achieve identification.
PolSAR Land Cover Classification Based on Roll-Invariant and Selected Hidden Polarimetric Features in the Rotation Domain

Directory of Open Access Journals (Sweden)

Chensong Tao

2017-07-01

Full Text Available Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR. Target polarimetric response is strongly dependent on its orientation. Backscattering responses of the same target with different orientations to the SAR flight path may be quite different. This target orientation diversity effect hinders PolSAR image understanding and interpretation. Roll-invariant polarimetric features such as entropy, anisotropy, mean alpha angle, and total scattering power are independent of the target orientation and are commonly adopted for PolSAR image classification. On the other aspect, target orientation diversity also contains rich information which may not be sensed by roll-invariant polarimetric features. In this vein, only using the roll-invariant polarimetric features may limit the final classification accuracy. To address this problem, this work uses the recently reported uniform polarimetric matrix rotation theory and a visualization and characterization tool of polarimetric coherence pattern to investigate hidden polarimetric features in the rotation domain along the radar line of sight. Then, a feature selection scheme is established and a set of hidden polarimetric features are selected in the rotation domain. Finally, a classification method is developed using the complementary information between roll-invariant and selected hidden polarimetric features with a support vector machine (SVM/decision tree (DT classifier. Comparison experiments are carried out with NASA/JPL AIRSAR and multi-temporal UAVSAR data. For AIRSAR data, the overall classification accuracy of the proposed classification method is 95.37% (with SVM/96.38% (with DT, while that of the conventional classification method is 93.87% (with SVM/94.12% (with DT, respectively. Meanwhile, for multi-temporal UAVSAR data, the mean overall classification accuracy of the proposed method is up to 97.47% (with SVM/99.39% (with DT, which is also higher
A Study of Integrity Evaluation System for Spent Fuel and Selection of the Representative Spent Fuel

International Nuclear Information System (INIS)

Kim, J. G.; Lee, S. K.; Lim, C. J.; Kim, J. K.; Lee, S. J.

2014-01-01

Spent fuel (SF) integrity evaluation is a regulatory requirement that is described in 10 CFR 71(transportation) and 10 CFR 72(storage) of the U. S. NRC licensing requirement. NRC regulation states that retrievability of SF after storage should be ensured and SF integrity under the normal condition must be guaranteed during transportation and handling process that is entailed before/during/after the interim storage. And SF integrity evaluation under the hypothetical accident condition is a core technology element for an assessment of critical, shielding, and containment. In this paper, SF integrity evaluation system which is suitable for domestic situation is suggested, and necessity of representative SF selection and its method is described. The ultimate goal of the SF integrity evaluation is to evaluate a safety margin in case of transportation/ handling/storage of SFs. It means that retrievability of SF after storage should be assured and SF integrity must be guaranteed at normal condition in the process of transportation/handling accompanied before/during/after interim storage. In Korea, SF integrity evaluation system is not established up to date. Especially, representative SF selection technology that is essential to SF integrity evaluation has not been fulfilled. To overcome this situation effectively, the methodology and technology of an overseas agency need to be benchmarked. In this paper, an overseas SF integrity evaluation system is analyzed, and an evaluation system suitable for domestic situation is suggested. Also, necessity of representative SF selection and its method is described
Unveiling network-based functional features through integration of gene expression into protein networks.

Science.gov (United States)

Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali

2018-06-01

Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.
Crowding with conjunctions of simple features.

Science.gov (United States)

Põder, Endel; Wagemans, Johan

2007-11-20

Several recent studies have related crowding with the feature integration stage in visual processing. In order to understand the mechanisms involved in this stage, it is important to use stimuli that have several features to integrate, and these features should be clearly defined and measurable. In this study, Gabor patches were used as target and distractor stimuli. The stimuli differed in three dimensions: spatial frequency, orientation, and color. A group of 3, 5, or 7 objects was presented briefly at 4 deg eccentricity of the visual field. The observers' task was to identify the object located in the center of the group. A strong effect of the number of distractors was observed, consistent with various spatial pooling models. The analysis of incorrect responses revealed that these were a mix of feature errors and mislocalizations of the target object. Feature errors were not purely random, but biased by the features of distractors. We propose a simple feature integration model that predicts most of the observed regularities.
How Perception Guides Action: Figure-Ground Segmentation Modulates Integration of Context Features into S-R Episodes

Science.gov (United States)

Frings, Christian; Rothermund, Klaus

2017-01-01

Perception and action are closely related. Responses are assumed to be represented in terms of their perceptual effects, allowing direct links between action and perception. In this regard, the integration of features of stimuli (S) and responses (R) into S-R bindings is a key mechanism for action control. Previous research focused on the…
GAIN RATIO BASED FEATURE SELECTION METHOD FOR PRIVACY PRESERVATION

Directory of Open Access Journals (Sweden)

R. Praveena Priyadarsini

2011-04-01

Full Text Available Privacy-preservation is a step in data mining that tries to safeguard sensitive information from unsanctioned disclosure and hence protecting individual data records and their privacy. There are various privacy preservation techniques like k-anonymity, l-diversity and t-closeness and data perturbation. In this paper k-anonymity privacy protection technique is applied to high dimensional datasets like adult and census. since, both the data sets are high dimensional, feature subset selection method like Gain Ratio is applied and the attributes of the datasets are ranked and low ranking attributes are filtered to form new reduced data subsets. K-anonymization privacy preservation technique is then applied on reduced datasets. The accuracy of the privacy preserved reduced datasets and the original datasets are compared for their accuracy on the two functionalities of data mining namely classification and clustering using naïve Bayesian and k-means algorithm respectively. Experimental results show that classification and clustering accuracy are comparatively the same for reduced k-anonym zed datasets and the original data sets.
Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection

Directory of Open Access Journals (Sweden)

Xin Ma

2015-01-01

Full Text Available The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR method, followed by incremental feature selection (IFS. We incorporated features of conjoint triad features and three novel features: binding propensity (BP, nonbinding propensity (NBP, and evolutionary information combined with physicochemical properties (EIPP. The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient. High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information.
Cancer survival classification using integrated data sets and intermediate information.

Science.gov (United States)

Kim, Shinuk; Park, Taesung; Kon, Mark

2014-09-01

Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. Integration of different data sets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different data sets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time. FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue data sets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS). In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF; 75%, SVM; 75%, FSCOX_median; and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF; 63.64%, SVM; 72.73%, FSCOX_median; and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of miRNA/mRNA using IFS

Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy.

Science.gov (United States)

Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A

2015-07-01

Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Interactive prostate segmentation using atlas-guided semi-supervised learning and adaptive feature selection.

Science.gov (United States)

Park, Sang Hyun; Gao, Yaozong; Shi, Yinghuan; Shen, Dinggang

2014-11-01

Accurate prostate segmentation is necessary for maximizing the effectiveness of radiation therapy of prostate cancer. However, manual segmentation from 3D CT images is very time-consuming and often causes large intra- and interobserver variations across clinicians. Many segmentation methods have been proposed to automate this labor-intensive process, but tedious manual editing is still required due to the limited performance. In this paper, the authors propose a new interactive segmentation method that can (1) flexibly generate the editing result with a few scribbles or dots provided by a clinician, (2) fast deliver intermediate results to the clinician, and (3) sequentially correct the segmentations from any type of automatic or interactive segmentation methods. The authors formulate the editing problem as a semisupervised learning problem which can utilize a priori knowledge of training data and also the valuable information from user interactions. Specifically, from a region of interest near the given user interactions, the appropriate training labels, which are well matched with the user interactions, can be locally searched from a training set. With voting from the selected training labels, both confident prostate and background voxels, as well as unconfident voxels can be estimated. To reflect informative relationship between voxels, location-adaptive features are selected from the confident voxels by using regression forest and Fisher separation criterion. Then, the manifold configuration computed in the derived feature space is enforced into the semisupervised learning algorithm. The labels of unconfident voxels are then predicted by regularizing semisupervised learning algorithm. The proposed interactive segmentation method was applied to correct automatic segmentation results of 30 challenging CT images. The correction was conducted three times with different user interactions performed at different time periods, in order to evaluate both the efficiency
Spectral characteristics and feature selection of satellite remote sensing data for climate and anthropogenic changes assessment in Bucharest area

Science.gov (United States)

Zoran, Maria; Savastru, Roxana; Savastru, Dan; Tautan, Marina; Miclos, Sorin; Cristescu, Luminita; Carstea, Elfrida; Baschir, Laurentiu

2010-05-01

Urban systems play a vital role in social and economic development in all countries. Their environmental changes can be investigated on different spatial and temporal scales. Urban and peri-urban environment dynamics is of great interest for future planning and decision making as well as in frame of local and regional changes. Changes in urban land cover include changes in biotic diversity, actual and potential primary productivity, soil quality, runoff, and sedimentation rates, and cannot be well understood without the knowledge of land use change that drives them. The study focuses on the assessment of environmental features changes for Bucharest metropolitan area, Romania by satellite remote sensing and in-situ monitoring data. Rational feature selection from the varieties of spectral channels in the optical wavelengths of electromagnetic spectrum (VIS and NIR) is very important for effective analysis and information extraction of remote sensing data. Based on comprehensively analyses of the spectral characteristics of remote sensing data is possibly to derive environmental changes in urban areas. The information quantity contained in a band is an important parameter in evaluating the band. The deviation and entropy are often used to show information amount. Feature selection is one of the most important steps in recognition and classification of remote sensing images. Therefore, it is necessary to select features before classification. The optimal features are those that can be used to distinguish objects easily and correctly. Three factors—the information quantity of bands, the correlation between bands and the spectral characteristic (e.g. absorption specialty) of classified objects in test area Bucharest have been considered in our study. As, the spectral characteristic of an object is influenced by many factors, being difficult to define optimal feature parameters to distinguish all the objects in a whole area, a method of multi-level feature selection
A Novel Image Retrieval Based on Visual Words Integration of SIFT and SURF.

Directory of Open Access Journals (Sweden)

Nouman Ali

Full Text Available With the recent evolution of technology, the number of image archives has increased exponentially. In Content-Based Image Retrieval (CBIR, high-level visual information is represented in the form of low-level features. The semantic gap between the low-level features and the high-level image concepts is an open research problem. In this paper, we present a novel visual words integration of Scale Invariant Feature Transform (SIFT and Speeded-Up Robust Features (SURF. The two local features representations are selected for image retrieval because SIFT is more robust to the change in scale and rotation, while SURF is robust to changes in illumination. The visual words integration of SIFT and SURF adds the robustness of both features to image retrieval. The qualitative and quantitative comparisons conducted on Corel-1000, Corel-1500, Corel-2000, Oliva and Torralba and Ground Truth image benchmarks demonstrate the effectiveness of the proposed visual words integration.
Selected PET radiomic features remain the same.

Science.gov (United States)

Tsujikawa, Tetsuya; Tsuyoshi, Hideaki; Kanno, Masafumi; Yamada, Shizuka; Kobayashi, Masato; Narita, Norihiko; Kimura, Hirohiko; Fujieda, Shigeharu; Yoshida, Yoshio; Okazawa, Hidehiko

2018-04-17

We investigated whether PET radiomic features are affected by differences in the scanner, scan protocol, and lesion location using 18 F-FDG PET/CT and PET/MR scans. SUV, TMR, skewness, kurtosis, entropy, and homogeneity strongly correlated between PET/CT and PET/MR images. SUVs were significantly higher on PET/MR 0-2 min and PET/MR 0-10 min than on PET/CT in gynecological cancer ( p = 0.008 and 0.008, respectively), whereas no significant difference was observed between PET/CT, PET/MR 0-2 min , and PET/MR 0-10 min images in oral cavity/oropharyngeal cancer. TMRs on PET/CT, PET/MR 0-2 min , and PET/MR 0-10 min increased in this order in gynecological cancer and oral cavity/oropharyngeal cancer. In contrast to conventional and histogram indices, 4 textural features (entropy, homogeneity, SRE, and LRE) were not significantly different between PET/CT, PET/MR 0-2 min , and PET/MR 0-10 min images. 18 F-FDG PET radiomic features strongly correlated between PET/CT and PET/MR images. Dixon-based attenuation correction on PET/MR images underestimated tumor tracer uptake more significantly in oral cavity/oropharyngeal cancer than in gynecological cancer. 18 F-FDG PET textural features were affected less by differences in the scanner and scan protocol than conventional and histogram features, possibly due to the resampling process using a medium bin width. Eight patients with gynecological cancer and 7 with oral cavity/oropharyngeal cancer underwent a whole-body 18 F-FDG PET/CT scan and regional PET/MR scan in one day. PET/MR scans were performed for 10 minutes in the list mode, and PET/CT and 0-2 min and 0-10 min PET/MR images were reconstructed. The standardized uptake value (SUV), tumor-to-muscle SUV ratio (TMR), skewness, kurtosis, entropy, homogeneity, short-run emphasis (SRE), and long-run emphasis (LRE) were compared between PET/CT, PET/MR 0-2 min , and PET/MR 0-10 min images.
Feature Extraction

CERN Document Server

CERN. Geneva

2015-01-01

Feature selection and reduction are key to robust multivariate analyses. In this talk I will focus on pros and cons of various variable selection methods and focus on those that are most relevant in the context of HEP.
Advanced Hybrid Spacesuit Concept Featuring Integrated Open Loop and Closed Loop Ventilation Systems

Science.gov (United States)

Daniel, Brian A.; Fitzpatrick, Garret R.; Gohmert, Dustin M.; Ybarra, Rick M.; Dub, Mark O.

2013-01-01

A document discusses the design and prototype of an advanced spacesuit concept that integrates the capability to function seamlessly with multiple ventilation system approaches. Traditionally, spacesuits are designed to operate both dependently and independently of a host vehicle environment control and life support system (ECLSS). Spacesuits that operate independent of vehicle-provided ECLSS services must do so with equipment selfcontained within or on the spacesuit. Suits that are dependent on vehicle-provided consumables must remain physically connected to and integrated with the vehicle to operate properly. This innovation is the design and prototype of a hybrid spacesuit approach that configures the spacesuit to seamlessly interface and integrate with either type of vehicular systems, while still maintaining the ability to function completely independent of the vehicle. An existing Advanced Crew Escape Suit (ACES) was utilized as the platform from which to develop the innovation. The ACES was retrofitted with selected components and one-off items to achieve the objective. The ventilation system concept was developed and prototyped/retrofitted to an existing ACES. Components were selected to provide suit connectors, hoses/umbilicals, internal breathing system ducting/ conduits, etc. The concept utilizes a lowpressure- drop, high-flow ventilation system that serves as a conduit from the vehicle supply into the suit, up through a neck seal, into the breathing helmet cavity, back down through the neck seal, out of the suit, and returned to the vehicle. The concept also utilizes a modified demand-based breathing system configured to function seamlessly with the low-pressure-drop closed-loop ventilation system.
Feature Mining and Health Assessment for Gearboxes Using Run-Up/Coast-Down Signals

Science.gov (United States)

Zhao, Ming; Lin, Jing; Miao, Yonghao; Xu, Xiaoqiang

2016-01-01

Vibration signals measured in the run-up/coast-down (R/C) processes usually carry rich information about the health status of machinery. However, a major challenge in R/C signals analysis lies in how to exploit more diagnostic information, and how this information could be properly integrated to achieve a more reliable maintenance decision. Aiming at this problem, a framework of R/C signals analysis is presented for the health assessment of gearbox. In the proposed methodology, we first investigate the data preprocessing and feature selection issues for R/C signals. Based on that, a sparsity-guided feature enhancement scheme is then proposed to extract the weak phase jitter associated with gear defect. In order for an effective feature mining and integration under R/C, a generalized phase demodulation technique is further established to reveal the evolution of modulation feature with operating speed and rotation angle. The experimental results indicate that the proposed methodology could not only detect the presence of gear damage, but also offer a novel insight into the dynamic behavior of gearbox. PMID:27827831
KLASIFIKASI INTI SEL PAP SMEAR BERDASARKAN ANALISIS TEKSTUR MENGGUNAKAN CORRELATION-BASED FEATURE SELECTION BERBASIS ALGORITMA C4.5

Directory of Open Access Journals (Sweden)

Toni Arifin

2014-09-01

Full Text Available Abstract - Pap Smear is an early examination to diagnose whether there’s indication cervical cancer or not, the process of observations were done by observing pap smear cell under the microscope. There’s so many research has been done to differentiate between normal and abnormal cell. In this research presents a classification of pap smear cell based on texture analysis. This research is using the Harlev image which amounts to 280 images, 140 images are used as training data and 140 images other are used as testing. On the texture analysis used Gray level Co-occurance Matrix (GLCM method with 5 parameters that is correlation, energy, homogeneity and entropy added by counting the value of brightness. For choose which the best attribute used correlation-based feature selection method and than used C45 algorithm for produce classification rule. The result accuracy of the classification normal and abnormal used decision tree C45 is 96,43% and errors in predicting is 3,57%. Keywords : Classification, Pap Smear cell image, texture analysis, Correlation-based feature selection, C45 algorithm. Abstrak - Pap Smear merupakan pemeriksaan dini untuk mendiagnosa apakah ada indikasi kanker serviks atau tidak, proses pengamatan dilakukan dengan mengamati sel pap smear dibawah mikroskop. Banyak penelitian yang telah dilakukan untuk membedakan antara sel normal dan abnormal. Dalam penelitian ini menyajikan klasifikasi inti sel pap smear berdasarkan analisis tektur. Citra yang digunakan dalam penelitian ini adalah citra Harlev yang berjumlah 280 citra, 140 citra digunakan sebagai data training dan 140 citra lain digunakan sebagai testing. Pada analisis tekstur mengunakan metode Gray level Co-occurrence Matrix (GLCM menggunakan 5 parameter yaitu korelasi, energi, homogenitas dan entropi ditambah dengan menghitung nilai brightness. Untuk memilih mana atribut terbaik digunakan metode correlation-based feature selection lalu digunakan algoritma C45 untuk
Supporting the material control and accountancy system with physical protection system features

International Nuclear Information System (INIS)

Miyoshi, D.S.; Olson, C.E.; Caskey, D.L.

1984-01-01

Most physical security functions can be accomplished by a range of alternative features. Careful design can provide comparable levels of security regardless of which option is chosen, albeit with possible differences in cost and efficiency. However, the effectiveness and especially the cost and efficiency of the material control and accounting system may be strongly influenced by the selection of a particular design approach to physical security. In this paper, a series of examples are cited to illustrate the effects that particular physical protection design choices may have. The examples have been chosen from several systems engineering projects at facilities within the DOE nuclear community. These examples are generalized, and a series of design principles are proposed for integrating physical security with material control and accounting by appropriate selection of alternative features. 2 references, 6 figures
Supporting the material control and accountancy system with physical protection system features

International Nuclear Information System (INIS)

Miyoshi, D.S.; Caskey, D.L.; Olson, C.E.

1984-01-01

Most physical security functions can be accomplished by a range of alternative features. Careful design can provide comparable levels of security regardless of which option is chosen, albeit with possible differences in cost and efficiency. However, the effectiveness and especially the cost and efficiency of the material control and accounting system may be strongly influenced by the selection of a particular design approach to physical security. In this paper, a series of examples are cited to illustrate the effects that particular physical protection design choices may have. The examples have been chosen from several systems engineering projects at facilities within the DOE nuclear community. These examples are generalized, and a series of design principles are proposed for integrating physical security with MC and A by appropriate selection of alternative features
SELECTED REQUIREMENTS OF INTEGRATED MANAGEMENT SYSTEMS BASED ON PAS 99 SPECIFICATION

Directory of Open Access Journals (Sweden)

Paweł Nowicki

2013-03-01

Full Text Available The aim this research was to analyze the ways of integration of management systems in food sector. The study involved the documentation, audits, corrective and preventive actions and management's review phases described in the specification PAS 99, which is one of common elements of integrated management systems. Four organizations were selected for the study. The organizations had introduced and certified at least two standardized management systems. It was assumed that the investigated organizations should have implemented the HACCP system. Studies were conducted as a case study. The employees responsible for the functioning of management systems were interviewed in all four organizations. The study was conducted in the form of in-depth interviews based on pre-prepared script. The scenario was developed based on the PAS 99 guideline. The process of integration of management systems implemented in the studied companies reveals the full compliance of an integrated management system with PASS 99 in the policy area.
Overview of the New England wind integration. Study and selected results

Energy Technology Data Exchange (ETDEWEB)

Norden, John R.; Henson, William L.W. [ISO New England, Holyoke, MA (United States)

2010-07-01

ISO New England commissioned a comprehensive wind integration study to be completed in the early fall of 2010: the New England Wind Integration Study (NEWIS). The NEWIS assesses the efects of scenarios that encompass a range of wind-power penetrations in New England using statistical and simulation analysis including the development of a mesoscale wind-to-power model for the New England and Maritime wind resources areas. It also determines the impacts of integrating increasing amounts of wind generation resources for New England, as well as, the measures that may be available to the ISO for responding to any challenges while enabling the integration of wind-power. This paper provides an overview of the study then focuses on selected near final results, particularly with regard to the varying capacity factor, capacity value and siting that were determined as part of the study. The full results of the NEWIS will be released in the fall of 2010. (orig.)
Advancing representation of hydrologic processes in the Soil and Water Assessment Tool (SWAT) through integration of the TOPographic MODEL (TOPMODEL) features

Science.gov (United States)

Chen, J.; Wu, Y.

2012-01-01

This paper presents a study of the integration of the Soil and Water Assessment Tool (SWAT) model and the TOPographic MODEL (TOPMODEL) features for enhancing the physical representation of hydrologic processes. In SWAT, four hydrologic processes, which are surface runoff, baseflow, groundwater re-evaporation and deep aquifer percolation, are modeled by using a group of empirical equations. The empirical equations usually constrain the simulation capability of relevant processes. To replace these equations and to model the influences of topography and water table variation on streamflow generation, the TOPMODEL features are integrated into SWAT, and a new model, the so-called SWAT-TOP, is developed. In the new model, the process of deep aquifer percolation is removed, the concept of groundwater re-evaporation is refined, and the processes of surface runoff and baseflow are remodeled. Consequently, three parameters in SWAT are discarded, and two new parameters to reflect the TOPMODEL features are introduced. SWAT-TOP and SWAT are applied to the East River basin in South China, and the results reveal that, compared with SWAT, the new model can provide a more reasonable simulation of the hydrologic processes of surface runoff, groundwater re-evaporation, and baseflow. This study evidences that an established hydrologic model can be further improved by integrating the features of another model, which is a possible way to enhance our understanding of the workings of catchments.
License Application Design Selection Feature Report: Waste Package Self Shielding Design Feature 13

International Nuclear Information System (INIS)

Tang, J.S.

2000-01-01

In the Viability Assessment (VA) reference design, handling of waste packages (WPs) in the emplacement drifts is performed remotely, and human access to the drifts is precluded when WPs are present. This report will investigate the feasibility of using a self-shielded WP design to reduce the radiation levels in the emplacement drifts to a point that, when coupled with ventilation, will create an acceptable environment for human access. This provides the benefit of allowing human entry to emplacement drifts to perform maintenance on ground support and instrumentation, and carry out performance confirmation activities. More direct human control of WP handling and emplacement operations would also be possible. However, these potential benefits must be weighed against the cost of implementation, and potential impacts on pre- and post-closure performance of the repository and WPs. The first section of this report will provide background information on previous investigations of the self-shielded WP design feature, summarize the objective and scope of this document, and provide quality assurance and software information. A shielding performance and cost study that includes several candidate shield materials will then be performed in the subsequent section to allow selection of two self-shielded WP design options for further evaluation. Finally, the remaining sections will evaluate the impacts of the two WP self-shielding options on the repository design, operations, safety, cost, and long-term performance of the WPs with respect to the VA reference design
Discriminative region extraction and feature selection based on the combination of SURF and saliency

Science.gov (United States)

Deng, Li; Wang, Chunhong; Rao, Changhui

2011-08-01

The objective of this paper is to provide a possible optimization on salient region algorithm, which is extensively used in recognizing and learning object categories. Salient region algorithm owns the superiority of intra-class tolerance, global score of features and automatically prominent scale selection under certain range. However, the major limitation behaves on performance, and that is what we attempt to improve. By reducing the number of pixels involved in saliency calculation, it can be accelerated. We use interest points detected by fast-Hessian, the detector of SURF, as the candidate feature for saliency operation, rather than the whole set in image. This implementation is thereby called Saliency based Optimization over SURF (SOSU for short). Experiment shows that bringing in of such a fast detector significantly speeds up the algorithm. Meanwhile, Robustness of intra-class diversity ensures object recognition accuracy.
A computational environment for long-term multi-feature and multi-algorithm seizure prediction.

Science.gov (United States)

Teixeira, C A; Direito, B; Costa, R P; Valderrama, M; Feldwisch-Drentrup, H; Nikolopoulos, S; Le Van Quyen, M; Schelter, B; Dourado, A

2010-01-01

The daily life of epilepsy patients is constrained by the possibility of occurrence of seizures. Until now, seizures cannot be predicted with sufficient sensitivity and specificity. Most of the seizure prediction studies have been focused on a small number of patients, and frequently assuming unrealistic hypothesis. This paper adopts the view that for an appropriate development of reliable predictors one should consider long-term recordings and several features and algorithms integrated in one software tool. A computational environment, based on Matlab (®), is presented, aiming to be an innovative tool for seizure prediction. It results from the need of a powerful and flexible tool for long-term EEG/ECG analysis by multiple features and algorithms. After being extracted, features can be subjected to several reduction and selection methods, and then used for prediction. The predictions can be conducted based on optimized thresholds or by applying computational intelligence methods. One important aspect is the integrated evaluation of the seizure prediction characteristic of the developed predictors.
MATHEMATICAL МODELLING OF SELECTING INFORMATIVE FEATURES FOR ANALYZING THE LIFE CYCLE PROCESSES OF RADIO-ELECTRONIC MEANS

Directory of Open Access Journals (Sweden)

Николай Григорьевич Стародубцев

2017-09-01

Full Text Available The subject of the study are methods and models for extracting information about the processes of the life cycle of radio electronic means at the design, production and operation stages. The goal is to develop the fundamentals of the theory of holistic monitoring of the life cycle of radio electronic means at the stages of their design, production and operation, in particular the development of information models for monitoring life cycle indicators in the production of radio electronic means. The attainment of this goal is achieved by solving such problems: research and development of a methodology for solving the problems of selecting informative features characterizing the state of the life cycle of radio electronic means; choice of informative features characterizing the state of the life cycle processes of radio electronic means; identification of the state of the life cycle processes of radio electronic means. To solve these problems, general scientific methods were used: the main provisions of functional analysis, nonequilibrium thermodynamics, estimation and prediction of random processes, optimization methods, pattern recognition. The following results are obtained. Methods for solving the problems of selecting informative features for monitoring the life cycle of radioelectronic facilities are developed by classifying the states of radioelectronic means and the processes of LC in the space of characteristics, each of which has a certain significance, which allowed finding a complex criterion and formalizing the selection procedures. When the number of a priori data is insufficient for a correct classification, heuristic methods of selection according to the criteria for using basic prototypes and information priorities are proposed. Conclusions. The solution of the problem of mathematical modeling of the efficiency functions of the processes of the life cycle of radioelectronic facilities and the choice of informative features for
Feature selection in wind speed prediction systems based on a hybrid coral reefs optimization – Extreme learning machine approach

International Nuclear Information System (INIS)

Salcedo-Sanz, S.; Pastor-Sánchez, A.; Prieto, L.; Blanco-Aguilera, A.; García-Herrera, R.

2014-01-01

Highlights: • A novel approach for short-term wind speed prediction is presented. • The system is formed by a coral reefs optimization algorithm and an extreme learning machine. • Feature selection is carried out with the CRO to improve the ELM performance. • The method is tested in real wind farm data in USA, for the period 2007–2008. - Abstract: This paper presents a novel approach for short-term wind speed prediction based on a Coral Reefs Optimization algorithm (CRO) and an Extreme Learning Machine (ELM), using meteorological predictive variables from a physical model (the Weather Research and Forecast model, WRF). The approach is based on a Feature Selection Problem (FSP) carried out with the CRO, that must obtain a reduced number of predictive variables out of the total available from the WRF. This set of features will be the input of an ELM, that finally provides the wind speed prediction. The CRO is a novel bio-inspired approach, based on the simulation of reef formation and coral reproduction, able to obtain excellent results in optimization problems. On the other hand, the ELM is a new paradigm in neural networks’ training, that provides a robust and extremely fast training of the network. Together, these algorithms are able to successfully solve this problem of feature selection in short-term wind speed prediction. Experiments in a real wind farm in the USA show the excellent performance of the CRO–ELM approach in this FSP wind speed prediction problem
Integrated Features by Administering the Support Vector Machine (SVM of Translational Initiations Sites in Alternative Polymorphic Contex

Directory of Open Access Journals (Sweden)

Nurul Arneida Husin

2012-04-01

Full Text Available Many algorithms and methods have been proposed for classification problems in bioinformatics. In this study, the discriminative approach in particular support vector machines (SVM is employed to recognize the studied TIS patterns. The applied discriminative approach is used to learn about some discriminant functions of samples that have been labelled as positive or negative. After learning, the discriminant functions are employed to decide whether a new sample is true or false. In this study, support vector machines (SVM is employed to recognize the patterns for studied translational initiation sites in alternative weak context. The method has been optimized with the best parameters selected; c=100, E=10-6 and ex=2 for non linear kernel function. Results show that with top 5 features and non linear kernel, the best prediction accuracy achieved is 95.8%. J48 algorithm is applied to compare with SVM with top 15 features and the results show a good prediction accuracy of 95.8%. This indicates that the top 5 features selected by the IGR method and that are performed by SVM are sufficient to use in the prediction of TIS in weak contexts.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.