WorldWideScience

Sample records for two-stage feature selection

  1. Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection

    Directory of Open Access Journals (Sweden)

    Liogienė Tatjana

    2016-07-01

    Full Text Available The intensive research of speech emotion recognition introduced a huge collection of speech emotion features. Large feature sets complicate the speech emotion recognition task. Among various feature selection and transformation techniques for one-stage classification, multiple classifier systems were proposed. The main idea of multiple classifiers is to arrange the emotion classification process in stages. Besides parallel and serial cases, the hierarchical arrangement of multi-stage classification is most widely used for speech emotion recognition. In this paper, we present a sequential-forward-feature-selection-based multi-stage classification scheme. The Sequential Forward Selection (SFS and Sequential Floating Forward Selection (SFFS techniques were employed for every stage of the multi-stage classification scheme. Experimental testing of the proposed scheme was performed using the German and Lithuanian emotional speech datasets. Sequential-feature-selection-based multi-stage classification outperformed the single-stage scheme by 12–42 % for different emotion sets. The multi-stage scheme has shown higher robustness to the growth of emotion set. The decrease in recognition rate with the increase in emotion set for multi-stage scheme was lower by 10–20 % in comparison with the single-stage case. Differences in SFS and SFFS employment for feature selection were negligible.

  2. A Dual-Stage Two-Phase Model of Selective Attention

    Science.gov (United States)

    Hubner, Ronald; Steinhauser, Marco; Lehle, Carola

    2010-01-01

    The dual-stage two-phase (DSTP) model is introduced as a formal and general model of selective attention that includes both an early and a late stage of stimulus selection. Whereas at the early stage information is selected by perceptual filters whose selectivity is relatively limited, at the late stage stimuli are selected more efficiently on a…

  3. Comparisons of single-stage and two-stage approaches to genomic selection.

    Science.gov (United States)

    Schulz-Streeck, Torben; Ogutu, Joseph O; Piepho, Hans-Peter

    2013-01-01

    Genomic selection (GS) is a method for predicting breeding values of plants or animals using many molecular markers that is commonly implemented in two stages. In plant breeding the first stage usually involves computation of adjusted means for genotypes which are then used to predict genomic breeding values in the second stage. We compared two classical stage-wise approaches, which either ignore or approximate correlations among the means by a diagonal matrix, and a new method, to a single-stage analysis for GS using ridge regression best linear unbiased prediction (RR-BLUP). The new stage-wise method rotates (orthogonalizes) the adjusted means from the first stage before submitting them to the second stage. This makes the errors approximately independently and identically normally distributed, which is a prerequisite for many procedures that are potentially useful for GS such as machine learning methods (e.g. boosting) and regularized regression methods (e.g. lasso). This is illustrated in this paper using componentwise boosting. The componentwise boosting method minimizes squared error loss using least squares and iteratively and automatically selects markers that are most predictive of genomic breeding values. Results are compared with those of RR-BLUP using fivefold cross-validation. The new stage-wise approach with rotated means was slightly more similar to the single-stage analysis than the classical two-stage approaches based on non-rotated means for two unbalanced datasets. This suggests that rotation is a worthwhile pre-processing step in GS for the two-stage approaches for unbalanced datasets. Moreover, the predictive accuracy of stage-wise RR-BLUP was higher (5.0-6.1%) than that of componentwise boosting.

  4. Multi-Stage Feature Selection by Using Genetic Algorithms for Fault Diagnosis in Gearboxes Based on Vibration Signal

    Directory of Open Access Journals (Sweden)

    Mariela Cerrada

    2015-09-01

    Full Text Available There are growing demands for condition-based monitoring of gearboxes, and techniques to improve the reliability, effectiveness and accuracy for fault diagnosis are considered valuable contributions. Feature selection is still an important aspect in machine learning-based diagnosis in order to reach good performance in the diagnosis system. The main aim of this research is to propose a multi-stage feature selection mechanism for selecting the best set of condition parameters on the time, frequency and time-frequency domains, which are extracted from vibration signals for fault diagnosis purposes in gearboxes. The selection is based on genetic algorithms, proposing in each stage a new subset of the best features regarding the classifier performance in a supervised environment. The selected features are augmented at each stage and used as input for a neural network classifier in the next step, while a new subset of feature candidates is treated by the selection process. As a result, the inherent exploration and exploitation of the genetic algorithms for finding the best solutions of the selection problem are locally focused. The Sensors 2015, 15 23904 approach is tested on a dataset from a real test bed with several fault classes under different running conditions of load and velocity. The model performance for diagnosis is over 98%.

  5. Two-Stage Fuzzy Portfolio Selection Problem with Transaction Costs

    Directory of Open Access Journals (Sweden)

    Yanju Chen

    2015-01-01

    Full Text Available This paper studies a two-period portfolio selection problem. The problem is formulated as a two-stage fuzzy portfolio selection model with transaction costs, in which the future returns of risky security are characterized by possibility distributions. The objective of the proposed model is to achieve the maximum utility in terms of the expected value and variance of the final wealth. Given the first-stage decision vector and a realization of fuzzy return, the optimal value expression of the second-stage programming problem is derived. As a result, the proposed two-stage model is equivalent to a single-stage model, and the analytical optimal solution of the two-stage model is obtained, which helps us to discuss the properties of the optimal solution. Finally, some numerical experiments are performed to demonstrate the new modeling idea and the effectiveness. The computational results provided by the proposed model show that the more risk-averse investor will invest more wealth in the risk-free security. They also show that the optimal invested amount in risky security increases as the risk-free return decreases and the optimal utility increases as the risk-free return increases, whereas the optimal utility increases as the transaction costs decrease. In most instances the utilities provided by the proposed two-stage model are larger than those provided by the single-stage model.

  6. Two-Stage Fuzzy Portfolio Selection Problem with Transaction Costs

    OpenAIRE

    Chen, Yanju; Wang, Ye

    2015-01-01

    This paper studies a two-period portfolio selection problem. The problem is formulated as a two-stage fuzzy portfolio selection model with transaction costs, in which the future returns of risky security are characterized by possibility distributions. The objective of the proposed model is to achieve the maximum utility in terms of the expected value and variance of the final wealth. Given the first-stage decision vector and a realization of fuzzy return, the optimal value expression of the s...

  7. Feature Extraction and Selection Strategies for Automated Target Recognition

    Science.gov (United States)

    Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin

    2010-01-01

    Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory region of-interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.

  8. Two-stage atlas subset selection in multi-atlas based image segmentation

    Energy Technology Data Exchange (ETDEWEB)

    Zhao, Tingting, E-mail: tingtingzhao@mednet.ucla.edu; Ruan, Dan, E-mail: druan@mednet.ucla.edu [The Department of Radiation Oncology, University of California, Los Angeles, California 90095 (United States)

    2015-06-15

    Purpose: Fast growing access to large databases and cloud stored data presents a unique opportunity for multi-atlas based image segmentation and also presents challenges in heterogeneous atlas quality and computation burden. This work aims to develop a novel two-stage method tailored to the special needs in the face of large atlas collection with varied quality, so that high-accuracy segmentation can be achieved with low computational cost. Methods: An atlas subset selection scheme is proposed to substitute a significant portion of the computationally expensive full-fledged registration in the conventional scheme with a low-cost alternative. More specifically, the authors introduce a two-stage atlas subset selection method. In the first stage, an augmented subset is obtained based on a low-cost registration configuration and a preliminary relevance metric; in the second stage, the subset is further narrowed down to a fusion set of desired size, based on full-fledged registration and a refined relevance metric. An inference model is developed to characterize the relationship between the preliminary and refined relevance metrics, and a proper augmented subset size is derived to ensure that the desired atlases survive the preliminary selection with high probability. Results: The performance of the proposed scheme has been assessed with cross validation based on two clinical datasets consisting of manually segmented prostate and brain magnetic resonance images, respectively. The proposed scheme demonstrates comparable end-to-end segmentation performance as the conventional single-stage selection method, but with significant computation reduction. Compared with the alternative computation reduction method, their scheme improves the mean and medium Dice similarity coefficient value from (0.74, 0.78) to (0.83, 0.85) and from (0.82, 0.84) to (0.95, 0.95) for prostate and corpus callosum segmentation, respectively, with statistical significance. Conclusions: The authors

  9. Two-stage atlas subset selection in multi-atlas based image segmentation.

    Science.gov (United States)

    Zhao, Tingting; Ruan, Dan

    2015-06-01

    Fast growing access to large databases and cloud stored data presents a unique opportunity for multi-atlas based image segmentation and also presents challenges in heterogeneous atlas quality and computation burden. This work aims to develop a novel two-stage method tailored to the special needs in the face of large atlas collection with varied quality, so that high-accuracy segmentation can be achieved with low computational cost. An atlas subset selection scheme is proposed to substitute a significant portion of the computationally expensive full-fledged registration in the conventional scheme with a low-cost alternative. More specifically, the authors introduce a two-stage atlas subset selection method. In the first stage, an augmented subset is obtained based on a low-cost registration configuration and a preliminary relevance metric; in the second stage, the subset is further narrowed down to a fusion set of desired size, based on full-fledged registration and a refined relevance metric. An inference model is developed to characterize the relationship between the preliminary and refined relevance metrics, and a proper augmented subset size is derived to ensure that the desired atlases survive the preliminary selection with high probability. The performance of the proposed scheme has been assessed with cross validation based on two clinical datasets consisting of manually segmented prostate and brain magnetic resonance images, respectively. The proposed scheme demonstrates comparable end-to-end segmentation performance as the conventional single-stage selection method, but with significant computation reduction. Compared with the alternative computation reduction method, their scheme improves the mean and medium Dice similarity coefficient value from (0.74, 0.78) to (0.83, 0.85) and from (0.82, 0.84) to (0.95, 0.95) for prostate and corpus callosum segmentation, respectively, with statistical significance. The authors have developed a novel two-stage atlas

  10. Two-stage atlas subset selection in multi-atlas based image segmentation

    International Nuclear Information System (INIS)

    Zhao, Tingting; Ruan, Dan

    2015-01-01

    Purpose: Fast growing access to large databases and cloud stored data presents a unique opportunity for multi-atlas based image segmentation and also presents challenges in heterogeneous atlas quality and computation burden. This work aims to develop a novel two-stage method tailored to the special needs in the face of large atlas collection with varied quality, so that high-accuracy segmentation can be achieved with low computational cost. Methods: An atlas subset selection scheme is proposed to substitute a significant portion of the computationally expensive full-fledged registration in the conventional scheme with a low-cost alternative. More specifically, the authors introduce a two-stage atlas subset selection method. In the first stage, an augmented subset is obtained based on a low-cost registration configuration and a preliminary relevance metric; in the second stage, the subset is further narrowed down to a fusion set of desired size, based on full-fledged registration and a refined relevance metric. An inference model is developed to characterize the relationship between the preliminary and refined relevance metrics, and a proper augmented subset size is derived to ensure that the desired atlases survive the preliminary selection with high probability. Results: The performance of the proposed scheme has been assessed with cross validation based on two clinical datasets consisting of manually segmented prostate and brain magnetic resonance images, respectively. The proposed scheme demonstrates comparable end-to-end segmentation performance as the conventional single-stage selection method, but with significant computation reduction. Compared with the alternative computation reduction method, their scheme improves the mean and medium Dice similarity coefficient value from (0.74, 0.78) to (0.83, 0.85) and from (0.82, 0.84) to (0.95, 0.95) for prostate and corpus callosum segmentation, respectively, with statistical significance. Conclusions: The authors

  11. Aging, selective attention, and feature integration.

    Science.gov (United States)

    Plude, D J; Doussard-Roosevelt, J A

    1989-03-01

    This study used feature-integration theory as a means of determining the point in processing at which selective attention deficits originate. The theory posits an initial stage of processing in which features are registered in parallel and then a serial process in which features are conjoined to form complex stimuli. Performance of young and older adults on feature versus conjunction search is compared. Analyses of reaction times and error rates suggest that elderly adults in addition to young adults, can capitalize on the early parallel processing stage of visual information processing, and that age decrements in visual search arise as a result of the later, serial stage of processing. Analyses of a third, unconfounded, conjunction search condition reveal qualitatively similar modes of conjunction search in young and older adults. The contribution of age-related data limitations is found to be secondary to the contribution of age decrements in selective attention.

  12. Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.

    Science.gov (United States)

    Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel

    2017-08-18

    Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among

  13. Feature extraction and sensor selection for NPP initiating event identification

    International Nuclear Information System (INIS)

    Lin, Ting-Han; Wu, Shun-Chi; Chen, Kuang-You; Chou, Hwai-Pwu

    2017-01-01

    Highlights: • A two-stage feature extraction scheme for NPP initiating event identification. • With stBP, interrelations among the sensors can be retained for identification. • With dSFS, sensors that are crucial for identification can be efficiently selected. • Efficacy of the scheme is illustrated with data from the Maanshan NPP simulator. - Abstract: Initiating event identification is essential in managing nuclear power plant (NPP) severe accidents. In this paper, a novel two-stage feature extraction scheme that incorporates the proposed sensor type-wise block projection (stBP) and deflatable sequential forward selection (dSFS) is used to elicit the discriminant information in the data obtained from various NPP sensors to facilitate event identification. With the stBP, the primal features can be extracted without eliminating the interrelations among the sensors of the same type. The extracted features are then subjected to a further dimensionality reduction by selecting the sensors that are most relevant to the events under consideration. This selection is not easy, and a combinatorial optimization technique is normally required. With the dSFS, an optimal sensor set can be found with less computational load. Moreover, its sensor deflation stage allows sensors in the preselected set to be iteratively refined to avoid being trapped into a local optimum. Results from detailed experiments containing data of 12 event categories and a total of 112 events generated with a Taiwan’s Maanshan NPP simulator are presented to illustrate the efficacy of the proposed scheme.

  14. SU-E-J-128: Two-Stage Atlas Selection in Multi-Atlas-Based Image Segmentation

    International Nuclear Information System (INIS)

    Zhao, T; Ruan, D

    2015-01-01

    Purpose: In the new era of big data, multi-atlas-based image segmentation is challenged by heterogeneous atlas quality and high computation burden from extensive atlas collection, demanding efficient identification of the most relevant atlases. This study aims to develop a two-stage atlas selection scheme to achieve computational economy with performance guarantee. Methods: We develop a low-cost fusion set selection scheme by introducing a preliminary selection to trim full atlas collection into an augmented subset, alleviating the need for extensive full-fledged registrations. More specifically, fusion set selection is performed in two successive steps: preliminary selection and refinement. An augmented subset is first roughly selected from the whole atlas collection with a simple registration scheme and the corresponding preliminary relevance metric; the augmented subset is further refined into the desired fusion set size, using full-fledged registration and the associated relevance metric. The main novelty of this work is the introduction of an inference model to relate the preliminary and refined relevance metrics, based on which the augmented subset size is rigorously derived to ensure the desired atlases survive the preliminary selection with high probability. Results: The performance and complexity of the proposed two-stage atlas selection method were assessed using a collection of 30 prostate MR images. It achieved comparable segmentation accuracy as the conventional one-stage method with full-fledged registration, but significantly reduced computation time to 1/3 (from 30.82 to 11.04 min per segmentation). Compared with alternative one-stage cost-saving approach, the proposed scheme yielded superior performance with mean and medium DSC of (0.83, 0.85) compared to (0.74, 0.78). Conclusion: This work has developed a model-guided two-stage atlas selection scheme to achieve significant cost reduction while guaranteeing high segmentation accuracy. The benefit

  15. Feature Selection Criteria for Real Time EKF-SLAM Algorithm

    Directory of Open Access Journals (Sweden)

    Fernando Auat Cheein

    2010-02-01

    Full Text Available This paper presents a seletion procedure for environmet features for the correction stage of a SLAM (Simultaneous Localization and Mapping algorithm based on an Extended Kalman Filter (EKF. This approach decreases the computational time of the correction stage which allows for real and constant-time implementations of the SLAM. The selection procedure consists in chosing the features the SLAM system state covariance is more sensible to. The entire system is implemented on a mobile robot equipped with a range sensor laser. The features extracted from the environment correspond to lines and corners. Experimental results of the real time SLAM algorithm and an analysis of the processing-time consumed by the SLAM with the feature selection procedure proposed are shown. A comparison between the feature selection approach proposed and the classical sequential EKF-SLAM along with an entropy feature selection approach is also performed.

  16. Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention.

    Science.gov (United States)

    Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei

    2016-01-13

    An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features.

  17. Object-based selection from spatially-invariant representations: evidence from a feature-report task.

    Science.gov (United States)

    Matsukura, Michi; Vecera, Shaun P

    2011-02-01

    Attention selects objects as well as locations. When attention selects an object's features, observers identify two features from a single object more accurately than two features from two different objects (object-based effect of attention; e.g., Duncan, Journal of Experimental Psychology: General, 113, 501-517, 1984). Several studies have demonstrated that object-based attention can operate at a late visual processing stage that is independent of objects' spatial information (Awh, Dhaliwal, Christensen, & Matsukura, Psychological Science, 12, 329-334, 2001; Matsukura & Vecera, Psychonomic Bulletin & Review, 16, 529-536, 2009; Vecera, Journal of Experimental Psychology: General, 126, 14-18, 1997; Vecera & Farah, Journal of Experimental Psychology: General, 123, 146-160, 1994). In the present study, we asked two questions regarding this late object-based selection mechanism. In Part I, we investigated how observers' foreknowledge of to-be-reported features allows attention to select objects, as opposed to individual features. Using a feature-report task, a significant object-based effect was observed when to-be-reported features were known in advance but not when this advance knowledge was absent. In Part II, we examined what drives attention to select objects rather than individual features in the absence of observers' foreknowledge of to-be-reported features. Results suggested that, when there was no opportunity for observers to direct their attention to objects that possess to-be-reported features at the time of stimulus presentation, these stimuli must retain strong perceptual cues to establish themselves as separate objects.

  18. Day-ahead price forecasting of electricity markets by a new feature selection algorithm and cascaded neural network technique

    International Nuclear Information System (INIS)

    Amjady, Nima; Keynia, Farshid

    2009-01-01

    With the introduction of restructuring into the electric power industry, the price of electricity has become the focus of all activities in the power market. Electricity price forecast is key information for electricity market managers and participants. However, electricity price is a complex signal due to its non-linear, non-stationary, and time variant behavior. In spite of performed research in this area, more accurate and robust price forecast methods are still required. In this paper, a new forecast strategy is proposed for day-ahead price forecasting of electricity markets. Our forecast strategy is composed of a new two stage feature selection technique and cascaded neural networks. The proposed feature selection technique comprises modified Relief algorithm for the first stage and correlation analysis for the second stage. The modified Relief algorithm selects candidate inputs with maximum relevancy with the target variable. Then among the selected candidates, the correlation analysis eliminates redundant inputs. Selected features by the two stage feature selection technique are used for the forecast engine, which is composed of 24 consecutive forecasters. Each of these 24 forecasters is a neural network allocated to predict the price of 1 h of the next day. The whole proposed forecast strategy is examined on the Spanish and Australia's National Electricity Markets Management Company (NEMMCO) and compared with some of the most recent price forecast methods.

  19. Comparative Study on Feature Selection and Fusion Schemes for Emotion Recognition from Speech

    Directory of Open Access Journals (Sweden)

    Santiago Planet

    2012-09-01

    Full Text Available The automatic analysis of speech to detect affective states may improve the way users interact with electronic devices. However, the analysis only at the acoustic level could be not enough to determine the emotion of a user in a realistic scenario. In this paper we analyzed the spontaneous speech recordings of the FAU Aibo Corpus at the acoustic and linguistic levels to extract two sets of features. The acoustic set was reduced by a greedy procedure selecting the most relevant features to optimize the learning stage. We compared two versions of this greedy selection algorithm by performing the search of the relevant features forwards and backwards. We experimented with three classification approaches: Naïve-Bayes, a support vector machine and a logistic model tree, and two fusion schemes: decision-level fusion, merging the hard-decisions of the acoustic and linguistic classifiers by means of a decision tree; and feature-level fusion, concatenating both sets of features before the learning stage. Despite the low performance achieved by the linguistic data, a dramatic improvement was achieved after its combination with the acoustic information, improving the results achieved by this second modality on its own. The results achieved by the classifiers using the parameters merged at feature level outperformed the classification results of the decision-level fusion scheme, despite the simplicity of the scheme. Moreover, the extremely reduced set of acoustic features obtained by the greedy forward search selection algorithm improved the results provided by the full set.

  20. The impact of feature selection on one and two-class classification performance for plant microRNAs.

    Science.gov (United States)

    Khalifa, Waleed; Yousef, Malik; Saçar Demirci, Müşerref Duygu; Allmer, Jens

    2016-01-01

    MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ∼29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ∼13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.

  1. The impact of feature selection on one and two-class classification performance for plant microRNAs

    Directory of Open Access Journals (Sweden)

    Waleed Khalifa

    2016-06-01

    Full Text Available MicroRNAs (miRNAs are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18–24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC is used in the field; because negative examples are hard to come by, one-class classification (OCC has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ∼29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ∼13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.

  2. On the prior probabilities for two-stage Bayesian estimates

    International Nuclear Information System (INIS)

    Kohut, P.

    1992-01-01

    The method of Bayesian inference is reexamined for its applicability and for the required underlying assumptions in obtaining and using prior probability estimates. Two different approaches are suggested to determine the first-stage priors in the two-stage Bayesian analysis which avoid certain assumptions required for other techniques. In the first scheme, the prior is obtained through a true frequency based distribution generated at selected intervals utilizing actual sampling of the failure rate distributions. The population variability distribution is generated as the weighed average of the frequency distributions. The second method is based on a non-parametric Bayesian approach using the Maximum Entropy Principle. Specific features such as integral properties or selected parameters of prior distributions may be obtained with minimal assumptions. It is indicated how various quantiles may also be generated with a least square technique

  3. Online feature selection with streaming features.

    Science.gov (United States)

    Wu, Xindong; Yu, Kui; Ding, Wei; Wang, Hao; Zhu, Xingquan

    2013-05-01

    We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.

  4. A Two-Stage Penalized Logistic Regression Approach to Case-Control Genome-Wide Association Studies

    Directory of Open Access Journals (Sweden)

    Jingyuan Zhao

    2012-01-01

    Full Text Available We propose a two-stage penalized logistic regression approach to case-control genome-wide association studies. This approach consists of a screening stage and a selection stage. In the screening stage, main-effect and interaction-effect features are screened by using L1-penalized logistic like-lihoods. In the selection stage, the retained features are ranked by the logistic likelihood with the smoothly clipped absolute deviation (SCAD penalty (Fan and Li, 2001 and Jeffrey’s Prior penalty (Firth, 1993, a sequence of nested candidate models are formed, and the models are assessed by a family of extended Bayesian information criteria (J. Chen and Z. Chen, 2008. The proposed approach is applied to the analysis of the prostate cancer data of the Cancer Genetic Markers of Susceptibility (CGEMS project in the National Cancer Institute, USA. Simulation studies are carried out to compare the approach with the pair-wise multiple testing approach (Marchini et al. 2005 and the LASSO-patternsearch algorithm (Shi et al. 2007.

  5. Attentional Selection of Feature Conjunctions Is Accomplished by Parallel and Independent Selection of Single Features.

    Science.gov (United States)

    Andersen, Søren K; Müller, Matthias M; Hillyard, Steven A

    2015-07-08

    Experiments that study feature-based attention have often examined situations in which selection is based on a single feature (e.g., the color red). However, in more complex situations relevant stimuli may not be set apart from other stimuli by a single defining property but by a specific combination of features. Here, we examined sustained attentional selection of stimuli defined by conjunctions of color and orientation. Human observers attended to one out of four concurrently presented superimposed fields of randomly moving horizontal or vertical bars of red or blue color to detect brief intervals of coherent motion. Selective stimulus processing in early visual cortex was assessed by recordings of steady-state visual evoked potentials (SSVEPs) elicited by each of the flickering fields of stimuli. We directly contrasted attentional selection of single features and feature conjunctions and found that SSVEP amplitudes on conditions in which selection was based on a single feature only (color or orientation) exactly predicted the magnitude of attentional enhancement of SSVEPs when attending to a conjunction of both features. Furthermore, enhanced SSVEP amplitudes elicited by attended stimuli were accompanied by equivalent reductions of SSVEP amplitudes elicited by unattended stimuli in all cases. We conclude that attentional selection of a feature-conjunction stimulus is accomplished by the parallel and independent facilitation of its constituent feature dimensions in early visual cortex. The ability to perceive the world is limited by the brain's processing capacity. Attention affords adaptive behavior by selectively prioritizing processing of relevant stimuli based on their features (location, color, orientation, etc.). We found that attentional mechanisms for selection of different features belonging to the same object operate independently and in parallel: concurrent attentional selection of two stimulus features is simply the sum of attending to each of those

  6. Color selection and location selection in ERPs : differences, similarities and 'neural specificity'

    NARCIS (Netherlands)

    Lange, J.J.; Wijers, A.A.; Mulder, L.J.M.; Mulder, G.

    It was hypothesized that color selection consists of two stages. The first stage represents a feature specific selection in neural populations specialized in processing color. The second stage constitutes feature non-specific selections, related to executive attentional processes and/or motor

  7. Doubly sparse factor models for unifying feature transformation and feature selection

    International Nuclear Information System (INIS)

    Katahira, Kentaro; Okanoya, Kazuo; Okada, Masato; Matsumoto, Narihisa; Sugase-Miyamoto, Yasuko

    2010-01-01

    A number of unsupervised learning methods for high-dimensional data are largely divided into two groups based on their procedures, i.e., (1) feature selection, which discards irrelevant dimensions of the data, and (2) feature transformation, which constructs new variables by transforming and mixing over all dimensions. We propose a method that both selects and transforms features in a common Bayesian inference procedure. Our method imposes a doubly automatic relevance determination (ARD) prior on the factor loading matrix. We propose a variational Bayesian inference for our model and demonstrate the performance of our method on both synthetic and real data.

  8. Doubly sparse factor models for unifying feature transformation and feature selection

    Energy Technology Data Exchange (ETDEWEB)

    Katahira, Kentaro; Okanoya, Kazuo; Okada, Masato [ERATO, Okanoya Emotional Information Project, Japan Science Technology Agency, Saitama (Japan); Matsumoto, Narihisa; Sugase-Miyamoto, Yasuko, E-mail: okada@k.u-tokyo.ac.j [Human Technology Research Institute, National Institute of Advanced Industrial Science and Technology, Ibaraki (Japan)

    2010-06-01

    A number of unsupervised learning methods for high-dimensional data are largely divided into two groups based on their procedures, i.e., (1) feature selection, which discards irrelevant dimensions of the data, and (2) feature transformation, which constructs new variables by transforming and mixing over all dimensions. We propose a method that both selects and transforms features in a common Bayesian inference procedure. Our method imposes a doubly automatic relevance determination (ARD) prior on the factor loading matrix. We propose a variational Bayesian inference for our model and demonstrate the performance of our method on both synthetic and real data.

  9. Feature selection for high-dimensional integrated data

    KAUST Repository

    Zheng, Charles

    2012-04-26

    Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of feature selection in which only a subset of the predictors Xt are dependent on the multidimensional variate Y, and the remainder of the predictors constitute a “noise set” Xu independent of Y. Using Monte Carlo simulations, we investigated the relative performance of two methods: thresholding and singular-value decomposition, in combination with stochastic optimization to determine “empirical bounds” on the small-sample accuracy of an asymptotic approximation. We demonstrate utility of the thresholding and SVD feature selection methods to with respect to a recent infant intestinal gene expression and metagenomics dataset.

  10. Feature selection for high-dimensional integrated data

    KAUST Repository

    Zheng, Charles; Schwartz, Scott; Chapkin, Robert S.; Carroll, Raymond J.; Ivanov, Ivan

    2012-01-01

    Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of feature selection in which only a subset of the predictors Xt are dependent on the multidimensional variate Y, and the remainder of the predictors constitute a “noise set” Xu independent of Y. Using Monte Carlo simulations, we investigated the relative performance of two methods: thresholding and singular-value decomposition, in combination with stochastic optimization to determine “empirical bounds” on the small-sample accuracy of an asymptotic approximation. We demonstrate utility of the thresholding and SVD feature selection methods to with respect to a recent infant intestinal gene expression and metagenomics dataset.

  11. Single-stage Acetabular Revision During Two-stage THA Revision for Infection is Effective in Selected Patients.

    Science.gov (United States)

    Fink, Bernd; Schlumberger, Michael; Oremek, Damian

    2017-08-01

    The treatment of periprosthetic infections of hip arthroplasties typically involves use of either a single- or two-stage (with implantation of a temporary spacer) revision surgery. In patients with severe acetabular bone deficiencies, either already present or after component removal, spacers cannot be safely implanted. In such hips where it is impossible to use spacers and yet a two-stage revision of the prosthetic stem is recommended, we have combined a two-stage revision of the stem with a single revision of the cup. To our knowledge, this approach has not been reported before. (1) What proportion of patients treated with single-stage acetabular reconstruction as part of a two-stage revision for an infected THA remain free from infection at 2 or more years? (2) What are the Harris hip scores after the first stage and at 2 years or more after the definitive reimplantation? Between June 2009 and June 2014, we treated all patients undergoing surgical treatment for an infected THA using a single-stage acetabular revision as part of a two-stage THA exchange if the acetabular defect classification was Paprosky Types 2B, 2C, 3A, 3B, or pelvic discontinuity and a two-stage procedure was preferred for the femur. The procedure included removal of all components, joint débridement, definitive acetabular reconstruction (with a cage to bridge the defect, and a cemented socket), and a temporary cemented femoral component at the first stage; the second stage consisted of repeat joint and femoral débridement and exchange of the femoral component to a cementless device. During the period noted, 35 patients met those definitions and were treated with this approach. No patients were lost to followup before 2 years; mean followup was 42 months (range, 24-84 months). The clinical evaluation was performed with the Harris hip scores and resolution of infection was assessed by the absence of clinical signs of infection and a C-reactive protein level less than 10 mg/L. All

  12. Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests.

    Science.gov (United States)

    Nguyen, Thanh-Tung; Huang, Joshua; Wu, Qingyao; Nguyen, Thuy; Li, Mark

    2015-01-01

    Single-nucleotide polymorphisms (SNPs) selection and identification are the most important tasks in Genome-wide association data analysis. The problem is difficult because genome-wide association data is very high dimensional and a large portion of SNPs in the data is irrelevant to the disease. Advanced machine learning methods have been successfully used in Genome-wide association studies (GWAS) for identification of genetic variants that have relatively big effects in some common, complex diseases. Among them, the most successful one is Random Forests (RF). Despite of performing well in terms of prediction accuracy in some data sets with moderate size, RF still suffers from working in GWAS for selecting informative SNPs and building accurate prediction models. In this paper, we propose to use a new two-stage quality-based sampling method in random forests, named ts-RF, for SNP subspace selection for GWAS. The method first applies p-value assessment to find a cut-off point that separates informative and irrelevant SNPs in two groups. The informative SNPs group is further divided into two sub-groups: highly informative and weak informative SNPs. When sampling the SNP subspace for building trees for the forest, only those SNPs from the two sub-groups are taken into account. The feature subspaces always contain highly informative SNPs when used to split a node at a tree. This approach enables one to generate more accurate trees with a lower prediction error, meanwhile possibly avoiding overfitting. It allows one to detect interactions of multiple SNPs with the diseases, and to reduce the dimensionality and the amount of Genome-wide association data needed for learning the RF model. Extensive experiments on two genome-wide SNP data sets (Parkinson case-control data comprised of 408,803 SNPs and Alzheimer case-control data comprised of 380,157 SNPs) and 10 gene data sets have demonstrated that the proposed model significantly reduced prediction errors and outperformed

  13. Efficient Multi-Label Feature Selection Using Entropy-Based Label Selection

    Directory of Open Access Journals (Sweden)

    Jaesung Lee

    2016-11-01

    Full Text Available Multi-label feature selection is designed to select a subset of features according to their importance to multiple labels. This task can be achieved by ranking the dependencies of features and selecting the features with the highest rankings. In a multi-label feature selection problem, the algorithm may be faced with a dataset containing a large number of labels. Because the computational cost of multi-label feature selection increases according to the number of labels, the algorithm may suffer from a degradation in performance when processing very large datasets. In this study, we propose an efficient multi-label feature selection method based on an information-theoretic label selection strategy. By identifying a subset of labels that significantly influence the importance of features, the proposed method efficiently outputs a feature subset. Experimental results demonstrate that the proposed method can identify a feature subset much faster than conventional multi-label feature selection methods for large multi-label datasets.

  14. Pairwise Constraint-Guided Sparse Learning for Feature Selection.

    Science.gov (United States)

    Liu, Mingxia; Zhang, Daoqiang

    2016-01-01

    Feature selection aims to identify the most informative features for a compact and accurate data representation. As typical supervised feature selection methods, Lasso and its variants using L1-norm-based regularization terms have received much attention in recent studies, most of which use class labels as supervised information. Besides class labels, there are other types of supervised information, e.g., pairwise constraints that specify whether a pair of data samples belong to the same class (must-link constraint) or different classes (cannot-link constraint). However, most of existing L1-norm-based sparse learning methods do not take advantage of the pairwise constraints that provide us weak and more general supervised information. For addressing that problem, we propose a pairwise constraint-guided sparse (CGS) learning method for feature selection, where the must-link and the cannot-link constraints are used as discriminative regularization terms that directly concentrate on the local discriminative structure of data. Furthermore, we develop two variants of CGS, including: 1) semi-supervised CGS that utilizes labeled data, pairwise constraints, and unlabeled data and 2) ensemble CGS that uses the ensemble of pairwise constraint sets. We conduct a series of experiments on a number of data sets from University of California-Irvine machine learning repository, a gene expression data set, two real-world neuroimaging-based classification tasks, and two large-scale attribute classification tasks. Experimental results demonstrate the efficacy of our proposed methods, compared with several established feature selection methods.

  15. FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES

    Directory of Open Access Journals (Sweden)

    Ratri Enggar Pawening

    2016-06-01

    Full Text Available Datasets with heterogeneous features can affect feature selection results that are not appropriate because it is difficult to evaluate heterogeneous features concurrently. Feature transformation (FT is another way to handle heterogeneous features subset selection. The results of transformation from non-numerical into numerical features may produce redundancy to the original numerical features. In this paper, we propose a method to select feature subset based on mutual information (MI for classifying heterogeneous features. We use unsupervised feature transformation (UFT methods and joint mutual information maximation (JMIM methods. UFT methods is used to transform non-numerical features into numerical features. JMIM methods is used to select feature subset with a consideration of the class label. The transformed and the original features are combined entirely, then determine features subset by using JMIM methods, and classify them using support vector machine (SVM algorithm. The classification accuracy are measured for any number of selected feature subset and compared between UFT-JMIM methods and Dummy-JMIM methods. The average classification accuracy for all experiments in this study that can be achieved by UFT-JMIM methods is about 84.47% and Dummy-JMIM methods is about 84.24%. This result shows that UFT-JMIM methods can minimize information loss between transformed and original features, and select feature subset to avoid redundant and irrelevant features.

  16. Hybrid feature selection for supporting lightweight intrusion detection systems

    Science.gov (United States)

    Song, Jianglong; Zhao, Wentao; Liu, Qiang; Wang, Xin

    2017-08-01

    Redundant and irrelevant features not only cause high resource consumption but also degrade the performance of Intrusion Detection Systems (IDS), especially when coping with big data. These features slow down the process of training and testing in network traffic classification. Therefore, a hybrid feature selection approach in combination with wrapper and filter selection is designed in this paper to build a lightweight intrusion detection system. Two main phases are involved in this method. The first phase conducts a preliminary search for an optimal subset of features, in which the chi-square feature selection is utilized. The selected set of features from the previous phase is further refined in the second phase in a wrapper manner, in which the Random Forest(RF) is used to guide the selection process and retain an optimized set of features. After that, we build an RF-based detection model and make a fair comparison with other approaches. The experimental results on NSL-KDD datasets show that our approach results are in higher detection accuracy as well as faster training and testing processes.

  17. Accuracy of the One-Stage and Two-Stage Impression Techniques: A Comparative Analysis.

    Science.gov (United States)

    Jamshidy, Ladan; Mozaffari, Hamid Reza; Faraji, Payam; Sharifi, Roohollah

    2016-01-01

    Introduction . One of the main steps of impression is the selection and preparation of an appropriate tray. Hence, the present study aimed to analyze and compare the accuracy of one- and two-stage impression techniques. Materials and Methods . A resin laboratory-made model, as the first molar, was prepared by standard method for full crowns with processed preparation finish line of 1 mm depth and convergence angle of 3-4°. Impression was made 20 times with one-stage technique and 20 times with two-stage technique using an appropriate tray. To measure the marginal gap, the distance between the restoration margin and preparation finish line of plaster dies was vertically determined in mid mesial, distal, buccal, and lingual (MDBL) regions by a stereomicroscope using a standard method. Results . The results of independent test showed that the mean value of the marginal gap obtained by one-stage impression technique was higher than that of two-stage impression technique. Further, there was no significant difference between one- and two-stage impression techniques in mid buccal region, but a significant difference was reported between the two impression techniques in MDL regions and in general. Conclusion . The findings of the present study indicated higher accuracy for two-stage impression technique than for the one-stage impression technique.

  18. One-stage versus two-stage exchange arthroplasty for infected total knee arthroplasty: a systematic review.

    Science.gov (United States)

    Nagra, Navraj S; Hamilton, Thomas W; Ganatra, Sameer; Murray, David W; Pandit, Hemant

    2016-10-01

    Infection complicating total knee arthroplasty (TKA) has serious implications. Traditionally the debate on whether one- or two-stage exchange arthroplasty is the optimum management of infected TKA has favoured two-stage procedures; however, a paradigm shift in opinion is emerging. This study aimed to establish whether current evidence supports one-stage revision for managing infected TKA based on reinfection rates and functional outcomes post-surgery. MEDLINE/PubMed and CENTRAL databases were reviewed for studies that compared one- and two-stage exchange arthroplasty TKA in more than ten patients with a minimum 2-year follow-up. From an initial sample of 796, five cohort studies with a total of 231 patients (46 single-stage/185 two-stage; median patient age 66 years, range 61-71 years) met inclusion criteria. Overall, there were no significant differences in risk of reinfection following one- or two-stage exchange arthroplasty (OR -0.06, 95 % confidence interval -0.13, 0.01). Subgroup analysis revealed that in studies published since 2000, one-stage procedures have a significantly lower reinfection rate. One study investigated functional outcomes and reported that one-stage surgery was associated with superior functional outcomes. Scarcity of data, inconsistent study designs, surgical technique and antibiotic regime disparities limit recommendations that can be made. Recent studies suggest one-stage exchange arthroplasty may provide superior outcomes, including lower reinfection rates and superior function, in select patients. Clinically, for some patients, one-stage exchange arthroplasty may represent optimum treatment; however, patient selection criteria and key components of surgical and post-operative anti-microbial management remain to be defined. III.

  19. Degree of contribution (DoC) feature selection algorithm for structural brain MRI volumetric features in depression detection.

    Science.gov (United States)

    Kipli, Kuryati; Kouzani, Abbas Z

    2015-07-01

    Accurate detection of depression at an individual level using structural magnetic resonance imaging (sMRI) remains a challenge. Brain volumetric changes at a structural level appear to have importance in depression biomarkers studies. An automated algorithm is developed to select brain sMRI volumetric features for the detection of depression. A feature selection (FS) algorithm called degree of contribution (DoC) is developed for selection of sMRI volumetric features. This algorithm uses an ensemble approach to determine the degree of contribution in detection of major depressive disorder. The DoC is the score of feature importance used for feature ranking. The algorithm involves four stages: feature ranking, subset generation, subset evaluation, and DoC analysis. The performance of DoC is evaluated on the Duke University Multi-site Imaging Research in the Analysis of Depression sMRI dataset. The dataset consists of 115 brain sMRI scans of 88 healthy controls and 27 depressed subjects. Forty-four sMRI volumetric features are used in the evaluation. The DoC score of forty-four features was determined as the accuracy threshold (Acc_Thresh) was varied. The DoC performance was compared with that of four existing FS algorithms. At all defined Acc_Threshs, DoC outperformed the four examined FS algorithms for the average classification score and the maximum classification score. DoC has a good ability to generate reduced-size subsets of important features that could yield high classification accuracy. Based on the DoC score, the most discriminant volumetric features are those from the left-brain region.

  20. Accuracy of the One-Stage and Two-Stage Impression Techniques: A Comparative Analysis

    Directory of Open Access Journals (Sweden)

    Ladan Jamshidy

    2016-01-01

    Full Text Available Introduction. One of the main steps of impression is the selection and preparation of an appropriate tray. Hence, the present study aimed to analyze and compare the accuracy of one- and two-stage impression techniques. Materials and Methods. A resin laboratory-made model, as the first molar, was prepared by standard method for full crowns with processed preparation finish line of 1 mm depth and convergence angle of 3-4°. Impression was made 20 times with one-stage technique and 20 times with two-stage technique using an appropriate tray. To measure the marginal gap, the distance between the restoration margin and preparation finish line of plaster dies was vertically determined in mid mesial, distal, buccal, and lingual (MDBL regions by a stereomicroscope using a standard method. Results. The results of independent test showed that the mean value of the marginal gap obtained by one-stage impression technique was higher than that of two-stage impression technique. Further, there was no significant difference between one- and two-stage impression techniques in mid buccal region, but a significant difference was reported between the two impression techniques in MDL regions and in general. Conclusion. The findings of the present study indicated higher accuracy for two-stage impression technique than for the one-stage impression technique.

  1. Unsupervised Feature Subset Selection

    DEFF Research Database (Denmark)

    Søndberg-Madsen, Nicolaj; Thomsen, C.; Pena, Jose

    2003-01-01

    This paper studies filter and hybrid filter-wrapper feature subset selection for unsupervised learning (data clustering). We constrain the search for the best feature subset by scoring the dependence of every feature on the rest of the features, conjecturing that these scores discriminate some ir...... irrelevant features. We report experimental results on artificial and real data for unsupervised learning of naive Bayes models. Both the filter and hybrid approaches perform satisfactorily....

  2. Feature-selective attention in healthy old age: a selective decline in selective attention?

    Science.gov (United States)

    Quigley, Cliodhna; Müller, Matthias M

    2014-02-12

    Deficient selection against irrelevant information has been proposed to underlie age-related cognitive decline. We recently reported evidence for maintained early sensory selection when older and younger adults used spatial selective attention to perform a challenging task. Here we explored age-related differences when spatial selection is not possible and feature-selective attention must be deployed. We additionally compared the integrity of feedforward processing by exploiting the well established phenomenon of suppression of visual cortical responses attributable to interstimulus competition. Electroencephalogram was measured while older and younger human adults responded to brief occurrences of coherent motion in an attended stimulus composed of randomly moving, orientation-defined, flickering bars. Attention was directed to horizontal or vertical bars by a pretrial cue, after which two orthogonally oriented, overlapping stimuli or a single stimulus were presented. Horizontal and vertical bars flickered at different frequencies and thereby elicited separable steady-state visual-evoked potentials, which were used to examine the effect of feature-based selection and the competitive influence of a second stimulus on ongoing visual processing. Age differences were found in feature-selective attentional modulation of visual responses: older adults did not show consistent modulation of magnitude or phase. In contrast, the suppressive effect of a second stimulus was robust and comparable in magnitude across age groups, suggesting that bottom-up processing of the current stimuli is essentially unchanged in healthy old age. Thus, it seems that visual processing per se is unchanged, but top-down attentional control is compromised in older adults when space cannot be used to guide selection.

  3. Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

    Science.gov (United States)

    Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

    2017-01-01

    Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods.

  4. Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

    Science.gov (United States)

    Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

    2017-01-01

    Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods. PMID:28120883

  5. Feature Selection by Reordering

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel; Jiřina jr., M.

    2005-01-01

    Roč. 2, č. 1 (2005), s. 155-161 ISSN 1738-6438 Institutional research plan: CEZ:AV0Z10300504 Keywords : feature selection * data reduction * ordering of features Subject RIV: BA - General Mathematics

  6. EOG and EMG: two important switches in automatic sleep stage classification.

    Science.gov (United States)

    Estrada, E; Nazeran, H; Barragan, J; Burk, J R; Lucas, E A; Behbehani, K

    2006-01-01

    Sleep is a natural periodic state of rest for the body, in which the eyes are usually closed and consciousness is completely or partially lost. In this investigation we used the EOG and EMG signals acquired from 10 patients undergoing overnight polysomnography with their sleep stages determined by expert sleep specialists based on RK rules. Differentiation between Stage 1, Awake and REM stages challenged a well trained neural network classifier to distinguish between classes when only EEG-derived signal features were used. To meet this challenge and improve the classification rate, extra features extracted from EOG and EMG signals were fed to the classifier. In this study, two simple feature extraction algorithms were applied to EOG and EMG signals. The statistics of the results were calculated and displayed in an easy to visualize fashion to observe tendencies for each sleep stage. Inclusion of these features show a great promise to improve the classification rate towards the target rate of 100%

  7. Accuracy of preoperative CT T staging of renal cell carcinoma: which features predict advanced stage?

    International Nuclear Information System (INIS)

    Bradley, A.J.; MacDonald, L.; Whiteside, S.; Johnson, R.J.; Ramani, V.A.C.

    2015-01-01

    Aims: To characterise CT findings in renal cell carcinoma (RCC), and establish which features are associated with higher clinical T stage disease, and to evaluate patterns of discrepancy between radiological and pathological staging of RCC. Materials and methods: Preoperative CT studies of 92 patients with 94 pathologically proven RCCs were retrospectively reviewed. CT stage was compared with pathological stage using the American Joint Committee on Cancer (AJCC), 7 th edition (2010). The presence or absence of tumour necrosis, perinephric fat standing, thickening of Gerota's fascia, collateral vessels were noted, and correlated with pT stage. The sensitivity, specificity, and positive (PPV) and negative predictive values (NPV) for predicting pT stage ≥pT3a were derived separately for different predictors using cross-tabulations. Results: Twenty-four lesions were pathological stage T1a, 21 were T1b, seven were T2a, 25 were T3a, 11 were T3b, four were T3c, and two were T4. There were no stage T2b. Sixty-three (67%) patients had necrosis, 27 (29%) thickening of Gerota's fascia (1 T1a), 25 had collateral vessels (0 T1a), 28 (30%) had fat stranding of <2 mm, 20 (21%) of 2–5mm and one (1%) of >5 mm. For pT stage ≥pT3a, the presence of perinephric fat stranding had a sensitivity, specificity, PPV and NPV of 74%, 65%, 63%, and 76%, respectively. Presence of tumour necrosis had a sensitivity, specificity, PPV, and NPV of 81%, 44%, 54%, and 72%, respectively. Thickening of Gerota's fascia had a sensitivity, specificity, PPV, and NPV of 52%, 90%, 81% and 70%, respectively; and enlarged collateral vessels had a sensitivity, specificity, PPV, and NPV value of 52%, 94%, 88%, and 71% respectively. Conclusion: The presence of perinephric stranding and tumour necrosis were not reliable signs for pT stage >T3a. Thickening of Gerota's fascia and the presence of collateral vessels in the peri- or paranephric fat had 90% and 94% specificity, with 82% and 88

  8. Feature Selection Methods for Zero-Shot Learning of Neural Activity

    Directory of Open Access Journals (Sweden)

    Carlos A. Caceres

    2017-06-01

    Full Text Available Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.

  9. Principal Feature Analysis: A Multivariate Feature Selection Method for fMRI Data

    Directory of Open Access Journals (Sweden)

    Lijun Wang

    2013-01-01

    Full Text Available Brain decoding with functional magnetic resonance imaging (fMRI requires analysis of complex, multivariate data. Multivoxel pattern analysis (MVPA has been widely used in recent years. MVPA treats the activation of multiple voxels from fMRI data as a pattern and decodes brain states using pattern classification methods. Feature selection is a critical procedure of MVPA because it decides which features will be included in the classification analysis of fMRI data, thereby improving the performance of the classifier. Features can be selected by limiting the analysis to specific anatomical regions or by computing univariate (voxel-wise or multivariate statistics. However, these methods either discard some informative features or select features with redundant information. This paper introduces the principal feature analysis as a novel multivariate feature selection method for fMRI data processing. This multivariate approach aims to remove features with redundant information, thereby selecting fewer features, while retaining the most information.

  10. A redundancy-removing feature selection algorithm for nominal data

    Directory of Open Access Journals (Sweden)

    Zhihua Li

    2015-10-01

    Full Text Available No order correlation or similarity metric exists in nominal data, and there will always be more redundancy in a nominal dataset, which means that an efficient mutual information-based nominal-data feature selection method is relatively difficult to find. In this paper, a nominal-data feature selection method based on mutual information without data transformation, called the redundancy-removing more relevance less redundancy algorithm, is proposed. By forming several new information-related definitions and the corresponding computational methods, the proposed method can compute the information-related amount of nominal data directly. Furthermore, by creating a new evaluation function that considers both the relevance and the redundancy globally, the new feature selection method can evaluate the importance of each nominal-data feature. Although the presented feature selection method takes commonly used MIFS-like forms, it is capable of handling high-dimensional datasets without expensive computations. We perform extensive experimental comparisons of the proposed algorithm and other methods using three benchmarking nominal datasets with two different classifiers. The experimental results demonstrate the average advantage of the presented algorithm over the well-known NMIFS algorithm in terms of the feature selection and classification accuracy, which indicates that the proposed method has a promising performance.

  11. Max-AUC feature selection in computer-aided detection of polyps in CT colonography.

    Science.gov (United States)

    Xu, Jian-Wu; Suzuki, Kenji

    2014-03-01

    We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level.

  12. Effective automated feature construction and selection for classification of biological sequences.

    Directory of Open Access Journals (Sweden)

    Uday Kamath

    Full Text Available Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection of regulatory regions, splice sites, exons, hypersensitive sites, and more. These problems naturally lend themselves to formulation as classification problems in machine learning. When classification is based on features extracted from the sequences under investigation, success is critically dependent on the chosen set of features.We present an algorithmic framework (EFFECT for automated detection of functional signals in biological sequences. We focus here on classification problems involving DNA sequences which state-of-the-art work in machine learning shows to be challenging and involve complex combinations of local and distal features. EFFECT uses a two-stage process to first construct a set of candidate sequence-based features and then select a most effective subset for the classification task at hand. Both stages make heavy use of evolutionary algorithms to efficiently guide the search towards informative features capable of discriminating between sequences that contain a particular functional signal and those that do not.To demonstrate its generality, EFFECT is applied to three separate problems of importance in DNA research: the recognition of hypersensitive sites, splice sites, and ALU sites. Comparisons with state-of-the-art algorithms show that the framework is both general and powerful. In addition, a detailed analysis of the constructed features shows that they contain valuable biological information about DNA architecture, allowing biologists and other researchers to directly inspect the features and potentially use the insights obtained to assist wet-laboratory studies on retainment or modification

  13. Features of mechanical snubbers and the method of selection

    International Nuclear Information System (INIS)

    Sunakoda, Katsuaki

    1978-01-01

    In the oil snubbers used in the high radiation environment of nuclear power stations, gas generation from oil and the deterioration of rubber material for sealing occur due to radiation damage, therefore periodical inspection and replacement are required during operation. The mechanical snubbers developed as aseismatic supporters in place of oil snubbers have entered the stage of practical use, and are made by two companies in USA and a company in Japan. Their features as compared with oil snubbers are as follows. The cost and time required for the maintenance were made as small as possible because the increase of the service life of mechanical components can be expected. The temperature dependence of mechanical snubbers is small. The matters demanding attention in the maintenance are the secular change of lubricating oil and the effect of radiation, and the rust prevention of ball screw bearings. These problems are being studied by Power Reactor and Nuclear Fuel Development Corp. for the fast prototype reactor Monju. The structural feature is to convert the thrust movement of equipments and pipings due to thermal expansion and contraction or earthquakes into rotating motion, using ball screws. The features and the construction of SMS type mechanical snubbers, the test and inspection prior to their shipping, the method of selection, and the method of handling them in actual places are explained. (Kako, I.)

  14. Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection.

    Science.gov (United States)

    Li, Baopu; Meng, Max Q-H

    2012-05-01

    Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.

  15. Two-stage Security Controls Selection

    NARCIS (Netherlands)

    Yevseyeva, I.; Basto, Fernandes V.; Moorsel, van A.; Janicke, H.; Michael, Emmerich T. M.

    2016-01-01

    To protect a system from potential cyber security breaches and attacks, one needs to select efficient security controls, taking into account technical and institutional goals and constraints, such as available budget, enterprise activity, internal and external environment. Here we model the security

  16. Two-stage nonrecursive filter/decimator

    International Nuclear Information System (INIS)

    Yoder, J.R.; Richard, B.D.

    1980-08-01

    A two-stage digital filter/decimator has been designed and implemented to reduce the sampling rate associated with the long-term computer storage of certain digital waveforms. This report describes the design selection and implementation process and serves as documentation for the system actually installed. A filter design with finite-impulse response (nonrecursive) was chosen for implementation via direct convolution. A newly-developed system-test statistic validates the system under different computer-operating environments

  17. A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization.

    Science.gov (United States)

    He, Xiaofei; Ji, Ming; Zhang, Chiyuan; Bao, Hujun

    2011-10-01

    In many information processing tasks, one is often confronted with very high-dimensional data. Feature selection techniques are designed to find the meaningful feature subset of the original features which can facilitate clustering, classification, and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenarios, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. Based on Laplacian regularized least squares, which finds a smooth function on the data manifold and minimizes the empirical loss, we propose two novel feature selection algorithms which aim to minimize the expected prediction error of the regularized regression model. Specifically, we select those features such that the size of the parameter covariance matrix of the regularized regression model is minimized. Motivated from experimental design, we use trace and determinant operators to measure the size of the covariance matrix. Efficient computational schemes are also introduced to solve the corresponding optimization problems. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithms.

  18. Discriminative semi-supervised feature selection via manifold regularization.

    Science.gov (United States)

    Xu, Zenglin; King, Irwin; Lyu, Michael Rung-Tsong; Jin, Rong

    2010-07-01

    Feature selection has attracted a huge amount of interest in both research and application communities of data mining. We consider the problem of semi-supervised feature selection, where we are given a small amount of labeled examples and a large amount of unlabeled examples. Since a small number of labeled samples are usually insufficient for identifying the relevant features, the critical problem arising from semi-supervised feature selection is how to take advantage of the information underneath the unlabeled data. To address this problem, we propose a novel discriminative semi-supervised feature selection method based on the idea of manifold regularization. The proposed approach selects features through maximizing the classification margin between different classes and simultaneously exploiting the geometry of the probability distribution that generates both labeled and unlabeled data. In comparison with previous semi-supervised feature selection algorithms, our proposed semi-supervised feature selection method is an embedded feature selection method and is able to find more discriminative features. We formulate the proposed feature selection method into a convex-concave optimization problem, where the saddle point corresponds to the optimal solution. To find the optimal solution, the level method, a fairly recent optimization method, is employed. We also present a theoretic proof of the convergence rate for the application of the level method to our problem. Empirical evaluation on several benchmark data sets demonstrates the effectiveness of the proposed semi-supervised feature selection method.

  19. Achieving Accurate Automatic Sleep Staging on Manually Pre-processed EEG Data Through Synchronization Feature Extraction and Graph Metrics.

    Science.gov (United States)

    Chriskos, Panteleimon; Frantzidis, Christos A; Gkivogkli, Polyxeni T; Bamidis, Panagiotis D; Kourtidou-Papadeli, Chrysoula

    2018-01-01

    Sleep staging, the process of assigning labels to epochs of sleep, depending on the stage of sleep they belong, is an arduous, time consuming and error prone process as the initial recordings are quite often polluted by noise from different sources. To properly analyze such data and extract clinical knowledge, noise components must be removed or alleviated. In this paper a pre-processing and subsequent sleep staging pipeline for the sleep analysis of electroencephalographic signals is described. Two novel methods of functional connectivity estimation (Synchronization Likelihood/SL and Relative Wavelet Entropy/RWE) are comparatively investigated for automatic sleep staging through manually pre-processed electroencephalographic recordings. A multi-step process that renders signals suitable for further analysis is initially described. Then, two methods that rely on extracting synchronization features from electroencephalographic recordings to achieve computerized sleep staging are proposed, based on bivariate features which provide a functional overview of the brain network, contrary to most proposed methods that rely on extracting univariate time and frequency features. Annotation of sleep epochs is achieved through the presented feature extraction methods by training classifiers, which are in turn able to accurately classify new epochs. Analysis of data from sleep experiments on a randomized, controlled bed-rest study, which was organized by the European Space Agency and was conducted in the "ENVIHAB" facility of the Institute of Aerospace Medicine at the German Aerospace Center (DLR) in Cologne, Germany attains high accuracy rates, over 90% based on ground truth that resulted from manual sleep staging by two experienced sleep experts. Therefore, it can be concluded that the above feature extraction methods are suitable for semi-automatic sleep staging.

  20. EEG feature selection method based on decision tree.

    Science.gov (United States)

    Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun

    2015-01-01

    This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.

  1. Feature Import Vector Machine: A General Classifier with Flexible Feature Selection.

    Science.gov (United States)

    Ghosh, Samiran; Wang, Yazhen

    2015-02-01

    The support vector machine (SVM) and other reproducing kernel Hilbert space (RKHS) based classifier systems are drawing much attention recently due to its robustness and generalization capability. General theme here is to construct classifiers based on the training data in a high dimensional space by using all available dimensions. The SVM achieves huge data compression by selecting only few observations which lie close to the boundary of the classifier function. However when the number of observations are not very large (small n ) but the number of dimensions/features are large (large p ), then it is not necessary that all available features are of equal importance in the classification context. Possible selection of an useful fraction of the available features may result in huge data compression. In this paper we propose an algorithmic approach by means of which such an optimal set of features could be selected. In short, we reverse the traditional sequential observation selection strategy of SVM to that of sequential feature selection. To achieve this we have modified the solution proposed by Zhu and Hastie (2005) in the context of import vector machine (IVM), to select an optimal sub-dimensional model to build the final classifier with sufficient accuracy.

  2. Analysis of Different Feature Selection Criteria Based on a Covariance Convergence Perspective for a SLAM Algorithm

    Science.gov (United States)

    Auat Cheein, Fernando A.; Carelli, Ricardo

    2011-01-01

    This paper introduces several non-arbitrary feature selection techniques for a Simultaneous Localization and Mapping (SLAM) algorithm. The feature selection criteria are based on the determination of the most significant features from a SLAM convergence perspective. The SLAM algorithm implemented in this work is a sequential EKF (Extended Kalman filter) SLAM. The feature selection criteria are applied on the correction stage of the SLAM algorithm, restricting it to correct the SLAM algorithm with the most significant features. This restriction also causes a decrement in the processing time of the SLAM. Several experiments with a mobile robot are shown in this work. The experiments concern the map reconstruction and a comparison between the different proposed techniques performance. The experiments were carried out at an outdoor environment composed by trees, although the results shown herein are not restricted to a special type of features. PMID:22346568

  3. CASSAVA BREEDING II: PHENOTYPIC CORRELATIONS THROUGH THE DIFFERENT STAGES OF SELECTION

    Directory of Open Access Journals (Sweden)

    Orlando Joaqui Barandica

    2016-12-01

    Full Text Available Breeding cassava relies on a phenotypic recurrent selection that takes advantage of the vegetative propagation of this crop. Successive stages of selection (single row trial- SRT; preliminary yield trial – PYT; advanced yield trial – AYT; and uniform yield trials UYT, gradually reduce the number of genotypes as the plot size, number of replications and locations increase. An important feature of this scheme is that, because of the clonal, reproduction of cassava, the same identical genotypes are evaluated throughout these four successive stages of selection. For this study data, from 14 years (more than 30,000 data points of evaluation in a sub-humid tropical environment was consolidated for a meta-analysis. Correlation coefficients for fresh root yield (FRY, dry matter content (DMC, harvest index (HIN and plant type score (PTS along the different stages of selection were estimated. DMC and PTS measured in different trials showed the highest correlation coefficients, indicating a relatively good repeatability. HIN had an intermediate repeatability, whereas FRY had the lowest value. The association between HIN and FRY was lower than expected, suggesting that HIN in early stages was not reliable as indirect selection for FRY in later stages. There was a consistent decrease in the average performance of clones grown in PYTs compared with the earlier evaluation of the same genotypes at SRTs. A feasible explanation for this trend is the impact of the environment on the physiological and nutritional status of the planting material and/or epigenetic effects. The usefulness of HIN is questioned. Measuring this variable takes considerable efforts at harvest time. DMC and FRY showed a weak positive association in SRT (r= 0.21 but a clearly negative one at UYT (r= -0.42. The change if the relationship between these variables is the result of selection. In later stages of selection, the plant is forced to maximize productivity on a dry weight basis

  4. Feature Selection for Chemical Sensor Arrays Using Mutual Information

    Science.gov (United States)

    Wang, X. Rosalind; Lizier, Joseph T.; Nowotny, Thomas; Berna, Amalia Z.; Prokopenko, Mikhail; Trowell, Stephen C.

    2014-01-01

    We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. PMID:24595058

  5. A Comparison Study on Multidomain EEG Features for Sleep Stage Classification

    Directory of Open Access Journals (Sweden)

    Yu Zhang

    2017-01-01

    Full Text Available Feature extraction from physiological signals of EEG (electroencephalogram is an essential part for sleep staging. In this study, multidomain feature extraction was investigated based on time domain analysis, nonlinear analysis, and frequency domain analysis. Unlike the traditional feature calculation in time domain, a sequence merging method was developed as a preprocessing procedure. The objective is to eliminate the clutter waveform and highlight the characteristic waveform for further analysis. The numbers of the characteristic activities were extracted as the features from time domain. The contributions of features from different domains to the sleep stages were compared. The effectiveness was further analyzed by automatic sleep stage classification and compared with the visual inspection. The overnight clinical sleep EEG recordings of 3 patients after the treatment of Continuous Positive Airway Pressure (CPAP were tested. The obtained results showed that the developed method can highlight the characteristic activity which is useful for both automatic sleep staging and visual inspection. Furthermore, it can be a training tool for better understanding the appearance of characteristic waveforms from raw sleep EEG which is mixed and complex in time domain.

  6. Genetic search feature selection for affective modeling

    DEFF Research Database (Denmark)

    Martínez, Héctor P.; Yannakakis, Georgios N.

    2010-01-01

    Automatic feature selection is a critical step towards the generation of successful computational models of affect. This paper presents a genetic search-based feature selection method which is developed as a global-search algorithm for improving the accuracy of the affective models built....... The method is tested and compared against sequential forward feature selection and random search in a dataset derived from a game survey experiment which contains bimodal input features (physiological and gameplay) and expressed pairwise preferences of affect. Results suggest that the proposed method...

  7. AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

    Science.gov (United States)

    Sun, Lei; Wang, Jun; Wei, Jinmao

    2017-03-14

    The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.

  8. Naive Bayes-Guided Bat Algorithm for Feature Selection

    Directory of Open Access Journals (Sweden)

    Ahmed Majid Taha

    2013-01-01

    Full Text Available When the amount of data and information is said to double in every 20 months or so, feature selection has become highly important and beneficial. Further improvements in feature selection will positively affect a wide array of applications in fields such as pattern recognition, machine learning, or signal processing. Bio-inspired method called Bat Algorithm hybridized with a Naive Bayes classifier has been presented in this work. The performance of the proposed feature selection algorithm was investigated using twelve benchmark datasets from different domains and was compared to three other well-known feature selection algorithms. Discussion focused on four perspectives: number of features, classification accuracy, stability, and feature generalization. The results showed that BANB significantly outperformed other algorithms in selecting lower number of features, hence removing irrelevant, redundant, or noisy features while maintaining the classification accuracy. BANB is also proven to be more stable than other methods and is capable of producing more general feature subsets.

  9. Naive Bayes-Guided Bat Algorithm for Feature Selection

    Science.gov (United States)

    Taha, Ahmed Majid; Mustapha, Aida; Chen, Soong-Der

    2013-01-01

    When the amount of data and information is said to double in every 20 months or so, feature selection has become highly important and beneficial. Further improvements in feature selection will positively affect a wide array of applications in fields such as pattern recognition, machine learning, or signal processing. Bio-inspired method called Bat Algorithm hybridized with a Naive Bayes classifier has been presented in this work. The performance of the proposed feature selection algorithm was investigated using twelve benchmark datasets from different domains and was compared to three other well-known feature selection algorithms. Discussion focused on four perspectives: number of features, classification accuracy, stability, and feature generalization. The results showed that BANB significantly outperformed other algorithms in selecting lower number of features, hence removing irrelevant, redundant, or noisy features while maintaining the classification accuracy. BANB is also proven to be more stable than other methods and is capable of producing more general feature subsets. PMID:24396295

  10. Signatures of natural selection between life cycle stages separated by metamorphosis in European eel

    DEFF Research Database (Denmark)

    Pujolar, J.M.; Jacobsen, M.W.; Bekkevold, Dorte

    2015-01-01

    Species showing complex life cycles provide excellent opportunities to study the genetic associations between life cycle stages, as selective pressures may differ before and after metamorphosis. The European eel presents a complex life cycle with two metamorphoses, a first metamorphosis from larvae...... into glass eels (juvenile stage) and a second metamorphosis into silver eels (adult stage). We tested the hypothesis that different genes and gene pathways will be under selection at different life stages when comparing the genetic associations between glass eels and silver eels. Results: We used two sets...... of markers to test for selection: first, we genotyped individuals using a panel of 80 coding-gene single nucleotide polymorphisms (SNPs) developed in American eel; second, we investigated selection at the genome level using a total of 153,423 RAD-sequencing generated SNPs widely distributed across the genome...

  11. A HYBRID FILTER AND WRAPPER FEATURE SELECTION APPROACH FOR DETECTING CONTAMINATION IN DRINKING WATER MANAGEMENT SYSTEM

    Directory of Open Access Journals (Sweden)

    S. VISALAKSHI

    2017-07-01

    Full Text Available Feature selection is an important task in predictive models which helps to identify the irrelevant features in the high - dimensional dataset. In this case of water contamination detection dataset, the standard wrapper algorithm alone cannot be applied because of the complexity. To overcome this computational complexity problem and making it lighter, filter-wrapper based algorithm has been proposed. In this work, reducing the feature space is a significant component of water contamination. The main findings are as follows: (1 The main goal is speeding up the feature selection process, so the proposed filter - based feature pre-selection is applied and guarantees that useful data are improbable to be detached in the initial stage which discussed briefly in this paper. (2 The resulting features are again filtered by using the Genetic Algorithm coded with Support Vector Machine method, where it facilitates to nutshell the subset of features with high accuracy and decreases the expense. Experimental results show that the proposed methods trim down redundant features effectively and achieved better classification accuracy.

  12. Effective Feature Selection for Classification of Promoter Sequences.

    Directory of Open Access Journals (Sweden)

    Kouser K

    Full Text Available Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine, KNN (K Nearest Neighbor and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.

  13. SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.

    Science.gov (United States)

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru

    2014-01-01

    Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.

  14. Feature Selection for Audio Surveillance in Urban Environment

    Directory of Open Access Journals (Sweden)

    KIKTOVA Eva

    2014-05-01

    Full Text Available This paper presents the work leading to the acoustic event detection system, which is designed to recognize two types of acoustic events (shot and breaking glass in urban environment. For this purpose, a huge front-end processing was performed for the effective parametric representation of an input sound. MFCC features and features computed during their extraction (MELSPEC and FBANK, then MPEG-7 audio descriptors and other temporal and spectral characteristics were extracted. High dimensional feature sets were created and in the next phase reduced by the mutual information based selection algorithms. Hidden Markov Model based classifier was applied and evaluated by the Viterbi decoding algorithm. Thus very effective feature sets were identified and also the less important features were found.

  15. A New Concept of Two-Stage Multi-Element Resonant-/Cyclo-Converter for Two-Phase IM/SM Motor

    Directory of Open Access Journals (Sweden)

    Mahmud Ali Rzig Abdalmula

    2013-01-01

    Full Text Available The paper deals with a new concept of power electronic two-phase system with two-stage DC/AC/AC converter and two-phase IM/PMSM motor. The proposed system consisting of two-stage converter comprises: input resonant boost converter with AC output, two-phase half-bridge cyclo-converter commutated by HF AC input voltage, and induction or synchronous motor. Such a system with AC interlink, as a whole unit, has better properties as a 3-phase reference VSI inverter: higher efficiency due to soft switching of both converter stages, higher switching frequency, smaller dimensions and weight with lesser number of power semiconductor switches and better price. In comparison with currently used conventional system configurations the proposed system features a good efficiency of electronic converters and also has a good torque overloading of two-phase AC induction or synchronous motors. Design of two-stage multi-element resonant converter and results of simulation experiments are presented in the paper.

  16. Simultaneous Channel and Feature Selection of Fused EEG Features Based on Sparse Group Lasso

    Directory of Open Access Journals (Sweden)

    Jin-Jia Wang

    2015-01-01

    Full Text Available Feature extraction and classification of EEG signals are core parts of brain computer interfaces (BCIs. Due to the high dimension of the EEG feature vector, an effective feature selection algorithm has become an integral part of research studies. In this paper, we present a new method based on a wrapped Sparse Group Lasso for channel and feature selection of fused EEG signals. The high-dimensional fused features are firstly obtained, which include the power spectrum, time-domain statistics, AR model, and the wavelet coefficient features extracted from the preprocessed EEG signals. The wrapped channel and feature selection method is then applied, which uses the logistical regression model with Sparse Group Lasso penalized function. The model is fitted on the training data, and parameter estimation is obtained by modified blockwise coordinate descent and coordinate gradient descent method. The best parameters and feature subset are selected by using a 10-fold cross-validation. Finally, the test data is classified using the trained model. Compared with existing channel and feature selection methods, results show that the proposed method is more suitable, more stable, and faster for high-dimensional feature fusion. It can simultaneously achieve channel and feature selection with a lower error rate. The test accuracy on the data used from international BCI Competition IV reached 84.72%.

  17. Meta-analysis of Gaussian individual patient data: Two-stage or not two-stage?

    Science.gov (United States)

    Morris, Tim P; Fisher, David J; Kenward, Michael G; Carpenter, James R

    2018-04-30

    Quantitative evidence synthesis through meta-analysis is central to evidence-based medicine. For well-documented reasons, the meta-analysis of individual patient data is held in higher regard than aggregate data. With access to individual patient data, the analysis is not restricted to a "two-stage" approach (combining estimates and standard errors) but can estimate parameters of interest by fitting a single model to all of the data, a so-called "one-stage" analysis. There has been debate about the merits of one- and two-stage analysis. Arguments for one-stage analysis have typically noted that a wider range of models can be fitted and overall estimates may be more precise. The two-stage side has emphasised that the models that can be fitted in two stages are sufficient to answer the relevant questions, with less scope for mistakes because there are fewer modelling choices to be made in the two-stage approach. For Gaussian data, we consider the statistical arguments for flexibility and precision in small-sample settings. Regarding flexibility, several of the models that can be fitted only in one stage may not be of serious interest to most meta-analysis practitioners. Regarding precision, we consider fixed- and random-effects meta-analysis and see that, for a model making certain assumptions, the number of stages used to fit this model is irrelevant; the precision will be approximately equal. Meta-analysts should choose modelling assumptions carefully. Sometimes relevant models can only be fitted in one stage. Otherwise, meta-analysts are free to use whichever procedure is most convenient to fit the identified model. © 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  18. Comparison of Oone-Stage Free Gracilis Muscle Flap With Two-Stage Method in Chronic Facial Palsy

    Directory of Open Access Journals (Sweden)

    J Ghaffari

    2007-08-01

    Full Text Available Background:Rehabilitation of facial paralysis is one of the greatest challenges faced by reconstructive surgeons today. The traditional method for treatment of patients with facial palsy is the two-stage free gracilis flap which has a long latency period of between the two stages of surgery.Methods: In this paper, we prospectively compared the results of the one-stage gracilis flap method with the two -stage technique.Results:Out of 41 patients with facial palsy refered to Hazrat-e-Fatemeh Hospital 31 were selected from whom 22 underwent two- stage and 9 one-stage method treatment. The two groups were identical according to age,sex,intensity of illness, duration, and chronicity of illness. Mean duration of follow up was 37 months. There was no significant relation between the two groups regarding the symmetry of face in repose, smiling, whistling and nasolabial folds. Frequency of complications was equal in both groups. The postoperative surgeons and patients' satisfaction were equal in both groups. There was no significant difference between the mean excursion of muscle flap in one-stage (9.8 mm and two-stage groups (8.9 mm. The ratio of contraction of the affected side compared to the normal side was similar in both groups. The mean time of the initial contraction of the muscle flap in the one-stage group (5.5 months had a significant difference (P=0.001 with the two-stage one (6.5 months.The study revealed a highly significant difference (P=0.0001 between the mean waiting period from the first operation to the beginning of muscle contraction in one-stage(5.5 monthsand two-stage groups(17.1 months.Conclusion:It seems that the results and complication of the two methods are the same,but the one-stage method requires less time for facial reanimation,and is costeffective because it saves time and decreases hospitalization costs.

  19. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    Science.gov (United States)

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  20. Classification Influence of Features on Given Emotions and Its Application in Feature Selection

    Science.gov (United States)

    Xing, Yin; Chen, Chuang; Liu, Li-Long

    2018-04-01

    In order to solve the problem that there is a large amount of redundant data in high-dimensional speech emotion features, we analyze deeply the extracted speech emotion features and select better features. Firstly, a given emotion is classified by each feature. Secondly, the recognition rate is ranked in descending order. Then, the optimal threshold of features is determined by rate criterion. Finally, the better features are obtained. When applied in Berlin and Chinese emotional data set, the experimental results show that the feature selection method outperforms the other traditional methods.

  1. A New Two-Stage Approach to Short Term Electrical Load Forecasting

    Directory of Open Access Journals (Sweden)

    Dragan Tasić

    2013-04-01

    Full Text Available In the deregulated energy market, the accuracy of load forecasting has a significant effect on the planning and operational decision making of utility companies. Electric load is a random non-stationary process influenced by a number of factors which make it difficult to model. To achieve better forecasting accuracy, a wide variety of models have been proposed. These models are based on different mathematical methods and offer different features. This paper presents a new two-stage approach for short-term electrical load forecasting based on least-squares support vector machines. With the aim of improving forecasting accuracy, one more feature was added to the model feature set, the next day average load demand. As this feature is unknown for one day ahead, in the first stage, forecasting of the next day average load demand is done and then used in the model in the second stage for next day hourly load forecasting. The effectiveness of the presented model is shown on the real data of the ISO New England electricity market. The obtained results confirm the validity advantage of the proposed approach.

  2. Toward optimal feature selection using ranking methods and classification algorithms

    Directory of Open Access Journals (Sweden)

    Novaković Jasmina

    2011-01-01

    Full Text Available We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be important for classification accuracy. In our experiments, ranking methods with different supervised learning algorithms give quite different results for balanced accuracy. Our cases confirm that, in order to be sure that a subset of features giving the highest accuracy has been selected, the use of many different indices is recommended.

  3. A two-stage method for microcalcification cluster segmentation in mammography by deformable models

    International Nuclear Information System (INIS)

    Arikidis, N.; Kazantzi, A.; Skiadopoulos, S.; Karahaliou, A.; Costaridou, L.; Vassiou, K.

    2015-01-01

    Purpose: Segmentation of microcalcification (MC) clusters in x-ray mammography is a difficult task for radiologists. Accurate segmentation is prerequisite for quantitative image analysis of MC clusters and subsequent feature extraction and classification in computer-aided diagnosis schemes. Methods: In this study, a two-stage semiautomated segmentation method of MC clusters is investigated. The first stage is targeted to accurate and time efficient segmentation of the majority of the particles of a MC cluster, by means of a level set method. The second stage is targeted to shape refinement of selected individual MCs, by means of an active contour model. Both methods are applied in the framework of a rich scale-space representation, provided by the wavelet transform at integer scales. Segmentation reliability of the proposed method in terms of inter and intraobserver agreements was evaluated in a case sample of 80 MC clusters originating from the digital database for screening mammography, corresponding to 4 morphology types (punctate: 22, fine linear branching: 16, pleomorphic: 18, and amorphous: 24) of MC clusters, assessing radiologists’ segmentations quantitatively by two distance metrics (Hausdorff distance—HDIST cluster , average of minimum distance—AMINDIST cluster ) and the area overlap measure (AOM cluster ). The effect of the proposed segmentation method on MC cluster characterization accuracy was evaluated in a case sample of 162 pleomorphic MC clusters (72 malignant and 90 benign). Ten MC cluster features, targeted to capture morphologic properties of individual MCs in a cluster (area, major length, perimeter, compactness, and spread), were extracted and a correlation-based feature selection method yielded a feature subset to feed in a support vector machine classifier. Classification performance of the MC cluster features was estimated by means of the area under receiver operating characteristic curve (Az ± Standard Error) utilizing tenfold cross

  4. Receptive fields selection for binary feature description.

    Science.gov (United States)

    Fan, Bin; Kong, Qingqun; Trzcinski, Tomasz; Wang, Zhiheng; Pan, Chunhong; Fua, Pascal

    2014-06-01

    Feature description for local image patch is widely used in computer vision. While the conventional way to design local descriptor is based on expert experience and knowledge, learning-based methods for designing local descriptor become more and more popular because of their good performance and data-driven property. This paper proposes a novel data-driven method for designing binary feature descriptor, which we call receptive fields descriptor (RFD). Technically, RFD is constructed by thresholding responses of a set of receptive fields, which are selected from a large number of candidates according to their distinctiveness and correlations in a greedy way. Using two different kinds of receptive fields (namely rectangular pooling area and Gaussian pooling area) for selection, we obtain two binary descriptors RFDR and RFDG .accordingly. Image matching experiments on the well-known patch data set and Oxford data set demonstrate that RFD significantly outperforms the state-of-the-art binary descriptors, and is comparable with the best float-valued descriptors at a fraction of processing time. Finally, experiments on object recognition tasks confirm that both RFDR and RFDG successfully bridge the performance gap between binary descriptors and their floating-point competitors.

  5. GMDH-Based Semi-Supervised Feature Selection for Electricity Load Classification Forecasting

    Directory of Open Access Journals (Sweden)

    Lintao Yang

    2018-01-01

    Full Text Available With the development of smart power grids, communication network technology and sensor technology, there has been an exponential growth in complex electricity load data. Irregular electricity load fluctuations caused by the weather and holiday factors disrupt the daily operation of the power companies. To deal with these challenges, this paper investigates a day-ahead electricity peak load interval forecasting problem. It transforms the conventional continuous forecasting problem into a novel interval forecasting problem, and then further converts the interval forecasting problem into the classification forecasting problem. In addition, an indicator system influencing the electricity load is established from three dimensions, namely the load series, calendar data, and weather data. A semi-supervised feature selection algorithm is proposed to address an electricity load classification forecasting issue based on the group method of data handling (GMDH technology. The proposed algorithm consists of three main stages: (1 training the basic classifier; (2 selectively marking the most suitable samples from the unclassified label data, and adding them to an initial training set; and (3 training the classification models on the final training set and classifying the test samples. An empirical analysis of electricity load dataset from four Chinese cities is conducted. Results show that the proposed model can address the electricity load classification forecasting problem more efficiently and effectively than the FW-Semi FS (forward semi-supervised feature selection and GMDH-U (GMDH-based semi-supervised feature selection for customer classification models.

  6. Feature Selection via Chaotic Antlion Optimization.

    Directory of Open Access Journals (Sweden)

    Hossam M Zawbaa

    Full Text Available Selecting a subset of relevant properties from a large set of features that describe a dataset is a challenging machine learning task. In biology, for instance, the advances in the available technologies enable the generation of a very large number of biomarkers that describe the data. Choosing the more informative markers along with performing a high-accuracy classification over the data can be a daunting task, particularly if the data are high dimensional. An often adopted approach is to formulate the feature selection problem as a biobjective optimization problem, with the aim of maximizing the performance of the data analysis model (the quality of the data training fitting while minimizing the number of features used.We propose an optimization approach for the feature selection problem that considers a "chaotic" version of the antlion optimizer method, a nature-inspired algorithm that mimics the hunting mechanism of antlions in nature. The balance between exploration of the search space and exploitation of the best solutions is a challenge in multi-objective optimization. The exploration/exploitation rate is controlled by the parameter I that limits the random walk range of the ants/prey. This variable is increased iteratively in a quasi-linear manner to decrease the exploration rate as the optimization progresses. The quasi-linear decrease in the variable I may lead to immature convergence in some cases and trapping in local minima in other cases. The chaotic system proposed here attempts to improve the tradeoff between exploration and exploitation. The methodology is evaluated using different chaotic maps on a number of feature selection datasets. To ensure generality, we used ten biological datasets, but we also used other types of data from various sources. The results are compared with the particle swarm optimizer and with genetic algorithm variants for feature selection using a set of quality metrics.

  7. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  8. Effect of feature-selective attention on neuronal responses in macaque area MT

    Science.gov (United States)

    Chen, X.; Hoffmann, K.-P.; Albright, T. D.

    2012-01-01

    Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color). PMID:22170961

  9. A Quantum Hybrid PSO Combined with Fuzzy k-NN Approach to Feature Selection and Cell Classification in Cervical Cancer Detection

    Directory of Open Access Journals (Sweden)

    Abdullah M. Iliyasu

    2017-12-01

    Full Text Available A quantum hybrid (QH intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO method with the intuitionistic rationality of traditional fuzzy k-nearest neighbours (Fuzzy k-NN algorithm (known simply as the Q-Fuzzy approach is proposed for efficient feature selection and classification of cells in cervical smeared (CS images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection and another hybrid technique combining the standard PSO algorithm with the Fuzzy k-NN technique (P-Fuzzy approach. In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k-NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.

  10. Feature selection for splice site prediction: A new method using EDA-based feature ranking

    Directory of Open Access Journals (Sweden)

    Rouzé Pierre

    2004-05-01

    Full Text Available Abstract Background The identification of relevant biological features in large and complex datasets is an important step towards gaining insight in the processes underlying the data. Other advantages of feature selection include the ability of the classification system to attain good or even better solutions using a restricted subset of features, and a faster classification. Thus, robust methods for fast feature selection are of key importance in extracting knowledge from complex biological data. Results In this paper we present a novel method for feature subset selection applied to splice site prediction, based on estimation of distribution algorithms, a more general framework of genetic algorithms. From the estimated distribution of the algorithm, a feature ranking is derived. Afterwards this ranking is used to iteratively discard features. We apply this technique to the problem of splice site prediction, and show how it can be used to gain insight into the underlying biological process of splicing. Conclusion We show that this technique proves to be more robust than the traditional use of estimation of distribution algorithms for feature selection: instead of returning a single best subset of features (as they normally do this method provides a dynamical view of the feature selection process, like the traditional sequential wrapper methods. However, the method is faster than the traditional techniques, and scales better to datasets described by a large number of features.

  11. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data

    KAUST Repository

    Abusamra, Heba

    2013-05-01

    Microarray technology has enriched the study of gene expression in such a way that scientists are now able to measure the expression levels of thousands of genes in a single experiment. Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification, interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This thesis aims on a comparative study of state-of-the-art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k- nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t- statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used for this study. Different experiments have been applied to compare the performance of the classification methods with and without performing feature selection. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in

  12. Online Feature Selection for Classifying Emphysema in HRCT Images

    Directory of Open Access Journals (Sweden)

    M. Prasad

    2008-06-01

    Full Text Available Feature subset selection, applied as a pre- processing step to machine learning, is valuable in dimensionality reduction, eliminating irrelevant data and improving classifier performance. In the classic formulation of the feature selection problem, it is assumed that all the features are available at the beginning. However, in many real world problems, there are scenarios where not all features are present initially and must be integrated as they become available. In such scenarios, online feature selection provides an efficient way to sort through a large space of features. It is in this context that we introduce online feature selection for the classification of emphysema, a smoking related disease that appears as low attenuation regions in High Resolution Computer Tomography (HRCT images. The technique was successfully evaluated on 61 HRCT scans and compared with different online feature selection approaches, including hill climbing, best first search, grafting, and correlation-based feature selection. The results were also compared against ldensity maskr, a standard approach used for emphysema detection in medical image analysis.

  13. Identification of Subtype-Specific Prognostic Genes for Early-Stage Lung Adenocarcinoma and Squamous Cell Carcinoma Patients Using an Embedded Feature Selection Algorithm.

    Directory of Open Access Journals (Sweden)

    Suyan Tian

    Full Text Available The existence of fundamental differences between lung adenocarcinoma (AC and squamous cell carcinoma (SCC in their underlying mechanisms motivated us to postulate that specific genes might exist relevant to prognosis of each histology subtype. To test on this research hypothesis, we previously proposed a simple Cox-regression model based feature selection algorithm and identified successfully some subtype-specific prognostic genes when applying this method to real-world data. In this article, we continue our effort on identification of subtype-specific prognostic genes for AC and SCC, and propose a novel embedded feature selection method by extending Threshold Gradient Descent Regularization (TGDR algorithm and minimizing on a corresponding negative partial likelihood function. Using real-world datasets and simulated ones, we show these two proposed methods have comparable performance whereas the new proposal is superior in terms of model parsimony. Our analysis provides some evidence on the existence of such subtype-specific prognostic genes, more investigation is warranted.

  14. Feature selection and multi-kernel learning for adaptive graph regularized nonnegative matrix factorization

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-09-20

    Nonnegative matrix factorization (NMF), a popular part-based representation technique, does not capture the intrinsic local geometric structure of the data space. Graph regularized NMF (GNMF) was recently proposed to avoid this limitation by regularizing NMF with a nearest neighbor graph constructed from the input data set. However, GNMF has two main bottlenecks. First, using the original feature space directly to construct the graph is not necessarily optimal because of the noisy and irrelevant features and nonlinear distributions of data samples. Second, one possible way to handle the nonlinear distribution of data samples is by kernel embedding. However, it is often difficult to choose the most suitable kernel. To solve these bottlenecks, we propose two novel graph-regularized NMF methods, AGNMFFS and AGNMFMK, by introducing feature selection and multiple-kernel learning to the graph regularized NMF, respectively. Instead of using a fixed graph as in GNMF, the two proposed methods learn the nearest neighbor graph that is adaptive to the selected features and learned multiple kernels, respectively. For each method, we propose a unified objective function to conduct feature selection/multi-kernel learning, NMF and adaptive graph regularization simultaneously. We further develop two iterative algorithms to solve the two optimization problems. Experimental results on two challenging pattern classification tasks demonstrate that the proposed methods significantly outperform state-of-the-art data representation methods.

  15. SIP-FS: a novel feature selection for data representation

    Directory of Open Access Journals (Sweden)

    Yiyou Guo

    2018-02-01

    Full Text Available Abstract Multiple features are widely used to characterize real-world datasets. It is desirable to select leading features with stability and interpretability from a set of distinct features for a comprehensive data description. However, most of existing feature selection methods focus on the predictability (e.g., prediction accuracy of selected results yet neglect stability. To obtain compact data representation, a novel feature selection method is proposed to improve stability, and interpretability without sacrificing predictability (SIP-FS. Instead of mutual information, generalized correlation is adopted in minimal redundancy maximal relevance to measure the relation between different feature types. Several feature types (each contains a certain number of features can then be selected and evaluated quantitatively to determine what types contribute to a specific class, thereby enhancing the so-called interpretability of features. Moreover, stability is introduced in the criterion of SIP-FS to obtain consistent results of ranking. We conduct experiments on three publicly available datasets using one-versus-all strategy to select class-specific features. The experiments illustrate that SIP-FS achieves significant performance improvements in terms of stability and interpretability with desirable prediction accuracy and indicates advantages over several state-of-the-art approaches.

  16. One-stage and two-stage penile buccal mucosa urethroplasty

    Directory of Open Access Journals (Sweden)

    G. Barbagli

    2016-03-01

    Full Text Available The paper provides the reader with the detailed description of current techniques of one-stage and two-stage penile buccal mucosa urethroplasty. The paper provides the reader with the preoperative patient evaluation paying attention to the use of diagnostic tools. The one-stage penile urethroplasty using buccal mucosa graft with the application of glue is preliminary showed and discussed. Two-stage penile urethroplasty is then reported. A detailed description of first-stage urethroplasty according Johanson technique is reported. A second-stage urethroplasty using buccal mucosa graft and glue is presented. Finally postoperative course and follow-up are addressed.

  17. Two-Stage Classification Approach for Human Detection in Camera Video in Bulk Ports

    Directory of Open Access Journals (Sweden)

    Mi Chao

    2015-09-01

    Full Text Available With the development of automation in ports, the video surveillance systems with automated human detection begun to be applied in open-air handling operation areas for safety and security. The accuracy of traditional human detection based on the video camera is not high enough to meet the requirements of operation surveillance. One of the key reasons is that Histograms of Oriented Gradients (HOG features of the human body will show great different between front & back standing (F&B and side standing (Side human body. Therefore, the final training for classifier will only gain a few useful specific features which have contribution to classification and are insufficient to support effective classification, while using the HOG features directly extracted by the samples from different human postures. This paper proposes a two-stage classification method to improve the accuracy of human detection. In the first stage, during preprocessing classification, images is mainly divided into possible F&B human body and not F&B human body, and then they were put into the second-stage classification among side human and non-human recognition. The experimental results in Tianjin port show that the two-stage classifier can improve the classification accuracy of human detection obviously.

  18. Two Stage Fuzzy Methodology to Evaluate the Credit Risks of Investment Projects

    OpenAIRE

    O. Badagadze; G. Sirbiladze; I. Khutsishvili

    2014-01-01

    The work proposes a decision support methodology for the credit risk minimization in selection of investment projects. The methodology provides two stages of projects’ evaluation. Preliminary selection of projects with minor credit risks is made using the Expertons Method. The second stage makes ranking of chosen projects using the Possibilistic Discrimination Analysis Method. The latter is a new modification of a well-known Method of Fuzzy Discrimination Analysis.

  19. Variable selection in near-infrared spectroscopy: Benchmarking of feature selection methods on biodiesel data

    International Nuclear Information System (INIS)

    Balabin, Roman M.; Smirnov, Sergey V.

    2011-01-01

    During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm -1 ) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic

  20. Joint Feature Selection and Classification for Multilabel Learning.

    Science.gov (United States)

    Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong

    2018-03-01

    Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.

  1. FunSAV: predicting the functional effect of single amino acid variants using a two-stage random forest model.

    Directory of Open Access Journals (Sweden)

    Mingjun Wang

    Full Text Available Single amino acid variants (SAVs are the most abundant form of known genetic variations associated with human disease. Successful prediction of the functional impact of SAVs from sequences can thus lead to an improved understanding of the underlying mechanisms of why a SAV may be associated with certain disease. In this work, we constructed a high-quality structural dataset that contained 679 high-quality protein structures with 2,048 SAVs by collecting the human genetic variant data from multiple resources and dividing them into two categories, i.e., disease-associated and neutral variants. We built a two-stage random forest (RF model, termed as FunSAV, to predict the functional effect of SAVs by combining sequence, structure and residue-contact network features with other additional features that were not explored in previous studies. Importantly, a two-step feature selection procedure was proposed to select the most important and informative features that contribute to the prediction of disease association of SAVs. In cross-validation experiments on the benchmark dataset, FunSAV achieved a good prediction performance with the area under the curve (AUC of 0.882, which is competitive with and in some cases better than other existing tools including SIFT, SNAP, Polyphen2, PANTHER, nsSNPAnalyzer and PhD-SNP. The sourcecodes of FunSAV and the datasets can be downloaded at http://sunflower.kuicr.kyoto-u.ac.jp/sjn/FunSAV.

  2. Early Contact Stage of Apoptosis: Its Morphological Features and Function

    Directory of Open Access Journals (Sweden)

    Etheri Mikadze

    2006-01-01

    Full Text Available Apoptosis has been a biological phenomenon of intense interest for 20 years, but the earlier morphological features of apoptosis have not been determined hitherto. Using the methods of semi- and ultrathin sections, the livers of intact embryos and young rats have been studied under the effect of cycloheximide to determine morphological features of an early stage of apoptosis. It is discovered that both in hepatoblasts and hepatocytes, apoptosis, besides the well-known stages, also includes an early contact stage, distinguishing features of which are agglutination of bound ribosomes (breaking of translation, elimination of the nucleolus, reduction of free polysomes (and in hepatocytes, reduction of cisterns of rough endoplasmic reticulum, formation of cytoplasmic excrescences, and cell shape changes. The early stage of apoptosis is characterized by close contact with neighboring cells. At a certain phase of the contact stage of apoptosis, the nucleolus reappears in the nucleus and the number of free polysomes in the cytoplasm increases, which suggests the renewal of synthesis of new RNA and proteins. Close contact of differentiating and mitotic hepatoblasts with apoptotic cells indicates a certain functional relationship between these cells that is realized not only by micropinocytosis, but through gap junctions as well. We assume that the apoptotic cell, besides proteolytic products, can contain newly synthesized, low-molecular substances, the relocation of which from apoptotic to neighboring cells may contribute to both functional activity and proliferation of adjacent hepatoblasts and, therefore, the function of apoptosis may not be limited only to the elimination of harmful, damaged, and unwanted cells.

  3. Features Selection for Skin Micro-Image Symptomatic Recognition

    Institute of Scientific and Technical Information of China (English)

    HUYue-li; CAOJia-lin; ZHAOQian; FENGXu

    2004-01-01

    Automatic recognition of skin micro-image symptom is important in skin diagnosis and treatment. Feature selection is to improve the classification performance of skin micro-image symptom.This paper proposes a hybrid approach based on the support vector machine (SVM) technique and genetic algorithm (GA) to select an optimum feature subset from the feature group extracted from the skin micro-images. An adaptive GA is introduced for maintaining the convergence rate. With the proposed method, the average cross validation accuracy is increased from 88.25% using all features to 96.92% using only selected features provided by a classifier for classification of 5 classes of skin symptoms. The experimental results are satisfactory.

  4. Features Selection for Skin Micro-Image Symptomatic Recognition

    Institute of Scientific and Technical Information of China (English)

    HU Yue-li; CAO Jia-lin; ZHAO Qian; FENG Xu

    2004-01-01

    Automatic recognition of skin micro-image symptom is important in skin diagnosis and treatment. Feature selection is to improve the classification performance of skin micro-image symptom.This paper proposes a hybrid approach based on the support vector machine (SVM) technique and genetic algorithm (GA) to select an optimum feature subset from the feature group extracted from the skin micro-images. An adaptive GA is introduced for maintaining the convergence rate. With the proposed method, the average cross validation accuracy is increased from 88.25% using all features to 96.92 % using only selected features provided by a classifier for classification of 5 classes of skin symptoms. The experimental results are satisfactory.

  5. Classification Using Markov Blanket for Feature Selection

    DEFF Research Database (Denmark)

    Zeng, Yifeng; Luo, Jian

    2009-01-01

    Selecting relevant features is in demand when a large data set is of interest in a classification task. It produces a tractable number of features that are sufficient and possibly improve the classification performance. This paper studies a statistical method of Markov blanket induction algorithm...... for filtering features and then applies a classifier using the Markov blanket predictors. The Markov blanket contains a minimal subset of relevant features that yields optimal classification performance. We experimentally demonstrate the improved performance of several classifiers using a Markov blanket...... induction as a feature selection method. In addition, we point out an important assumption behind the Markov blanket induction algorithm and show its effect on the classification performance....

  6. Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods

    Science.gov (United States)

    2013-01-01

    Background Machine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers. Results In the first stage of this research, five feature selection methods have been proposed and experimented on the oral cancer prognosis dataset. In the second stage, the model with the features selected from each feature selection methods are tested on the proposed classifiers. Four types of classifiers are chosen; these are namely, ANFIS, artificial neural network, support vector machine and logistic regression. A k-fold cross-validation is implemented on all types of classifiers due to the small sample size. The hybrid model of ReliefF-GA-ANFIS with 3-input features of drink, invasion and p63 achieved the best accuracy (accuracy = 93.81%; AUC = 0.90) for the oral cancer prognosis. Conclusions The results revealed that the prognosis is superior with the presence of both clinicopathologic and genomic markers. The selected features can be investigated further to validate the potential of becoming as significant prognostic signature in the oral cancer studies. PMID:23725313

  7. UNLABELED SELECTED SAMPLES IN FEATURE EXTRACTION FOR CLASSIFICATION OF HYPERSPECTRAL IMAGES WITH LIMITED TRAINING SAMPLES

    Directory of Open Access Journals (Sweden)

    A. Kianisarkaleh

    2015-12-01

    Full Text Available Feature extraction plays a key role in hyperspectral images classification. Using unlabeled samples, often unlimitedly available, unsupervised and semisupervised feature extraction methods show better performance when limited number of training samples exists. This paper illustrates the importance of selecting appropriate unlabeled samples that used in feature extraction methods. Also proposes a new method for unlabeled samples selection using spectral and spatial information. The proposed method has four parts including: PCA, prior classification, posterior classification and sample selection. As hyperspectral image passes these parts, selected unlabeled samples can be used in arbitrary feature extraction methods. The effectiveness of the proposed unlabeled selected samples in unsupervised and semisupervised feature extraction is demonstrated using two real hyperspectral datasets. Results show that through selecting appropriate unlabeled samples, the proposed method can improve the performance of feature extraction methods and increase classification accuracy.

  8. Clinical evaluation of two-stage mandibular wisdom tooth extraction method to avoid mental nerve paresthesia

    International Nuclear Information System (INIS)

    Nozoe, Etsuro; Nakamura, Yasunori; Okawachi, Takako; Ishihata, Kiyohide; Shinnakasu, Mana; Nakamura, Norifumi

    2011-01-01

    Clinical courses following two-stage mandibular wisdom tooth extraction (TMWTE) carried out for preventing postoperative mental nerve paresthesia (MNP) were analyzed. When panoramic X-ray showed overlapping of wisdom tooth root on the superior 1/2 or more of the mandibular canal, interruption of the white line of the superior wall of the canal, or diversion of the canal, CT examination was facilitated. In cases where contact between the tooth root and canal was demonstrated in CT examination, TMWTE was then selected after gaining the patient's consent. TMWTE consisted of removing more than a half of the tooth crown and tooth root extraction at the second step after 2-3 months. The clinical features of wisdom teeth extracted and postoperative courses including tooth movement and occurrence of MNP during two-stage MWTE were evaluated. TMWTE was carried out for 40 teeth among 811 wisdom teeth (4.9%) that were extracted from 2007 to 2009. Among them, complete procedures were accomplished in 39 teeth, and crown removal was performed insufficiently at the first-stage operation in one tooth. Tooth movement was detected in 37 of 40 cases (92.5%). No postoperative MNP was observed in cases in which complete two-stage MWTE was carried out, but one case with insufficient crown removal was complicated by postoperative MNP. Seven mild complications (dehiscence, cold sensitivity, etc.) were noted after the first-stage operation. Therefore, we conclude that TMWTE for high-risk cases assessed by X-ray findings is useful to avoid MNP after MWTE. (author)

  9. MRI features of patients with heroin spongiform leukoencephalopathy of different clinical stages

    International Nuclear Information System (INIS)

    Shi Zhu; Pan Suyue; Zhou Liang; Dong Zhao; Lu Bingxun

    2007-01-01

    Objective: To investigate radiological features of patients with heroin spongiform leukoencephalopathy (HSLE) of different clinical stages and discuss the evolutional characteristics of the disease. Methods: Thirty two patients with HSLE underwent precontrast MRI and postcontrast MRI. The history of addiction, clinical presentations, and brain MRI were analyzed and summarized according to the patient's clinical staging. There are 6 cases in I stage, 21 cases in II stage, 5 cases in III stage. Results: All patients had history of heroin vapor inhalation. Most of the cases developed subacute cerebellar impairment in earlier period. Brain MRI revealed symmetrical lesion within bilateral cerebellum in all patients. Splenium of the corpus callosum, posterior limb of the internal capsule, deep white matter of the occipital and parietal lobes, were gradually involved with progressive deterioration of HSLE. The brain stem and deep white matter of the frontal and temporal lobes were involved in some cases. Conclusions: The history of heated heroin vapor inhalation was the prerequisite for the diagnosis of HSLE. Brain MRI presented the characteristic lesion and its evolution of HSLE. Brain MRI was very important for accurate diagnosis and helpful to judge the clinical stages according to the involved brain region. (authors)

  10. Adversarial Feature Selection Against Evasion Attacks.

    Science.gov (United States)

    Zhang, Fei; Chan, Patrick P K; Biggio, Battista; Yeung, Daniel S; Roli, Fabio

    2016-03-01

    Pattern recognition and machine learning techniques have been increasingly adopted in adversarial settings such as spam, intrusion, and malware detection, although their security against well-crafted attacks that aim to evade detection by manipulating data at test time has not yet been thoroughly assessed. While previous work has been mainly focused on devising adversary-aware classification algorithms to counter evasion attempts, only few authors have considered the impact of using reduced feature sets on classifier security against the same attacks. An interesting, preliminary result is that classifier security to evasion may be even worsened by the application of feature selection. In this paper, we provide a more detailed investigation of this aspect, shedding some light on the security properties of feature selection against evasion attacks. Inspired by previous work on adversary-aware classifiers, we propose a novel adversary-aware feature selection model that can improve classifier security against evasion attacks, by incorporating specific assumptions on the adversary's data manipulation strategy. We focus on an efficient, wrapper-based implementation of our approach, and experimentally validate its soundness on different application examples, including spam and malware detection.

  11. Optics of two-stage photovoltaic concentrators with dielectric second stages

    Science.gov (United States)

    Ning, Xiaohui; O'Gallagher, Joseph; Winston, Roland

    1987-04-01

    Two-stage photovoltaic concentrators with Fresnel lenses as primaries and dielectric totally internally reflecting nonimaging concentrators as secondaries are discussed. The general design principles of such two-stage systems are given. Their optical properties are studied and analyzed in detail using computer ray trace procedures. It is found that the two-stage concentrator offers not only a higher concentration or increased acceptance angle, but also a more uniform flux distribution on the photovoltaic cell than the point focusing Fresnel lens alone. Experimental measurements with a two-stage prototype module are presented and compared to the analytical predictions.

  12. Optics of two-stage photovoltaic concentrators with dielectric second stages.

    Science.gov (United States)

    Ning, X; O'Gallagher, J; Winston, R

    1987-04-01

    Two-stage photovoltaic concentrators with Fresnel lenses as primaries and dielectric totally internally reflecting nonimaging concentrators as secondaries are discussed. The general design principles of such two-stage systems are given. Their optical properties are studied and analyzed in detail using computer ray trace procedures. It is found that the two-stage concentrator offers not only a higher concentration or increased acceptance angle, but also a more uniform flux distribution on the photovoltaic cell than the point focusing Fresnel lens alone. Experimental measurements with a two-stage prototype module are presented and compared to the analytical predictions.

  13. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  14. Feature Selection Based on Mutual Correlation

    Czech Academy of Sciences Publication Activity Database

    Haindl, Michal; Somol, Petr; Ververidis, D.; Kotropoulos, C.

    2006-01-01

    Roč. 19, č. 4225 (2006), s. 569-577 ISSN 0302-9743. [Iberoamerican Congress on Pattern Recognition. CIARP 2006 /11./. Cancun, 14.11.2006-17.11.2006] R&D Projects: GA AV ČR 1ET400750407; GA MŠk 1M0572; GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : feature selection Subject RIV: BD - Theory of Information Impact factor: 0.402, year: 2005 http://library.utia.cas.cz/separaty/historie/haindl-feature selection based on mutual correlation.pdf

  15. Single-stage-to-orbit versus two-stage-two-orbit: A cost perspective

    Science.gov (United States)

    Hamaker, Joseph W.

    1996-03-01

    This paper considers the possible life-cycle costs of single-stage-to-orbit (SSTO) and two-stage-to-orbit (TSTO) reusable launch vehicles (RLV's). The analysis parametrically addresses the issue such that the preferred economic choice comes down to the relative complexity of the TSTO compared to the SSTO. The analysis defines the boundary complexity conditions at which the two configurations have equal life-cycle costs, and finally, makes a case for the economic preference of SSTO over TSTO.

  16. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

    KAUST Repository

    Abusamra, Heba

    2013-11-01

    Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  17. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

    KAUST Repository

    Abusamra, Heba

    2013-01-01

    Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  18. Effective traffic features selection algorithm for cyber-attacks samples

    Science.gov (United States)

    Li, Yihong; Liu, Fangzheng; Du, Zhenyu

    2018-05-01

    By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.

  19. Penalized feature selection and classification in bioinformatics

    OpenAIRE

    Ma, Shuangge; Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classific...

  20. A Meta-Heuristic Regression-Based Feature Selection for Predictive Analytics

    Directory of Open Access Journals (Sweden)

    Bharat Singh

    2014-11-01

    Full Text Available A high-dimensional feature selection having a very large number of features with an optimal feature subset is an NP-complete problem. Because conventional optimization techniques are unable to tackle large-scale feature selection problems, meta-heuristic algorithms are widely used. In this paper, we propose a particle swarm optimization technique while utilizing regression techniques for feature selection. We then use the selected features to classify the data. Classification accuracy is used as a criterion to evaluate classifier performance, and classification is accomplished through the use of k-nearest neighbour (KNN and Bayesian techniques. Various high dimensional data sets are used to evaluate the usefulness of the proposed approach. Results show that our approach gives better results when compared with other conventional feature selection algorithms.

  1. An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem

    Directory of Open Access Journals (Sweden)

    Jing Bian

    2016-01-01

    Full Text Available In the era of big data, feature selection is an essential process in machine learning. Although the class imbalance problem has recently attracted a great deal of attention, little effort has been undertaken to develop feature selection techniques. In addition, most applications involving feature selection focus on classification accuracy but not cost, although costs are important. To cope with imbalance problems, we developed a cost-sensitive feature selection algorithm that adds the cost-based evaluation function of a filter feature selection using a chaos genetic algorithm, referred to as CSFSG. The evaluation function considers both feature-acquiring costs (test costs and misclassification costs in the field of network security, thereby weakening the influence of many instances from the majority of classes in large-scale datasets. The CSFSG algorithm reduces the total cost of feature selection and trades off both factors. The behavior of the CSFSG algorithm is tested on a large-scale dataset of network security, using two kinds of classifiers: C4.5 and k-nearest neighbor (KNN. The results of the experimental research show that the approach is efficient and able to effectively improve classification accuracy and to decrease classification time. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.

  2. Novel Automatic Filter-Class Feature Selection for Machine Learning Regression

    DEFF Research Database (Denmark)

    Wollsen, Morten Gill; Hallam, John; Jørgensen, Bo Nørregaard

    2017-01-01

    With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require...... model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature...

  3. Improving mass candidate detection in mammograms via feature maxima propagation and local feature selection.

    Science.gov (United States)

    Melendez, Jaime; Sánchez, Clara I; van Ginneken, Bram; Karssemeijer, Nico

    2014-08-01

    Mass candidate detection is a crucial component of multistep computer-aided detection (CAD) systems. It is usually performed by combining several local features by means of a classifier. When these features are processed on a per-image-location basis (e.g., for each pixel), mismatching problems may arise while constructing feature vectors for classification, which is especially true when the behavior expected from the evaluated features is a peaked response due to the presence of a mass. In this study, two of these problems, consisting of maxima misalignment and differences of maxima spread, are identified and two solutions are proposed. The first proposed method, feature maxima propagation, reproduces feature maxima through their neighboring locations. The second method, local feature selection, combines different subsets of features for different feature vectors associated with image locations. Both methods are applied independently and together. The proposed methods are included in a mammogram-based CAD system intended for mass detection in screening. Experiments are carried out with a database of 382 digital cases. Sensitivity is assessed at two sets of operating points. The first one is the interval of 3.5-15 false positives per image (FPs/image), which is typical for mass candidate detection. The second one is 1 FP/image, which allows to estimate the quality of the mass candidate detector's output for use in subsequent steps of the CAD system. The best results are obtained when the proposed methods are applied together. In that case, the mean sensitivity in the interval of 3.5-15 FPs/image significantly increases from 0.926 to 0.958 (p < 0.0002). At the lower rate of 1 FP/image, the mean sensitivity improves from 0.628 to 0.734 (p < 0.0002). Given the improved detection performance, the authors believe that the strategies proposed in this paper can render mass candidate detection approaches based on image location classification more robust to feature

  4. Features of mechanical snubbers and the method of selection

    Energy Technology Data Exchange (ETDEWEB)

    Sunakoda, K [Sanwa Tekki Corp., Utsunomiya (Japan). Utsunomiya Works

    1978-11-01

    In the oil snubbers used in the high radiation environment of nuclear power stations, gas generation from oil and the deterioration of rubber material for sealing occur due to radiation damage, therefore periodical inspection and replacement are required during operation. The mechanical snubbers developed as aseismatic supporters in place of oil snubbers have entered the stage of practical use, and are made by two companies in USA and a company in Japan. Their features as compared with oil snubbers are presented.ces are explained.

  5. Signatures of natural selection between life cycle stages separated by metamorphosis in European eel.

    Science.gov (United States)

    Pujolar, J M; Jacobsen, M W; Bekkevold, D; Lobón-Cervià, J; Jónsson, B; Bernatchez, L; Hansen, M M

    2015-08-13

    Species showing complex life cycles provide excellent opportunities to study the genetic associations between life cycle stages, as selective pressures may differ before and after metamorphosis. The European eel presents a complex life cycle with two metamorphoses, a first metamorphosis from larvae into glass eels (juvenile stage) and a second metamorphosis into silver eels (adult stage). We tested the hypothesis that different genes and gene pathways will be under selection at different life stages when comparing the genetic associations between glass eels and silver eels. We used two sets of markers to test for selection: first, we genotyped individuals using a panel of 80 coding-gene single nucleotide polymorphisms (SNPs) developed in American eel; second, we investigated selection at the genome level using a total of 153,423 RAD-sequencing generated SNPs widely distributed across the genome. Using the RAD approach, outlier tests identified a total of 2413 (1.57%) potentially selected SNPs. Functional annotation analysis identified signal transduction pathways as the most over-represented group of genes, including MAPK/Erk signalling, calcium signalling and GnRH (gonadotropin-releasing hormone) signalling. Many of the over-represented pathways were related to growth, while others could result from the different conditions that eels inhabit during their life cycle. The observation of different genes and gene pathways under selection when comparing glass eels vs. silver eels supports the adaptive decoupling hypothesis for the benefits of metamorphosis. Partitioning the life cycle into discrete morphological phases may be overall beneficial since it allows the different life stages to respond independently to their unique selection pressures. This might translate into a more effective use of food and niche resources and/or performance of phase-specific tasks (e.g. feeding in the case of glass eels, migrating and reproducing in the case of silver eels).

  6. Efficient Separation and Extraction of Vanadium and Chromium in High Chromium Vanadium Slag by Selective Two-Stage Roasting-Leaching

    Science.gov (United States)

    Wen, Jing; Jiang, Tao; Xu, Yingzhe; Liu, Jiayi; Xue, Xiangxin

    2018-04-01

    Vanadium and chromium are important rare metals, leading to a focus on high chromium vanadium slag (HCVS) as a potential raw material to extract vanadium and chromium in China. In this work, a novel method based on selective two-stage roasting-leaching was proposed to separate and extract vanadium and chromium efficiently in HCVS. XRD, FT-IR, and SEM were utilized to analyze the phase evolutions and microstructure during the whole process. Calcification roasting, which can calcify vanadium selectively using thermodynamics, was carried out in the first roasting stage to transfer vanadium into acid-soluble vanadate and leave chromium in the leaching residue as (Fe0.6Cr0.4)2O3 after H2SO4 leaching. When HCVS and CaO were mixed in the molar ratio CaO/V2O3 (n(CaO)/n(V2O3)) of 0.5 to 1.25, around 90 pct vanadium and less than 1 pct chromium were extracted in the first leaching liquid, thus achieving the separation of vanadium and chromium. In the second roasting stage, sodium salt, which combines with chromium easily, was added to the first leaching residue to extract chromium and 95.16 pct chromium was extracted under the optimal conditions. The total vanadium and chromium leaching rates were above 95 pct, achieving the efficient separation and extraction of vanadium and chromium. The established method provides a new technique to separate vanadium and chromium during roasting rather than in the liquid form, which is useful for the comprehensive application of HCVS.

  7. Efficient Separation and Extraction of Vanadium and Chromium in High Chromium Vanadium Slag by Selective Two-Stage Roasting-Leaching

    Science.gov (United States)

    Wen, Jing; Jiang, Tao; Xu, Yingzhe; Liu, Jiayi; Xue, Xiangxin

    2018-06-01

    Vanadium and chromium are important rare metals, leading to a focus on high chromium vanadium slag (HCVS) as a potential raw material to extract vanadium and chromium in China. In this work, a novel method based on selective two-stage roasting-leaching was proposed to separate and extract vanadium and chromium efficiently in HCVS. XRD, FT-IR, and SEM were utilized to analyze the phase evolutions and microstructure during the whole process. Calcification roasting, which can calcify vanadium selectively using thermodynamics, was carried out in the first roasting stage to transfer vanadium into acid-soluble vanadate and leave chromium in the leaching residue as (Fe0.6Cr0.4)2O3 after H2SO4 leaching. When HCVS and CaO were mixed in the molar ratio CaO/V2O3 (n(CaO)/n(V2O3)) of 0.5 to 1.25, around 90 pct vanadium and less than 1 pct chromium were extracted in the first leaching liquid, thus achieving the separation of vanadium and chromium. In the second roasting stage, sodium salt, which combines with chromium easily, was added to the first leaching residue to extract chromium and 95.16 pct chromium was extracted under the optimal conditions. The total vanadium and chromium leaching rates were above 95 pct, achieving the efficient separation and extraction of vanadium and chromium. The established method provides a new technique to separate vanadium and chromium during roasting rather than in the liquid form, which is useful for the comprehensive application of HCVS.

  8. East Chestnut Ridge hydrogeologic characterization: A geophysical study of two karst features

    International Nuclear Information System (INIS)

    1991-01-01

    Permitting and site selection activities for the proposed East Chestnut Ridge landfill, located on the Oak Ridge Reservation, have required additional hydrogeologic studies of two karst features. Geophysical testing methods were utilized for investigating these karst features. The objectives of the geophysical testing was to determine the feasibility of geophysical techniques for locating subsurface karst features and to determine if subsurface anomalies exist at the proposed landfill site. Two karst features, one lacking surface expression (sinkhole) but with a known solution cavity at depth (from previous hydrologic studies), and the other with surface expression were tested with surface geophysical methods. Four geophysical profiles, two crossing and centered over each karst feature were collected using both gravimetric and electrical resistivity techniques

  9. NetProt: Complex-based Feature Selection.

    Science.gov (United States)

    Goh, Wilson Wen Bin; Wong, Limsoon

    2017-08-04

    Protein complex-based feature selection (PCBFS) provides unparalleled reproducibility with high phenotypic relevance on proteomics data. Currently, there are five PCBFS paradigms, but not all representative methods have been implemented or made readily available. To allow general users to take advantage of these methods, we developed the R-package NetProt, which provides implementations of representative feature-selection methods. NetProt also provides methods for generating simulated differential data and generating pseudocomplexes for complex-based performance benchmarking. The NetProt open source R package is available for download from https://github.com/gohwils/NetProt/releases/ , and online documentation is available at http://rpubs.com/gohwils/204259 .

  10. A Two-Stage Approach to Civil Conflict: Contested Incompatibilities and Armed Violence

    DEFF Research Database (Denmark)

    Bartusevicius, Henrikas; Gleditsch, Kristian Skrede

    2017-01-01

    conflict origination but have no clear effect on militarization, whereas other features emphasized as shaping the risk of civil war, such as refugee flows and soft state power, strongly influence militarization but not incompatibilities. We posit that a two-stage approach to conflict analysis can help...

  11. Comparative effectiveness of one-stage versus two-stage basilic vein transposition arteriovenous fistulas.

    Science.gov (United States)

    Ghaffarian, Amir A; Griffin, Claire L; Kraiss, Larry W; Sarfati, Mark R; Brooke, Benjamin S

    2018-02-01

    Basilic vein transposition (BVT) fistulas may be performed as either a one-stage or two-stage operation, although there is debate as to which technique is superior. This study was designed to evaluate the comparative clinical efficacy and cost-effectiveness of one-stage vs two-stage BVT. We identified all patients at a single large academic hospital who had undergone creation of either a one-stage or two-stage BVT between January 2007 and January 2015. Data evaluated included patient demographics, comorbidities, medication use, reasons for abandonment, and interventions performed to maintain patency. Costs were derived from the literature, and effectiveness was expressed in quality-adjusted life-years (QALYs). We analyzed primary and secondary functional patency outcomes as well as survival during follow-up between one-stage and two-stage BVT procedures using multivariate Cox proportional hazards models and Kaplan-Meier analysis with log-rank tests. The incremental cost-effectiveness ratio was used to determine cost savings. We identified 131 patients in whom 57 (44%) one-stage BVT and 74 (56%) two-stage BVT fistulas were created among 8 different vascular surgeons during the study period that each performed both procedures. There was no significant difference in the mean age, male gender, white race, diabetes, coronary disease, or medication profile among patients undergoing one- vs two-stage BVT. After fistula transposition, the median follow-up time was 8.3 months (interquartile range, 3-21 months). Primary patency rates of one-stage BVT were 56% at 12-month follow-up, whereas primary patency rates of two-stage BVT were 72% at 12-month follow-up. Patients undergoing two-stage BVT also had significantly higher rates of secondary functional patency at 12 months (57% for one-stage BVT vs 80% for two-stage BVT) and 24 months (44% for one-stage BVT vs 73% for two-stage BVT) of follow-up (P < .001 using log-rank test). However, there was no significant difference

  12. Automatic staging of bladder cancer on CT urography

    Science.gov (United States)

    Garapati, Sankeerth S.; Hadjiiski, Lubomir M.; Cha, Kenny H.; Chan, Heang-Ping; Caoili, Elaine M.; Cohan, Richard H.; Weizer, Alon; Alva, Ajjai; Paramagul, Chintana; Wei, Jun; Zhou, Chuan

    2016-03-01

    Correct staging of bladder cancer is crucial for the decision of neoadjuvant chemotherapy treatment and minimizing the risk of under- or over-treatment. Subjectivity and variability of clinicians in utilizing available diagnostic information may lead to inaccuracy in staging bladder cancer. An objective decision support system that merges the information in a predictive model based on statistical outcomes of previous cases and machine learning may assist clinicians in making more accurate and consistent staging assessments. In this study, we developed a preliminary method to stage bladder cancer. With IRB approval, 42 bladder cancer cases with CTU scans were collected from patient files. The cases were classified into two classes based on pathological stage T2, which is the decision threshold for neoadjuvant chemotherapy treatment (i.e. for stage >=T2) clinically. There were 21 cancers below stage T2 and 21 cancers at stage T2 or above. All 42 lesions were automatically segmented using our auto-initialized cascaded level sets (AI-CALS) method. Morphological features were extracted, which were selected and merged by linear discriminant analysis (LDA) classifier. A leave-one-case-out resampling scheme was used to train and test the classifier using the 42 lesions. The classification accuracy was quantified using the area under the ROC curve (Az). The average training Az was 0.97 and the test Az was 0.85. The classifier consistently selected the lesion volume, a gray level feature and a contrast feature. This predictive model shows promise for assisting in assessing the bladder cancer stage.

  13. Robust Feature Selection from Microarray Data Based on Cooperative Game Theory and Qualitative Mutual Information

    Directory of Open Access Journals (Sweden)

    Atiyeh Mortazavi

    2016-01-01

    Full Text Available High dimensionality of microarray data sets may lead to low efficiency and overfitting. In this paper, a multiphase cooperative game theoretic feature selection approach is proposed for microarray data classification. In the first phase, due to high dimension of microarray data sets, the features are reduced using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. In the second phase, Shapley index is used to evaluate the power of each feature. The main innovation of the proposed approach is to employ Qualitative Mutual Information (QMI for this purpose. The idea of Qualitative Mutual Information causes the selected features to have more stability and this stability helps to deal with the problem of data imbalance and scarcity. In the third phase, a forward selection scheme is applied which uses a scoring function to weight each feature. The performance of the proposed method is compared with other popular feature selection algorithms such as Fisher ratio, minimum redundancy maximum relevance, and previous works on cooperative game based feature selection. The average classification accuracy on eleven microarray data sets shows that the proposed method improves both average accuracy and average stability compared to other approaches.

  14. Design considerations for single-stage and two-stage pneumatic pellet injectors

    International Nuclear Information System (INIS)

    Gouge, M.J.; Combs, S.K.; Fisher, P.W.; Milora, S.L.

    1988-09-01

    Performance of single-stage pneumatic pellet injectors is compared with several models for one-dimensional, compressible fluid flow. Agreement is quite good for models that reflect actual breech chamber geometry and incorporate nonideal effects such as gas friction. Several methods of improving the performance of single-stage pneumatic pellet injectors in the near term are outlined. The design and performance of two-stage pneumatic pellet injectors are discussed, and initial data from the two-stage pneumatic pellet injector test facility at Oak Ridge National Laboratory are presented. Finally, a concept for a repeating two-stage pneumatic pellet injector is described. 27 refs., 8 figs., 3 tabs

  15. Two-Stage Part-Based Pedestrian Detection

    DEFF Research Database (Denmark)

    Møgelmose, Andreas; Prioletti, Antonio; Trivedi, Mohan M.

    2012-01-01

    Detecting pedestrians is still a challenging task for automotive vision system due the extreme variability of targets, lighting conditions, occlusions, and high speed vehicle motion. A lot of research has been focused on this problem in the last 10 years and detectors based on classifiers has...... gained a special place among the different approaches presented. This work presents a state-of-the-art pedestrian detection system based on a two stages classifier. Candidates are extracted with a Haar cascade classifier trained with the DaimlerDB dataset and then validated through part-based HOG...... of several metrics, such as detection rate, false positives per hour, and frame rate. The novelty of this system rely in the combination of HOG part-based approach, tracking based on specific optimized feature and porting on a real prototype....

  16. Multi-task feature selection in microarray data by binary integer programming.

    Science.gov (United States)

    Lan, Liang; Vucetic, Slobodan

    2013-12-20

    A major challenge in microarray classification is that the number of features is typically orders of magnitude larger than the number of examples. In this paper, we propose a novel feature filter algorithm to select the feature subset with maximal discriminative power and minimal redundancy by solving a quadratic objective function with binary integer constraints. To improve the computational efficiency, the binary integer constraints are relaxed and a low-rank approximation to the quadratic term is applied. The proposed feature selection algorithm was extended to solve multi-task microarray classification problems. We compared the single-task version of the proposed feature selection algorithm with 9 existing feature selection methods on 4 benchmark microarray data sets. The empirical results show that the proposed method achieved the most accurate predictions overall. We also evaluated the multi-task version of the proposed algorithm on 8 multi-task microarray datasets. The multi-task feature selection algorithm resulted in significantly higher accuracy than when using the single-task feature selection methods.

  17. Modeling and Implementing Two-Stage AdaBoost for Real-Time Vehicle License Plate Detection

    Directory of Open Access Journals (Sweden)

    Moon Kyou Song

    2014-01-01

    Full Text Available License plate (LP detection is the most imperative part of the automatic LP recognition system. In previous years, different methods, techniques, and algorithms have been developed for LP detection (LPD systems. This paper proposes to automatical detection of car LPs via image processing techniques based on classifier or machine learning algorithms. In this paper, we propose a real-time and robust method for LPD systems using the two-stage adaptive boosting (AdaBoost algorithm combined with different image preprocessing techniques. Haar-like features are used to compute and select features from LP images. The AdaBoost algorithm is used to classify parts of an image within a search window by a trained strong classifier as either LP or non-LP. Adaptive thresholding is used for the image preprocessing method applied to those images that are of insufficient quality for LPD. This method is of a faster speed and higher accuracy than most of the existing methods used in LPD. Experimental results demonstrate that the average LPD rate is 98.38% and the computational time is approximately 49 ms.

  18. A Two-Stage Layered Mixture Experiment Design for a Nuclear Waste Glass Application-Part 1

    International Nuclear Information System (INIS)

    Cooley, Scott K.; Piepel, Gregory F.; Gan, Hao; Kot, Wing; Pegg, Ian L.

    2003-01-01

    A layered experimental design involving mixture variables was generated to support developing property-composition models for high-level waste (HLW) glasses. The design was generated in two stages, each having unique characteristics. Each stage used a layered design having an outer layer, an inner layer, a center point, and some replicates. The layers were defined by single- and multi-variable constraints. The first stage involved 15 glass components treated as mixture variables. For each layer, vertices were generated and optimal design software was used to select alternative subsets of vertices and calculate design optimality measures. Two partial quadratic mixture models, containing 25 terms for the outer layer and 30 terms for the inner layer, were the basis for the optimal design calculations. Distributions of predicted glass property values were plotted and evaluated for the alternative subsets of vertices. Based on the optimality measures and the predicted property distributions, a ''best'' subset of vertices was selected for each layer to form a layered design for the first stage. The design for the second stage was selected to augment the first-stage design. The discussion of the second-stage design begins in this Part 1 and is continued in Part 2 (Cooley and Piepel, 2003b)

  19. Feature selection and multi-kernel learning for sparse representation on a manifold

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-03-01

    Sparse representation has been widely studied as a part-based data representation method and applied in many scientific and engineering fields, such as bioinformatics and medical imaging. It seeks to represent a data sample as a sparse linear combination of some basic items in a dictionary. Gao etal. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity graph constructed directly from the original feature space is not necessarily a reliable reflection of the intrinsic manifold of the data samples. To overcome this problem, we integrate feature selection and multiple kernel learning into the sparse coding on the manifold. To this end, unified objectives are defined for feature selection, multiple kernel learning, sparse coding, and graph regularization. By optimizing the objective functions iteratively, we develop novel data representation algorithms with feature selection and multiple kernel learning respectively. Experimental results on two challenging tasks, N-linked glycosylation prediction and mammogram retrieval, demonstrate that the proposed algorithms outperform the traditional sparse coding methods. © 2013 Elsevier Ltd.

  20. Feature selection and multi-kernel learning for sparse representation on a manifold.

    Science.gov (United States)

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2014-03-01

    Sparse representation has been widely studied as a part-based data representation method and applied in many scientific and engineering fields, such as bioinformatics and medical imaging. It seeks to represent a data sample as a sparse linear combination of some basic items in a dictionary. Gao et al. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity graph constructed directly from the original feature space is not necessarily a reliable reflection of the intrinsic manifold of the data samples. To overcome this problem, we integrate feature selection and multiple kernel learning into the sparse coding on the manifold. To this end, unified objectives are defined for feature selection, multiple kernel learning, sparse coding, and graph regularization. By optimizing the objective functions iteratively, we develop novel data representation algorithms with feature selection and multiple kernel learning respectively. Experimental results on two challenging tasks, N-linked glycosylation prediction and mammogram retrieval, demonstrate that the proposed algorithms outperform the traditional sparse coding methods. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.

    Science.gov (United States)

    Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens

    2016-01-01

    MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.

  2. Cancer microarray data feature selection using multi-objective binary particle swarm optimization algorithm

    Science.gov (United States)

    Annavarapu, Chandra Sekhara Rao; Dara, Suresh; Banka, Haider

    2016-01-01

    Cancer investigations in microarray data play a major role in cancer analysis and the treatment. Cancer microarray data consists of complex gene expressed patterns of cancer. In this article, a Multi-Objective Binary Particle Swarm Optimization (MOBPSO) algorithm is proposed for analyzing cancer gene expression data. Due to its high dimensionality, a fast heuristic based pre-processing technique is employed to reduce some of the crude domain features from the initial feature set. Since these pre-processed and reduced features are still high dimensional, the proposed MOBPSO algorithm is used for finding further feature subsets. The objective functions are suitably modeled by optimizing two conflicting objectives i.e., cardinality of feature subsets and distinctive capability of those selected subsets. As these two objective functions are conflicting in nature, they are more suitable for multi-objective modeling. The experiments are carried out on benchmark gene expression datasets, i.e., Colon, Lymphoma and Leukaemia available in literature. The performance of the selected feature subsets with their classification accuracy and validated using 10 fold cross validation techniques. A detailed comparative study is also made to show the betterment or competitiveness of the proposed algorithm. PMID:27822174

  3. Two-Stage Fan I: Aerodynamic and Mechanical Design

    Science.gov (United States)

    Messenger, H. E.; Kennedy, E. E.

    1972-01-01

    A two-stage, highly-loaded fan was designed to deliver an overall pressure ratio of 2.8 with an adiabatic efficiency of 83.9 percent. At the first rotor inlet, design flow per unit annulus area is 42 lbm/sec/sq ft (205 kg/sec/sq m), hub/tip ratio is 0.4 with a tip diameter of 31 inches (0.787 m), and design tip speed is 1450 ft/sec (441.96 m/sec). Other features include use of multiple-circular-arc airfoils, resettable stators, and split casings over the rotor tip sections for casing treatment tests.

  4. Two-Stage Series-Resonant Inverter

    Science.gov (United States)

    Stuart, Thomas A.

    1994-01-01

    Two-stage inverter includes variable-frequency, voltage-regulating first stage and fixed-frequency second stage. Lightweight circuit provides regulated power and is invulnerable to output short circuits. Does not require large capacitor across ac bus, like parallel resonant designs. Particularly suitable for use in ac-power-distribution system of aircraft.

  5. Comparisons and Selections of Features and Classifiers for Short Text Classification

    Science.gov (United States)

    Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi

    2017-10-01

    Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.

  6. Strong selection on mandible and nest features in a carpenter bee that nests in two sympatric host plants.

    Science.gov (United States)

    Flores-Prado, Luis; Pinto, Carlos F; Rojas, Alejandra; Fontúrbel, Francisco E

    2014-05-01

    Host plants are used by herbivorous insects as feeding or nesting resources. In wood-boring insects, host plants features may impose selective forces leading to phenotypic differentiation on traits related to nest construction. Carpenter bees build their nests in dead stems or dry twigs of shrubs and trees; thus, mandibles are essential for the nesting process, and the nest is required for egg laying and offspring survival. We explored the shape and intensity of natural selection on phenotypic variation on three size measures of the bees (intertegular width, wing length, and mandible area) and two nest architecture measures (tunnel length and diameter) on bees using the native species Chusquea quila (Poaceae), and the alloctonous species Rubus ulmifolius (Rosaceae), in central Chile. Our results showed significant and positive linear selection gradients for tunnel length on both hosts, indicating that bees building long nests have more offspring. Bees with broader mandibles show greater fitness on C. quila but not on R. ulmifolius. Considering that C. quila represents a selective force on mandible area, we hypothesized a high adaptive value of this trait, resulting in higher fitness values when nesting on this host, despite its wood is denser and hence more difficult to be bored.

  7. Tracing the breeding farm of domesticated pig using feature selection (

    Directory of Open Access Journals (Sweden)

    Taehyung Kwon

    2017-11-01

    Full Text Available Objective Increasing food safety demands in the animal product market have created a need for a system to trace the food distribution process, from the manufacturer to the retailer, and genetic traceability is an effective method to trace the origin of animal products. In this study, we successfully achieved the farm tracing of 6,018 multi-breed pigs, using single nucleotide polymorphism (SNP markers strictly selected through least absolute shrinkage and selection operator (LASSO feature selection. Methods We performed farm tracing of domesticated pig (Sus scrofa from SNP markers and selected the most relevant features for accurate prediction. Considering multi-breed composition of our data, we performed feature selection using LASSO penalization on 4,002 SNPs that are shared between breeds, which also includes 179 SNPs with small between-breed difference. The 100 highest-scored features were extracted from iterative simulations and then evaluated using machine-leaning based classifiers. Results We selected 1,341 SNPs from over 45,000 SNPs through iterative LASSO feature selection, to minimize between-breed differences. We subsequently selected 100 highest-scored SNPs from iterative scoring, and observed high statistical measures in classification of breeding farms by cross-validation only using these SNPs. Conclusion The study represents a successful application of LASSO feature selection on multi-breed pig SNP data to trace the farm information, which provides a valuable method and possibility for further researches on genetic traceability.

  8. Two-stage anaerobic digestion of cheese whey

    Energy Technology Data Exchange (ETDEWEB)

    Lo, K V; Liao, P H

    1986-01-01

    A two-stage digestion of cheese whey was studied using two anaerobic rotating biological contact reactors. The second-stage reactor receiving partially treated effluent from the first-stage reactor could be operated at a hydraulic retention time of one day. The results indicated that two-stage digestion is a feasible alternative for treating whey. 6 references.

  9. Feature selection using genetic algorithms for fetal heart rate analysis

    International Nuclear Information System (INIS)

    Xu, Liang; Redman, Christopher W G; Georgieva, Antoniya; Payne, Stephen J

    2014-01-01

    The fetal heart rate (FHR) is monitored on a paper strip (cardiotocogram) during labour to assess fetal health. If necessary, clinicians can intervene and assist with a prompt delivery of the baby. Data-driven computerized FHR analysis could help clinicians in the decision-making process. However, selecting the best computerized FHR features that relate to labour outcome is a pressing research problem. The objective of this study is to apply genetic algorithms (GA) as a feature selection method to select the best feature subset from 64 FHR features and to integrate these best features to recognize unfavourable FHR patterns. The GA was trained on 404 cases and tested on 106 cases (both balanced datasets) using three classifiers, respectively. Regularization methods and backward selection were used to optimize the GA. Reasonable classification performance is shown on the testing set for the best feature subset (Cohen's kappa values of 0.45 to 0.49 using different classifiers). This is, to our knowledge, the first time that a feature selection method for FHR analysis has been developed on a database of this size. This study indicates that different FHR features, when integrated, can show good performance in predicting labour outcome. It also gives the importance of each feature, which will be a valuable reference point for further studies. (paper)

  10. Oculomotor selection underlies feature retention in visual working memory.

    Science.gov (United States)

    Hanning, Nina M; Jonikaitis, Donatas; Deubel, Heiner; Szinte, Martin

    2016-02-01

    Oculomotor selection, spatial task relevance, and visual working memory (WM) are described as three processes highly intertwined and sustained by similar cortical structures. However, because task-relevant locations always constitute potential saccade targets, no study so far has been able to distinguish between oculomotor selection and spatial task relevance. We designed an experiment that allowed us to dissociate in humans the contribution of task relevance, oculomotor selection, and oculomotor execution to the retention of feature representations in WM. We report that task relevance and oculomotor selection lead to dissociable effects on feature WM maintenance. In a first task, in which an object's location was encoded as a saccade target, its feature representations were successfully maintained in WM, whereas they declined at nonsaccade target locations. Likewise, we observed a similar WM benefit at the target of saccades that were prepared but never executed. In a second task, when an object's location was marked as task relevant but constituted a nonsaccade target (a location to avoid), feature representations maintained at that location did not benefit. Combined, our results demonstrate that oculomotor selection is consistently associated with WM, whereas task relevance is not. This provides evidence for an overlapping circuitry serving saccade target selection and feature-based WM that can be dissociated from processes encoding task-relevant locations. Copyright © 2016 the American Physiological Society.

  11. Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach

    Directory of Open Access Journals (Sweden)

    Daniel Peralta

    2015-01-01

    Full Text Available Nowadays, many disciplines have to deal with big datasets that additionally involve a high number of features. Feature selection methods aim at eliminating noisy, redundant, or irrelevant features that may deteriorate the classification performance. However, traditional methods lack enough scalability to cope with datasets of millions of instances and extract successful results in a delimited time. This paper presents a feature selection algorithm based on evolutionary computation that uses the MapReduce paradigm to obtain subsets of features from big datasets. The algorithm decomposes the original dataset in blocks of instances to learn from them in the map phase; then, the reduce phase merges the obtained partial results into a final vector of feature weights, which allows a flexible application of the feature selection procedure using a threshold to determine the selected subset of features. The feature selection method is evaluated by using three well-known classifiers (SVM, Logistic Regression, and Naive Bayes implemented within the Spark framework to address big data problems. In the experiments, datasets up to 67 millions of instances and up to 2000 attributes have been managed, showing that this is a suitable framework to perform evolutionary feature selection, improving both the classification accuracy and its runtime when dealing with big data problems.

  12. Feature and Region Selection for Visual Learning.

    Science.gov (United States)

    Zhao, Ji; Wang, Liantao; Cabral, Ricardo; De la Torre, Fernando

    2016-03-01

    Visual learning problems, such as object classification and action recognition, are typically approached using extensions of the popular bag-of-words (BoWs) model. Despite its great success, it is unclear what visual features the BoW model is learning. Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: 1) our approach accommodates non-linear additive kernels, such as the popular χ(2) and intersection kernel; 2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; 3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; and 4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach.

  13. On the robustness of two-stage estimators

    KAUST Repository

    Zhelonkin, Mikhail

    2012-04-01

    The aim of this note is to provide a general framework for the analysis of the robustness properties of a broad class of two-stage models. We derive the influence function, the change-of-variance function, and the asymptotic variance of a general two-stage M-estimator, and provide their interpretations. We illustrate our results in the case of the two-stage maximum likelihood estimator and the two-stage least squares estimator. © 2011.

  14. Feature selection and nearest centroid classification for protein mass spectrometry

    Directory of Open Access Journals (Sweden)

    Levner Ilya

    2005-03-01

    Full Text Available Abstract Background The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the input space and thus enable the subsequent use of classification algorithms. This paper examines feature selection techniques for proteomic mass spectrometry. Results This study examines the performance of the nearest centroid classifier coupled with the following feature selection algorithms. Student-t test, Kolmogorov-Smirnov test, and the P-test are univariate statistics used for filter-based feature ranking. From the wrapper approaches we tested sequential forward selection and a modified version of sequential backward selection. Embedded approaches included shrunken nearest centroid and a novel version of boosting based feature selection we developed. In addition, we tested several dimensionality reduction approaches, namely principal component analysis and principal component analysis coupled with linear discriminant analysis. To fairly assess each algorithm, evaluation was done using stratified cross validation with an internal leave-one-out cross-validation loop for automated feature selection. Comprehensive experiments, conducted on five popular cancer data sets, revealed that the less advocated sequential forward selection and boosted feature selection algorithms produce the most consistent results across all data sets. In contrast, the state-of-the-art performance reported on isolated data sets for several of the studied algorithms, does not hold across all data sets. Conclusion This study tested a number of popular feature

  15. Feature selection toolbox software package

    Czech Academy of Sciences Publication Activity Database

    Pudil, Pavel; Novovičová, Jana; Somol, Petr

    2002-01-01

    Roč. 23, č. 4 (2002), s. 487-492 ISSN 0167-8655 R&D Projects: GA ČR GA402/01/0981 Institutional research plan: CEZ:AV0Z1075907 Keywords : pattern recognition * feature selection * loating search algorithms Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.409, year: 2002

  16. Sensitivity Analysis in Two-Stage DEA

    Directory of Open Access Journals (Sweden)

    Athena Forghani

    2015-07-01

    Full Text Available Data envelopment analysis (DEA is a method for measuring the efficiency of peer decision making units (DMUs which uses a set of inputs to produce a set of outputs. In some cases, DMUs have a two-stage structure, in which the first stage utilizes inputs to produce outputs used as the inputs of the second stage to produce final outputs. One important issue in two-stage DEA is the sensitivity of the results of an analysis to perturbations in the data. The current paper looks into combined model for two-stage DEA and applies the sensitivity analysis to DMUs on the entire frontier. In fact, necessary and sufficient conditions for preserving a DMU's efficiency classiffication are developed when various data changes are applied to all DMUs.

  17. Sensitivity Analysis in Two-Stage DEA

    Directory of Open Access Journals (Sweden)

    Athena Forghani

    2015-12-01

    Full Text Available Data envelopment analysis (DEA is a method for measuring the efficiency of peer decision making units (DMUs which uses a set of inputs to produce a set of outputs. In some cases, DMUs have a two-stage structure, in which the first stage utilizes inputs to produce outputs used as the inputs of the second stage to produce final outputs. One important issue in two-stage DEA is the sensitivity of the results of an analysis to perturbations in the data. The current paper looks into combined model for two-stage DEA and applies the sensitivity analysis to DMUs on the entire frontier. In fact, necessary and sufficient conditions for preserving a DMU's efficiency classiffication are developed when various data changes are applied to all DMUs.

  18. Jointly Feature Learning and Selection for Robust Tracking via a Gating Mechanism.

    Directory of Open Access Journals (Sweden)

    Bineng Zhong

    Full Text Available To achieve effective visual tracking, a robust feature representation composed of two separate components (i.e., feature learning and selection for an object is one of the key issues. Typically, a common assumption used in visual tracking is that the raw video sequences are clear, while real-world data is with significant noise and irrelevant patterns. Consequently, the learned features may be not all relevant and noisy. To address this problem, we propose a novel visual tracking method via a point-wise gated convolutional deep network (CPGDN that jointly performs the feature learning and feature selection in a unified framework. The proposed method performs dynamic feature selection on raw features through a gating mechanism. Therefore, the proposed method can adaptively focus on the task-relevant patterns (i.e., a target object, while ignoring the task-irrelevant patterns (i.e., the surrounding background of a target object. Specifically, inspired by transfer learning, we firstly pre-train an object appearance model offline to learn generic image features and then transfer rich feature hierarchies from an offline pre-trained CPGDN into online tracking. In online tracking, the pre-trained CPGDN model is fine-tuned to adapt to the tracking specific objects. Finally, to alleviate the tracker drifting problem, inspired by an observation that a visual target should be an object rather than not, we combine an edge box-based object proposal method to further improve the tracking accuracy. Extensive evaluation on the widely used CVPR2013 tracking benchmark validates the robustness and effectiveness of the proposed method.

  19. A Two-Stage Fuzzy Logic Control Method of Traffic Signal Based on Traffic Urgency Degree

    OpenAIRE

    Yan Ge

    2014-01-01

    City intersection traffic signal control is an important method to improve the efficiency of road network and alleviate traffic congestion. This paper researches traffic signal fuzzy control method on a single intersection. A two-stage traffic signal control method based on traffic urgency degree is proposed according to two-stage fuzzy inference on single intersection. At the first stage, calculate traffic urgency degree for all red phases using traffic urgency evaluation module and select t...

  20. Two stage-type railgun accelerator

    International Nuclear Information System (INIS)

    Ogino, Mutsuo; Azuma, Kingo.

    1995-01-01

    The present invention provides a two stage-type railgun accelerator capable of spiking a flying body (ice pellet) formed by solidifying a gaseous hydrogen isotope as a fuel to a thermonuclear reactor at a higher speed into a central portion of plasmas. Namely, the two stage-type railgun accelerator accelerates the flying body spiked from a initial stage accelerator to a portion between rails by Lorentz force generated when electric current is supplied to the two rails by way of a plasma armature. In this case, two sets of solenoids are disposed for compressing the plasma armature in the longitudinal direction of the rails. The first and the second sets of solenoid coils are previously supplied with electric current. After passing of the flying body, the armature formed into plasmas by a gas laser disposed at the back of the flying body is compressed in the longitudinal direction of the rails by a magnetic force of the first and the second sets of solenoid coils to increase the plasma density. A current density is also increased simultaneously. Then, the first solenoid coil current is turned OFF to accelerate the flying body in two stages by the compressed plasma armature. (I.S.)

  1. Feature Selection based on Machine Learning in MRIs for Hippocampal Segmentation

    Science.gov (United States)

    Tangaro, Sabina; Amoroso, Nicola; Brescia, Massimo; Cavuoti, Stefano; Chincarini, Andrea; Errico, Rosangela; Paolo, Inglese; Longo, Giuseppe; Maglietta, Rosalia; Tateo, Andrea; Riccio, Giuseppe; Bellotti, Roberto

    2015-01-01

    Neurodegenerative diseases are frequently associated with structural changes in the brain. Magnetic resonance imaging (MRI) scans can show these variations and therefore can be used as a supportive feature for a number of neurodegenerative diseases. The hippocampus has been known to be a biomarker for Alzheimer disease and other neurological and psychiatric diseases. However, it requires accurate, robust, and reproducible delineation of hippocampal structures. Fully automatic methods are usually the voxel based approach; for each voxel a number of local features were calculated. In this paper, we compared four different techniques for feature selection from a set of 315 features extracted for each voxel: (i) filter method based on the Kolmogorov-Smirnov test; two wrapper methods, respectively, (ii) sequential forward selection and (iii) sequential backward elimination; and (iv) embedded method based on the Random Forest Classifier on a set of 10 T1-weighted brain MRIs and tested on an independent set of 25 subjects. The resulting segmentations were compared with manual reference labelling. By using only 23 feature for each voxel (sequential backward elimination) we obtained comparable state-of-the-art performances with respect to the standard tool FreeSurfer.

  2. Corporate governance and audit features: SMEs evidence

    OpenAIRE

    Al-Najjar, Basil

    2018-01-01

    Purpose\\ud The purpose of this paper is to investigate the effect of corporate governance factors on audit features, namely, audit fees and the selection of Big 4 audit firms within the UK SMEs context.\\ud \\ud Design/methodology/approach\\ud The author uses different regression models to investigate the impact of corporate governance characteristics on audit features, and employs cross-sectional time series models as well as two-stage least squares technique. In addition, the author has used l...

  3. Feature Selection with the Boruta Package

    OpenAIRE

    Kursa, Miron B.; Rudnicki, Witold R.

    2010-01-01

    This article describes a R package Boruta, implementing a novel feature selection algorithm for finding emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.

  4. An electronic image processing device featuring continuously selectable two-dimensional bipolar filter functions and real-time operation

    International Nuclear Information System (INIS)

    Charleston, B.D.; Beckman, F.H.; Franco, M.J.; Charleston, D.B.

    1981-01-01

    A versatile electronic-analogue image processing system has been developed for use in improving the quality of various types of images with emphasis on those encountered in experimental and diagnostic medicine. The operational principle utilizes spatial filtering which selectively controls the contrast of an image according to the spatial frequency content of relevant and non-relevant features of the image. Noise can be reduced or eliminated by selectively lowering the contrast of information in the high spatial frequency range. Edge sharpness can be enhanced by accentuating the upper midrange spatial frequencies. Both methods of spatial frequency control may be adjusted continuously in the same image to obtain maximum visibility of the features of interest. A precision video camera is used to view medical diagnostic images, either prints, transparencies or CRT displays. The output of the camera provides the analogue input signal for both the electronic processing system and the video display of the unprocessed image. The video signal input to the electronic processing system is processed by a two-dimensional spatial convolution operation. The system employs charged-coupled devices (CCDs), both tapped analogue delay lines (TADs) and serial analogue delay lines (SADs), to store information in the form of analogue potentials which are constantly being updated as new sampled analogue data arrive at the input. This information is convolved with a programmed bipolar radially symmetrical hexagonal function which may be controlled and varied at each radius by the operator in real-time by adjusting a set of front panel controls or by a programmed microprocessor control. Two TV monitors are used, one for processed image display and the other for constant reference to the original image. The working prototype has a full-screen display matrix size of 200 picture elements per horizontal line by 240 lines. The matrix can be expanded vertically and horizontally for the

  5. Two-stage implant systems.

    Science.gov (United States)

    Fritz, M E

    1999-06-01

    Since the advent of osseointegration approximately 20 years ago, there has been a great deal of scientific data developed on two-stage integrated implant systems. Although these implants were originally designed primarily for fixed prostheses in the mandibular arch, they have been used in partially dentate patients, in patients needing overdentures, and in single-tooth restorations. In addition, this implant system has been placed in extraction sites, in bone-grafted areas, and in maxillary sinus elevations. Often, the documentation of these procedures has lagged. In addition, most of the reports use survival criteria to describe results, often providing overly optimistic data. It can be said that the literature describes a true adhesion of the epithelium to the implant similar to adhesion to teeth, that two-stage implants appear to have direct contact somewhere between 50% and 70% of the implant surface, that the microbial flora of the two-stage implant system closely resembles that of the natural tooth, and that the microbiology of periodontitis appears to be closely related to peri-implantitis. In evaluations of the data from implant placement in all of the above-noted situations by means of meta-analysis, it appears that there is a strong case that two-stage dental implants are successful, usually showing a confidence interval of over 90%. It also appears that the mandibular implants are more successful than maxillary implants. Studies also show that overdenture therapy is valid, and that single-tooth implants and implants placed in partially dentate mouths have a success rate that is quite good, although not quite as high as in the fully edentulous dentition. It would also appear that the potential causes of failure in the two-stage dental implant systems are peri-implantitis, placement of implants in poor-quality bone, and improper loading of implants. There are now data addressing modifications of the implant surface to alter the percentage of

  6. Improving Naive Bayes with Online Feature Selection for Quick Adaptation to Evolving Feature Usefulness

    Energy Technology Data Exchange (ETDEWEB)

    Pon, R K; Cardenas, A F; Buttler, D J

    2007-09-19

    The definition of what makes an article interesting varies from user to user and continually evolves even for a single user. As a result, for news recommendation systems, useless document features can not be determined a priori and all features are usually considered for interestingness classification. Consequently, the presence of currently useless features degrades classification performance [1], particularly over the initial set of news articles being classified. The initial set of document is critical for a user when considering which particular news recommendation system to adopt. To address these problems, we introduce an improved version of the naive Bayes classifier with online feature selection. We use correlation to determine the utility of each feature and take advantage of the conditional independence assumption used by naive Bayes for online feature selection and classification. The augmented naive Bayes classifier performs 28% better than the traditional naive Bayes classifier in recommending news articles from the Yahoo! RSS feeds.

  7. [Electroencephalogram Feature Selection Based on Correlation Coefficient Analysis].

    Science.gov (United States)

    Zhou, Jinzhi; Tang, Xiaofang

    2015-08-01

    In order to improve the accuracy of classification with small amount of motor imagery training data on the development of brain-computer interface (BCD systems, we proposed an analyzing method to automatically select the characteristic parameters based on correlation coefficient analysis. Throughout the five sample data of dataset IV a from 2005 BCI Competition, we utilized short-time Fourier transform (STFT) and correlation coefficient calculation to reduce the number of primitive electroencephalogram dimension, then introduced feature extraction based on common spatial pattern (CSP) and classified by linear discriminant analysis (LDA). Simulation results showed that the average rate of classification accuracy could be improved by using correlation coefficient feature selection method than those without using this algorithm. Comparing with support vector machine (SVM) optimization features algorithm, the correlation coefficient analysis can lead better selection parameters to improve the accuracy of classification.

  8. Development of a nomogram combining clinical staging with 18F-FDG PET/CT image features in non-small-cell lung cancer stage I-III

    International Nuclear Information System (INIS)

    Desseroit, Marie-Charlotte; Visvikis, Dimitris; Majdoub, Mohamed; Hatt, Mathieu; Tixier, Florent; Perdrisot, Remy; Cheze Le Rest, Catherine; Guillevin, Remy

    2016-01-01

    Our goal was to develop a nomogram by exploiting intratumour heterogeneity on CT and PET images from routine 18 F-FDG PET/CT acquisitions to identify patients with the poorest prognosis. This retrospective study included 116 patients with NSCLC stage I, II or III and with staging 18 F-FDG PET/CT imaging. Primary tumour volumes were delineated using the FLAB algorithm and 3D Slicer trademark on PET and CT images, respectively. PET and CT heterogeneities were quantified using texture analysis. The reproducibility of the CT features was assessed on a separate test-retest dataset. The stratification power of the PET/CT features was evaluated using the Kaplan-Meier method and the log-rank test. The best standard metric (functional volume) was combined with the least redundant and most prognostic PET/CT heterogeneity features to build the nomogram. PET entropy and CT zone percentage had the highest complementary values with clinical stage and functional volume. The nomogram improved stratification amongst patients with stage II and III disease, allowing identification of patients with the poorest prognosis (clinical stage III, large tumour volume, high PET heterogeneity and low CT heterogeneity). Intratumour heterogeneity quantified using textural features on both CT and PET images from routine staging 18 F-FDG PET/CT acquisitions can be used to create a nomogram with higher stratification power than staging alone. (orig.)

  9. Two-step two-stage fission gas release model

    International Nuclear Information System (INIS)

    Kim, Yong-soo; Lee, Chan-bock

    2006-01-01

    Based on the recent theoretical model, two-step two-stage model is developed which incorporates two stage diffusion processes, grain lattice and grain boundary diffusion, coupled with the two step burn-up factor in the low and high burn-up regime. FRAPCON-3 code and its in-pile data sets have been used for the benchmarking and validation of this model. Results reveals that its prediction is in better agreement with the experimental measurements than that by any model contained in the FRAPCON-3 code such as ANS 5.4, modified ANS5.4, and Forsberg-Massih model over whole burn-up range up to 70,000 MWd/MTU. (author)

  10. Falling Leaves Inspired ZnO Nanorods-Nanoslices Hierarchical Structure for Implant Surface Modification with Two Stage Releasing Features.

    Science.gov (United States)

    Liao, Hang; Miao, Xinxin; Ye, Jing; Wu, Tianlong; Deng, Zhongbo; Li, Chen; Jia, Jingyu; Cheng, Xigao; Wang, Xiaolei

    2017-04-19

    Inspired from falling leaves, ZnO nanorods-nanoslices hierarchical structure (NHS) was constructed to modify the surfaces of two widely used implant materials: titanium (Ti) and tantalum (Ta), respectively. By which means, two-stage release of antibacterial active substances were realized to address the clinical importance of long-term broad-spectrum antibacterial activity. At early stages (within 48 h), the NHS exhibited a rapid releasing to kill the bacteria around the implant immediately. At a second stage (over 2 weeks), the NHS exhibited a slow releasing to realize long-term inhibition. The excellent antibacterial activity of ZnO NHS was confirmed once again by animal test in vivo. According to the subsequent experiments, the ZnO NHS coating exhibited the great advantage of high efficiency, low toxicity, and long-term durability, which could be a feasible manner to prevent the abuse of antibiotics on implant-related surgery.

  11. Two-stage revision of septic knee prosthesis with articulating knee spacers yields better infection eradication rate than one-stage or two-stage revision with static spacers.

    Science.gov (United States)

    Romanò, C L; Gala, L; Logoluso, N; Romanò, D; Drago, L

    2012-12-01

    The best method for treating chronic periprosthetic knee infection remains controversial. Randomized, comparative studies on treatment modalities are lacking. This systematic review of the literature compares the infection eradication rate after two-stage versus one-stage revision and static versus articulating spacers in two-stage procedures. We reviewed full-text papers and those with an abstract in English published from 1966 through 2011 that reported the success rate of infection eradication after one-stage or two-stage revision with two different types of spacers. In all, 6 original articles reporting the results after one-stage knee exchange arthoplasty (n = 204) and 38 papers reporting on two-stage revision (n = 1,421) were reviewed. The average success rate in the eradication of infection was 89.8% after a two-stage revision and 81.9% after a one-stage procedure at a mean follow-up of 44.7 and 40.7 months, respectively. The average infection eradication rate after a two-stage procedure was slightly, although significantly, higher when an articulating spacer rather than a static spacer was used (91.2 versus 87%). The methodological limitations of this study and the heterogeneous material in the studies reviewed notwithstanding, this systematic review shows that, on average, a two-stage procedure is associated with a higher rate of eradication of infection than one-stage revision for septic knee prosthesis and that articulating spacers are associated with a lower recurrence of infection than static spacers at a comparable mean duration of follow-up. IV.

  12. Genetic Particle Swarm Optimization–Based Feature Selection for Very-High-Resolution Remotely Sensed Imagery Object Change Detection

    Science.gov (United States)

    Chen, Qiang; Chen, Yunhao; Jiang, Weiguo

    2016-01-01

    In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm. PMID:27483285

  13. Genetic Particle Swarm Optimization-Based Feature Selection for Very-High-Resolution Remotely Sensed Imagery Object Change Detection.

    Science.gov (United States)

    Chen, Qiang; Chen, Yunhao; Jiang, Weiguo

    2016-07-30

    In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm.

  14. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data

    KAUST Repository

    Abusamra, Heba

    2013-01-01

    Different experiments have been applied to compare the performance of the classification methods with and without performing feature selection. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

  15. Multi-scale textural feature extraction and particle swarm optimization based model selection for false positive reduction in mammography.

    Science.gov (United States)

    Zyout, Imad; Czajkowska, Joanna; Grzegorzek, Marcin

    2015-12-01

    The high number of false positives and the resulting number of avoidable breast biopsies are the major problems faced by current mammography Computer Aided Detection (CAD) systems. False positive reduction is not only a requirement for mass but also for calcification CAD systems which are currently deployed for clinical use. This paper tackles two problems related to reducing the number of false positives in the detection of all lesions and masses, respectively. Firstly, textural patterns of breast tissue have been analyzed using several multi-scale textural descriptors based on wavelet and gray level co-occurrence matrix. The second problem addressed in this paper is the parameter selection and performance optimization. For this, we adopt a model selection procedure based on Particle Swarm Optimization (PSO) for selecting the most discriminative textural features and for strengthening the generalization capacity of the supervised learning stage based on a Support Vector Machine (SVM) classifier. For evaluating the proposed methods, two sets of suspicious mammogram regions have been used. The first one, obtained from Digital Database for Screening Mammography (DDSM), contains 1494 regions (1000 normal and 494 abnormal samples). The second set of suspicious regions was obtained from database of Mammographic Image Analysis Society (mini-MIAS) and contains 315 (207 normal and 108 abnormal) samples. Results from both datasets demonstrate the efficiency of using PSO based model selection for optimizing both classifier hyper-parameters and parameters, respectively. Furthermore, the obtained results indicate the promising performance of the proposed textural features and more specifically, those based on co-occurrence matrix of wavelet image representation technique. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Influence of the selection from incremental stages on lactate minimum intensity: a pilot study

    Directory of Open Access Journals (Sweden)

    Willian Eiji Miyagi

    2013-09-01

    Full Text Available The purposes of this study were to assess the influence of stage selection from the incremental phase and the use of peak lactate after hyperlactatemia induction on the determination of the lactate minimum intensity (iLACmin. Twelve moderately active university students (23±5 years, 78.3±14.1 kg, 175.3±5.1 cm performed a maximal incremental test to determine the respiratory compensation point (RCP (initial intensity at 70 W and increments of 17.5 W every 2 minutes and a lactate minimum test (induction with the Wingate test, the incremental test started at 30 W below RCP with increments of 10 W every 3 minutes on a cycle ergometer. The iLACmin was determined using second order polynomial adjustment applying five exercise stage selection: 1 using all stages (iLACmin P; 2 using all stages below and two stages above iLACminP (iLACminA; 3 using two stages below and all stages above iLACminP (iLACminB; 4 using the largest and same possible number of stages below and above the iLACminP (iLACminI; 5 using all stages and peak lactate after hyperlactatemia induction (iLACminD. No differences were found between the iLACminP (138.2±30.2 W, iLACminA (139.1±29.1 W, iLACminB (135.3±14.2 W, iLACminI (138.6±20.5 W and iLACmiD (136.7±28.5 W protocols, and a high level of agreement between these intensities and iLACminP was observed. Oxygen uptake, heart rate, rating of perceived exertion and lactate corresponding to these intensities was not different and was strongly correlated. However, the iLACminB presented the lowest success rate (66.7%. In conclusion, stage selection did not influence the determination of iLACmin but modified the success rate

  17. Two stages of economic development

    OpenAIRE

    Gong, Gang

    2016-01-01

    This study suggests that the development process of a less-developed country can be divided into two stages, which demonstrate significantly different properties in areas such as structural endowments, production modes, income distribution, and the forces that drive economic growth. The two stages of economic development have been indicated in the growth theory of macroeconomics and in the various "turning point" theories in development economics, including Lewis's dual economy theory, Kuznet...

  18. Feature Selection with the Boruta Package

    Directory of Open Access Journals (Sweden)

    Miron B. Kursa

    2010-10-01

    Full Text Available This article describes a R package Boruta, implementing a novel feature selection algorithm for finding emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.

  19. Embedded Incremental Feature Selection for Reinforcement Learning

    Science.gov (United States)

    2012-05-01

    Prior to this work, feature selection for reinforce- ment learning has focused on linear value function ap- proximation ( Kolter and Ng, 2009; Parr et al...InProceed- ings of the the 23rd International Conference on Ma- chine Learning, pages 449–456. Kolter , J. Z. and Ng, A. Y. (2009). Regularization and feature

  20. Computer ranking of the sequence of appearance of 73 features of the brain and related structures in staged human embryos during the sixth week of development.

    Science.gov (United States)

    O'Rahilly, R; Müller, F; Hutchins, G M; Moore, G W

    1987-09-01

    The sequence of events in the development of the brain in human embryos, already published for stages 8-15, is here continued for stages 16 and 17. With the aid of a computerized bubble-sort algorithm, 71 individual embryos were ranked in ascending order of the features present. Whereas these numbered 100 in the previous study, the increasing structural complexity gave 27 new features in the two stages now under investigation. The chief characteristics of stage 16 (approximately 37 postovulatory days) are protruding basal nuclei, the caudal olfactory elevation (olfactory tubercle), the tectobulbar tracts, and ascending fibers to the cerebellum. The main features of stage 17 (approximately 41 postovulatory days) are the cortical nucleus of the amygdaloid body, an intermediate layer in the tectum mesencephali, the posterior commissure, and the habenulo-interpeduncular tract. In addition, a typical feature at stage 17 is the crescentic shape of the lens cavity.

  1. DYNAMIC FEATURE SELECTION FOR WEB USER IDENTIFICATION ON LINGUISTIC AND STYLISTIC FEATURES OF ONLINE TEXTS

    Directory of Open Access Journals (Sweden)

    A. A. Vorobeva

    2017-01-01

    Full Text Available The paper deals with identification and authentication of web users participating in the Internet information processes (based on features of online texts.In digital forensics web user identification based on various linguistic features can be used to discover identity of individuals, criminals or terrorists using the Internet to commit cybercrimes. Internet could be used as a tool in different types of cybercrimes (fraud and identity theft, harassment and anonymous threats, terrorist or extremist statements, distribution of illegal content and information warfare. Linguistic identification of web users is a kind of biometric identification, it can be used to narrow down the suspects, identify a criminal and prosecute him. Feature set includes various linguistic and stylistic features extracted from online texts. We propose dynamic feature selection for each web user identification task. Selection is based on calculating Manhattan distance to k-nearest neighbors (Relief-f algorithm. This approach improves the identification accuracy and minimizes the number of features. Experiments were carried out on several datasets with different level of class imbalance. Experiment results showed that features relevance varies in different set of web users (probable authors of some text; features selection for each set of web users improves identification accuracy by 4% at the average that is approximately 1% higher than with the use of static set of features. The proposed approach is most effective for a small number of training samples (messages per user.

  2. Efficient Generation and Selection of Combined Features for Improved Classification

    KAUST Repository

    Shono, Ahmad N.

    2014-05-01

    This study contributes a methodology and associated toolkit developed to allow users to experiment with the use of combined features in classification problems. Methods are provided for efficiently generating combined features from an original feature set, for efficiently selecting the most discriminating of these generated combined features, and for efficiently performing a preliminary comparison of the classification results when using the original features exclusively against the results when using the selected combined features. The potential benefit of considering combined features in classification problems is demonstrated by applying the developed methodology and toolkit to three sample data sets where the discovery of combined features containing new discriminating information led to improved classification results.

  3. Two-Stage Performance Engineering of Container-based Virtualization

    Directory of Open Access Journals (Sweden)

    Zheng Li

    2018-02-01

    Full Text Available Cloud computing has become a compelling paradigm built on compute and storage virtualization technologies. The current virtualization solution in the Cloud widely relies on hypervisor-based technologies. Given the recent booming of the container ecosystem, the container-based virtualization starts receiving more attention for being a promising alternative. Although the container technologies are generally considered to be lightweight, no virtualization solution is ideally resource-free, and the corresponding performance overheads will lead to negative impacts on the quality of Cloud services. To facilitate understanding container technologies from the performance engineering’s perspective, we conducted two-stage performance investigations into Docker containers as a concrete example. At the first stage, we used a physical machine with “just-enough” resource as a baseline to investigate the performance overhead of a standalone Docker container against a standalone virtual machine (VM. With findings contrary to the related work, our evaluation results show that the virtualization’s performance overhead could vary not only on a feature-by-feature basis but also on a job-to-job basis. Moreover, the hypervisor-based technology does not come with higher performance overhead in every case. For example, Docker containers particularly exhibit lower QoS in terms of storage transaction speed. At the ongoing second stage, we employed a physical machine with “fair-enough” resource to implement a container-based MapReduce application and try to optimize its performance. In fact, this machine failed in affording VM-based MapReduce clusters in the same scale. The performance tuning results show that the effects of different optimization strategies could largely be related to the data characteristics. For example, LZO compression can bring the most significant performance improvement when dealing with text data in our case.

  4. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  5. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Science.gov (United States)

    Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363

  6. Development of in Silico Models for Predicting P-Glycoprotein Inhibitors Based on a Two-Step Approach for Feature Selection and Its Application to Chinese Herbal Medicine Screening.

    Science.gov (United States)

    Yang, Ming; Chen, Jialei; Shi, Xiufeng; Xu, Liwen; Xi, Zhijun; You, Lisha; An, Rui; Wang, Xinhong

    2015-10-05

    P-glycoprotein (P-gp) is regarded as an important factor in determining the ADMET (absorption, distribution, metabolism, elimination, and toxicity) characteristics of drugs and drug candidates. Successful prediction of P-gp inhibitors can thus lead to an improved understanding of the underlying mechanisms of both changes in the pharmacokinetics of drugs and drug-drug interactions. Therefore, there has been considerable interest in the development of in silico modeling of P-gp inhibitors in recent years. Considering that a large number of molecular descriptors are used to characterize diverse structural moleculars, efficient feature selection methods are required to extract the most informative predictors. In this work, we constructed an extensive available data set of 2428 molecules that includes 1518 P-gp inhibitors and 910 P-gp noninhibitors from multiple resources. Importantly, a two-step feature selection approach based on a genetic algorithm and a greedy forward-searching algorithm was employed to select the minimum set of the most informative descriptors that contribute to the prediction of P-gp inhibitors. To determine the best machine learning algorithm, 18 classifiers coupled with the feature selection method were compared. The top three best-performing models (flexible discriminant analysis, support vector machine, and random forest) and their ensemble model using respectively only 3, 9, 7, and 14 descriptors achieve an overall accuracy of 83.2%-86.7% for the training set containing 1040 compounds, an overall accuracy of 82.3%-85.5% for the test set containing 1039 compounds, and a prediction accuracy of 77.4%-79.9% for the external validation set containing 349 compounds. The models were further extensively validated by DrugBank database (1890 compounds). The proposed models are competitive with and in some cases better than other published models in terms of prediction accuracy and minimum number of descriptors. Applicability domain then was addressed

  7. A two-stage multi-view learning framework based computer-aided diagnosis of liver tumors with contrast enhanced ultrasound images.

    Science.gov (United States)

    Guo, Le-Hang; Wang, Dan; Qian, Yi-Yi; Zheng, Xiao; Zhao, Chong-Ke; Li, Xiao-Long; Bo, Xiao-Wan; Yue, Wen-Wen; Zhang, Qi; Shi, Jun; Xu, Hui-Xiong

    2018-04-04

    With the fast development of artificial intelligence techniques, we proposed a novel two-stage multi-view learning framework for the contrast-enhanced ultrasound (CEUS) based computer-aided diagnosis for liver tumors, which adopted only three typical CEUS images selected from the arterial phase, portal venous phase and late phase. In the first stage, the deep canonical correlation analysis (DCCA) was performed on three image pairs between the arterial and portal venous phases, arterial and delayed phases, and portal venous and delayed phases respectively, which then generated total six-view features. While in the second stage, these multi-view features were then fed to a multiple kernel learning (MKL) based classifier to further promote the diagnosis result. Two MKL classification algorithms were evaluated in this MKL-based classification framework. We evaluated proposed DCCA-MKL framework on 93 lesions (47 malignant cancers vs. 46 benign tumors). The proposed DCCA-MKL framework achieved the mean classification accuracy, sensitivity, specificity, Youden index, false positive rate, and false negative rate of 90.41 ± 5.80%, 93.56 ± 5.90%, 86.89 ± 9.38%, 79.44 ± 11.83%, 13.11 ± 9.38% and 6.44 ± 5.90%, respectively, by soft margin MKL classifier. The experimental results indicate that the proposed DCCA-MKL framework achieves best performance for discriminating benign liver tumors from malignant liver cancers. Moreover, it is also proved that the three-phase CEUS image based CAD is feasible for liver tumors with the proposed DCCA-MKL framework.

  8. [Feature extraction for breast cancer data based on geometric algebra theory and feature selection using differential evolution].

    Science.gov (United States)

    Li, Jing; Hong, Wenxue

    2014-12-01

    The feature extraction and feature selection are the important issues in pattern recognition. Based on the geometric algebra representation of vector, a new feature extraction method using blade coefficient of geometric algebra was proposed in this study. At the same time, an improved differential evolution (DE) feature selection method was proposed to solve the elevated high dimension issue. The simple linear discriminant analysis was used as the classifier. The result of the 10-fold cross-validation (10 CV) classification of public breast cancer biomedical dataset was more than 96% and proved superior to that of the original features and traditional feature extraction method.

  9. Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors

    Directory of Open Access Journals (Sweden)

    Hong Zhao

    2013-01-01

    Full Text Available Feature selection is an essential process in data mining applications since it reduces a model’s complexity. However, feature selection with various types of costs is still a new research topic. In this paper, we study the cost-sensitive feature selection problem of numeric data with measurement errors. The major contributions of this paper are fourfold. First, a new data model is built to address test costs and misclassification costs as well as error boundaries. It is distinguished from the existing models mainly on the error boundaries. Second, a covering-based rough set model with normal distribution measurement errors is constructed. With this model, coverings are constructed from data rather than assigned by users. Third, a new cost-sensitive feature selection problem is defined on this model. It is more realistic than the existing feature selection problems. Fourth, both backtracking and heuristic algorithms are proposed to deal with the new problem. Experimental results show the efficiency of the pruning techniques for the backtracking algorithm and the effectiveness of the heuristic algorithm. This study is a step toward realistic applications of the cost-sensitive learning.

  10. A Feature Subset Selection Method Based On High-Dimensional Mutual Information

    Directory of Open Access Journals (Sweden)

    Chee Keong Kwoh

    2011-04-01

    Full Text Available Feature selection is an important step in building accurate classifiers and provides better understanding of the data sets. In this paper, we propose a feature subset selection method based on high-dimensional mutual information. We also propose to use the entropy of the class attribute as a criterion to determine the appropriate subset of features when building classifiers. We prove that if the mutual information between a feature set X and the class attribute Y equals to the entropy of Y , then X is a Markov Blanket of Y . We show that in some cases, it is infeasible to approximate the high-dimensional mutual information with algebraic combinations of pairwise mutual information in any forms. In addition, the exhaustive searches of all combinations of features are prerequisite for finding the optimal feature subsets for classifying these kinds of data sets. We show that our approach outperforms existing filter feature subset selection methods for most of the 24 selected benchmark data sets.

  11. Feature-selective attention: evidence for a decline in old age.

    Science.gov (United States)

    Quigley, Cliodhna; Andersen, Søren K; Schulze, Lars; Grunwald, Martin; Müller, Matthias M

    2010-04-19

    Although attention in older adults is an active research area, feature-selective aspects have not yet been explicitly studied. Here we report the results of an exploratory study involving directed changes in feature-selective attention. The stimuli used were two random dot kinematograms (RDKs) of different colours, superimposed and centrally presented. A colour cue with random onset after the beginning of each trial instructed young and older subjects to attend to one of the RDKs and detect short intervals of coherent motion while ignoring analogous motion events in the non-cued RDK. Behavioural data show that older adults could detect motion, but discriminated target from distracter motion less reliably than young adults. The method of frequency tagging allowed us to separate the EEG responses to the attended and ignored stimuli and directly compare steady-state visual evoked potential (SSVEP) amplitudes elicited by each stimulus before and after cue onset. We found that younger adults show a clear attentional enhancement of SSVEP amplitude in the post-cue interval, while older adults' SSVEP responses to attended and ignored stimuli do not differ. Thus, in situations where attentional selection cannot be spatially resolved, older adults show a deficit in selection that is not shared by young adults. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.

  12. Comparison of single-stage and temperature-phased two-stage anaerobic digestion of oily food waste

    International Nuclear Information System (INIS)

    Wu, Li-Jie; Kobayashi, Takuro; Li, Yu-You; Xu, Kai-Qin

    2015-01-01

    Highlights: • A single-stage and two two-stage anaerobic systems were synchronously operated. • Similar methane production 0.44 L/g VS_a_d_d_e_d from oily food waste was achieved. • The first stage of the two-stage process became inefficient due to serious pH drop. • Recycle favored the hythan production in the two-stage digestion. • The conversion of unsaturated fatty acids was enhanced by recycle introduction. - Abstract: Anaerobic digestion is an effective technology to recover energy from oily food waste. A single-stage system and temperature-phased two-stage systems with and without recycle for anaerobic digestion of oily food waste were constructed to compare the operation performances. The synchronous operation indicated the similar ability to produce methane in the three systems, with a methane yield of 0.44 L/g VS_a_d_d_e_d. The pH drop to less than 4.0 in the first stage of two-stage system without recycle resulted in poor hydrolysis, and methane or hydrogen was not produced in this stage. Alkalinity supplement from the second stage of two-stage system with recycle improved pH in the first stage to 5.4. Consequently, 35.3% of the particulate COD in the influent was reduced in the first stage of two-stage system with recycle according to a COD mass balance, and hydrogen was produced with a percentage of 31.7%, accordingly. Similar solids and organic matter were removed in the single-stage system and two-stage system without recycle. More lipid degradation and the conversion of long-chain fatty acids were achieved in the single-stage system. Recycling was proved to be effective in promoting the conversion of unsaturated long-chain fatty acids into saturated fatty acids in the two-stage system.

  13. Adaptive feature selection using v-shaped binary particle swarm optimization.

    Science.gov (United States)

    Teng, Xuyang; Dong, Hongbin; Zhou, Xiurong

    2017-01-01

    Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers.

  14. Speech Emotion Feature Selection Method Based on Contribution Analysis Algorithm of Neural Network

    International Nuclear Information System (INIS)

    Wang Xiaojia; Mao Qirong; Zhan Yongzhao

    2008-01-01

    There are many emotion features. If all these features are employed to recognize emotions, redundant features may be existed. Furthermore, recognition result is unsatisfying and the cost of feature extraction is high. In this paper, a method to select speech emotion features based on contribution analysis algorithm of NN is presented. The emotion features are selected by using contribution analysis algorithm of NN from the 95 extracted features. Cluster analysis is applied to analyze the effectiveness for the features selected, and the time of feature extraction is evaluated. Finally, 24 emotion features selected are used to recognize six speech emotions. The experiments show that this method can improve the recognition rate and the time of feature extraction

  15. How do we select multiple features? Transient costs for selecting two colors rather than one, persistent costs for color-location conjunctions.

    Science.gov (United States)

    Lo, Shih-Yu; Holcombe, Alex O

    2014-02-01

    In a previous study Lo, Howard, & Holcombe (Vision Research 63:20-33, 2012), selecting two colors did not induce a performance cost, relative to selecting one color. For example, requiring possible report of both a green and a red target did not yield a worse performance than when both targets were green. Yet a cost of selecting multiple colors was observed when selection needed be contingent on both color and location. When selecting a red target to the left and a green target to the right, superimposing a green distractor to the left and a red distractor to the right impeded performance. Possibly, participants cannot confine attention to a color at a particular location. As a result, distractors that share the target colors disrupt attentional selection of the targets. The attempt to select the targets must then be repeated, which increases the likelihood that the trial terminates when selection is not effective, even for long trials. Consistent with this, here we find a persistent cost of selecting two colors when the conjunction of color and location is needed, but the cost is confined to short exposure durations when the observer just has to monitor red and green stimuli without the need to use the location information. These results suggest that selecting two colors is time-consuming but effective, whereas selection of simultaneous conjunctions is never entirely successful.

  16. MR imaging features and staging of neuroendocrine carcinomas of the uterine cervix with pathological correlations

    Energy Technology Data Exchange (ETDEWEB)

    Duan, Xiaohui; Zhang, Xiang; Hu, Huijun; Li, Guozhao; Wang, Dongye; Zhang, Fang; Shen, Jun [Sun Yat-Sen University, Department of Radiology, Sun Yat-Sen Memorial Hospital, Guangzhou (China); Ban, Xiaohua [Sun Yat-Sen University, Medical Imaging and Minimally Invasive Interventional Center and State Key Laboratory of Oncology in Southern China, Cancer Center, Guangzhou, Guangdong (China); Wang, Charles Qian [Sun Yat-Sen University, Department of Radiology, Sun Yat-Sen Memorial Hospital, Guangzhou (China); University of New South Wales, JMO, Westmead Hospital, Sydney (Australia)

    2016-12-15

    To determine MR imaging features and staging accuracy of neuroendocrine carcinomas (NECs) of the uterine cervix with pathological correlations. Twenty-six patients with histologically proven NECs, 60 patients with squamous cell carcinomas (SCCs), and 30 patients with adenocarcinomas of the uterine cervix were included. The clinical data, pathological findings, and MRI findings were reviewed retrospectively. MRI features of cervical NECs, SCCs, and adenocarcinomas were compared, and MRI staging of cervical NECs was compared with the pathological staging. Cervical NECs showed a higher tendency toward a homogeneous signal intensity on T2-weighted imaging and a homogeneous enhancement pattern, as well as a lower ADC value of tumour and a higher incidence of lymphadenopathy, compared with SCCs and adenocarcinomas (P < 0.05). An ADC value cutoff of 0.90 x 10{sup -3} mm{sup 2}/s was robust for differentiation between cervical NECs and other cervical cancers, with a sensitivity of 63.3 % and a specificity of 95 %. In 21 patients who underwent radical hysterectomy and lymphadenectomy, the overall accuracy of tumour staging by MR imaging was 85.7 % with reference to pathology staging. Homogeneous lesion texture and low ADC value are likely suggestive features of cervical NECs and MR imaging is reliable for the staging of cervical NECs. (orig.)

  17. Feature Selection using Multi-objective Genetic Algorith m: A Hybrid Approach

    OpenAIRE

    Ahuja, Jyoti; GJUST - Guru Jambheshwar University of Sciecne and Technology; Ratnoo, Saroj Dahiya; GJUST - Guru Jambheshwar University of Sciecne and Technology

    2015-01-01

    Feature selection is an important pre-processing task for building accurate and comprehensible classification models. Several researchers have applied filter, wrapper or hybrid approaches using genetic algorithms which are good candidates for optimization problems that involve large search spaces like in the case of feature selection. Moreover, feature selection is an inherently multi-objective problem with many competing objectives involving size, predictive power and redundancy of the featu...

  18. Two-Stage Centrifugal Fan

    Science.gov (United States)

    Converse, David

    2011-01-01

    Fan designs are often constrained by envelope, rotational speed, weight, and power. Aerodynamic performance and motor electrical performance are heavily influenced by rotational speed. The fan used in this work is at a practical limit for rotational speed due to motor performance characteristics, and there is no more space available in the packaging for a larger fan. The pressure rise requirements keep growing. The way to ordinarily accommodate a higher DP is to spin faster or grow the fan rotor diameter. The invention is to put two radially oriented stages on a single disk. Flow enters the first stage from the center; energy is imparted to the flow in the first stage blades, the flow is redirected some amount opposite to the direction of rotation in the fixed stators, and more energy is imparted to the flow in the second- stage blades. Without increasing either rotational speed or disk diameter, it is believed that as much as 50 percent more DP can be achieved with this design than with an ordinary, single-stage centrifugal design. This invention is useful primarily for fans having relatively low flow rates with relatively high pressure rise requirements.

  19. Feature selection is the ReliefF for multiple instance learning

    NARCIS (Netherlands)

    Zafra, A.; Pechenizkiy, M.; Ventura, S.

    2010-01-01

    Dimensionality reduction and feature selection in particular are known to be of a great help for making supervised learning more effective and efficient. Many different feature selection techniques have been proposed for the traditional settings, where each instance is expected to have a label. In

  20. Many-Objective Particle Swarm Optimization Using Two-Stage Strategy and Parallel Cell Coordinate System.

    Science.gov (United States)

    Hu, Wang; Yen, Gary G; Luo, Guangchun

    2017-06-01

    It is a daunting challenge to balance the convergence and diversity of an approximate Pareto front in a many-objective optimization evolutionary algorithm. A novel algorithm, named many-objective particle swarm optimization with the two-stage strategy and parallel cell coordinate system (PCCS), is proposed in this paper to improve the comprehensive performance in terms of the convergence and diversity. In the proposed two-stage strategy, the convergence and diversity are separately emphasized at different stages by a single-objective optimizer and a many-objective optimizer, respectively. A PCCS is exploited to manage the diversity, such as maintaining a diverse archive, identifying the dominance resistant solutions, and selecting the diversified solutions. In addition, a leader group is used for selecting the global best solutions to balance the exploitation and exploration of a population. The experimental results illustrate that the proposed algorithm outperforms six chosen state-of-the-art designs in terms of the inverted generational distance and hypervolume over the DTLZ test suite.

  1. Two-stage clustering (TSC: a pipeline for selecting operational taxonomic units for the high-throughput sequencing of PCR amplicons.

    Directory of Open Access Journals (Sweden)

    Xiao-Tao Jiang

    Full Text Available Clustering 16S/18S rRNA amplicon sequences into operational taxonomic units (OTUs is a critical step for the bioinformatic analysis of microbial diversity. Here, we report a pipeline for selecting OTUs with a relatively low computational demand and a high degree of accuracy. This pipeline is referred to as two-stage clustering (TSC because it divides tags into two groups according to their abundance and clusters them sequentially. The more abundant group is clustered using a hierarchical algorithm similar to that in ESPRIT, which has a high degree of accuracy but is computationally costly for large datasets. The rarer group, which includes the majority of tags, is then heuristically clustered to improve efficiency. To further improve the computational efficiency and accuracy, two preclustering steps are implemented. To maintain clustering accuracy, all tags are grouped into an OTU depending on their pairwise Needleman-Wunsch distance. This method not only improved the computational efficiency but also mitigated the spurious OTU estimation from 'noise' sequences. In addition, OTUs clustered using TSC showed comparable or improved performance in beta-diversity comparisons compared to existing OTU selection methods. This study suggests that the distribution of sequencing datasets is a useful property for improving the computational efficiency and increasing the clustering accuracy of the high-throughput sequencing of PCR amplicons. The software and user guide are freely available at http://hwzhoulab.smu.edu.cn/paperdata/.

  2. Information Overload in Multi-Stage Selection Procedures

    NARCIS (Netherlands)

    S.S. Ficco (Stefano); V.A. Karamychev (Vladimir)

    2004-01-01

    textabstractThe paper studies information processing imperfections in a fully rational decision-making network. It is shown that imperfect information transmission and imperfect information acquisition in a multi-stage selection game yield information overload. The paper analyses the mechanisms

  3. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

    Science.gov (United States)

    Al-Rajab, Murad; Lu, Joan; Xu, Qiang

    2017-07-01

    This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification.

    Science.gov (United States)

    Zhang, Yong; Gong, Dun-Wei; Cheng, Jian

    2017-01-01

    Feature selection is an important data-preprocessing technique in classification problems such as bioinformatics and signal processing. Generally, there are some situations where a user is interested in not only maximizing the classification performance but also minimizing the cost that may be associated with features. This kind of problem is called cost-based feature selection. However, most existing feature selection approaches treat this task as a single-objective optimization problem. This paper presents the first study of multi-objective particle swarm optimization (PSO) for cost-based feature selection problems. The task of this paper is to generate a Pareto front of nondominated solutions, that is, feature subsets, to meet different requirements of decision-makers in real-world applications. In order to enhance the search capability of the proposed algorithm, a probability-based encoding technology and an effective hybrid operator, together with the ideas of the crowding distance, the external archive, and the Pareto domination relationship, are applied to PSO. The proposed PSO-based multi-objective feature selection algorithm is compared with several multi-objective feature selection algorithms on five benchmark datasets. Experimental results show that the proposed algorithm can automatically evolve a set of nondominated solutions, and it is a highly competitive feature selection method for solving cost-based feature selection problems.

  5. Predictive Feature Selection for Genetic Policy Search

    Science.gov (United States)

    2014-05-22

    limited manual intervention are becoming increasingly desirable as more complex tasks in dynamic and high- tempo environments are explored. Reinforcement...states in many domains causes features relevant to the reward variations to be overlooked, which hinders the policy search. 3.4 Parameter Selection PFS...the current feature subset. This local minimum may be “deceptive,” meaning that it does not clearly lead to the global optimal policy ( Goldberg and

  6. Comparison of feature selection and classification for MALDI-MS data

    Directory of Open Access Journals (Sweden)

    Yang Mary

    2009-07-01

    Full Text Available Abstract Introduction In the classification of Mass Spectrometry (MS proteomics data, peak detection, feature selection, and learning classifiers are critical to classification accuracy. To better understand which methods are more accurate when classifying data, some publicly available peak detection algorithms for Matrix assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS data were recently compared; however, the issue of different feature selection methods and different classification models as they relate to classification performance has not been addressed. With the application of intelligent computing, much progress has been made in the development of feature selection methods and learning classifiers for the analysis of high-throughput biological data. The main objective of this paper is to compare the methods of feature selection and different learning classifiers when applied to MALDI-MS data and to provide a subsequent reference for the analysis of MS proteomics data. Results We compared a well-known method of feature selection, Support Vector Machine Recursive Feature Elimination (SVMRFE, and a recently developed method, Gradient based Leave-one-out Gene Selection (GLGS that effectively performs microarray data analysis. We also compared several learning classifiers including K-Nearest Neighbor Classifier (KNNC, Naïve Bayes Classifier (NBC, Nearest Mean Scaled Classifier (NMSC, uncorrelated normal based quadratic Bayes Classifier recorded as UDC, Support Vector Machines, and a distance metric learning for Large Margin Nearest Neighbor classifier (LMNN based on Mahanalobis distance. To compare, we conducted a comprehensive experimental study using three types of MALDI-MS data. Conclusion Regarding feature selection, SVMRFE outperformed GLGS in classification. As for the learning classifiers, when classification models derived from the best training were compared, SVMs performed the best with respect to the expected testing

  7. HOSPITAL SITE SELECTION USING TWO-STAGE FUZZY MULTI-CRITERIA DECISION MAKING PROCESS

    Directory of Open Access Journals (Sweden)

    Ali Soltani

    2011-06-01

    Full Text Available Site selection for sitting of urban activities/facilities is one of the crucial policy-related decisions taken by urban planners and policy makers. The process of site selection is inherently complicated. A careless site imposes exorbitant costs on city budget and damages the environment inevitably. Nowadays, multi-attributes decision making approaches are suggested to use to improve precision of decision making and reduce surplus side effects. Two well-known techniques, analytical hierarchal process and analytical network process are among multi-criteria decision making systems which can easily be consistent with both quantitative and qualitative criteria. These are also developed to be fuzzy analytical hierarchal process and fuzzy analytical network process systems which are capable of accommodating inherent uncertainty and vagueness in multi-criteria decision-making. This paper reports the process and results of a hospital site selection within the Region 5 of Shiraz metropolitan area, Iran using integrated fuzzy analytical network process systems with Geographic Information System (GIS. The weights of the alternatives were calculated using fuzzy analytical network process. Then a sensitivity analysis was conducted to measure the elasticity of a decision in regards to different criteria. This study contributes to planning practice by suggesting a more comprehensive decision making tool for site selection.

  8. HOSPITAL SITE SELECTION USING TWO-STAGE FUZZY MULTI-CRITERIA DECISION MAKING PROCESS

    Directory of Open Access Journals (Sweden)

    Ali Soltani

    2011-01-01

    Full Text Available Site selection for sitting of urban activities/facilities is one of the crucial policy-related decisions taken by urban planners and policy makers. The process of site selection is inherently complicated. A careless site imposes exorbitant costs on city budget and damages the environment inevitably. Nowadays, multi-attributes decision making approaches are suggested to use to improve precision of decision making and reduce surplus side effects. Two well-known techniques, analytical hierarchal process and analytical network process are among multi-criteria decision making systems which can easily be consistent with both quantitative and qualitative criteria. These are also developed to be fuzzy analytical hierarchal process and fuzzy analytical network process systems which are capable of accommodating inherent uncertainty and vagueness in multi-criteria decision-making. This paper reports the process and results of a hospital site selection within the Region 5 of Shiraz metropolitan area, Iran using integrated fuzzy analytical network process systems with Geographic Information System (GIS. The weights of the alternatives were calculated using fuzzy analytical network process. Then a sensitivity analysis was conducted to measure the elasticity of a decision in regards to different criteria. This study contributes to planning practice by suggesting a more comprehensive decision making tool for site selection.

  9. Strong selection on mandible and nest features in a carpenter bee that nests in two sympatric host plants

    OpenAIRE

    Flores-Prado, Luis; Pinto, Carlos F; Rojas, Alejandra; Fontúrbel, Francisco E

    2014-01-01

    Host plants are used by herbivorous insects as feeding or nesting resources. In wood-boring insects, host plants features may impose selective forces leading to phenotypic differentiation on traits related to nest construction. Carpenter bees build their nests in dead stems or dry twigs of shrubs and trees; thus, mandibles are essential for the nesting process, and the nest is required for egg laying and offspring survival. We explored the shape and intensity of natural selection on phenotypi...

  10. Manifold regularized multi-task feature selection for multi-modality classification in Alzheimer's disease.

    Science.gov (United States)

    Jie, Biao; Zhang, Daoqiang; Cheng, Bo; Shen, Dinggang

    2013-01-01

    Accurate diagnosis of Alzheimer's disease (AD), as well as its prodromal stage (i.e., mild cognitive impairment, MCI), is very important for possible delay and early treatment of the disease. Recently, multi-modality methods have been used for fusing information from multiple different and complementary imaging and non-imaging modalities. Although there are a number of existing multi-modality methods, few of them have addressed the problem of joint identification of disease-related brain regions from multi-modality data for classification. In this paper, we proposed a manifold regularized multi-task learning framework to jointly select features from multi-modality data. Specifically, we formulate the multi-modality classification as a multi-task learning framework, where each task focuses on the classification based on each modality. In order to capture the intrinsic relatedness among multiple tasks (i.e., modalities), we adopted a group sparsity regularizer, which ensures only a small number of features to be selected jointly. In addition, we introduced a new manifold based Laplacian regularization term to preserve the geometric distribution of original data from each task, which can lead to the selection of more discriminative features. Furthermore, we extend our method to the semi-supervised setting, which is very important since the acquisition of a large set of labeled data (i.e., diagnosis of disease) is usually expensive and time-consuming, while the collection of unlabeled data is relatively much easier. To validate our method, we have performed extensive evaluations on the baseline Magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET) data of Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Our experimental results demonstrate the effectiveness of the proposed method.

  11. A selective overview of feature screening for ultrahigh-dimensional data.

    Science.gov (United States)

    JingYuan, Liu; Wei, Zhong; RunZe, L I

    2015-10-01

    High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of high-dimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data. Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many high-dimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.

  12. Minimizing coupling loss by selection of twist pitch lengths in multi-stage cable-in-conduit conductors

    International Nuclear Information System (INIS)

    Rolando, G; Nijhuis, A; Devred, A

    2014-01-01

    The numerical code JackPot-ACDC (van Lanen et al 2010 Cryogenics 50 139–48, van Lanen et al 2011 IEEE Trans. Appl. Supercond. 21 1926–9, van Lanen et al 2012 Supercond. Sci. Technol. 25 025012) allows fast parametric studies of the electro-magnetic performance of cable-in-conduit conductors (CICCs). In this paper the code is applied to the analysis of the relation between twist pitch length sequence and coupling loss in multi-stage ITER-type CICCs. The code shows that in the analysed conductors the coupling loss is at its minimum when the twist pitches of the successive cabling stages have a length ratio close to one. It is also predicted that by careful selection of the stage-to-stage twist pitch ratio, CICCs cabled according to long twist schemes in the initial stages can achieve lower coupling loss than conductors with shorter pitches. The result is validated by AC loss measurements performed on prototype conductors for the ITER Central Solenoid featuring different twist pitch sequences. (paper)

  13. Empirical study of classification process for two-stage turbo air classifier in series

    Science.gov (United States)

    Yu, Yuan; Liu, Jiaxiang; Li, Gang

    2013-05-01

    The suitable process parameters for a two-stage turbo air classifier are important for obtaining the ultrafine powder that has a narrow particle-size distribution, however little has been published internationally on the classification process for the two-stage turbo air classifier in series. The influence of the process parameters of a two-stage turbo air classifier in series on classification performance is empirically studied by using aluminum oxide powders as the experimental material. The experimental results show the following: 1) When the rotor cage rotary speed of the first-stage classifier is increased from 2 300 r/min to 2 500 r/min with a constant rotor cage rotary speed of the second-stage classifier, classification precision is increased from 0.64 to 0.67. However, in this case, the final ultrafine powder yield is decreased from 79% to 74%, which means the classification precision and the final ultrafine powder yield can be regulated through adjusting the rotor cage rotary speed of the first-stage classifier. 2) When the rotor cage rotary speed of the second-stage classifier is increased from 2 500 r/min to 3 100 r/min with a constant rotor cage rotary speed of the first-stage classifier, the cut size is decreased from 13.16 μm to 8.76 μm, which means the cut size of the ultrafine powder can be regulated through adjusting the rotor cage rotary speed of the second-stage classifier. 3) When the feeding speed is increased from 35 kg/h to 50 kg/h, the "fish-hook" effect is strengthened, which makes the ultrafine powder yield decrease. 4) To weaken the "fish-hook" effect, the equalization of the two-stage wind speeds or the combination of a high first-stage wind speed with a low second-stage wind speed should be selected. This empirical study provides a criterion of process parameter configurations for a two-stage or multi-stage classifier in series, which offers a theoretical basis for practical production.

  14. Feature-Selective Attention Adaptively Shifts Noise Correlations in Primary Auditory Cortex.

    Science.gov (United States)

    Downer, Joshua D; Rapone, Brittany; Verhein, Jessica; O'Connor, Kevin N; Sutter, Mitchell L

    2017-05-24

    Sensory environments often contain an overwhelming amount of information, with both relevant and irrelevant information competing for neural resources. Feature attention mediates this competition by selecting the sensory features needed to form a coherent percept. How attention affects the activity of populations of neurons to support this process is poorly understood because population coding is typically studied through simulations in which one sensory feature is encoded without competition. Therefore, to study the effects of feature attention on population-based neural coding, investigations must be extended to include stimuli with both relevant and irrelevant features. We measured noise correlations ( r noise ) within small neural populations in primary auditory cortex while rhesus macaques performed a novel feature-selective attention task. We found that the effect of feature-selective attention on r noise depended not only on the population tuning to the attended feature, but also on the tuning to the distractor feature. To attempt to explain how these observed effects might support enhanced perceptual performance, we propose an extension of a simple and influential model in which shifts in r noise can simultaneously enhance the representation of the attended feature while suppressing the distractor. These findings present a novel mechanism by which attention modulates neural populations to support sensory processing in cluttered environments. SIGNIFICANCE STATEMENT Although feature-selective attention constitutes one of the building blocks of listening in natural environments, its neural bases remain obscure. To address this, we developed a novel auditory feature-selective attention task and measured noise correlations ( r noise ) in rhesus macaque A1 during task performance. Unlike previous studies showing that the effect of attention on r noise depends on population tuning to the attended feature, we show that the effect of attention depends on the tuning

  15. Game Theoretic Approach for Systematic Feature Selection; Application in False Alarm Detection in Intensive Care Units

    Directory of Open Access Journals (Sweden)

    Fatemeh Afghah

    2018-03-01

    Full Text Available Intensive Care Units (ICUs are equipped with many sophisticated sensors and monitoring devices to provide the highest quality of care for critically ill patients. However, these devices might generate false alarms that reduce standard of care and result in desensitization of caregivers to alarms. Therefore, reducing the number of false alarms is of great importance. Many approaches such as signal processing and machine learning, and designing more accurate sensors have been developed for this purpose. However, the significant intrinsic correlation among the extracted features from different sensors has been mostly overlooked. A majority of current data mining techniques fail to capture such correlation among the collected signals from different sensors that limits their alarm recognition capabilities. Here, we propose a novel information-theoretic predictive modeling technique based on the idea of coalition game theory to enhance the accuracy of false alarm detection in ICUs by accounting for the synergistic power of signal attributes in the feature selection stage. This approach brings together techniques from information theory and game theory to account for inter-features mutual information in determining the most correlated predictors with respect to false alarm by calculating Banzhaf power of each feature. The numerical results show that the proposed method can enhance classification accuracy and improve the area under the ROC (receiver operating characteristic curve compared to other feature selection techniques, when integrated in classifiers such as Bayes-Net that consider inter-features dependencies.

  16. Audio-visual synchrony and feature-selective attention co-amplify early visual processing.

    Science.gov (United States)

    Keitel, Christian; Müller, Matthias M

    2016-05-01

    Our brain relies on neural mechanisms of selective attention and converging sensory processing to efficiently cope with rich and unceasing multisensory inputs. One prominent assumption holds that audio-visual synchrony can act as a strong attractor for spatial attention. Here, we tested for a similar effect of audio-visual synchrony on feature-selective attention. We presented two superimposed Gabor patches that differed in colour and orientation. On each trial, participants were cued to selectively attend to one of the two patches. Over time, spatial frequencies of both patches varied sinusoidally at distinct rates (3.14 and 3.63 Hz), giving rise to pulse-like percepts. A simultaneously presented pure tone carried a frequency modulation at the pulse rate of one of the two visual stimuli to introduce audio-visual synchrony. Pulsed stimulation elicited distinct time-locked oscillatory electrophysiological brain responses. These steady-state responses were quantified in the spectral domain to examine individual stimulus processing under conditions of synchronous versus asynchronous tone presentation and when respective stimuli were attended versus unattended. We found that both, attending to the colour of a stimulus and its synchrony with the tone, enhanced its processing. Moreover, both gain effects combined linearly for attended in-sync stimuli. Our results suggest that audio-visual synchrony can attract attention to specific stimulus features when stimuli overlap in space.

  17. SUCCESS FACTORS IN GROWING SMBs: A STUDY OF TWO INDUSTRIES AT TWO STAGES OF DEVELOPMENT

    Directory of Open Access Journals (Sweden)

    Tor Jarl Trondsen

    2002-01-01

    Full Text Available The study attempts to identify factors for growing SMBs. An evolutionary phase approach has been used. The study also aims to find out if there are common and different denominators for newer and older firms that can affect their profitability. The study selects a sampling frame that isolates two groups of firms in two industries at two stages of development. A variety of organizational and structural data was collected and analyzed. Amongst the conclusions that may be drawn from the study are that it is not easy to find a common definition of success, it is important to stratify SMBs when studying them, an evolutionary stage approach helps to compare firms with roughly the same external and internal dynamics and each industry has its own set of success variables.The study has identified three success variables for older firms that reflect contemporary strategic thinking such as crafting a good strategy and changing it only incrementally, building core competencies and outsourcing the rest, and keeping up with innovation and honing competitive skills.

  18. Feature selection in classification of eye movements using electrooculography for activity recognition.

    Science.gov (United States)

    Mala, S; Latha, K

    2014-01-01

    Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.

  19. Opposed piston linear compressor driven two-stage Stirling Cryocooler for cooling of IR sensors in space application

    Science.gov (United States)

    Bhojwani, Virendra; Inamdar, Asif; Lele, Mandar; Tendolkar, Mandar; Atrey, Milind; Bapat, Shridhar; Narayankhedkar, Kisan

    2017-04-01

    A two-stage Stirling Cryocooler has been developed and tested for cooling IR sensors in space application. The concept uses an opposed piston linear compressor to drive the two-stage Stirling expander. The configuration used a moving coil linear motor for the compressor as well as for the expander unit. Electrical phase difference of 80 degrees was maintained between the voltage waveforms supplied to the compressor motor and expander motor. The piston and displacer surface were coated with Rulon an anti-friction material to ensure oil less operation of the unit. The present article discusses analysis results, features of the cryocooler and experimental tests conducted on the developed unit. The two-stages of Cryo-cylinder and the expander units were manufactured from a single piece to ensure precise alignment between the two-stages. Flexure bearings were used to suspend the piston and displacer about its mean position. The objective of the work was to develop a two-stage Stirling cryocooler with 2 W at 120 K and 0.5 W at 60 K cooling capacity for the two-stages and input power of less than 120 W. The Cryocooler achieved a minimum temperature of 40.7 K at stage 2.

  20. Assessing efficiency and effectiveness of Malaysian Islamic banks: A two stage DEA analysis

    Science.gov (United States)

    Kamarudin, Norbaizura; Ismail, Wan Rosmanira; Mohd, Muhammad Azri

    2014-06-01

    Islamic banks in Malaysia are indispensable players in the financial industry with the growing needs for syariah compliance system. In the banking industry, most recent studies concerned only on operational efficiency. However rarely on the operational effectiveness. Since the production process of banking industry can be described as a two-stage process, two-stage Data Envelopment Analysis (DEA) can be applied to measure the bank performance. This study was designed to measure the overall performance in terms of efficiency and effectiveness of Islamic banks in Malaysia using Two-Stage DEA approach. This paper presents analysis of a DEA model which split the efficiency and effectiveness in order to evaluate the performance of ten selected Islamic Banks in Malaysia for the financial year period ended 2011. The analysis shows average efficient score is more than average effectiveness score thus we can say that Malaysian Islamic banks were more efficient rather than effective. Furthermore, none of the bank exhibit best practice in both stages as we can say that a bank with better efficiency does not always mean having better effectiveness at the same time.

  1. Effects of changing canopy directional reflectance on feature selection

    Science.gov (United States)

    Smith, J. A.; Oliver, R. E.; Kilpela, O. E.

    1973-01-01

    The use of a Monte Carlo model for generating sample directional reflectance data for two simplified target canopies at two different solar positions is reported. Successive iterations through the model permit the calculation of a mean vector and covariance matrix for canopy reflectance for varied sensor view angles. These data may then be used to calculate the divergence between the target distributions for various wavelength combinations and for these view angles. Results of a feature selection analysis indicate that different sets of wavelengths are optimum for target discrimination depending on sensor view angle and that the targets may be more easily discriminated for some scan angles than others. The time-varying behavior of these results is also pointed out.

  2. Less is more: Avoiding the LIBS dimensionality curse through judicious feature selection for explosive detection

    Science.gov (United States)

    Kumar Myakalwar, Ashwin; Spegazzini, Nicolas; Zhang, Chi; Kumar Anubham, Siva; Dasari, Ramachandra R.; Barman, Ishan; Kumar Gundawar, Manoj

    2015-01-01

    Despite its intrinsic advantages, translation of laser induced breakdown spectroscopy for material identification has been often impeded by the lack of robustness of developed classification models, often due to the presence of spurious correlations. While a number of classifiers exhibiting high discriminatory power have been reported, efforts in establishing the subset of relevant spectral features that enable a fundamental interpretation of the segmentation capability and avoid the ‘curse of dimensionality’ have been lacking. Using LIBS data acquired from a set of secondary explosives, we investigate judicious feature selection approaches and architect two different chemometrics classifiers –based on feature selection through prerequisite knowledge of the sample composition and genetic algorithm, respectively. While the full spectral input results in classification rate of ca.92%, selection of only carbon to hydrogen spectral window results in near identical performance. Importantly, the genetic algorithm-derived classifier shows a statistically significant improvement to ca. 94% accuracy for prospective classification, even though the number of features used is an order of magnitude smaller. Our findings demonstrate the impact of rigorous feature selection in LIBS and also hint at the feasibility of using a discrete filter based detector thereby enabling a cheaper and compact system more amenable to field operations. PMID:26286630

  3. An Ensemble Method with Integration of Feature Selection and Classifier Selection to Detect the Landslides

    Science.gov (United States)

    Zhongqin, G.; Chen, Y.

    2017-12-01

    Abstract Quickly identify the spatial distribution of landslides automatically is essential for the prevention, mitigation and assessment of the landslide hazard. It's still a challenging job owing to the complicated characteristics and vague boundary of the landslide areas on the image. The high resolution remote sensing image has multi-scales, complex spatial distribution and abundant features, the object-oriented image classification methods can make full use of the above information and thus effectively detect the landslides after the hazard happened. In this research we present a new semi-supervised workflow, taking advantages of recent object-oriented image analysis and machine learning algorithms to quick locate the different origins of landslides of some areas on the southwest part of China. Besides a sequence of image segmentation, feature selection, object classification and error test, this workflow ensemble the feature selection and classifier selection. The feature this study utilized were normalized difference vegetation index (NDVI) change, textural feature derived from the gray level co-occurrence matrices (GLCM), spectral feature and etc. The improvement of this study shows this algorithm significantly removes some redundant feature and the classifiers get fully used. All these improvements lead to a higher accuracy on the determination of the shape of landslides on the high resolution remote sensing image, in particular the flexibility aimed at different kinds of landslides.

  4. Computer ranking of the sequence of appearance of 100 features of the brain and related structures in staged human embryos during the first 5 weeks of development.

    Science.gov (United States)

    O'Rahilly, R; Müller, F; Hutchins, G M; Moore, G W

    1984-11-01

    The sequence of events in the development of the brain in staged human embryos was investigated in much greater detail than in previous studies by listing 100 features in 165 embryos of the first 5 weeks. Using a computerized bubble-sort algorithm, individual embryos were ranked in ascending order of the features present. This procedure made feasible an appreciation of the slight variation found in the developmental features. The vast majority of features appeared during either one or two stages (about 2 or 3 days). In general, the soundness of the Carnegie system of embryonic staging was amply confirmed. The rhombencephalon was found to show increasing complexity around stage 13, and the postoptic portion of the diencephalon underwent considerable differentiation by stage 15. The need for similar investigations of other systems of the body is emphasized, and the importance of such studies in assessing the timing of congenital malformations and in clarifying syndromic clusters is suggested.

  5. Features of application of test techniques in selection and certification of personnel

    Directory of Open Access Journals (Sweden)

    Levina E.V.

    2017-04-01

    Full Text Available this article investigates the problem of use of test techniques in the course of selection and certification of personnel, and also for detection of features of formation of social and psychological climate in small groups and labor collectives. The group dynamics including individual preferences of players in the choice of strategy of behavior in a conflict situation in definition of the most acceptable social role, in identification of the most constructive and effective type of the administrative decision has the special importance. The author of the article offers a set of the test methods allowing to define social and psychological type of the individual at a stage of staff recruitment and to reveal its individual behavioral preferences that can structurally influence on team work.

  6. An ant colony optimization based feature selection for web page classification.

    Science.gov (United States)

    Saraç, Esra; Özel, Selma Ayşe

    2014-01-01

    The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods.

  7. Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

    Science.gov (United States)

    Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

    2014-01-01

    Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).

  8. Hierarchical feature selection for erythema severity estimation

    Science.gov (United States)

    Wang, Li; Shi, Chenbo; Shu, Chang

    2014-10-01

    At present PASI system of scoring is used for evaluating erythema severity, which can help doctors to diagnose psoriasis [1-3]. The system relies on the subjective judge of doctors, where the accuracy and stability cannot be guaranteed [4]. This paper proposes a stable and precise algorithm for erythema severity estimation. Our contributions are twofold. On one hand, in order to extract the multi-scale redness of erythema, we design the hierarchical feature. Different from traditional methods, we not only utilize the color statistical features, but also divide the detect window into small window and extract hierarchical features. Further, a feature re-ranking step is introduced, which can guarantee that extracted features are irrelevant to each other. On the other hand, an adaptive boosting classifier is applied for further feature selection. During the step of training, the classifier will seek out the most valuable feature for evaluating erythema severity, due to its strong learning ability. Experimental results demonstrate the high precision and robustness of our algorithm. The accuracy is 80.1% on the dataset which comprise 116 patients' images with various kinds of erythema. Now our system has been applied for erythema medical efficacy evaluation in Union Hosp, China.

  9. A Hybrid Feature Selection Approach for Arabic Documents Classification

    NARCIS (Netherlands)

    Habib, Mena Badieh; Sarhan, Ahmed A. E.; Salem, Abdel-Badeeh M.; Fayed, Zaki T.; Gharib, Tarek F.

    Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge number of features. Feature selection tries to

  10. Simultaneous feature selection and classification via Minimax Probability Machine

    Directory of Open Access Journals (Sweden)

    Liming Yang

    2010-12-01

    Full Text Available This paper presents a novel method for simultaneous feature selection and classification by incorporating a robust L1-norm into the objective function of Minimax Probability Machine (MPM. A fractional programming framework is derived by using a bound on the misclassification error involving the mean and covariance of the data. Furthermore, the problems are solved by the Quadratic Interpolation method. Experiments show that our methods can select fewer features to improve the generalization compared to MPM, which illustrates the effectiveness of the proposed algorithms.

  11. Two-stage free electron laser research

    Science.gov (United States)

    Segall, S. B.

    1984-10-01

    KMS Fusion, Inc. began studying the feasibility of two-stage free electron lasers for the Office of Naval Research in June, 1980. At that time, the two-stage FEL was only a concept that had been proposed by Luis Elias. The range of parameters over which such a laser could be successfully operated, attainable power output, and constraints on laser operation were not known. The primary reason for supporting this research at that time was that it had the potential for producing short-wavelength radiation using a relatively low voltage electron beam. One advantage of a low-voltage two-stage FEL would be that shielding requirements would be greatly reduced compared with single-stage short-wavelength FEL's. If the electron energy were kept below about 10 MeV, X-rays, generated by electrons striking the beam line wall, would not excite neutron resonance in atomic nuclei. These resonances cause the emission of neutrons with subsequent induced radioactivity. Therefore, above about 10 MeV, a meter or more of concrete shielding is required for the system, whereas below 10 MeV, a few millimeters of lead would be adequate.

  12. Selection of individual features of a speech signal using genetic algorithms

    Directory of Open Access Journals (Sweden)

    Kamil Kamiński

    2016-03-01

    Full Text Available The paper presents an automatic speaker’s recognition system, implemented in the Matlab environment, and demonstrates how to achieve and optimize various elements of the system. The main emphasis was put on features selection of a speech signal using a genetic algorithm which takes into account synergy of features. The results of optimization of selected elements of a classifier have been also shown, including the number of Gaussian distributions used to model each of the voices. In addition, for creating voice models, a universal voice model has been used.[b]Keywords[/b]: biometrics, automatic speaker recognition, genetic algorithms, feature selection

  13. Selecting Optimal Feature Set in High-Dimensional Data by Swarm Search

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2013-01-01

    Full Text Available Selecting the right set of features from data of high dimensionality for inducing an accurate classification model is a tough computational challenge. It is almost a NP-hard problem as the combinations of features escalate exponentially as the number of features increases. Unfortunately in data mining, as well as other engineering applications and bioinformatics, some data are described by a long array of features. Many feature subset selection algorithms have been proposed in the past, but not all of them are effective. Since it takes seemingly forever to use brute force in exhaustively trying every possible combination of features, stochastic optimization may be a solution. In this paper, we propose a new feature selection scheme called Swarm Search to find an optimal feature set by using metaheuristics. The advantage of Swarm Search is its flexibility in integrating any classifier into its fitness function and plugging in any metaheuristic algorithm to facilitate heuristic search. Simulation experiments are carried out by testing the Swarm Search over some high-dimensional datasets, with different classification algorithms and various metaheuristic algorithms. The comparative experiment results show that Swarm Search is able to attain relatively low error rates in classification without shrinking the size of the feature subset to its minimum.

  14. Assessment of Itakura Distance as a valuable feature for computer-aided classification of sleep stages.

    Science.gov (United States)

    Ebrahimi, F; Mikaili, M; Estrada, E; Nazeran, H

    2007-01-01

    Staging and detection of various states of sleep derived from EEG and other biomedical signals have proven to be very helpful in diagnosis, prognosis and remedy of various sleep related disorders. The time consuming and costly process of visual scoring of sleep stages by a specialist has always motivated researchers to develop an automatic sleep scoring system and the first step toward achieving this task is finding discriminating characteristics (or features) for each stage. A vast variety of these features and methods have been investigated in the sleep literature with different degrees of success. In this study, we investigated the performance of a newly introduced measure: the Itakura Distance (ID), as a similarity measure between EEG and EOG signals. This work demonstrated and further confirmed the outcomes of our previous research that the Itakura Distance serves as a valuable similarity measure to differentiate between different sleep stages.

  15. Manifold Regularized Multi-Task Feature Selection for Multi-Modality Classification in Alzheimer’s Disease

    Science.gov (United States)

    Jie, Biao; Cheng, Bo

    2014-01-01

    Accurate diagnosis of Alzheimer’s disease (AD), as well as its pro-dromal stage (i.e., mild cognitive impairment, MCI), is very important for possible delay and early treatment of the disease. Recently, multi-modality methods have been used for fusing information from multiple different and complementary imaging and non-imaging modalities. Although there are a number of existing multi-modality methods, few of them have addressed the problem of joint identification of disease-related brain regions from multi-modality data for classification. In this paper, we proposed a manifold regularized multi-task learning framework to jointly select features from multi-modality data. Specifically, we formulate the multi-modality classification as a multi-task learning framework, where each task focuses on the classification based on each modality. In order to capture the intrinsic relatedness among multiple tasks (i.e., modalities), we adopted a group sparsity regularizer, which ensures only a small number of features to be selected jointly. In addition, we introduced a new manifold based Laplacian regularization term to preserve the geometric distribution of original data from each task, which can lead to the selection of more discriminative features. Furthermore, we extend our method to the semi-supervised setting, which is very important since the acquisition of a large set of labeled data (i.e., diagnosis of disease) is usually expensive and time-consuming, while the collection of unlabeled data is relatively much easier. To validate our method, we have performed extensive evaluations on the baseline Magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET) data of Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Our experimental results demonstrate the effectiveness of the proposed method. PMID:24505676

  16. A Semidefinite Programming Based Search Strategy for Feature Selection with Mutual Information Measure.

    Science.gov (United States)

    Naghibi, Tofigh; Hoffmann, Sarah; Pfister, Beat

    2015-08-01

    Feature subset selection, as a special case of the general subset selection problem, has been the topic of a considerable number of studies due to the growing importance of data-mining applications. In the feature subset selection problem there are two main issues that need to be addressed: (i) Finding an appropriate measure function than can be fairly fast and robustly computed for high-dimensional data. (ii) A search strategy to optimize the measure over the subset space in a reasonable amount of time. In this article mutual information between features and class labels is considered to be the measure function. Two series expansions for mutual information are proposed, and it is shown that most heuristic criteria suggested in the literature are truncated approximations of these expansions. It is well-known that searching the whole subset space is an NP-hard problem. Here, instead of the conventional sequential search algorithms, we suggest a parallel search strategy based on semidefinite programming (SDP) that can search through the subset space in polynomial time. By exploiting the similarities between the proposed algorithm and an instance of the maximum-cut problem in graph theory, the approximation ratio of this algorithm is derived and is compared with the approximation ratio of the backward elimination method. The experiments show that it can be misleading to judge the quality of a measure solely based on the classification accuracy, without taking the effect of the non-optimum search strategy into account.

  17. A Two-Stage Layered Mixture Experiment Design for a Nuclear Waste Glass Application-Part 2

    International Nuclear Information System (INIS)

    Cooley, Scott K.; Piepel, Gregory F.; Gan, Hao; Kot, Wing; Pegg, Ian L.

    2003-01-01

    Part 1 (Cooley and Piepel, 2003a) describes the first stage of a two-stage experimental design to support property-composition modeling for high-level waste (HLW) glass to be produced at the Hanford Site in Washington state. Each stage used a layered design having an outer layer, an inner layer, a center point, and some replicates. However, the design variables and constraints defining the layers of the experimental glass composition region (EGCR) were defined differently for the second stage than for the first. The first-stage initial design involved 15 components, all treated as mixture variables. The second-stage augmentation design involved 19 components, with 14 treated as mixture variables and 5 treated as non-mixture variables. For each second-stage layer, vertices were generated and optimal design software was used to select alternative subsets of vertices for the design and calculate design optimality measures. A model containing 29 partial quadratic mixture terms plus 5 linear terms for the non-mixture variables was the basis for the optimal design calculations. Predicted property values were plotted for the alternative subsets of second-stage vertices and the first-stage design points. Based on the optimality measures and the predicted property distributions, a ''best'' subset of vertices was selected for each layer of the second-stage to augment the first-stage design

  18. Computer ranking of the sequence of appearance of 40 features of the brain and related structures in staged human embryos during the seventh week of development.

    Science.gov (United States)

    O'Rahilly, R; Müller, F; Hutchins, G M; Moore, G W

    1988-08-01

    The sequence of events in the development of the brain in human embryos, already published for stages 8-17, is here continued for stages 18 and 19. With the aid of a computerized bubble-sort algorithm, 58 individual embryos were ranked in ascending order of the features present. The increasing structural complexity provided 40 new features in these two stages. The chief characteristics of stage 18 (approximately 44 postovulatory days) are rapidly growing basal nuclei; appearance of the extraventricular bulge of the cerebellum (flocculus), of the superior cerebellar peduncle, and of follicles in the epiphysis cerebri; and the presence of vomeronasal organ and ganglion, of the bucconasal membrane, and of isolated semicircular ducts. The main features of stage 19 (approximately 48 days) are the cochlear nuclei, the ganglion of the nervus terminalis, nuclei of the prosencephalic septum, the appearance of the subcommissural organ, the presence of villi in the choroid plexuses of the fourth and lateral ventricles, and the stria medullaris thalami.

  19. Feature Selection Using Adaboost for Face Expression Recognition

    National Research Council Canada - National Science Library

    Silapachote, Piyanuch; Karuppiah, Deepak R; Hanson, Allen R

    2005-01-01

    We propose a classification technique for face expression recognition using AdaBoost that learns by selecting the relevant global and local appearance features with the most discriminating information...

  20. A stage is a stage is a stage: a direct comparison of two scoring systems.

    Science.gov (United States)

    Dawson, Theo L

    2003-09-01

    L. Kohlberg (1969) argued that his moral stages captured a developmental sequence specific to the moral domain. To explore that contention, the author compared stage assignments obtained with the Standard Issue Scoring System (A. Colby & L. Kohlberg, 1987a, 1987b) and those obtained with a generalized content-independent stage-scoring system called the Hierarchical Complexity Scoring System (T. L. Dawson, 2002a), on 637 moral judgment interviews (participants' ages ranged from 5 to 86 years). The correlation between stage scores produced with the 2 systems was .88. Although standard issue scoring and hierarchical complexity scoring often awarded different scores up to Kohlberg's Moral Stage 2/3, from his Moral Stage 3 onward, scores awarded with the two systems predominantly agreed. The author explores the implications for developmental research.

  1. A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark

    Directory of Open Access Journals (Sweden)

    Yong Wang

    2016-02-01

    Full Text Available Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.

  2. Two-stage thermal/nonthermal waste treatment process

    International Nuclear Information System (INIS)

    Rosocha, L.A.; Anderson, G.K.; Coogan, J.J.; Kang, M.; Tennant, R.A.; Wantuck, P.J.

    1993-01-01

    An innovative waste treatment technology is being developed in Los Alamos to address the destruction of hazardous organic wastes. The technology described in this report uses two stages: a packed bed reactor (PBR) in the first stage to volatilize and/or combust liquid organics and a silent discharge plasma (SDP) reactor to remove entrained hazardous compounds in the off-gas to even lower levels. We have constructed pre-pilot-scale PBR-SDP apparatus and tested the two stages separately and in combined modes. These tests are described in the report

  3. Feature selection model based on clustering and ranking in pipeline for microarray data

    Directory of Open Access Journals (Sweden)

    Barnali Sahu

    2017-01-01

    Full Text Available Most of the available feature selection techniques in the literature are classifier bound. It means a group of features tied to the performance of a specific classifier as applied in wrapper and hybrid approach. Our objective in this study is to select a set of generic features not tied to any classifier based on the proposed framework. This framework uses attribute clustering and feature ranking techniques in pipeline in order to remove redundant features. On each uncovered cluster, signal-to-noise ratio, t-statistics and significance analysis of microarray are independently applied to select the top ranked features. Both filter and evolutionary wrapper approaches have been considered for feature selection and the data set with selected features are given to ensemble of predefined statistically different classifiers. The class labels of the test data are determined using majority voting technique. Moreover, with the aforesaid objectives, this paper focuses on obtaining a stable result out of various classification models. Further, a comparative analysis has been performed to study the classification accuracy and computational time of the current approach and evolutionary wrapper techniques. It gives a better insight into the features and further enhancing the classification accuracy with less computational time.

  4. Optimal Feature Space Selection in Detecting Epileptic Seizure based on Recurrent Quantification Analysis and Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Saleh LAshkari

    2016-06-01

    Full Text Available Selecting optimal features based on nature of the phenomenon and high discriminant ability is very important in the data classification problems. Since it doesn't require any assumption about stationary condition and size of the signal and the noise in Recurrent Quantification Analysis (RQA, it may be useful for epileptic seizure Detection. In this study, RQA was used to discriminate ictal EEG from the normal EEG where optimal features selected by combination of algorithm genetic and Bayesian Classifier. Recurrence plots of hundred samples in each two categories were obtained with five distance norms in this study: Euclidean, Maximum, Minimum, Normalized and Fixed Norm. In order to choose optimal threshold for each norm, ten threshold of ε was generated and then the best feature space was selected by genetic algorithm in combination with a bayesian classifier. The results shown that proposed method is capable of discriminating the ictal EEG from the normal EEG where for Minimum norm and 0.1˂ε˂1, accuracy was 100%. In addition, the sensitivity of proposed framework to the ε and the distance norm parameters was low. The optimal feature presented in this study is Trans which it was selected in most feature spaces with high accuracy.

  5. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  6. Illumination uniformity issue explored via two-stage solar concentrator system based on Fresnel lens and compound flat concentrator

    International Nuclear Information System (INIS)

    Yeh, Naichia

    2016-01-01

    This paper illustrates details about the solar radiation distribution on the target of a two-stage solar concentrator that combines the Fresnel lens (FL) and the compound flat concentrator (CFC). The paper starts with a review of some FL development milestones such as the two-stage systems and the comparisons of flat vs. curved lenses in addition to the most noteworthy FL-based solar energy application, concentration photovoltaic (CPV). Through the review of the FL based CPV and two-stage concentrators, this study leads to the development of an algorithm to explore the spectrum distribution insight on the receiver of a two-stage (FL plus CFC) solar concentration system. It established the potential for using a correctly positioned 2nd stage reflector of right dimension to selectively redirect the desired spectrum on the target area so as to enhance the concentration flux intensity and uniformity at the same time. The study also helped to chart out the approximate locations of certain spectrum segments on the FL's target area, which is useful for exploring the spectrum control mechanism via the Fresnel lenses. - Highlights: • Map out the approximate locations of spectrum segments on FL's focal area. • Use the 2nd stage reflector to selectively reflect the desired spectrum on target. • Explore the spectrum distribution insight on FL solar concentrators' target area.

  7. Enhancing the Performance of LibSVM Classifier by Kernel F-Score Feature Selection

    Science.gov (United States)

    Sarojini, Balakrishnan; Ramaraj, Narayanasamy; Nickolas, Savarimuthu

    Medical Data mining is the search for relationships and patterns within the medical datasets that could provide useful knowledge for effective clinical decisions. The inclusion of irrelevant, redundant and noisy features in the process model results in poor predictive accuracy. Much research work in data mining has gone into improving the predictive accuracy of the classifiers by applying the techniques of feature selection. Feature selection in medical data mining is appreciable as the diagnosis of the disease could be done in this patient-care activity with minimum number of significant features. The objective of this work is to show that selecting the more significant features would improve the performance of the classifier. We empirically evaluate the classification effectiveness of LibSVM classifier on the reduced feature subset of diabetes dataset. The evaluations suggest that the feature subset selected improves the predictive accuracy of the classifier and reduce false negatives and false positives.

  8. A stratified two-stage sampling design for digital soil mapping in a Mediterranean basin

    Science.gov (United States)

    Blaschek, Michael; Duttmann, Rainer

    2015-04-01

    The quality of environmental modelling results often depends on reliable soil information. In order to obtain soil data in an efficient manner, several sampling strategies are at hand depending on the level of prior knowledge and the overall objective of the planned survey. This study focuses on the collection of soil samples considering available continuous secondary information in an undulating, 16 km²-sized river catchment near Ussana in southern Sardinia (Italy). A design-based, stratified, two-stage sampling design has been applied aiming at the spatial prediction of soil property values at individual locations. The stratification based on quantiles from density functions of two land-surface parameters - topographic wetness index and potential incoming solar radiation - derived from a digital elevation model. Combined with four main geological units, the applied procedure led to 30 different classes in the given test site. Up to six polygons of each available class were selected randomly excluding those areas smaller than 1ha to avoid incorrect location of the points in the field. Further exclusion rules were applied before polygon selection masking out roads and buildings using a 20m buffer. The selection procedure was repeated ten times and the set of polygons with the best geographical spread were chosen. Finally, exact point locations were selected randomly from inside the chosen polygon features. A second selection based on the same stratification and following the same methodology (selecting one polygon instead of six) was made in order to create an appropriate validation set. Supplementary samples were obtained during a second survey focusing on polygons that have either not been considered during the first phase at all or were not adequately represented with respect to feature size. In total, both field campaigns produced an interpolation set of 156 samples and a validation set of 41 points. The selection of sample point locations has been done using

  9. An improved strategy for skin lesion detection and classification using uniform segmentation and feature selection based approach.

    Science.gov (United States)

    Nasir, Muhammad; Attique Khan, Muhammad; Sharif, Muhammad; Lali, Ikram Ullah; Saba, Tanzila; Iqbal, Tassawar

    2018-02-21

    Melanoma is the deadliest type of skin cancer with highest mortality rate. However, the annihilation in early stage implies a high survival rate therefore, it demands early diagnosis. The accustomed diagnosis methods are costly and cumbersome due to the involvement of experienced experts as well as the requirements for highly equipped environment. The recent advancements in computerized solutions for these diagnoses are highly promising with improved accuracy and efficiency. In this article, we proposed a method for the classification of melanoma and benign skin lesions. Our approach integrates preprocessing, lesion segmentation, features extraction, features selection, and classification. Preprocessing is executed in the context of hair removal by DullRazor, whereas lesion texture and color information are utilized to enhance the lesion contrast. In lesion segmentation, a hybrid technique has been implemented and results are fused using additive law of probability. Serial based method is applied subsequently that extracts and fuses the traits such as color, texture, and HOG (shape). The fused features are selected afterwards by implementing a novel Boltzman Entropy method. Finally, the selected features are classified by Support Vector Machine. The proposed method is evaluated on publically available data set PH2. Our approach has provided promising results of sensitivity 97.7%, specificity 96.7%, accuracy 97.5%, and F-score 97.5%, which are significantly better than the results of existing methods available on the same data set. The proposed method detects and classifies melanoma significantly good as compared to existing methods. © 2018 Wiley Periodicals, Inc.

  10. Eliminating Survivor Bias in Two-stage Instrumental Variable Estimators.

    Science.gov (United States)

    Vansteelandt, Stijn; Walter, Stefan; Tchetgen Tchetgen, Eric

    2018-07-01

    Mendelian randomization studies commonly focus on elderly populations. This makes the instrumental variables analysis of such studies sensitive to survivor bias, a type of selection bias. A particular concern is that the instrumental variable conditions, even when valid for the source population, may be violated for the selective population of individuals who survive the onset of the study. This is potentially very damaging because Mendelian randomization studies are known to be sensitive to bias due to even minor violations of the instrumental variable conditions. Interestingly, the instrumental variable conditions continue to hold within certain risk sets of individuals who are still alive at a given age when the instrument and unmeasured confounders exert additive effects on the exposure, and moreover, the exposure and unmeasured confounders exert additive effects on the hazard of death. In this article, we will exploit this property to derive a two-stage instrumental variable estimator for the effect of exposure on mortality, which is insulated against the above described selection bias under these additivity assumptions.

  11. New Hybrid Features Selection Method: A Case Study on Websites Phishing

    Directory of Open Access Journals (Sweden)

    Khairan D. Rajab

    2017-01-01

    Full Text Available Phishing is one of the serious web threats that involves mimicking authenticated websites to deceive users in order to obtain their financial information. Phishing has caused financial damage to the different online stakeholders. It is massive in the magnitude of hundreds of millions; hence it is essential to minimize this risk. Classifying websites into “phishy” and legitimate types is a primary task in data mining that security experts and decision makers are hoping to improve particularly with respect to the detection rate and reliability of the results. One way to ensure the reliability of the results and to enhance performance is to identify a set of related features early on so the data dimensionality reduces and irrelevant features are discarded. To increase reliability of preprocessing, this article proposes a new feature selection method that combines the scores of multiple known methods to minimize discrepancies in feature selection results. The proposed method has been applied to the problem of website phishing classification to show its pros and cons in identifying relevant features. Results against a security dataset reveal that the proposed preprocessing method was able to derive new features datasets which when mined generate high competitive classifiers with reference to detection rate when compared to results obtained from other features selection methods.

  12. Electricity market price spike analysis by a hybrid data model and feature selection technique

    International Nuclear Information System (INIS)

    Amjady, Nima; Keynia, Farshid

    2010-01-01

    In a competitive electricity market, energy price forecasting is an important activity for both suppliers and consumers. For this reason, many techniques have been proposed to predict electricity market prices in the recent years. However, electricity price is a complex volatile signal owning many spikes. Most of electricity price forecast techniques focus on the normal price prediction, while price spike forecast is a different and more complex prediction process. Price spike forecasting has two main aspects: prediction of price spike occurrence and value. In this paper, a novel technique for price spike occurrence prediction is presented composed of a new hybrid data model, a novel feature selection technique and an efficient forecast engine. The hybrid data model includes both wavelet and time domain variables as well as calendar indicators, comprising a large candidate input set. The set is refined by the proposed feature selection technique evaluating both relevancy and redundancy of the candidate inputs. The forecast engine is a probabilistic neural network, which are fed by the selected candidate inputs of the feature selection technique and predict price spike occurrence. The efficiency of the whole proposed method for price spike occurrence forecasting is evaluated by means of real data from the Queensland and PJM electricity markets. (author)

  13. Heuristic algorithms for feature selection under Bayesian models with block-diagonal covariance structure.

    Science.gov (United States)

    Foroughi Pour, Ali; Dalton, Lori A

    2018-03-21

    Many bioinformatics studies aim to identify markers, or features, that can be used to discriminate between distinct groups. In problems where strong individual markers are not available, or where interactions between gene products are of primary interest, it may be necessary to consider combinations of features as a marker family. To this end, recent work proposes a hierarchical Bayesian framework for feature selection that places a prior on the set of features we wish to select and on the label-conditioned feature distribution. While an analytical posterior under Gaussian models with block covariance structures is available, the optimal feature selection algorithm for this model remains intractable since it requires evaluating the posterior over the space of all possible covariance block structures and feature-block assignments. To address this computational barrier, in prior work we proposed a simple suboptimal algorithm, 2MNC-Robust, with robust performance across the space of block structures. Here, we present three new heuristic feature selection algorithms. The proposed algorithms outperform 2MNC-Robust and many other popular feature selection algorithms on synthetic data. In addition, enrichment analysis on real breast cancer, colon cancer, and Leukemia data indicates they also output many of the genes and pathways linked to the cancers under study. Bayesian feature selection is a promising framework for small-sample high-dimensional data, in particular biomarker discovery applications. When applied to cancer data these algorithms outputted many genes already shown to be involved in cancer as well as potentially new biomarkers. Furthermore, one of the proposed algorithms, SPM, outputs blocks of heavily correlated genes, particularly useful for studying gene interactions and gene networks.

  14. A Hybrid Feature Subset Selection Algorithm for Analysis of High Correlation Proteomic Data

    Science.gov (United States)

    Kordy, Hussain Montazery; Baygi, Mohammad Hossein Miran; Moradi, Mohammad Hassan

    2012-01-01

    Pathological changes within an organ can be reflected as proteomic patterns in biological fluids such as plasma, serum, and urine. The surface-enhanced laser desorption and ionization time-of-flight mass spectrometry (SELDI-TOF MS) has been used to generate proteomic profiles from biological fluids. Mass spectrometry yields redundant noisy data that the most data points are irrelevant features for differentiating between cancer and normal cases. In this paper, we have proposed a hybrid feature subset selection algorithm based on maximum-discrimination and minimum-correlation coupled with peak scoring criteria. Our algorithm has been applied to two independent SELDI-TOF MS datasets of ovarian cancer obtained from the NCI-FDA clinical proteomics databank. The proposed algorithm has used to extract a set of proteins as potential biomarkers in each dataset. We applied the linear discriminate analysis to identify the important biomarkers. The selected biomarkers have been able to successfully diagnose the ovarian cancer patients from the noncancer control group with an accuracy of 100%, a sensitivity of 100%, and a specificity of 100% in the two datasets. The hybrid algorithm has the advantage that increases reproducibility of selected biomarkers and able to find a small set of proteins with high discrimination power. PMID:23717808

  15. Experimental and numerical studies on two-stage combustion of biomass

    Energy Technology Data Exchange (ETDEWEB)

    Houshfar, Eshan

    2012-07-01

    fate of the main corrosive compounds, in particular chlorine, was determined in an experimental campaign using fuel mixtures. The corrosion risk associated with three fuel mixtures was quite different. Grot (Norwegian term used for tree's tops and branches) was found to be a poor corrosion-reduction additive and could not serve as an alternative fuel for co-firing with straw. Peat was found to reduce the corrosive compounds only at high peat additions (50 wt%). Sewage sludge was the best alternative for corrosion reduction as 10 wt% addition almost eliminated chlorine from the fly ash. Numerical studies were also performed to estimate the emission level in the flue gas using a comprehensive mechanism in a configuration which simulated two-stage combustion of biomass. Furthermore, a reduction of the comprehensive chemical mechanism was performed since the mechanism is still complex and needs very long computational time and powerful hardware resources. The selected detailed mechanism in this study contains 81 species and 703 elementary reactions. Necessity analysis was used to determine which species and reactions that are of less importance for the predictability of the final result and, hence, can be discarded. For validation, numerical results using the derived reduced mechanism were compared with the results obtained with the original detailed mechanism. The reduced mechanism contains 35 species and 198 reactions, corresponding to 72% reduction in the number of reactions and, therefore, improving the computational time considerably. Yet the model based on the reduced mechanism predicts correctly concentrations of NOx and CO that are essentially identical to those of the complete mechanism in the range of reaction conditions of interest. The modeling conditions are selected in a way to mimic values in the different ranges of temperature, excess air ratio and residence time, since these variables are the main affecting parameters on NOx emission. (Author)

  16. Feature Selection for Wheat Yield Prediction

    Science.gov (United States)

    Ruß, Georg; Kruse, Rudolf

    Carrying out effective and sustainable agriculture has become an important issue in recent years. Agricultural production has to keep up with an everincreasing population by taking advantage of a field’s heterogeneity. Nowadays, modern technology such as the global positioning system (GPS) and a multitude of developed sensors enable farmers to better measure their fields’ heterogeneities. For this small-scale, precise treatment the term precision agriculture has been coined. However, the large amounts of data that are (literally) harvested during the growing season have to be analysed. In particular, the farmer is interested in knowing whether a newly developed heterogeneity sensor is potentially advantageous or not. Since the sensor data are readily available, this issue should be seen from an artificial intelligence perspective. There it can be treated as a feature selection problem. The additional task of yield prediction can be treated as a multi-dimensional regression problem. This article aims to present an approach towards solving these two practically important problems using artificial intelligence and data mining ideas and methodologies.

  17. Effect of ammoniacal nitrogen on one-stage and two-stage anaerobic digestion of food waste

    International Nuclear Information System (INIS)

    Ariunbaatar, Javkhlan; Scotto Di Perta, Ester; Panico, Antonio; Frunzo, Luigi; Esposito, Giovanni; Lens, Piet N.L.; Pirozzi, Francesco

    2015-01-01

    Highlights: • Almost 100% of the biomethane potential of food waste was recovered during AD in a two-stage CSTR. • Recirculation of the liquid fraction of the digestate provided the necessary buffer in the AD reactors. • A higher OLR (0.9 gVS/L·d) led to higher accumulation of TAN, which caused more toxicity. • A two-stage reactor is more sensitive to elevated concentrations of ammonia. • The IC 50 of TAN for the AD of food waste amounts to 3.8 g/L. - Abstract: This research compares the operation of one-stage and two-stage anaerobic continuously stirred tank reactor (CSTR) systems fed semi-continuously with food waste. The main purpose was to investigate the effects of ammoniacal nitrogen on the anaerobic digestion process. The two-stage system gave more reliable operation compared to one-stage due to: (i) a better pH self-adjusting capacity; (ii) a higher resistance to organic loading shocks; and (iii) a higher conversion rate of organic substrate to biomethane. Also a small amount of biohydrogen was detected from the first stage of the two-stage reactor making this system attractive for biohythane production. As the digestate contains ammoniacal nitrogen, re-circulating it provided the necessary alkalinity in the systems, thus preventing an eventual failure by volatile fatty acids (VFA) accumulation. However, re-circulation also resulted in an ammonium accumulation, yielding a lower biomethane production. Based on the batch experimental results the 50% inhibitory concentration of total ammoniacal nitrogen on the methanogenic activities was calculated as 3.8 g/L, corresponding to 146 mg/L free ammonia for the inoculum used for this research. The two-stage system was affected by the inhibition more than the one-stage system, as it requires less alkalinity and the physically separated methanogens are more sensitive to inhibitory factors, such as ammonium and propionic acid

  18. Staging of gastric adenocarcinoma using two-phase spiral CT: correlation with pathologic staging

    International Nuclear Information System (INIS)

    Seo, Tae Seok; Lee, Dong Ho; Ko, Young Tae; Lim, Joo Won

    1998-01-01

    To correlate the preoperative staging of gastric adenocarcinoma using two-phase spiral CT with pathologic staging. One hundred and eighty patients with gastric cancers confirmed during surgery underwent two-phase spiral CT, and were evaluated retrospectively. CT scans were obtained in the prone position after ingestion of water. Scans were performed 35 and 80 seconds after the start of infusion of 120mL of non-ionic contrast material with the speed of 3mL/sec. Five mm collimation, 7mm/sec table feed and 5mm reconstruction interval were used. T-and N-stage were determined using spiral CT images, without knowledge of the pathologic results. Pathologic staging was later compared with CT staging. Pathologic T-stage was T1 in 70 cases(38.9%), T2 in 33(18.3%), T3 in 73(40.6%), and T4 in 4(2.2%). Type-I or IIa elevated lesions accouted for 10 of 70 T1 cases(14.3%) and flat or depressed lesions(type IIb, IIc, or III) for 60(85.7%). Pathologic N-stage was NO in 85 cases(47.2%), N1 in 42(23.3%), N2 in 31(17.2%), and N3 in 22(12,2%). The detection rate of early gastric cancer using two-phase spiral CT was 100.0%(10 of 10 cases) among elevated lesions and 78.3%(47 of 60 cases) among flat or depressed lesions. With regard to T-stage, there was good correlation between CT image and pathology in 86 of 180 cases(47.8%). Overstaging occurred in 23.3%(42 of 180 cases) and understaging in 28.9%(52 of 180 cases). With regard to N-stage, good correlation between CT image and pathology was noted in 94 of 180 cases(52.2%). The rate of understaging(31.7%, 57 of 180 cases) was higher than that of overstaging(16.1%, 29 of 180 cases)(p<0.001). The detection rate of early gastric cancer using two-phase spiral CT was 81.4%, and there was no significant difference in detectability between elevated and depressed lesions. Two-phase spiral CT for determing the T-and N-stage of gastric cancer was not effective;it was accurate in abont 50% of cases understaging tended to occur.=20

  19. Frequency analysis of a two-stage planetary gearbox using two different methodologies

    Science.gov (United States)

    Feki, Nabih; Karray, Maha; Khabou, Mohamed Tawfik; Chaari, Fakher; Haddar, Mohamed

    2017-12-01

    This paper is focused on the characterization of the frequency content of vibration signals issued from a two-stage planetary gearbox. To achieve this goal, two different methodologies are adopted: the lumped-parameter modeling approach and the phenomenological modeling approach. The two methodologies aim to describe the complex vibrations generated by a two-stage planetary gearbox. The phenomenological model describes directly the vibrations as measured by a sensor fixed outside the fixed ring gear with respect to an inertial reference frame, while results from a lumped-parameter model are referenced with respect to a rotating frame and then transferred into an inertial reference frame. Two different case studies of the two-stage planetary gear are adopted to describe the vibration and the corresponding spectra using both models. Each case presents a specific geometry and a specific spectral structure.

  20. A DYNAMIC FEATURE SELECTION METHOD FOR DOCUMENT RANKING WITH RELEVANCE FEEDBACK APPROACH

    Directory of Open Access Journals (Sweden)

    K. Latha

    2010-07-01

    Full Text Available Ranking search results is essential for information retrieval and Web search. Search engines need to not only return highly relevant results, but also be fast to satisfy users. As a result, not all available features can be used for ranking, and in fact only a small percentage of these features can be used. Thus, it is crucial to have a feature selection mechanism that can find a subset of features that both meets latency requirements and achieves high relevance. In this paper we describe a 0/1 knapsack procedure for automatically selecting features to use within Generalization model for Document Ranking. We propose an approach for Relevance Feedback using Expectation Maximization method and evaluate the algorithm on the TREC Collection for describing classes of feedback textual information retrieval features. Experimental results, evaluated on standard TREC-9 part of the OHSUMED collections, show that our feature selection algorithm produces models that are either significantly more effective than, or equally effective as, models such as Markov Random Field model, Correlation Co-efficient and Count Difference method

  1. Feature-selective Attention in Frontoparietal Cortex: Multivoxel Codes Adjust to Prioritize Task-relevant Information.

    Science.gov (United States)

    Jackson, Jade; Rich, Anina N; Williams, Mark A; Woolgar, Alexandra

    2017-02-01

    Human cognition is characterized by astounding flexibility, enabling us to select appropriate information according to the objectives of our current task. A circuit of frontal and parietal brain regions, often referred to as the frontoparietal attention network or multiple-demand (MD) regions, are believed to play a fundamental role in this flexibility. There is evidence that these regions dynamically adjust their responses to selectively process information that is currently relevant for behavior, as proposed by the "adaptive coding hypothesis" [Duncan, J. An adaptive coding model of neural function in prefrontal cortex. Nature Reviews Neuroscience, 2, 820-829, 2001]. Could this provide a neural mechanism for feature-selective attention, the process by which we preferentially process one feature of a stimulus over another? We used multivariate pattern analysis of fMRI data during a perceptually challenging categorization task to investigate whether the representation of visual object features in the MD regions flexibly adjusts according to task relevance. Participants were trained to categorize visually similar novel objects along two orthogonal stimulus dimensions (length/orientation) and performed short alternating blocks in which only one of these dimensions was relevant. We found that multivoxel patterns of activation in the MD regions encoded the task-relevant distinctions more strongly than the task-irrelevant distinctions: The MD regions discriminated between stimuli of different lengths when length was relevant and between the same objects according to orientation when orientation was relevant. The data suggest a flexible neural system that adjusts its representation of visual objects to preferentially encode stimulus features that are currently relevant for behavior, providing a neural mechanism for feature-selective attention.

  2. Area Determination of Diabetic Foot Ulcer Images Using a Cascaded Two-Stage SVM-Based Classification.

    Science.gov (United States)

    Wang, Lei; Pedersen, Peder C; Agu, Emmanuel; Strong, Diane M; Tulu, Bengisu

    2017-09-01

    The standard chronic wound assessment method based on visual examination is potentially inaccurate and also represents a significant clinical workload. Hence, computer-based systems providing quantitative wound assessment may be valuable for accurately monitoring wound healing status, with the wound area the best suited for automated analysis. Here, we present a novel approach, using support vector machines (SVM) to determine the wound boundaries on foot ulcer images captured with an image capture box, which provides controlled lighting and range. After superpixel segmentation, a cascaded two-stage classifier operates as follows: in the first stage, a set of k binary SVM classifiers are trained and applied to different subsets of the entire training images dataset, and incorrectly classified instances are collected. In the second stage, another binary SVM classifier is trained on the incorrectly classified set. We extracted various color and texture descriptors from superpixels that are used as input for each stage in the classifier training. Specifically, color and bag-of-word representations of local dense scale invariant feature transformation features are descriptors for ruling out irrelevant regions, and color and wavelet-based features are descriptors for distinguishing healthy tissue from wound regions. Finally, the detected wound boundary is refined by applying the conditional random field method. We have implemented the wound classification on a Nexus 5 smartphone platform, except for training which was done offline. Results are compared with other classifiers and show that our approach provides high global performance rates (average sensitivity = 73.3%, specificity = 94.6%) and is sufficiently efficient for a smartphone-based image analysis.

  3. MATHEMATICAL МODELLING OF SELECTING INFORMATIVE FEATURES FOR ANALYZING THE LIFE CYCLE PROCESSES OF RADIO-ELECTRONIC MEANS

    Directory of Open Access Journals (Sweden)

    Николай Григорьевич Стародубцев

    2017-09-01

    Full Text Available The subject of the study are methods and models for extracting information about the processes of the life cycle of radio electronic means at the design, production and operation stages. The goal is to develop the fundamentals of the theory of holistic monitoring of the life cycle of radio electronic means at the stages of their design, production and operation, in particular the development of information models for monitoring life cycle indicators in the production of radio electronic means. The attainment of this goal is achieved by solving such problems: research and development of a methodology for solving the problems of selecting informative features characterizing the state of the life cycle of radio electronic means; choice of informative features characterizing the state of the life cycle processes of radio electronic means; identification of the state of the life cycle processes of radio electronic means. To solve these problems, general scientific methods were used: the main provisions of functional analysis, nonequilibrium thermodynamics, estimation and prediction of random processes, optimization methods, pattern recognition. The following results are obtained. Methods for solving the problems of selecting informative features for monitoring the life cycle of radioelectronic facilities are developed by classifying the states of radioelectronic means and the processes of LC in the space of characteristics, each of which has a certain significance, which allowed finding a complex criterion and formalizing the selection procedures. When the number of a priori data is insufficient for a correct classification, heuristic methods of selection according to the criteria for using basic prototypes and information priorities are proposed. Conclusions. The solution of the problem of mathematical modeling of the efficiency functions of the processes of the life cycle of radioelectronic facilities and the choice of informative features for

  4. A Two-Stage Queue Model to Optimize Layout of Urban Drainage System considering Extreme Rainstorms

    OpenAIRE

    He, Xinhua; Hu, Wenfa

    2017-01-01

    Extreme rainstorm is a main factor to cause urban floods when urban drainage system cannot discharge stormwater successfully. This paper investigates distribution feature of rainstorms and draining process of urban drainage systems and uses a two-stage single-counter queue method M/M/1→M/D/1 to model urban drainage system. The model emphasizes randomness of extreme rainstorms, fuzziness of draining process, and construction and operation cost of drainage system. Its two objectives are total c...

  5. 5-YEAR SURVIVAL OF PATIENTS WITH STAGE II UTERINE CANCER DEPENDING ON MORPHOLOGIC FEATURES OF TUMOR

    Directory of Open Access Journals (Sweden)

    Ye. A. Mustafina

    2008-01-01

    Full Text Available Retrospective data of treatment results of 109 patients with rarely observed stage II uterine cancer, admitted to N.N. Blokhin Russian Cancer Research Center from 1980 to 2000 is analyzed. Correlation of overall 5-year survival rates of stage IIA and IIB uterine can- cer patients with a number of tumor morphologic features is studied. The influence of some non-elucidated morphologic features of stage IIA and IIB uterine cancer such as the degree of cellular anaplasia, the depth of tumor invasion into the uterine neck, lymho- vascular invasion into the myometrium and uterine neck, microscopic vessels density in the area of the most extensive invasion, the presence of necrotic areas in the tumor tissue on long-term treatment results are analyzed.

  6. Tensor-based Multi-view Feature Selection with Applications to Brain Diseases

    Science.gov (United States)

    Cao, Bokai; He, Lifang; Kong, Xiangnan; Yu, Philip S.; Hao, Zhifeng; Ragin, Ann B.

    2015-01-01

    In the era of big data, we can easily access information from multiple views which may be obtained from different sources or feature subsets. Generally, different views provide complementary information for learning tasks. Thus, multi-view learning can facilitate the learning process and is prevalent in a wide range of application domains. For example, in medical science, measurements from a series of medical examinations are documented for each subject, including clinical, imaging, immunologic, serologic and cognitive measures which are obtained from multiple sources. Specifically, for brain diagnosis, we can have different quantitative analysis which can be seen as different feature subsets of a subject. It is desirable to combine all these features in an effective way for disease diagnosis. However, some measurements from less relevant medical examinations can introduce irrelevant information which can even be exaggerated after view combinations. Feature selection should therefore be incorporated in the process of multi-view learning. In this paper, we explore tensor product to bring different views together in a joint space, and present a dual method of tensor-based multi-view feature selection (dual-Tmfs) based on the idea of support vector machine recursive feature elimination. Experiments conducted on datasets derived from neurological disorder demonstrate the features selected by our proposed method yield better classification performance and are relevant to disease diagnosis. PMID:25937823

  7. Automatic sleep staging using empirical mode decomposition, discrete wavelet transform, time-domain, and nonlinear dynamics features of heart rate variability signals.

    Science.gov (United States)

    Ebrahimi, Farideh; Setarehdan, Seyed-Kamaledin; Ayala-Moyeda, Jose; Nazeran, Homer

    2013-10-01

    The conventional method for sleep staging is to analyze polysomnograms (PSGs) recorded in a sleep lab. The electroencephalogram (EEG) is one of the most important signals in PSGs but recording and analysis of this signal presents a number of technical challenges, especially at home. Instead, electrocardiograms (ECGs) are much easier to record and may offer an attractive alternative for home sleep monitoring. The heart rate variability (HRV) signal proves suitable for automatic sleep staging. Thirty PSGs from the Sleep Heart Health Study (SHHS) database were used. Three feature sets were extracted from 5- and 0.5-min HRV segments: time-domain features, nonlinear-dynamics features and time-frequency features. The latter was achieved by using empirical mode decomposition (EMD) and discrete wavelet transform (DWT) methods. Normalized energies in important frequency bands of HRV signals were computed using time-frequency methods. ANOVA and t-test were used for statistical evaluations. Automatic sleep staging was based on HRV signal features. The ANOVA followed by a post hoc Bonferroni was used for individual feature assessment. Most features were beneficial for sleep staging. A t-test was used to compare the means of extracted features in 5- and 0.5-min HRV segments. The results showed that the extracted features means were statistically similar for a small number of features. A separability measure showed that time-frequency features, especially EMD features, had larger separation than others. There was not a sizable difference in separability of linear features between 5- and 0.5-min HRV segments but separability of nonlinear features, especially EMD features, decreased in 0.5-min HRV segments. HRV signal features were classified by linear discriminant (LD) and quadratic discriminant (QD) methods. Classification results based on features from 5-min segments surpassed those obtained from 0.5-min segments. The best result was obtained from features using 5-min HRV

  8. Evaluation of the effect of one stage versus two stage full mouth disinfection on C-reactive protein and leucocyte count in patients with chronic periodontitis.

    Science.gov (United States)

    Pabolu, Chandra Mohan; Mutthineni, Ramesh Babu; Chintala, Srikanth; Naheeda; Mutthineni, Navya

    2013-07-01

    Conventional non-surgical periodontal therapy is carried out in quadrant basis with 1-2 week interval. This time lag may result in re-infection of instrumented pocket and may impair healing. Therefore, a new approach to full-mouth non-surgical therapy to be completed within two consecutive days with full-mouth disinfection has been suggested. In periodontitis, leukocyte counts and levels of C-reactive protein (CRP) are likely to be slightly elevated, indicating the presence of infection or inflammation. The aim of this study is to compare the efficacy of one stage and two stage non-surgical therapy on clinical parameters along with CRP levels and total white blood cell (TWBC) count. A total of 20 patients were selected and were divided into two groups. Group 1 received one stage full mouth dis-infection and Group 2 received two stages FMD. Plaque index, sulcus bleeding index, probing depth, clinical attachment loss, serum CRP and TWBC count were evaluated for both the groups at baseline and at 1 month post-treatment. The results were analyzed using the Student t-test. Both treatment modalities lead to a significant improvement of the clinical and hematological parameters; however comparison between the two groups showed no significant difference after 1 month. The therapeutic intervention may have a systemic effect on blood count in periodontitis patients. Though one stage FMD had limited benefits over two stages FMD, the therapy can be accomplished in a shorter duration.

  9. Prognostic Value and Reproducibility of Pretreatment CT Texture Features in Stage III Non-Small Cell Lung Cancer

    Energy Technology Data Exchange (ETDEWEB)

    Fried, David V. [Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Graduate School of Biomedical Sciences, The University of Texas Health Science Center at Houston, Houston, Texas (United States); Tucker, Susan L. [Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Zhou, Shouhao [Division of Quantitative Sciences, Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Liao, Zhongxing [Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Mawlawi, Osama [Graduate School of Biomedical Sciences, The University of Texas Health Science Center at Houston, Houston, Texas (United States); Ibbott, Geoffrey [Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Graduate School of Biomedical Sciences, The University of Texas Health Science Center at Houston, Houston, Texas (United States); Court, Laurence E., E-mail: LECourt@mdanderson.org [Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Graduate School of Biomedical Sciences, The University of Texas Health Science Center at Houston, Houston, Texas (United States)

    2014-11-15

    Purpose: To determine whether pretreatment CT texture features can improve patient risk stratification beyond conventional prognostic factors (CPFs) in stage III non-small cell lung cancer (NSCLC). Methods and Materials: We retrospectively reviewed 91 cases with stage III NSCLC treated with definitive chemoradiation therapy. All patients underwent pretreatment diagnostic contrast enhanced computed tomography (CE-CT) followed by 4-dimensional CT (4D-CT) for treatment simulation. We used the average-CT and expiratory (T50-CT) images from the 4D-CT along with the CE-CT for texture extraction. Histogram, gradient, co-occurrence, gray tone difference, and filtration-based techniques were used for texture feature extraction. Penalized Cox regression implementing cross-validation was used for covariate selection and modeling. Models incorporating texture features from the 33 image types and CPFs were compared to those with models incorporating CPFs alone for overall survival (OS), local-regional control (LRC), and freedom from distant metastases (FFDM). Predictive Kaplan-Meier curves were generated using leave-one-out cross-validation. Patients were stratified based on whether their predicted outcome was above or below the median. Reproducibility of texture features was evaluated using test-retest scans from independent patients and quantified using concordance correlation coefficients (CCC). We compared models incorporating the reproducibility seen on test-retest scans to our original models and determined the classification reproducibility. Results: Models incorporating both texture features and CPFs demonstrated a significant improvement in risk stratification compared to models using CPFs alone for OS (P=.046), LRC (P=.01), and FFDM (P=.005). The average CCCs were 0.89, 0.91, and 0.67 for texture features extracted from the average-CT, T50-CT, and CE-CT, respectively. Incorporating reproducibility within our models yielded 80.4% (±3.7% SD), 78.3% (±4.0% SD), and 78

  10. Prognostic Value and Reproducibility of Pretreatment CT Texture Features in Stage III Non-Small Cell Lung Cancer

    International Nuclear Information System (INIS)

    Fried, David V.; Tucker, Susan L.; Zhou, Shouhao; Liao, Zhongxing; Mawlawi, Osama; Ibbott, Geoffrey; Court, Laurence E.

    2014-01-01

    Purpose: To determine whether pretreatment CT texture features can improve patient risk stratification beyond conventional prognostic factors (CPFs) in stage III non-small cell lung cancer (NSCLC). Methods and Materials: We retrospectively reviewed 91 cases with stage III NSCLC treated with definitive chemoradiation therapy. All patients underwent pretreatment diagnostic contrast enhanced computed tomography (CE-CT) followed by 4-dimensional CT (4D-CT) for treatment simulation. We used the average-CT and expiratory (T50-CT) images from the 4D-CT along with the CE-CT for texture extraction. Histogram, gradient, co-occurrence, gray tone difference, and filtration-based techniques were used for texture feature extraction. Penalized Cox regression implementing cross-validation was used for covariate selection and modeling. Models incorporating texture features from the 33 image types and CPFs were compared to those with models incorporating CPFs alone for overall survival (OS), local-regional control (LRC), and freedom from distant metastases (FFDM). Predictive Kaplan-Meier curves were generated using leave-one-out cross-validation. Patients were stratified based on whether their predicted outcome was above or below the median. Reproducibility of texture features was evaluated using test-retest scans from independent patients and quantified using concordance correlation coefficients (CCC). We compared models incorporating the reproducibility seen on test-retest scans to our original models and determined the classification reproducibility. Results: Models incorporating both texture features and CPFs demonstrated a significant improvement in risk stratification compared to models using CPFs alone for OS (P=.046), LRC (P=.01), and FFDM (P=.005). The average CCCs were 0.89, 0.91, and 0.67 for texture features extracted from the average-CT, T50-CT, and CE-CT, respectively. Incorporating reproducibility within our models yielded 80.4% (±3.7% SD), 78.3% (±4.0% SD), and 78

  11. Wide-bandwidth bilateral control using two-stage actuator system

    International Nuclear Information System (INIS)

    Kokuryu, Saori; Izutsu, Masaki; Kamamichi, Norihiro; Ishikawa, Jun

    2015-01-01

    This paper proposes a two-stage actuator system that consists of a coarse actuator driven by a ball screw with an AC motor (the first stage) and a fine actuator driven by a voice coil motor (the second stage). The proposed two-stage actuator system is applied to make a wide-bandwidth bilateral control system without needing expensive high-performance actuators. In the proposed system, the first stage has a wide moving range with a narrow control bandwidth, and the second stage has a narrow moving range with a wide control bandwidth. By consolidating these two inexpensive actuators with different control bandwidths in a complementary manner, a wide bandwidth bilateral control system can be constructed based on a mechanical impedance control. To show the validity of the proposed method, a prototype of the two-stage actuator system has been developed and basic performance was evaluated by experiment. The experimental results showed that a light mechanical impedance with a mass of 10 g and a damping coefficient of 2.5 N/(m/s) that is an important factor to establish good transparency in bilateral control has been successfully achieved and also showed that a better force and position responses between a master and slave is achieved by using the proposed two-stage actuator system compared with a narrow bandwidth case using a single ball screw system. (author)

  12. A New Feature Selection Algorithm Based on the Mean Impact Variance

    Directory of Open Access Journals (Sweden)

    Weidong Cheng

    2014-01-01

    Full Text Available The selection of fewer or more representative features from multidimensional features is important when the artificial neural network (ANN algorithm is used as a classifier. In this paper, a new feature selection method called the mean impact variance (MIVAR method is proposed to determine the feature that is more suitable for classification. Moreover, this method is constructed on the basis of the training process of the ANN algorithm. To verify the effectiveness of the proposed method, the MIVAR value is used to rank the multidimensional features of the bearing fault diagnosis. In detail, (1 70-dimensional all waveform features are extracted from a rolling bearing vibration signal with four different operating states, (2 the corresponding MIVAR values of all 70-dimensional features are calculated to rank all features, (3 14 groups of 10-dimensional features are separately generated according to the ranking results and the principal component analysis (PCA algorithm and a back propagation (BP network is constructed, and (4 the validity of the ranking result is proven by training this BP network with these seven groups of 10-dimensional features and by comparing the corresponding recognition rates. The results prove that the features with larger MIVAR value can lead to higher recognition rates.

  13. Two-stage precipitation of neptunium (IV) oxalate

    International Nuclear Information System (INIS)

    Luerkens, D.W.

    1983-07-01

    Neptunium (IV) oxalate was precipitated using a two-stage precipitation system. A series of precipitation experiments was used to identify the significant process variables affecting precipitate characteristics. Process variables tested were input concentrations, solubility conditions in the first stage precipitator, precipitation temperatures, and residence time in the first stage precipitator. A procedure has been demonstrated that produces neptunium (IV) oxalate particles that filter well and readily calcine to the oxide

  14. One-stage (Warsaw) and two-stage (Oslo) repair of unilateral cleft lip and palate: Craniofacial outcomes.

    Science.gov (United States)

    Fudalej, Piotr Stanislaw; Wegrodzka, Ewa; Semb, Gunvor; Hortis-Dzierzbicka, Maria

    2015-09-01

    The aim of this study was to compare facial development in subjects with complete unilateral cleft lip and palate (CUCLP) treated with two different surgical protocols. Lateral cephalometric radiographs of 61 patients (42 boys, 19 girls; mean age, 10.9 years; SD, 1) treated consecutively in Warsaw with one-stage repair and 61 age-matched and sex-matched patients treated in Oslo with two-stage surgery were selected to evaluate craniofacial morphology. On each radiograph 13 angular and two ratio variables were measured in order to describe hard and soft tissues of the facial region. The analysis showed that differences between the groups were limited to hard tissues – the maxillary prominence in subjects from the Warsaw group was decreased by almost 4° in comparison with the Oslo group (sella-nasion-A-point (SNA) = 75.3° and 79.1°, respectively) and maxillo-mandibular morphology was less favorable in the Warsaw group than the Oslo group (ANB angle = 0.8° and 2.8°, respectively). The soft tissue contour was comparable in both groups. In conclusion, inter-group differences suggest a more favorable outcome in the Oslo group. However, the distinctiveness of facial morphology in background populations (ie, in Poles and Norwegians) could have contributed to the observed results. Copyright © 2015 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.

  15. Feature selection from a facial image for distinction of sasang constitution.

    Science.gov (United States)

    Koo, Imhoi; Kim, Jong Yeol; Kim, Myoung Geun; Kim, Keun Ho

    2009-09-01

    Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here.

  16. Creating large out-of-plane displacement electrothermal motion stage by incorporating beams with step features

    International Nuclear Information System (INIS)

    Kim, Yong-Sik; Dagalakis, Nicholas G; Gupta, Satyandra K

    2013-01-01

    Realizing out-of-plane actuation in micro-electro-mechanical systems (MEMS) is still a challenging task. In this paper, the design, fabrication methods and experimental results for a MEMS-based out-of-plane motion stage are presented based on bulk micromachining technologies. This stage is electrothermally actuated for out-of-plane motion by incorporating beams with step features. The fabricated motion stage has demonstrated displacements of 85 µm with 0.4 µm (mA) −1 rates and generated up to 11.8 mN forces with stiffness of 138.8 N m −1 . These properties obtained from the presented stage are comparable to those for in-plane motion stages, therefore making this out-of-plane stage useful when used in combination with in-plane motion stages. (paper)

  17. The Research and Application of SURF Algorithm Based on Feature Point Selection Algorithm

    Directory of Open Access Journals (Sweden)

    Zhang Fang Hu

    2014-04-01

    Full Text Available As the pixel information of depth image is derived from the distance information, when implementing SURF algorithm with KINECT sensor for static sign language recognition, there can be some mismatched pairs in palm area. This paper proposes a feature point selection algorithm, by filtering the SURF feature points step by step based on the number of feature points within adaptive radius r and the distance between the two points, it not only greatly improves the recognition rate, but also ensures the robustness under the environmental factors, such as skin color, illumination intensity, complex background, angle and scale changes. The experiment results show that the improved SURF algorithm can effectively improve the recognition rate, has a good robustness.

  18. Effect of ammoniacal nitrogen on one-stage and two-stage anaerobic digestion of food waste

    Energy Technology Data Exchange (ETDEWEB)

    Ariunbaatar, Javkhlan, E-mail: jaka@unicas.it [Department of Civil and Mechanical Engineering, University of Cassino and Southern Lazio, Via Di Biasio 43, 03043 Cassino, FR (Italy); UNESCO-IHE Institute for Water Education, Westvest 7, 2611 AX Delft (Netherlands); Scotto Di Perta, Ester [Department of Civil, Architectural and Environmental Engineering, University of Naples Federico II, Via Claudio 21, 80125 Naples (Italy); Panico, Antonio [Telematic University PEGASO, Piazza Trieste e Trento, 48, 80132 Naples (Italy); Frunzo, Luigi [Department of Mathematics and Applications Renato Caccioppoli, University of Naples Federico II, Via Claudio, 21, 80125 Naples (Italy); Esposito, Giovanni [Department of Civil and Mechanical Engineering, University of Cassino and Southern Lazio, Via Di Biasio 43, 03043 Cassino, FR (Italy); Lens, Piet N.L. [UNESCO-IHE Institute for Water Education, Westvest 7, 2611 AX Delft (Netherlands); Pirozzi, Francesco [Department of Civil, Architectural and Environmental Engineering, University of Naples Federico II, Via Claudio 21, 80125 Naples (Italy)

    2015-04-15

    Highlights: • Almost 100% of the biomethane potential of food waste was recovered during AD in a two-stage CSTR. • Recirculation of the liquid fraction of the digestate provided the necessary buffer in the AD reactors. • A higher OLR (0.9 gVS/L·d) led to higher accumulation of TAN, which caused more toxicity. • A two-stage reactor is more sensitive to elevated concentrations of ammonia. • The IC{sub 50} of TAN for the AD of food waste amounts to 3.8 g/L. - Abstract: This research compares the operation of one-stage and two-stage anaerobic continuously stirred tank reactor (CSTR) systems fed semi-continuously with food waste. The main purpose was to investigate the effects of ammoniacal nitrogen on the anaerobic digestion process. The two-stage system gave more reliable operation compared to one-stage due to: (i) a better pH self-adjusting capacity; (ii) a higher resistance to organic loading shocks; and (iii) a higher conversion rate of organic substrate to biomethane. Also a small amount of biohydrogen was detected from the first stage of the two-stage reactor making this system attractive for biohythane production. As the digestate contains ammoniacal nitrogen, re-circulating it provided the necessary alkalinity in the systems, thus preventing an eventual failure by volatile fatty acids (VFA) accumulation. However, re-circulation also resulted in an ammonium accumulation, yielding a lower biomethane production. Based on the batch experimental results the 50% inhibitory concentration of total ammoniacal nitrogen on the methanogenic activities was calculated as 3.8 g/L, corresponding to 146 mg/L free ammonia for the inoculum used for this research. The two-stage system was affected by the inhibition more than the one-stage system, as it requires less alkalinity and the physically separated methanogens are more sensitive to inhibitory factors, such as ammonium and propionic acid.

  19. On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Asriyanti Indah Pratiwi

    2018-01-01

    Full Text Available Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far.

  20. Microbial landscape features in patients with generalized periodontitis at pre-clinical and radiological stage of its development

    Directory of Open Access Journals (Sweden)

    Vatamanyuk N.V.

    2014-11-01

    Full Text Available The paper presents the results of a comparative study of microbial landscape features in patients with generalized periodontitis at pre-clinical and radiological stage of its development in 42 patients. The purpose of the study is a comparative study of the composition of microbiocenosis of periodontal tissues in patients with chronic catarrhal gingivitis (CCG and chronic generalized periodontitis (CGP at an early stage of development and development studies of microbiological criteria for early emergence of the destructive process in periodontal structures. We found that the microflora isolated from dento-gingival grooves is of importance in diagnostics to identify the etiology of chronic generalized catarrhal gingivitis (CGCG and chronic generalized periodontitis in the early stages of its development. It was established that the presence of two or more types of fixed parodonto-pathogenic microorganisms in microbial association increases the likelihood of inflammatory and destructive events in periodontal tissues in patients with GCCG and is one of the reasons of is becoming CGP.

  1. Two-stage dental implants inserted in a one-stage procedure : a prospective comparative clinical study

    NARCIS (Netherlands)

    Heijdenrijk, Kees

    2002-01-01

    The results of this study indicate that dental implants designed for a submerged implantation procedure can be used in a single-stage procedure and may be as predictable as one-stage implants. Although one-stage implant systems and two-stage.

  2. Among-year variation in selection during early life stages and the genetic basis of fitness in Arabidopsis thaliana.

    Science.gov (United States)

    Postma, Froukje M; Ågren, Jon

    2018-04-19

    Incomplete information regarding both selection regimes and the genetic basis of fitness limits our understanding of adaptive evolution. Among-year variation in the genetic basis of fitness is rarely quantified, and estimates of selection are typically based on single components of fitness, thus potentially missing conflicting selection acting during other life-history stages. Here, we examined among-year variation in selection on a key life-history trait and the genetic basis of fitness covering the whole life cycle in the annual plant Arabidopsis thaliana. We planted freshly-matured seeds of >200 recombinant inbred lines (RILs) derived from a cross between two locally-adapted populations (Italy and Sweden), and both parental genotypes at the native site of the Swedish population in three consecutive years. We quantified selection against the nonlocal Italian genotype, mapped quantitative trait loci (QTL) for fitness and its components, and quantified selection on timing of germination during different life stages. In all three years, the local Swedish genotype outperformed the non-local Italian genotype. However, both the contribution of early life stages to relative fitness, and the effects of fitness QTL varied among years. Timing of germination was under conflicting selection through seedling establishment vs. adult survival and fecundity, and both the direction and magnitude of net selection varied among years. Our results demonstrate that selection during early life stages and the genetic basis of fitness can vary markedly among years, emphasizing the need for multi-year studies considering the whole life cycle for a full understanding of natural selection and mechanisms maintaining local adaptation. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  3. Integrative approaches to the prediction of protein functions based on the feature selection

    Directory of Open Access Journals (Sweden)

    Lee Hyunju

    2009-12-01

    Full Text Available Abstract Background Protein function prediction has been one of the most important issues in functional genomics. With the current availability of various genomic data sets, many researchers have attempted to develop integration models that combine all available genomic data for protein function prediction. These efforts have resulted in the improvement of prediction quality and the extension of prediction coverage. However, it has also been observed that integrating more data sources does not always increase the prediction quality. Therefore, selecting data sources that highly contribute to the protein function prediction has become an important issue. Results We present systematic feature selection methods that assess the contribution of genome-wide data sets to predict protein functions and then investigate the relationship between genomic data sources and protein functions. In this study, we use ten different genomic data sources in Mus musculus, including: protein-domains, protein-protein interactions, gene expressions, phenotype ontology, phylogenetic profiles and disease data sources to predict protein functions that are labelled with Gene Ontology (GO terms. We then apply two approaches to feature selection: exhaustive search feature selection using a kernel based logistic regression (KLR, and a kernel based L1-norm regularized logistic regression (KL1LR. In the first approach, we exhaustively measure the contribution of each data set for each function based on its prediction quality. In the second approach, we use the estimated coefficients of features as measures of contribution of data sources. Our results show that the proposed methods improve the prediction quality compared to the full integration of all data sources and other filter-based feature selection methods. We also show that contributing data sources can differ depending on the protein function. Furthermore, we observe that highly contributing data sets can be similar among

  4. A data-driven multi-model methodology with deep feature selection for short-term wind forecasting

    International Nuclear Information System (INIS)

    Feng, Cong; Cui, Mingjian; Hodge, Bri-Mathias; Zhang, Jie

    2017-01-01

    Highlights: • An ensemble model is developed to produce both deterministic and probabilistic wind forecasts. • A deep feature selection framework is developed to optimally determine the inputs to the forecasting methodology. • The developed ensemble methodology has improved the forecasting accuracy by up to 30%. - Abstract: With the growing wind penetration into the power system worldwide, improving wind power forecasting accuracy is becoming increasingly important to ensure continued economic and reliable power system operations. In this paper, a data-driven multi-model wind forecasting methodology is developed with a two-layer ensemble machine learning technique. The first layer is composed of multiple machine learning models that generate individual forecasts. A deep feature selection framework is developed to determine the most suitable inputs to the first layer machine learning models. Then, a blending algorithm is applied in the second layer to create an ensemble of the forecasts produced by first layer models and generate both deterministic and probabilistic forecasts. This two-layer model seeks to utilize the statistically different characteristics of each machine learning algorithm. A number of machine learning algorithms are selected and compared in both layers. This developed multi-model wind forecasting methodology is compared to several benchmarks. The effectiveness of the proposed methodology is evaluated to provide 1-hour-ahead wind speed forecasting at seven locations of the Surface Radiation network. Numerical results show that comparing to the single-algorithm models, the developed multi-model framework with deep feature selection procedure has improved the forecasting accuracy by up to 30%.

  5. Comparative assessment of single-stage and two-stage anaerobic digestion for the treatment of thin stillage.

    Science.gov (United States)

    Nasr, Noha; Elbeshbishy, Elsayed; Hafez, Hisham; Nakhla, George; El Naggar, M Hesham

    2012-05-01

    A comparative evaluation of single-stage and two-stage anaerobic digestion processes for biomethane and biohydrogen production using thin stillage was performed to assess the impact of separating the acidogenic and methanogenic stages on anaerobic digestion. Thin stillage, the main by-product from ethanol production, was characterized by high total chemical oxygen demand (TCOD) of 122 g/L and total volatile fatty acids (TVFAs) of 12 g/L. A maximum methane yield of 0.33 L CH(4)/gCOD(added) (STP) was achieved in the two-stage process while a single-stage process achieved a maximum yield of only 0.26 L CH(4)/gCOD(added) (STP). The separation of acidification stage increased the TVFAs to TCOD ratio from 10% in the raw thin stillage to 54% due to the conversion of carbohydrates into hydrogen and VFAs. Comparison of the two processes based on energy outcome revealed that an increase of 18.5% in the total energy yield was achieved using two-stage anaerobic digestion. Copyright © 2012 Elsevier Ltd. All rights reserved.

  6. An input feature selection method applied to fuzzy neural networks for signal esitmation

    International Nuclear Information System (INIS)

    Na, Man Gyun; Sim, Young Rok

    2001-01-01

    It is well known that the performance of a fuzzy neural networks strongly depends on the input features selected for its training. In its applications to sensor signal estimation, there are a large number of input variables related with an output. As the number of input variables increases, the training time of fuzzy neural networks required increases exponentially. Thus, it is essential to reduce the number of inputs to a fuzzy neural networks and to select the optimum number of mutually independent inputs that are able to clearly define the input-output mapping. In this work, principal component analysis (PAC), genetic algorithms (GA) and probability theory are combined to select new important input features. A proposed feature selection method is applied to the signal estimation of the steam generator water level, the hot-leg flowrate, the pressurizer water level and the pressurizer pressure sensors in pressurized water reactors and compared with other input feature selection methods

  7. Two-stage neural-network-based technique for Urdu character two-dimensional shape representation, classification, and recognition

    Science.gov (United States)

    Megherbi, Dalila B.; Lodhi, S. M.; Boulenouar, A. J.

    2001-03-01

    This work is in the field of automated document processing. This work addresses the problem of representation and recognition of Urdu characters using Fourier representation and a Neural Network architecture. In particular, we show that a two-stage Neural Network scheme is used here to make classification of 36 Urdu characters into seven sub-classes namely subclasses characterized by seven proposed and defined fuzzy features specifically related to Urdu characters. We show that here Fourier Descriptors and Neural Network provide a remarkably simple way to draw definite conclusions from vague, ambiguous, noisy or imprecise information. In particular, we illustrate the concept of interest regions and describe a framing method that provides a way to make the proposed technique for Urdu characters recognition robust and invariant to scaling and translation. We also show that a given character rotation is dealt with by using the Hotelling transform. This transform is based upon the eigenvalue decomposition of the covariance matrix of an image, providing a method of determining the orientation of the major axis of an object within an image. Finally experimental results are presented to show the power and robustness of the proposed two-stage Neural Network based technique for Urdu character recognition, its fault tolerance, and high recognition accuracy.

  8. Condensate from a two-stage gasifier

    DEFF Research Database (Denmark)

    Bentzen, Jens Dall; Henriksen, Ulrik Birk; Hindsgaul, Claus

    2000-01-01

    Condensate, produced when gas from downdraft biomass gasifier is cooled, contains organic compounds that inhibit nitrifiers. Treatment with activated carbon removes most of the organics and makes the condensate far less inhibitory. The condensate from an optimised two-stage gasifier is so clean...... that the organic compounds and the inhibition effect are very low even before treatment with activated carbon. The moderate inhibition effect relates to a high content of ammonia in the condensate. The nitrifiers become tolerant to the condensate after a few weeks of exposure. The level of organic compounds...... and the level of inhibition are so low that condensate from the optimised two-stage gasifier can be led to the public sewer....

  9. Prostate cancer multi-feature analysis using trans-rectal ultrasound images

    International Nuclear Information System (INIS)

    Mohamed, S S; Salama, M M A; Kamel, M; El-Saadany, E F; Rizkalla, K; Chin, J

    2005-01-01

    This note focuses on extracting and analysing prostate texture features from trans-rectal ultrasound (TRUS) images for tissue characterization. One of the principal contributions of this investigation is the use of the information of the images' frequency domain features and spatial domain features to attain a more accurate diagnosis. Each image is divided into regions of interest (ROIs) by the Gabor multi-resolution analysis, a crucial stage, in which segmentation is achieved according to the frequency response of the image pixels. The pixels with a similar response to the same filter are grouped to form one ROI. Next, from each ROI two different statistical feature sets are constructed; the first set includes four grey level dependence matrix (GLDM) features and the second set consists of five grey level difference vector (GLDV) features. These constructed feature sets are then ranked by the mutual information feature selection (MIFS) algorithm. Here, the features that provide the maximum mutual information of each feature and class (cancerous and non-cancerous) and the minimum mutual information of the selected features are chosen, yeilding a reduced feature subset. The two constructed feature sets, GLDM and GLDV, as well as the reduced feature subset, are examined in terms of three different classifiers: the condensed k-nearest neighbour (CNN), the decision tree (DT) and the support vector machine (SVM). The accuracy classification results range from 87.5% to 93.75%, where the performance of the SVM and that of the DT are significantly better than the performance of the CNN. (note)

  10. Evidence of two-stage melting of Wigner solids

    Science.gov (United States)

    Knighton, Talbot; Wu, Zhe; Huang, Jian; Serafin, Alessandro; Xia, J. S.; Pfeiffer, L. N.; West, K. W.

    2018-02-01

    Ultralow carrier concentrations of two-dimensional holes down to p =1 ×109cm-2 are realized. Remarkable insulating states are found below a critical density of pc=4 ×109cm-2 or rs≈40 . Sensitive dc V-I measurement as a function of temperature and electric field reveals a two-stage phase transition supporting the melting of a Wigner solid as a two-stage first-order transition.

  11. A study of metaheuristic algorithms for high dimensional feature selection on microarray data

    Science.gov (United States)

    Dankolo, Muhammad Nasiru; Radzi, Nor Haizan Mohamed; Sallehuddin, Roselina; Mustaffa, Noorfa Haszlinna

    2017-11-01

    Microarray systems enable experts to examine gene profile at molecular level using machine learning algorithms. It increases the potentials of classification and diagnosis of many diseases at gene expression level. Though, numerous difficulties may affect the efficiency of machine learning algorithms which includes vast number of genes features comprised in the original data. Many of these features may be unrelated to the intended analysis. Therefore, feature selection is necessary to be performed in the data pre-processing. Many feature selection algorithms are developed and applied on microarray which including the metaheuristic optimization algorithms. This paper discusses the application of the metaheuristics algorithms for feature selection in microarray dataset. This study reveals that, the algorithms have yield an interesting result with limited resources thereby saving computational expenses of machine learning algorithms.

  12. Finger vein recognition with personalized feature selection.

    Science.gov (United States)

    Xi, Xiaoming; Yang, Gongping; Yin, Yilong; Meng, Xianjing

    2013-08-22

    Finger veins are a promising biometric pattern for personalized identification in terms of their advantages over existing biometrics. Based on the spatial pyramid representation and the combination of more effective information such as gray, texture and shape, this paper proposes a simple but powerful feature, called Pyramid Histograms of Gray, Texture and Orientation Gradients (PHGTOG). For a finger vein image, PHGTOG can reflect the global spatial layout and local details of gray, texture and shape. To further improve the recognition performance and reduce the computational complexity, we select a personalized subset of features from PHGTOG for each subject by using the sparse weight vector, which is trained by using LASSO and called PFS-PHGTOG. We conduct extensive experiments to demonstrate the promise of the PHGTOG and PFS-PHGTOG, experimental results on our databases show that PHGTOG outperforms the other existing features. Moreover, PFS-PHGTOG can further boost the performance in comparison with PHGTOG.

  13. Finger Vein Recognition with Personalized Feature Selection

    Directory of Open Access Journals (Sweden)

    Xianjing Meng

    2013-08-01

    Full Text Available Finger veins are a promising biometric pattern for personalized identification in terms of their advantages over existing biometrics. Based on the spatial pyramid representation and the combination of more effective information such as gray, texture and shape, this paper proposes a simple but powerful feature, called Pyramid Histograms of Gray, Texture and Orientation Gradients (PHGTOG. For a finger vein image, PHGTOG can reflect the global spatial layout and local details of gray, texture and shape. To further improve the recognition performance and reduce the computational complexity, we select a personalized subset of features from PHGTOG for each subject by using the sparse weight vector, which is trained by using LASSO and called PFS-PHGTOG. We conduct extensive experiments to demonstrate the promise of the PHGTOG and PFS-PHGTOG, experimental results on our databases show that PHGTOG outperforms the other existing features. Moreover, PFS-PHGTOG can further boost the performance in comparison with PHGTOG.

  14. Two-stage liquefaction of a Spanish subbituminous coal

    Energy Technology Data Exchange (ETDEWEB)

    Martinez, M.T.; Fernandez, I.; Benito, A.M.; Cebolla, V.; Miranda, J.L.; Oelert, H.H. (Instituto de Carboquimica, Zaragoza (Spain))

    1993-05-01

    A Spanish subbituminous coal has been processed in two-stage liquefaction in a non-integrated process. The first-stage coal liquefaction has been carried out in a continuous pilot plant in Germany at Clausthal Technical University at 400[degree]C, 20 MPa hydrogen pressure and anthracene oil as solvent. The second-stage coal liquefaction has been performed in continuous operation in a hydroprocessing unit at the Instituto de Carboquimica at 450[degree]C and 10 MPa hydrogen pressure, with two commercial catalysts: Harshaw HT-400E (Co-Mo/Al[sub 2]O[sub 3]) and HT-500E (Ni-Mo/Al[sub 2]O[sub 3]). The total conversion for the first-stage coal liquefaction was 75.41 wt% (coal d.a.f.), being 3.79 wt% gases, 2.58 wt% primary condensate and 69.04 wt% heavy liquids. The heteroatoms removal for the second-stage liquefaction was 97-99 wt% of S, 85-87 wt% of N and 93-100 wt% of O. The hydroprocessed liquids have about 70% of compounds with boiling point below 350[degree]C, and meet the sulphur and nitrogen specifications for refinery feedstocks. Liquids from two-stage coal liquefaction have been distilled, and the naphtha, kerosene and diesel fractions obtained have been characterized. 39 refs., 3 figs., 8 tabs.

  15. Feature Selection from a Facial Image for Distinction of Sasang Constitution

    Directory of Open Access Journals (Sweden)

    Imhoi Koo

    2009-01-01

    Full Text Available Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here.

  16. Feature Selection from a Facial Image for Distinction of Sasang Constitution

    Science.gov (United States)

    Koo, Imhoi; Kim, Jong Yeol; Kim, Myoung Geun

    2009-01-01

    Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here. PMID:19745013

  17. Feature diagnosticity and task context shape activity in human scene-selective cortex.

    Science.gov (United States)

    Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S

    2016-01-15

    Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Infrared face recognition based on LBP histogram and KW feature selection

    Science.gov (United States)

    Xie, Zhihua

    2014-07-01

    The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).

  19. Relevant test set using feature selection algorithm for early detection ...

    African Journals Online (AJOL)

    The objective of feature selection is to find the most relevant features for classification. Thus, the dimensionality of the information will be reduced and may improve classification's accuracy. This paper proposed a minimum set of relevant questions that can be used for early detection of dyslexia. In this research, we ...

  20. A ROC-based feature selection method for computer-aided detection and diagnosis

    Science.gov (United States)

    Wang, Songyuan; Zhang, Guopeng; Liao, Qimei; Zhang, Junying; Jiao, Chun; Lu, Hongbing

    2014-03-01

    Image-based computer-aided detection and diagnosis (CAD) has been a very active research topic aiming to assist physicians to detect lesions and distinguish them from benign to malignant. However, the datasets fed into a classifier usually suffer from small number of samples, as well as significantly less samples available in one class (have a disease) than the other, resulting in the classifier's suboptimal performance. How to identifying the most characterizing features of the observed data for lesion detection is critical to improve the sensitivity and minimize false positives of a CAD system. In this study, we propose a novel feature selection method mR-FAST that combines the minimal-redundancymaximal relevance (mRMR) framework with a selection metric FAST (feature assessment by sliding thresholds) based on the area under a ROC curve (AUC) generated on optimal simple linear discriminants. With three feature datasets extracted from CAD systems for colon polyps and bladder cancer, we show that the space of candidate features selected by mR-FAST is more characterizing for lesion detection with higher AUC, enabling to find a compact subset of superior features at low cost.

  1. Raman spectral feature selection using ant colony optimization for breast cancer diagnosis.

    Science.gov (United States)

    Fallahzadeh, Omid; Dehghani-Bidgoli, Zohreh; Assarian, Mohammad

    2018-06-04

    Pathology as a common diagnostic test of cancer is an invasive, time-consuming, and partially subjective method. Therefore, optical techniques, especially Raman spectroscopy, have attracted the attention of cancer diagnosis researchers. However, as Raman spectra contain numerous peaks involved in molecular bounds of the sample, finding the best features related to cancerous changes can improve the accuracy of diagnosis in this method. The present research attempted to improve the power of Raman-based cancer diagnosis by finding the best Raman features using the ACO algorithm. In the present research, 49 spectra were measured from normal, benign, and cancerous breast tissue samples using a 785-nm micro-Raman system. After preprocessing for removal of noise and background fluorescence, the intensity of 12 important Raman bands of the biological samples was extracted as features of each spectrum. Then, the ACO algorithm was applied to find the optimum features for diagnosis. As the results demonstrated, by selecting five features, the classification accuracy of the normal, benign, and cancerous groups increased by 14% and reached 87.7%. ACO feature selection can improve the diagnostic accuracy of Raman-based diagnostic models. In the present study, features corresponding to ν(C-C) αhelix proline, valine (910-940), νs(C-C) skeletal lipids (1110-1130), and δ(CH2)/δ(CH3) proteins (1445-1460) were selected as the best features in cancer diagnosis.

  2. Numerical simulation of two-dimensional late-stage coarsening for nucleation and growth

    International Nuclear Information System (INIS)

    Akaiwa, N.; Meiron, D.I.

    1995-01-01

    Numerical simulations of two-dimensional late-stage coarsening for nucleation and growth or Ostwald ripening are performed at area fractions 0.05 to 0.4 using the monopole and dipole approximations of a boundary integral formulation for the steady state diffusion equation. The simulations are performed using two different initial spatial distributions. One is a random spatial distribution, and the other is a random spatial distribution with depletion zones around the particles. We characterize the spatial correlations of particles by the radial distribution function, the pair correlation functions, and the structure function. Although the initial spatial correlations are different, we find time-independent scaled correlation functions in the late stage of coarsening. An important feature of the late-stage spatial correlations is that depletion zones exist around particles. A log-log plot of the structure function shows that the slope at small wave numbers is close to 4 and is -3 at very large wave numbers for all area fractions. At large wave numbers we observe oscillations in the structure function. We also confirm the cubic growth law of the average particle radius. The rate constant of the cubic growth law and the particle size distribution functions are also determined. We find qualitatively good agreement between experiments and the present simulations. In addition, the present results agree well with simulation results using the Cahn-Hilliard equation

  3. The effect of destination linked feature selection in real-time network intrusion detection

    CSIR Research Space (South Africa)

    Mzila, P

    2013-07-01

    Full Text Available techniques in the network intrusion detection system (NIDS) is the feature selection technique. The ability of NIDS to accurately identify intrusion from the network traffic relies heavily on feature selection, which describes the pattern of the network...

  4. Compensatory selection for roads over natural linear features by wolves in northern Ontario: Implications for caribou conservation.

    Directory of Open Access Journals (Sweden)

    Erica J Newton

    Full Text Available Woodland caribou (Rangifer tarandus caribou in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism-a compensatory functional response to anthropogenic

  5. One-stage and two-stage penile buccal mucosa urethroplasty

    African Journals Online (AJOL)

    G. Barbagli

    2015-12-02

    Dec 2, 2015 ... there also seems to be a trend of decreasing urethritis and an increase of instrumentation and catheter related strictures in these countries as well [4–6]. The repair of penile urethral strictures may require one- or two- stage urethroplasty [7–10]. Certainly, sexual function can be placed at risk by any surgery ...

  6. Urinary bladder cancer T-staging from T2-weighted MR images using an optimal biomarker approach

    Science.gov (United States)

    Wang, Chuang; Udupa, Jayaram K.; Tong, Yubing; Chen, Jerry; Venigalla, Sriram; Odhner, Dewey; Guzzo, Thomas J.; Christodouleas, John; Torigian, Drew A.

    2018-02-01

    Magnetic resonance imaging (MRI) is often used in clinical practice to stage patients with bladder cancer to help plan treatment. However, qualitative assessment of MR images is prone to inaccuracies, adversely affecting patient outcomes. In this paper, T2-weighted MR image-based quantitative features were extracted from the bladder wall in 65 patients with bladder cancer to classify them into two primary tumor (T) stage groups: group 1 - T stage T2, with primary tumor locally confined to the bladder, and group 2 - T stage T2, with primary tumor locally extending beyond the bladder. The bladder was divided into 8 sectors in the axial plane, where each sector has a corresponding reference standard T stage that is based on expert radiology qualitative MR image review and histopathologic results. The performance of the classification for correct assignment of T stage grouping was then evaluated at both the patient level and the sector level. Each bladder sector was divided into 3 shells (inner, middle, and outer), and 15,834 features including intensity features and texture features from local binary pattern and gray-level co-occurrence matrix were extracted from the 3 shells of each sector. An optimal feature set was selected from all features using an optimal biomarker approach. Nine optimal biomarker features were derived based on texture properties from the middle shell, with an area under the ROC curve of AUC value at the sector and patient level of 0.813 and 0.806, respectively.

  7. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System.

    Science.gov (United States)

    Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir

    2015-01-01

    The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.

  8. Multi-level gene/MiRNA feature selection using deep belief nets and active learning.

    Science.gov (United States)

    Ibrahim, Rania; Yousri, Noha A; Ismail, Mohamed A; El-Makky, Nagwa M

    2014-01-01

    Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting the most informative genes/miRNAs. Recently, deep learning has proven its ability in representing the data in multiple levels of abstraction, allowing for better discrimination between different classes. However, the idea of using deep learning for feature selection is not widely used in the bioinformatics field yet. In this paper, a novel multi-level feature selection approach named MLFS is proposed for selecting genes/miRNAs based on expression profiles. The approach is based on both deep and active learning. Moreover, an extension to use the technique for miRNAs is presented by considering the biological relation between miRNAs and genes. Experimental results show that the approach was able to outperform classical feature selection methods in hepatocellular carcinoma (HCC) by 9%, lung cancer by 6% and breast cancer by around 10% in F1-measure. Results also show the enhancement in F1-measure of our approach over recently related work in [1] and [2].

  9. Two-Stage Variable Sample-Rate Conversion System

    Science.gov (United States)

    Tkacenko, Andre

    2009-01-01

    A two-stage variable sample-rate conversion (SRC) system has been pro posed as part of a digital signal-processing system in a digital com munication radio receiver that utilizes a variety of data rates. The proposed system would be used as an interface between (1) an analog- todigital converter used in the front end of the receiver to sample an intermediatefrequency signal at a fixed input rate and (2) digita lly implemented tracking loops in subsequent stages that operate at v arious sample rates that are generally lower than the input sample r ate. This Two-Stage System would be capable of converting from an input sample rate to a desired lower output sample rate that could be var iable and not necessarily a rational fraction of the input rate.

  10. Automatic Feature Selection and Weighting for the Formation of Homogeneous Groups for Regional Intensity-Duration-Frequency (IDF) Curve Estimation

    Science.gov (United States)

    Yang, Z.; Burn, D. H.

    2017-12-01

    Extreme rainfall events can have devastating impacts on society. To quantify the associated risk, the IDF curve has been used to provide the essential rainfall-related information for urban planning. However, the recent changes in the rainfall climatology caused by climate change and urbanization have made the estimates provided by the traditional regional IDF approach increasingly inaccurate. This inaccuracy is mainly caused by two problems: 1) The ineffective choice of similarity indicators for the formation of a homogeneous group at different regions; and 2) An inadequate number of stations in the pooling group that does not adequately reflect the optimal balance between group size and group homogeneity or achieve the lowest uncertainty in the rainfall quantiles estimates. For the first issue, to consider the temporal difference among different meteorological and topographic indicators, a three-layer design is proposed based on three stages in the extreme rainfall formation: cloud formation, rainfall generation and change of rainfall intensity above urban surface. During the process, the impacts from climate change and urbanization are considered through the inclusion of potential relevant features at each layer. Then to consider spatial difference of similarity indicators for the homogeneous group formation at various regions, an automatic feature selection and weighting algorithm, specifically the hybrid searching algorithm of Tabu search, Lagrange Multiplier and Fuzzy C-means Clustering, is used to select the optimal combination of features for the potential optimal homogenous groups formation at a specific region. For the second issue, to compare the uncertainty of rainfall quantile estimates among potential groups, the two sample Kolmogorov-Smirnov test-based sample ranking process is used. During the process, linear programming is used to rank these groups based on the confidence intervals of the quantile estimates. The proposed methodology fills the gap

  11. A new and fast image feature selection method for developing an optimal mammographic mass detection scheme.

    Science.gov (United States)

    Tan, Maxine; Pu, Jiantao; Zheng, Bin

    2014-08-01

    Selecting optimal features from a large image feature pool remains a major challenge in developing computer-aided detection (CAD) schemes of medical images. The objective of this study is to investigate a new approach to significantly improve efficacy of image feature selection and classifier optimization in developing a CAD scheme of mammographic masses. An image dataset including 1600 regions of interest (ROIs) in which 800 are positive (depicting malignant masses) and 800 are negative (depicting CAD-generated false positive regions) was used in this study. After segmentation of each suspicious lesion by a multilayer topographic region growth algorithm, 271 features were computed in different feature categories including shape, texture, contrast, isodensity, spiculation, local topological features, as well as the features related to the presence and location of fat and calcifications. Besides computing features from the original images, the authors also computed new texture features from the dilated lesion segments. In order to select optimal features from this initial feature pool and build a highly performing classifier, the authors examined and compared four feature selection methods to optimize an artificial neural network (ANN) based classifier, namely: (1) Phased Searching with NEAT in a Time-Scaled Framework, (2) A sequential floating forward selection (SFFS) method, (3) A genetic algorithm (GA), and (4) A sequential forward selection (SFS) method. Performances of the four approaches were assessed using a tenfold cross validation method. Among these four methods, SFFS has highest efficacy, which takes 3%-5% of computational time as compared to GA approach, and yields the highest performance level with the area under a receiver operating characteristic curve (AUC) = 0.864 ± 0.034. The results also demonstrated that except using GA, including the new texture features computed from the dilated mass segments improved the AUC results of the ANNs optimized

  12. Disentangling phylogenetic constraints from selective forces in the evolution of trematode transmission stages

    NARCIS (Netherlands)

    Koehler, A.V.; Brown, B.; Poulin, R.; Thieltges, D.W.; Fredensborg, B.L.

    2012-01-01

    The transmission stages of parasites are key determinants of parasite fitness, but they also incur huge mortality. Yet the selective forces shaping the sizes of transmission stages remain poorly understood. We ran a comparative analysis of interspecific variation in the size of transmission stages

  13. Sparse Bayesian classification and feature selection for biological expression data with high correlations.

    Directory of Open Access Journals (Sweden)

    Xian Yang

    Full Text Available Classification models built on biological expression data are increasingly used to predict distinct disease subtypes. Selected features that separate sample groups can be the candidates of biomarkers, helping us to discover biological functions/pathways. However, three challenges are associated with building a robust classification and feature selection model: 1 the number of significant biomarkers is much smaller than that of measured features for which the search will be exhaustive; 2 current biological expression data are big in both sample size and feature size which will worsen the scalability of any search algorithms; and 3 expression profiles of certain features are typically highly correlated which may prevent to distinguish the predominant features. Unfortunately, most of the existing algorithms are partially addressing part of these challenges but not as a whole. In this paper, we propose a unified framework to address the above challenges. The classification and feature selection problem is first formulated as a nonconvex optimisation problem. Then the problem is relaxed and solved iteratively by a sequence of convex optimisation procedures which can be distributed computed and therefore allows the efficient implementation on advanced infrastructures. To illustrate the competence of our method over others, we first analyse a randomly generated simulation dataset under various conditions. We then analyse a real gene expression dataset on embryonal tumour. Further downstream analysis, such as functional annotation and pathway analysis, are performed on the selected features which elucidate several biological findings.

  14. Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System

    Directory of Open Access Journals (Sweden)

    Pavol Partila

    2015-01-01

    Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.

  15. Spatial variation in density and size structure indicate habitat selection throughout life stages of two Southwestern Atlantic snappers.

    Science.gov (United States)

    Aschenbrenner, Alexandre; Hackradt, Carlos Werner; Ferreira, Beatrice Padovani

    2016-02-01

    The early life history of Lutjanus alexandrei and Lutjanus jocu in Southwestern Atlantic is still largely unknown. Habitat use of different life stages (i.e. size categories and densities) of the Brazilian snapper (L. alexandrei) and dog snapper (L. jocu) was examined in a tropical portion of NE coast of Brazil. Visual surveys were conducted in different shallow habitats (mangroves and reefs). Both snapper species showed higher densities in early life stages in mangrove habitat, with a clear increase in fish size from mangrove to adjacent reefs. Post-settler individuals were exclusively found in mangroves for both species. Juveniles of L. alexandrei were also registered only in mangroves, while sub-adult individuals were associated with both mangrove and reef habitats. Mature individuals of L. alexandrei were only observed in reef habitats. Juvenile and sub-adult individuals of the dog snapper were both associated with mangrove and reef habitats, with high densities registered in mangroves. Mature individuals of L. jocu were not registered in the study area. This pattern suggests preference for mangrove habitat in early life stages for both species. Ontogenetic movement between habitats was also recorded. This pattern denotes habitat selection across different life cycle of both species. Such information highlights the importance of directing management and conservation efforts to these habitats to secure the continuity of contribution to adult populations. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Don't spin the pen: two alternative methods for second-stage sampling in urban cluster surveys

    Directory of Open Access Journals (Sweden)

    Rose Angela MC

    2007-06-01

    Full Text Available Abstract In two-stage cluster surveys, the traditional method used in second-stage sampling (in which the first household in a cluster is selected is time-consuming and may result in biased estimates of the indicator of interest. Firstly, a random direction from the center of the cluster is selected, usually by spinning a pen. The houses along that direction are then counted out to the boundary of the cluster, and one is then selected at random to be the first household surveyed. This process favors households towards the center of the cluster, but it could easily be improved. During a recent meningitis vaccination coverage survey in Maradi, Niger, we compared this method of first household selection to two alternatives in urban zones: 1 using a superimposed grid on the map of the cluster area and randomly selecting an intersection; and 2 drawing the perimeter of the cluster area using a Global Positioning System (GPS and randomly selecting one point within the perimeter. Although we only compared a limited number of clusters using each method, we found the sampling grid method to be the fastest and easiest for field survey teams, although it does require a map of the area. Selecting a random GPS point was also found to be a good method, once adequate training can be provided. Spinning the pen and counting households to the boundary was the most complicated and time-consuming. The two methods tested here represent simpler, quicker and potentially more robust alternatives to spinning the pen for cluster surveys in urban areas. However, in rural areas, these alternatives would favor initial household selection from lower density (or even potentially empty areas. Bearing in mind these limitations, as well as available resources and feasibility, investigators should choose the most appropriate method for their particular survey context.

  17. An Appraisal Model Based on a Synthetic Feature Selection Approach for Students’ Academic Achievement

    Directory of Open Access Journals (Sweden)

    Ching-Hsue Cheng

    2017-11-01

    Full Text Available Obtaining necessary information (and even extracting hidden messages from existing big data, and then transforming them into knowledge, is an important skill. Data mining technology has received increased attention in various fields in recent years because it can be used to find historical patterns and employ machine learning to aid in decision-making. When we find unexpected rules or patterns from the data, they are likely to be of high value. This paper proposes a synthetic feature selection approach (SFSA, which is combined with a support vector machine (SVM to extract patterns and find the key features that influence students’ academic achievement. For verifying the proposed model, two databases, namely, “Student Profile” and “Tutorship Record”, were collected from an elementary school in Taiwan, and were concatenated into an integrated dataset based on students’ names as a research dataset. The results indicate the following: (1 the accuracy of the proposed feature selection approach is better than that of the Minimum-Redundancy-Maximum-Relevance (mRMR approach; (2 the proposed model is better than the listing methods when the six least influential features have been deleted; and (3 the proposed model can enhance the accuracy and facilitate the interpretation of the pattern from a hybrid-type dataset of students’ academic achievement.

  18. Fleet Planning Decision-Making: Two-Stage Optimization with Slot Purchase

    Directory of Open Access Journals (Sweden)

    Lay Eng Teoh

    2016-01-01

    Full Text Available Essentially, strategic fleet planning is vital for airlines to yield a higher profit margin while providing a desired service frequency to meet stochastic demand. In contrast to most studies that did not consider slot purchase which would affect the service frequency determination of airlines, this paper proposes a novel approach to solve the fleet planning problem subject to various operational constraints. A two-stage fleet planning model is formulated in which the first stage selects the individual operating route that requires slot purchase for network expansions while the second stage, in the form of probabilistic dynamic programming model, determines the quantity and type of aircraft (with the corresponding service frequency to meet the demand profitably. By analyzing an illustrative case study (with 38 international routes, the results show that the incorporation of slot purchase in fleet planning is beneficial to airlines in achieving economic and social sustainability. The developed model is practically viable for airlines not only to provide a better service quality (via a higher service frequency to meet more demand but also to obtain a higher revenue and profit margin, by making an optimal slot purchase and fleet planning decision throughout the long-term planning horizon.

  19. DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues.

    Science.gov (United States)

    Ma, Xin; Guo, Jing; Sun, Xiao

    2016-01-01

    DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.

  20. Emotional textile image classification based on cross-domain convolutional sparse autoencoders with feature selection

    Science.gov (United States)

    Li, Zuhe; Fan, Yangyu; Liu, Weihua; Yu, Zeqi; Wang, Fengqin

    2017-01-01

    We aim to apply sparse autoencoder-based unsupervised feature learning to emotional semantic analysis for textile images. To tackle the problem of limited training data, we present a cross-domain feature learning scheme for emotional textile image classification using convolutional autoencoders. We further propose a correlation-analysis-based feature selection method for the weights learned by sparse autoencoders to reduce the number of features extracted from large size images. First, we randomly collect image patches on an unlabeled image dataset in the source domain and learn local features with a sparse autoencoder. We then conduct feature selection according to the correlation between different weight vectors corresponding to the autoencoder's hidden units. We finally adopt a convolutional neural network including a pooling layer to obtain global feature activations of textile images in the target domain and send these global feature vectors into logistic regression models for emotional image classification. The cross-domain unsupervised feature learning method achieves 65% to 78% average accuracy in the cross-validation experiments corresponding to eight emotional categories and performs better than conventional methods. Feature selection can reduce the computational cost of global feature extraction by about 50% while improving classification performance.

  1. Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy.

    Science.gov (United States)

    Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A

    2015-07-01

    Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data

    Directory of Open Access Journals (Sweden)

    Harris Lyndsay N

    2006-04-01

    Full Text Available Abstract Background Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological classification problems. Feature selection and classification algorithms need to be robust to noise and outliers in the data. Results We developed a recursive support vector machine (R-SVM algorithm to select important genes/biomarkers for the classification of noisy data. We compared its performance to a similar, state-of-the-art method (SVM recursive feature elimination or SVM-RFE, paying special attention to the ability of recovering the true informative genes/biomarkers and the robustness to outliers in the data. Simulation experiments show that a 5 %-~20 % improvement over SVM-RFE can be achieved regard to these properties. The SVM-based methods are also compared with a conventional univariate method and their respective strengths and weaknesses are discussed. R-SVM was applied to two sets of SELDI-TOF-MS proteomics data, one from a human breast cancer study and the other from a study on rat liver cirrhosis. Important biomarkers found by the algorithm were validated by follow-up biological experiments. Conclusion The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features. The multivariate SVM-based method outperforms the univariate method in the classification performance, but univariate methods can reveal more of the differentially expressed features especially when there are correlations between the features.

  3. Feature selection and classifier parameters estimation for EEG signals peak detection using particle swarm optimization.

    Science.gov (United States)

    Adam, Asrul; Shapiai, Mohd Ibrahim; Tumari, Mohd Zaidi Mohd; Mohamad, Mohd Saberi; Mubin, Marizan

    2014-01-01

    Electroencephalogram (EEG) signal peak detection is widely used in clinical applications. The peak point can be detected using several approaches, including time, frequency, time-frequency, and nonlinear domains depending on various peak features from several models. However, there is no study that provides the importance of every peak feature in contributing to a good and generalized model. In this study, feature selection and classifier parameters estimation based on particle swarm optimization (PSO) are proposed as a framework for peak detection on EEG signals in time domain analysis. Two versions of PSO are used in the study: (1) standard PSO and (2) random asynchronous particle swarm optimization (RA-PSO). The proposed framework tries to find the best combination of all the available features that offers good peak detection and a high classification rate from the results in the conducted experiments. The evaluation results indicate that the accuracy of the peak detection can be improved up to 99.90% and 98.59% for training and testing, respectively, as compared to the framework without feature selection adaptation. Additionally, the proposed framework based on RA-PSO offers a better and reliable classification rate as compared to standard PSO as it produces low variance model.

  4. A novel relational regularization feature selection method for joint regression and classification in AD diagnosis.

    Science.gov (United States)

    Zhu, Xiaofeng; Suk, Heung-Il; Wang, Li; Lee, Seong-Whan; Shen, Dinggang

    2017-05-01

    In this paper, we focus on joint regression and classification for Alzheimer's disease diagnosis and propose a new feature selection method by embedding the relational information inherent in the observations into a sparse multi-task learning framework. Specifically, the relational information includes three kinds of relationships (such as feature-feature relation, response-response relation, and sample-sample relation), for preserving three kinds of the similarity, such as for the features, the response variables, and the samples, respectively. To conduct feature selection, we first formulate the objective function by imposing these three relational characteristics along with an ℓ 2,1 -norm regularization term, and further propose a computationally efficient algorithm to optimize the proposed objective function. With the dimension-reduced data, we train two support vector regression models to predict the clinical scores of ADAS-Cog and MMSE, respectively, and also a support vector classification model to determine the clinical label. We conducted extensive experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset to validate the effectiveness of the proposed method. Our experimental results showed the efficacy of the proposed method in enhancing the performances of both clinical scores prediction and disease status identification, compared to the state-of-the-art methods. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Effects of Two-stage Heat Treatment on Delayed Coke and Study of Their Surface Texture Characteristics

    Science.gov (United States)

    Im, Ui-Su; Kim, Jiyoung; Lee, Seon Ho; Lee, Byung-Rok; Peck, Dong-Hyun; Jung, Doo-Hwan

    2017-12-01

    In the present study, surface texture features and chemical properties of two types of cokes, made from coal tar by either 1-stage heat treatment or 2-stage heat treatment, were researched. The relationship between surface texture characteristics and the chemical properties was identified through molecular weight distribution, insolubility of coal tar, weight loss with temperature increase, coking yield, and polarized light microscope analysis. Rapidly cleared anisotropy texture in cokes was observed in accordance with the coking temperature rise. Quinoline insolubility and toluene insolubility of coal tar increased with a corresponding increases in coking temperature. In particular, the cokes produced by the 2-stage heat treatment (2S-C) showed surface structure of needle cokes at a temperature approximately 50°C lower than the 1-stage heat treatment (1S-C). Additionally, the coking yield of 2S-C increased by approximately 14% in comparison with 1S-C.

  6. Neural Network-Based Coronary Heart Disease Risk Prediction Using Feature Correlation Analysis

    Directory of Open Access Journals (Sweden)

    Jae Kwon Kim

    2017-01-01

    Full Text Available Background. Of the machine learning techniques used in predicting coronary heart disease (CHD, neural network (NN is popularly used to improve performance accuracy. Objective. Even though NN-based systems provide meaningful results based on clinical experiments, medical experts are not satisfied with their predictive performances because NN is trained in a “black-box” style. Method. We sought to devise an NN-based prediction of CHD risk using feature correlation analysis (NN-FCA using two stages. First, the feature selection stage, which makes features acceding to the importance in predicting CHD risk, is ranked, and second, the feature correlation analysis stage, during which one learns about the existence of correlations between feature relations and the data of each NN predictor output, is determined. Result. Of the 4146 individuals in the Korean dataset evaluated, 3031 had low CHD risk and 1115 had CHD high risk. The area under the receiver operating characteristic (ROC curve of the proposed model (0.749 ± 0.010 was larger than the Framingham risk score (FRS (0.393 ± 0.010. Conclusions. The proposed NN-FCA, which utilizes feature correlation analysis, was found to be better than FRS in terms of CHD risk prediction. Furthermore, the proposed model resulted in a larger ROC curve and more accurate predictions of CHD risk in the Korean population than the FRS.

  7. Rolling Bearing Fault Diagnosis Using Modified Neighborhood Preserving Embedding and Maximal Overlap Discrete Wavelet Packet Transform with Sensitive Features Selection

    Directory of Open Access Journals (Sweden)

    Fei Dong

    2018-01-01

    Full Text Available In order to enhance the performance of bearing fault diagnosis and classification, features extraction and features dimensionality reduction have become more important. The original statistical feature set was calculated from single branch reconstruction vibration signals obtained by using maximal overlap discrete wavelet packet transform (MODWPT. In order to reduce redundancy information of original statistical feature set, features selection by adjusted rand index and sum of within-class mean deviations (FSASD was proposed to select fault sensitive features. Furthermore, a modified features dimensionality reduction method, supervised neighborhood preserving embedding with label information (SNPEL, was proposed to realize low-dimensional representations for high-dimensional feature space. Finally, vibration signals collected from two experimental test rigs were employed to evaluate the performance of the proposed procedure. The results show that the effectiveness, adaptability, and superiority of the proposed procedure can serve as an intelligent bearing fault diagnosis system.

  8. Different cortical mechanisms for spatial vs. feature-based attentional selection in visual working memory

    Directory of Open Access Journals (Sweden)

    Anna Heuer

    2016-08-01

    Full Text Available The limited capacity of visual working memory necessitates attentional mechanisms that selectively update and maintain only the most task-relevant content. Psychophysical experiments have shown that the retroactive selection of memory content can be based on visual properties such as location or shape, but the neural basis for such differential selection is unknown. For example, it is not known if there are different cortical modules specialized for spatial versus feature-based mnemonic attention, in the same way that has been demonstrated for attention to perceptual input. Here, we used transcranial magnetic stimulation (TMS to identify areas in human parietal and occipital cortex involved in the selection of objects from memory based on cues to their location (spatial information or their shape (featural information. We found that TMS over the supramarginal gyrus (SMG selectively facilitated spatial selection, whereas TMS over the lateral occipital cortex selectively enhanced feature-based selection for remembered objects in the contralateral visual field. Thus, different cortical regions are responsible for spatial vs. feature-based selection of working memory representations. Since the same regions are involved in attention to external events, these new findings indicate overlapping mechanisms for attentional control over perceptual input and mnemonic representations.

  9. Selection of morphological features of pollen grains for chosen tree taxa

    Directory of Open Access Journals (Sweden)

    Agnieszka Kubik-Komar

    2018-05-01

    Full Text Available The basis of aerobiological studies is to monitor airborne pollen concentrations and pollen season timing. This task is performed by appropriately trained staff and is difficult and time consuming. The goal of this research is to select morphological characteristics of grains that are the most discriminative for distinguishing between birch, hazel and alder taxa and are easy to determine automatically from microscope images. This selection is based on the split attributes of the J4.8 classification trees built for different subsets of features. Determining the discriminative features by this method, we provide specific rules for distinguishing between individual taxa, at the same time obtaining a high percentage of correct classification. The most discriminative among the 13 morphological characteristics studied are the following: number of pores, maximum axis, minimum axis, axes difference, maximum oncus width, and number of lateral pores. The classification result of the tree based on this subset is better than the one built on the whole feature set and it is almost 94%. Therefore, selection of attributes before tree building is recommended. The classification results for the features easiest to obtain from the image, i.e. maximum axis, minimum axis, axes difference, and number of lateral pores, are only 2.09 pp lower than those obtained for the complete set, but 3.23 pp lower than the results obtained for the selected most discriminating attributes only.

  10. Feature Subset Selection and Instance Filtering for Cross-project Defect Prediction - Classification and Ranking

    Directory of Open Access Journals (Sweden)

    Faimison Porto

    2016-12-01

    Full Text Available The defect prediction models can be a good tool on organizing the project's test resources. The models can be constructed with two main goals: 1 to classify the software parts - defective or not; or 2 to rank the most defective parts in a decreasing order. However, not all companies maintain an appropriate set of historical defect data. In this case, a company can build an appropriate dataset from known external projects - called Cross-project Defect Prediction (CPDP. The CPDP models, however, present low prediction performances due to the heterogeneity of data. Recently, Instance Filtering methods were proposed in order to reduce this heterogeneity by selecting the most similar instances from the training dataset. Originally, the similarity is calculated based on all the available dataset features (or independent variables. We propose that using only the most relevant features on the similarity calculation can result in more accurate filtered datasets and better prediction performances. In this study we extend our previous work. We analyse both prediction goals - Classification and Ranking. We present an empirical evaluation of 41 different methods by associating Instance Filtering methods with Feature Selection methods. We used 36 versions of 11 open source projects on experiments. The results show similar evidences for both prediction goals. First, the defect prediction performance of CPDP models can be improved by associating Feature Selection and Instance Filtering. Second, no evaluated method presented general better performances. Indeed, the most appropriate method can vary according to the characteristics of the project being predicted.

  11. Dream features in the early stages of Parkinson's disease.

    Science.gov (United States)

    Bugalho, Paulo; Paiva, Teresa

    2011-11-01

    Few studies have investigated the relation between dream features and cognition in Parkinson's disease (PD), although vivid dreams, hallucinations and cognitive decline have been proposed as successive steps of a pathological continuum. Our objectives were therefore to characterize the dreams of early stage PD and to study the relation between dream characteristics, cognitive function, motor status, depression, dopaminergic treatment, and the presence of REM sleep behaviour disorder (RBD) and hallucinations. Dreams of 19 male PD patients and 21 matched control subjects were classified according to Hall and van de Castle system. h statistics was used to compare the dream content between patients and controls. We tested the relation between patients' dreams characteristics and cognitive function (Frontal assessment battery (FAB) and Mini-Mental State Examination tests) depression (Beck depression inventory), motor function (UPDRS), dopaminergic treatment, the presence of RBD (according to clinical criteria) and hallucinations, using general linear model statistics. Patients and controls differed only on FAB scores. Relevant differences in the Hall and van de Castle scale were found between patient's dreams and those of the control group, regarding animals, aggression/friendliness, physical aggression, befriender (higher in the patient group) and aggressor and bodily misfortunes (lower in the patient group) features. Cognitive and particularly frontal dysfunction had a significant influence on the frequency of physical aggression and animal related features, while dopaminergic doses, depressive symptoms, hallucinations and RBD did not. We found a pattern of dream alteration characterized by heightened aggressiveness and the presence of animals. These were related to more severe frontal dysfunction, which could be the origin of such changes.

  12. [Comparison research on two-stage sequencing batch MBR and one-stage MBR].

    Science.gov (United States)

    Yuan, Xin-Yan; Shen, Heng-Gen; Sun, Lei; Wang, Lin; Li, Shi-Feng

    2011-01-01

    Aiming at resolving problems in MBR operation, like low nitrogen and phosphorous removal efficiency, severe membrane fouling and etc, comparison research on two-stage sequencing batch MBR (TSBMBR) and one-stage aerobic MBR has been done in this paper. The results indicated that TSBMBR owned advantages of SBR in removing nitrogen and phosphorous, which could make up the deficiency of traditional one-stage aerobic MBR in nitrogen and phosphorous removal. During steady operation period, effluent average NH4(+) -N, TN and TP concentration is 2.83, 12.20, 0.42 mg/L, which could reach domestic scenic environment use. From membrane fouling control point of view, TSBMBR has lower SMP in supernatant, specific trans-membrane flux deduction rate, membrane fouling resistant than one-stage aerobic MBR. The sedimentation and gel layer resistant of TSBMBR was only 6.5% and 33.12% of one-stage aerobic MBR. Besides high efficiency in removing nitrogen and phosphorous, TSBMBR could effectively reduce sedimentation and gel layer pollution on membrane surface. Comparing with one-stage MBR, TSBMBR could operate with higher trans-membrane flux, lower membrane fouling rate and better pollutants removal effects.

  13. Feature selection for Bayesian network classifiers using the MDL-FS score

    NARCIS (Netherlands)

    Drugan, Madalina M.; Wiering, Marco A.

    When constructing a Bayesian network classifier from data, the more or less redundant features included in a dataset may bias the classifier and as a consequence may result in a relatively poor classification accuracy. In this paper, we study the problem of selecting appropriate subsets of features

  14. Energy demand in Portuguese manufacturing: a two-stage model

    International Nuclear Information System (INIS)

    Borges, A.M.; Pereira, A.M.

    1992-01-01

    We use a two-stage model of factor demand to estimate the parameters determining energy demand in Portuguese manufacturing. In the first stage, a capital-labor-energy-materials framework is used to analyze the substitutability between energy as a whole and other factors of production. In the second stage, total energy demand is decomposed into oil, coal and electricity demands. The two stages are fully integrated since the energy composite used in the first stage and its price are obtained from the second stage energy sub-model. The estimates obtained indicate that energy demand in manufacturing responds significantly to price changes. In addition, estimation results suggest that there are important substitution possibilities among energy forms and between energy and other factors of production. The role of price changes in energy-demand forecasting, as well as in energy policy in general, is clearly established. (author)

  15. DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

    KAUST Repository

    Soufan, Othman

    2015-02-26

    Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem\\'s dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.

  16. DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

    KAUST Repository

    Soufan, Othman; Kleftogiannis, Dimitrios A.; Kalnis, Panos; Bajic, Vladimir B.

    2015-01-01

    Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.

  17. Feature Selection and Predictors of Falls with Foot Force Sensors Using KNN-Based Algorithms

    Directory of Open Access Journals (Sweden)

    Shengyun Liang

    2015-11-01

    Full Text Available The aging process may lead to the degradation of lower extremity function in the elderly population, which can restrict their daily quality of life and gradually increase the fall risk. We aimed to determine whether objective measures of physical function could predict subsequent falls. Ground reaction force (GRF data, which was quantified by sample entropy, was collected by foot force sensors. Thirty eight subjects (23 fallers and 15 non-fallers participated in functional movement tests, including walking and sit-to-stand (STS. A feature selection algorithm was used to select relevant features to classify the elderly into two groups: at risk and not at risk of falling down, for three KNN-based classifiers: local mean-based k-nearest neighbor (LMKNN, pseudo nearest neighbor (PNN, local mean pseudo nearest neighbor (LMPNN classification. We compared classification performances, and achieved the best results with LMPNN, with sensitivity, specificity and accuracy all 100%. Moreover, a subset of GRFs was significantly different between the two groups via Wilcoxon rank sum test, which is compatible with the classification results. This method could potentially be used by non-experts to monitor balance and the risk of falling down in the elderly population.

  18. Sleep-stage transitions during polysomnographic recordings as diagnostic features of type 1 narcolepsy

    DEFF Research Database (Denmark)

    Christensen, Julie Anja Engelhard; Carrillo, Oscar; Leary, Eileen B.

    2015-01-01

    Objective: Type 1 narcolepsy/hypocretin deficiency is characterized by excessive daytime sleepiness, sleep fragmentation, and cataplexy. Short rapid eye movement (REM) latency (≤15 min) during nocturnal polysomnography (PSG) or during naps of the multiple sleep latency test (MSLT) defines a sleep......-onset REM sleep period (SOREMP), a diagnostic hallmark. We hypothesized that abnormal sleep transitions other than SOREMPs can be identified in type 1 narcolepsy. Methods: Sleep-stage transitions (one to 10 epochs to one to five epochs of any other stage) and bout length features (one to 10 epochs) were...... of 19 cases and 708 sleep-clinic patients was used for the validation. Results: (1) ≥5 transitions from ≥5 epochs of stage N1 or W to ≥2 epochs of REM sleep, (2) ≥22 transitions from ≥3 epochs of stage N2 or N3 to ≥2 epochs of N1 or W, and (3) ≥16 bouts of ≥6 epochs of N1 or W were found to be highly...

  19. The influence of magnetic field strength in ionization stage on ion transport between two stages of a double stage Hall thruster

    International Nuclear Information System (INIS)

    Yu Daren; Song Maojiang; Li Hong; Liu Hui; Han Ke

    2012-01-01

    It is futile for a double stage Hall thruster to design a special ionization stage if the ionized ions cannot enter the acceleration stage. Based on this viewpoint, the ion transport under different magnetic field strengths in the ionization stage is investigated, and the physical mechanisms affecting the ion transport are analyzed in this paper. With a combined experimental and particle-in-cell simulation study, it is found that the ion transport between two stages is chiefly affected by the potential well, the potential barrier, and the potential drop at the bottom of potential well. With the increase of magnetic field strength in the ionization stage, there is larger plasma density caused by larger potential well. Furthermore, the potential barrier near the intermediate electrode declines first and then rises up while the potential drop at the bottom of potential well rises up first and then declines as the magnetic field strength increases in the ionization stage. Consequently, both the ion current entering the acceleration stage and the total ion current ejected from the thruster rise up first and then decline as the magnetic field strength increases in the ionization stage. Therefore, there is an optimal magnetic field strength in the ionization stage to guide the ion transport between two stages.

  20. Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Shahrbanoo Goli

    2016-01-01

    Full Text Available The Support Vector Regression (SVR model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model.

  1. Computer aided diagnosis system for Alzheimer disease using brain diffusion tensor imaging features selected by Pearson's correlation.

    Science.gov (United States)

    Graña, M; Termenon, M; Savio, A; Gonzalez-Pinto, A; Echeveste, J; Pérez, J M; Besga, A

    2011-09-20

    The aim of this paper is to obtain discriminant features from two scalar measures of Diffusion Tensor Imaging (DTI) data, Fractional Anisotropy (FA) and Mean Diffusivity (MD), and to train and test classifiers able to discriminate Alzheimer's Disease (AD) patients from controls on the basis of features extracted from the FA or MD volumes. In this study, support vector machine (SVM) classifier was trained and tested on FA and MD data. Feature selection is done computing the Pearson's correlation between FA or MD values at voxel site across subjects and the indicative variable specifying the subject class. Voxel sites with high absolute correlation are selected for feature extraction. Results are obtained over an on-going study in Hospital de Santiago Apostol collecting anatomical T1-weighted MRI volumes and DTI data from healthy control subjects and AD patients. FA features and a linear SVM classifier achieve perfect accuracy, sensitivity and specificity in several cross-validation studies, supporting the usefulness of DTI-derived features as an image-marker for AD and to the feasibility of building Computer Aided Diagnosis systems for AD based on them. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  2. Optimum location of external markers using feature selection algorithms for real-time tumor tracking in external-beam radiotherapy: a virtual phantom study.

    Science.gov (United States)

    Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin

    2016-01-08

    In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in

  3. Feature Selection Methods for Robust Decoding of Finger Movements in a Non-human Primate

    Science.gov (United States)

    Padmanaban, Subash; Baker, Justin; Greger, Bradley

    2018-01-01

    Objective: The performance of machine learning algorithms used for neural decoding of dexterous tasks may be impeded due to problems arising when dealing with high-dimensional data. The objective of feature selection algorithms is to choose a near-optimal subset of features from the original feature space to improve the performance of the decoding algorithm. The aim of our study was to compare the effects of four feature selection techniques, Wilcoxon signed-rank test, Relative Importance, Principal Component Analysis (PCA), and Mutual Information Maximization on SVM classification performance for a dexterous decoding task. Approach: A nonhuman primate (NHP) was trained to perform small coordinated movements—similar to typing. An array of microelectrodes was implanted in the hand area of the motor cortex of the NHP and used to record action potentials (AP) during finger movements. A Support Vector Machine (SVM) was used to classify which finger movement the NHP was making based upon AP firing rates. We used the SVM classification to examine the functional parameters of (i) robustness to simulated failure and (ii) longevity of classification. We also compared the effect of using isolated-neuron and multi-unit firing rates as the feature vector supplied to the SVM. Main results: The average decoding accuracy for multi-unit features and single-unit features using Mutual Information Maximization (MIM) across 47 sessions was 96.74 ± 3.5% and 97.65 ± 3.36% respectively. The reduction in decoding accuracy between using 100% of the features and 10% of features based on MIM was 45.56% (from 93.7 to 51.09%) and 4.75% (from 95.32 to 90.79%) for multi-unit and single-unit features respectively. MIM had best performance compared to other feature selection methods. Significance: These results suggest improved decoding performance can be achieved by using optimally selected features. The results based on clinically relevant performance metrics also suggest that the decoding

  4. EEG-based recognition of video-induced emotions: selecting subject-independent feature set.

    Science.gov (United States)

    Kortelainen, Jukka; Seppänen, Tapio

    2013-01-01

    Emotions are fundamental for everyday life affecting our communication, learning, perception, and decision making. Including emotions into the human-computer interaction (HCI) could be seen as a significant step forward offering a great potential for developing advanced future technologies. While the electrical activity of the brain is affected by emotions, offers electroencephalogram (EEG) an interesting channel to improve the HCI. In this paper, the selection of subject-independent feature set for EEG-based emotion recognition is studied. We investigate the effect of different feature sets in classifying person's arousal and valence while watching videos with emotional content. The classification performance is optimized by applying a sequential forward floating search algorithm for feature selection. The best classification rate (65.1% for arousal and 63.0% for valence) is obtained with a feature set containing power spectral features from the frequency band of 1-32 Hz. The proposed approach substantially improves the classification rate reported in the literature. In future, further analysis of the video-induced EEG changes including the topographical differences in the spectral features is needed.

  5. Fast Branch & Bound algorithms for optimal feature selection

    Czech Academy of Sciences Publication Activity Database

    Somol, Petr; Pudil, Pavel; Kittler, J.

    2004-01-01

    Roč. 26, č. 7 (2004), s. 900-912 ISSN 0162-8828 R&D Projects: GA ČR GA402/02/1271; GA ČR GA402/03/1310; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : subset search * feature selection * search tree Subject RIV: BD - Theory of Information Impact factor: 4.352, year: 2004

  6. Genetic Fuzzy System (GFS based wavelet co-occurrence feature selection in mammogram classification for breast cancer diagnosis

    Directory of Open Access Journals (Sweden)

    Meenakshi M. Pawar

    2016-09-01

    Full Text Available Breast cancer is significant health problem diagnosed mostly in women worldwide. Therefore, early detection of breast cancer is performed with the help of digital mammography, which can reduce mortality rate. This paper presents wrapper based feature selection approach for wavelet co-occurrence feature (WCF using Genetic Fuzzy System (GFS in mammogram classification problem. The performance of GFS algorithm is explained using mini-MIAS database. WCF features are obtained from detail wavelet coefficients at each level of decomposition of mammogram image. At first level of decomposition, 18 features are applied to GFS algorithm, which selects 5 features with an average classification success rate of 39.64%. Subsequently, at second level it selects 9 features from 36 features and the classification success rate is improved to 56.75%. For third level, 16 features are selected from 54 features and average success rate is improved to 64.98%. Lastly, at fourth level 72 features are applied to GFS, which selects 16 features and thereby increasing average success rate to 89.47%. Hence, GFS algorithm is the effective way of obtaining optimal set of feature in breast cancer diagnosis.

  7. Two-stage Catalytic Reduction of NOx with Hydrocarbons

    Energy Technology Data Exchange (ETDEWEB)

    Umit S. Ozkan; Erik M. Holmgreen; Matthew M. Yung; Jonathan Halter; Joel Hiltner

    2005-12-21

    A two-stage system for the catalytic reduction of NO from lean-burn natural gas reciprocating engine exhaust is investigated. Each of the two stages uses a distinct catalyst. The first stage is oxidation of NO to NO{sub 2} and the second stage is reduction of NO{sub 2} to N{sub 2} with a hydrocarbon. The central idea is that since NO{sub 2} is a more easily reduced species than NO, it should be better able to compete with oxygen for the combustion reaction of hydrocarbon, which is a challenge in lean conditions. Early work focused on demonstrating that the N{sub 2} yield obtained when NO{sub 2} was reduced was greater than when NO was reduced. NO{sub 2} reduction catalysts were designed and silver supported on alumina (Ag/Al{sub 2}O{sub 3}) was found to be quite active, able to achieve 95% N{sub 2} yield in 10% O{sub 2} using propane as the reducing agent. The design of a catalyst for NO oxidation was also investigated, and a Co/TiO{sub 2} catalyst prepared by sol-gel was shown to have high activity for the reaction, able to reach equilibrium conversion of 80% at 300 C at GHSV of 50,000h{sup -1}. After it was shown that NO{sub 2} could be more easily reduced to N{sub 2} than NO, the focus shifted on developing a catalyst that could use methane as the reducing agent. The Ag/Al{sub 2}O{sub 3} catalyst was tested and found to be inactive for NOx reduction with methane. Through iterative catalyst design, a palladium-based catalyst on a sulfated-zirconia support (Pd/SZ) was synthesized and shown to be able to selectively reduce NO{sub 2} in lean conditions using methane. Development of catalysts for the oxidation reaction also continued and higher activity, as well as stability in 10% water, was observed on a Co/ZrO{sub 2} catalyst, which reached equilibrium conversion of 94% at 250 C at the same GHSV. The Co/ZrO{sub 2} catalyst was also found to be extremely active for oxidation of CO, ethane, and propane, which could potential eliminate the need for any separate

  8. A comprehensive study of task coalescing for selecting parallelism granularity in a two-stage bidiagonal reduction

    KAUST Repository

    Haidar, Azzam

    2012-05-01

    We present new high performance numerical kernels combined with advanced optimization techniques that significantly increase the performance of parallel bidiagonal reduction. Our approach is based on developing efficient fine-grained computational tasks as well as reducing overheads associated with their high-level scheduling during the so-called bulge chasing procedure that is an essential phase of a scalable bidiagonalization procedure. In essence, we coalesce multiple tasks in a way that reduces the time needed to switch execution context between the scheduler and useful computational tasks. At the same time, we maintain the crucial information about the tasks and their data dependencies between the coalescing groups. This is the necessary condition to preserve numerical correctness of the computation. We show our annihilation strategy based on multiple applications of single orthogonal reflectors. Despite non-trivial characteristics in computational complexity and memory access patterns, our optimization approach smoothly applies to the annihilation scenario. The coalescing positively influences another equally important aspect of the bulge chasing stage: the memory reuse. For the tasks within the coalescing groups, the data is retained in high levels of the cache hierarchy and, as a consequence, operations that are normally memory-bound increase their ratio of computation to off-chip communication and become compute-bound which renders them amenable to efficient execution on multicore architectures. The performance for the new two-stage bidiagonal reduction is staggering. Our implementation results in up to 50-fold and 12-fold improvement (∼130 Gflop/s) compared to the equivalent routines from LAPACK V3.2 and Intel MKL V10.3, respectively, on an eight socket hexa-core AMD Opteron multicore shared-memory system with a matrix size of 24000 x 24000. Last but not least, we provide a comprehensive study on the impact of the coalescing group size in terms of cache

  9. Optimization of Boiling Water Reactor Loading Pattern Using Two-Stage Genetic Algorithm

    International Nuclear Information System (INIS)

    Kobayashi, Yoko; Aiyoshi, Eitaro

    2002-01-01

    A new two-stage optimization method based on genetic algorithms (GAs) using an if-then heuristic rule was developed to generate optimized boiling water reactor (BWR) loading patterns (LPs). In the first stage, the LP is optimized using an improved GA operator. In the second stage, an exposure-dependent control rod pattern (CRP) is sought using GA with an if-then heuristic rule. The procedure of the improved GA is based on deterministic operators that consist of crossover, mutation, and selection. The handling of the encoding technique and constraint conditions by that GA reflects the peculiar characteristics of the BWR. In addition, strategies such as elitism and self-reproduction are effectively used in order to improve the search speed. The LP evaluations were performed with a three-dimensional diffusion code that coupled neutronic and thermal-hydraulic models. Strong axial heterogeneities and constraints dependent on three dimensions have always necessitated the use of three-dimensional core simulators for BWRs, so that optimization of computational efficiency is required. The proposed algorithm is demonstrated by successfully generating LPs for an actual BWR plant in two phases. One phase is only LP optimization applying the Haling technique. The other phase is an LP optimization that considers the CRP during reactor operation. In test calculations, candidates that shuffled fresh and burned fuel assemblies within a reasonable computation time were obtained

  10. Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets

    NARCIS (Netherlands)

    Aalaei, Shokoufeh; Shahraki, Hadi; Rowhanimanesh, Alireza; Eslami, Saeid

    2016-01-01

    This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. To

  11. A Local Asynchronous Distributed Privacy Preserving Feature Selection Algorithm for Large Peer-to-Peer Networks

    Data.gov (United States)

    National Aeronautics and Space Administration — In this paper we develop a local distributed privacy preserving algorithm for feature selection in a large peer-to-peer environment. Feature selection is often used...

  12. Pattern Classification Using an Olfactory Model with PCA Feature Selection in Electronic Noses: Study and Application

    Directory of Open Access Journals (Sweden)

    Junbao Zheng

    2012-03-01

    Full Text Available Biologically-inspired models and algorithms are considered as promising sensor array signal processing methods for electronic noses. Feature selection is one of the most important issues for developing robust pattern recognition models in machine learning. This paper describes an investigation into the classification performance of a bionic olfactory model with the increase of the dimensions of input feature vector (outer factor as well as its parallel channels (inner factor. The principal component analysis technique was applied for feature selection and dimension reduction. Two data sets of three classes of wine derived from different cultivars and five classes of green tea derived from five different provinces of China were used for experiments. In the former case the results showed that the average correct classification rate increased as more principal components were put in to feature vector. In the latter case the results showed that sufficient parallel channels should be reserved in the model to avoid pattern space crowding. We concluded that 6~8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3~5 pattern classes considering the trade-off between time consumption and classification rate.

  13. Self-Adaptive MOEA Feature Selection for Classification of Bankruptcy Prediction Data

    Science.gov (United States)

    Gaspar-Cunha, A.; Recio, G.; Costa, L.; Estébanez, C.

    2014-01-01

    Bankruptcy prediction is a vast area of finance and accounting whose importance lies in the relevance for creditors and investors in evaluating the likelihood of getting into bankrupt. As companies become complex, they develop sophisticated schemes to hide their real situation. In turn, making an estimation of the credit risks associated with counterparts or predicting bankruptcy becomes harder. Evolutionary algorithms have shown to be an excellent tool to deal with complex problems in finances and economics where a large number of irrelevant features are involved. This paper provides a methodology for feature selection in classification of bankruptcy data sets using an evolutionary multiobjective approach that simultaneously minimise the number of features and maximise the classifier quality measure (e.g., accuracy). The proposed methodology makes use of self-adaptation by applying the feature selection algorithm while simultaneously optimising the parameters of the classifier used. The methodology was applied to four different sets of data. The obtained results showed the utility of using the self-adaptation of the classifier. PMID:24707201

  14. Train Stop Scheduling in a High-Speed Rail Network by Utilizing a Two-Stage Approach

    Directory of Open Access Journals (Sweden)

    Huiling Fu

    2012-01-01

    Full Text Available Among the most commonly used methods of scheduling train stops are practical experience and various “one-step” optimal models. These methods face problems of direct transferability and computational complexity when considering a large-scale high-speed rail (HSR network such as the one in China. This paper introduces a two-stage approach for train stop scheduling with a goal of efficiently organizing passenger traffic into a rational train stop pattern combination while retaining features of regularity, connectivity, and rapidity (RCR. Based on a three-level station classification definition, a mixed integer programming model and a train operating tactics descriptive model along with the computing algorithm are developed and presented for the two stages. A real-world numerical example is presented using the Chinese HSR network as the setting. The performance of the train stop schedule and the applicability of the proposed approach are evaluated from the perspective of maintaining RCR.

  15. Final Report on Two-Stage Fast Spectrum Fuel Cycle Options

    International Nuclear Information System (INIS)

    Yang, Won Sik; Lin, C. S.; Hader, J. S.; Park, T. K.; Deng, P.; Yang, G.; Jung, Y. S.; Kim, T. K.; Stauff, N. E.

    2016-01-01

    This report presents the performance characteristics of two ''two-stage'' fast spectrum fuel cycle options proposed to enhance uranium resource utilization and to reduce nuclear waste generation. One is a two-stage fast spectrum fuel cycle option of continuous recycle of plutonium (Pu) in a fast reactor (FR) and subsequent burning of minor actinides (MAs) in an accelerator-driven system (ADS). The first stage is a sodium-cooled FR fuel cycle starting with low-enriched uranium (LEU) fuel; at the equilibrium cycle, the FR is operated using the recovered Pu and natural uranium without supporting LEU. Pu and uranium (U) are co-extracted from the discharged fuel and recycled in the first stage, and the recovered MAs are sent to the second stage. The second stage is a sodium-cooled ADS in which MAs are burned in an inert matrix fuel form. The discharged fuel of ADS is reprocessed, and all the recovered heavy metals (HMs) are recycled into the ADS. The other is a two-stage FR/ADS fuel cycle option with MA targets loaded in the FR. The recovered MAs are not directly sent to ADS, but partially incinerated in the FR in order to reduce the amount of MAs to be sent to the ADS. This is a heterogeneous recycling option of transuranic (TRU) elements

  16. Two-stage electrolysis to enrich tritium in environmental water

    International Nuclear Information System (INIS)

    Shima, Nagayoshi; Muranaka, Takeshi

    2007-01-01

    We present a two-stage electrolyzing procedure to enrich tritium in environmental waters. Tritium is first enriched rapidly through a commercially-available electrolyser with a large 50A current, and then through a newly-designed electrolyser that avoids the memory effect, with a 6A current. Tritium recovery factor obtained by such a two-stage electrolysis was greater than that obtained when using the commercially-available device solely. Water samples collected in 2006 in lakes and along the Pacific coast of Aomori prefecture, Japan, were electrolyzed using the two-stage method. Tritium concentrations in these samples ranged from 0.2 to 0.9 Bq/L and were half or less, that in samples collected at the same sites in 1992. (author)

  17. Enhancing Web Service Selection by User Preferences of Non-Functional Features

    OpenAIRE

    Badr , Youakim; Abraham , Ajith; Biennier , Frédérique; Grosan , Crina

    2008-01-01

    International audience; Selection of an appropriate Web service for a particular task has become a difficult challenge due to the increasing number of Web services offering similar functionalities. The functional properties describe what the service can do and the nonfunctional properties depict how the service can do it. Non-functional properties involving qualitative or quantitative features have become essential criteria to enhance the selection process of services making the selection pro...

  18. Modelling of an air-cooled two-stage Rankine cycle for electricity production

    International Nuclear Information System (INIS)

    Liu, Bo

    2014-01-01

    This work considers a two stage Rankine cycle architecture slightly different from a standard Rankine cycle for electricity generation. Instead of expanding the steam to extremely low pressure, the vapor leaves the turbine at a higher pressure then having a much smaller specific volume. It is thus possible to greatly reduce the size of the steam turbine. The remaining energy is recovered by a bottoming cycle using a working fluid which has a much higher density than the water steam. Thus, the turbines and heat exchangers are more compact; the turbine exhaust velocity loss is lower. This configuration enables to largely reduce the global size of the steam water turbine and facilitate the use of a dry cooling system. The main advantage of such an air cooled two stage Rankine cycle is the possibility to choose the installation site of a large or medium power plant without the need of a large and constantly available water source; in addition, as compared to water cooled cycles, the risk regarding future operations is reduced (climate conditions may affect water availability or temperature, and imply changes in the water supply regulatory rules). The concept has been investigated by EDF R and D. A 22 MW prototype was developed in the 1970's using ammonia as the working fluid of the bottoming cycle for its high density and high latent heat. However, this fluid is toxic. In order to search more suitable working fluids for the two stage Rankine cycle application and to identify the optimal cycle configuration, we have established a working fluid selection methodology. Some potential candidates have been identified. We have evaluated the performances of the two stage Rankine cycles operating with different working fluids in both design and off design conditions. For the most acceptable working fluids, components of the cycle have been sized. The power plant concept can then be evaluated on a life cycle cost basis. (author)

  19. Residual signal feature extraction for gearbox planetary stage fault detection

    DEFF Research Database (Denmark)

    Skrimpas, Georgios Alexandros; Ursin, Thomas; Sweeney, Christian Walsted

    2017-01-01

    Faults in planetary gears and related bearings, e.g. planet bearings and planet carrier bearings, pose inherent difficulties on their accurate and consistent detection associated mainly to the low energy in slow rotating stages and the operating complexity of planetary gearboxes. In this work......, identification of the expected spectral signature for proper residual signal calculation and filtering of any frequency component not related to the planetary stage. Two field cases of planet carrier bearing defect and planet wheel spalling are presented and discussed, showing the efficiency of the followed...

  20. Update on Merkel Cell Carcinoma: Epidemiology, Etiopathogenesis, Clinical Features, Diagnosis, and Staging.

    Science.gov (United States)

    Llombart, B; Requena, C; Cruz, J

    2017-03-01

    Merkel cell carcinoma (MCC) is a rare, highly aggressive tumor, and local or regional disease recurrence is common, as is metastasis. MCC usually develops in sun-exposed skin in patients of advanced age. Its incidence has risen 4-fold in recent decades as the population has aged and immunohistochemical techniques have led to more diagnoses. The pathogenesis of MCC remains unclear but UV radiation, immunosuppression, and the presence of Merkel cell polyomavirus in the tumor genome seem to play key roles. This review seeks to update our understanding of the epidemiology, etiology, pathogenesis, and clinical features of MCC. We also review histologic and immunohistochemical features required for diagnosis. MCC staging is discussed, given its great importance in establishing a prognosis for these patients. Copyright © 2016 AEDV. Publicado por Elsevier España, S.L.U. All rights reserved.

  1. On A Two-Stage Supply Chain Model In The Manufacturing Industry ...

    African Journals Online (AJOL)

    We model a two-stage supply chain where the upstream stage (stage 2) always meet demand from the downstream stage (stage 1).Demand is stochastic hence shortages will occasionally occur at stage 2. Stage 2 must fill these shortages by expediting using overtime production and/or backordering. We derive optimal ...

  2. One- and two-stage Arrhenius models for pharmaceutical shelf life prediction.

    Science.gov (United States)

    Fan, Zhewen; Zhang, Lanju

    2015-01-01

    One of the most challenging aspects of the pharmaceutical development is the demonstration and estimation of chemical stability. It is imperative that pharmaceutical products be stable for two or more years. Long-term stability studies are required to support such shelf life claim at registration. However, during drug development to facilitate formulation and dosage form selection, an accelerated stability study with stressed storage condition is preferred to quickly obtain a good prediction of shelf life under ambient storage conditions. Such a prediction typically uses Arrhenius equation that describes relationship between degradation rate and temperature (and humidity). Existing methods usually rely on the assumption of normality of the errors. In addition, shelf life projection is usually based on confidence band of a regression line. However, the coverage probability of a method is often overlooked or under-reported. In this paper, we introduce two nonparametric bootstrap procedures for shelf life estimation based on accelerated stability testing, and compare them with a one-stage nonlinear Arrhenius prediction model. Our simulation results demonstrate that one-stage nonlinear Arrhenius method has significant lower coverage than nominal levels. Our bootstrap method gave better coverage and led to a shelf life prediction closer to that based on long-term stability data.

  3. Particle swarm optimization based feature enhancement and feature selection for improved emotion recognition in speech and glottal signals.

    Science.gov (United States)

    Muthusamy, Hariharan; Polat, Kemal; Yaacob, Sazali

    2015-01-01

    In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.

  4. Structural features of subtype-selective EP receptor modulators.

    Science.gov (United States)

    Markovič, Tijana; Jakopin, Žiga; Dolenc, Marija Sollner; Mlinarič-Raščan, Irena

    2017-01-01

    Prostaglandin E2 is a potent endogenous molecule that binds to four different G-protein-coupled receptors: EP1-4. Each of these receptors is a valuable drug target, with distinct tissue localisation and signalling pathways. We review the structural features of EP modulators required for subtype-selective activity, as well as the structural requirements for improved pharmacokinetic parameters. Novel EP receptor subtype selective agonists and antagonists appear to be valuable drug candidates in the therapy of many pathophysiological states, including ulcerative colitis, glaucoma, bone healing, B cell lymphoma, neurological diseases, among others, which have been studied in vitro, in vivo and in early phase clinical trials. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Automatic sleep stage classification of single-channel EEG by using complex-valued convolutional neural network.

    Science.gov (United States)

    Zhang, Junming; Wu, Yan

    2018-03-28

    Many systems are developed for automatic sleep stage classification. However, nearly all models are based on handcrafted features. Because of the large feature space, there are so many features that feature selection should be used. Meanwhile, designing handcrafted features is a difficult and time-consuming task because the feature designing needs domain knowledge of experienced experts. Results vary when different sets of features are chosen to identify sleep stages. Additionally, many features that we may be unaware of exist. However, these features may be important for sleep stage classification. Therefore, a new sleep stage classification system, which is based on the complex-valued convolutional neural network (CCNN), is proposed in this study. Unlike the existing sleep stage methods, our method can automatically extract features from raw electroencephalography data and then classify sleep stage based on the learned features. Additionally, we also prove that the decision boundaries for the real and imaginary parts of a complex-valued convolutional neuron intersect orthogonally. The classification performances of handcrafted features are compared with those of learned features via CCNN. Experimental results show that the proposed method is comparable to the existing methods. CCNN obtains a better classification performance and considerably faster convergence speed than convolutional neural network. Experimental results also show that the proposed method is a useful decision-support tool for automatic sleep stage classification.

  6. Gene features selection for three-class disease classification via multiple orthogonal partial least square discriminant analysis and S-plot using microarray data.

    Science.gov (United States)

    Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu

    2013-01-01

    DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.

  7. Optimum location of external markers using feature selection algorithms for real‐time tumor tracking in external‐beam radiotherapy: a virtual phantom study

    Science.gov (United States)

    Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin

    2016-01-01

    In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature

  8. Treatment of corn ethanol distillery wastewater using two-stage anaerobic digestion.

    Science.gov (United States)

    Ráduly, B; Gyenge, L; Szilveszter, Sz; Kedves, A; Crognale, S

    In this study the mesophilic two-stage anaerobic digestion (AD) of corn bioethanol distillery wastewater is investigated in laboratory-scale reactors. Two-stage AD technology separates the different sub-processes of the AD in two distinct reactors, enabling the use of optimal conditions for the different microbial consortia involved in the different process phases, and thus allowing for higher applicable organic loading rates (OLRs), shorter hydraulic retention times (HRTs) and better conversion rates of the organic matter, as well as higher methane content of the produced biogas. In our experiments the reactors have been operated in semi-continuous phase-separated mode. A specific methane production of 1,092 mL/(L·d) has been reached at an OLR of 6.5 g TCOD/(L·d) (TCOD: total chemical oxygen demand) and a total HRT of 21 days (5.7 days in the first-stage, and 15.3 days in the second-stage reactor). Nonetheless the methane concentration in the second-stage reactor was very high (78.9%); the two-stage AD outperformed the reference single-stage AD (conducted at the same reactor loading rate and retention time) by only a small margin in terms of volumetric methane production rate. This makes questionable whether the higher methane content of the biogas counterbalances the added complexity of the two-stage digestion.

  9. Improving permafrost distribution modelling using feature selection algorithms

    Science.gov (United States)

    Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail

    2016-04-01

    The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its

  10. Rapid Two-stage Versus One-stage Surgical Repair of Interrupted Aortic Arch with Ventricular Septal Defect in Neonates

    Directory of Open Access Journals (Sweden)

    Meng-Lin Lee

    2008-11-01

    Conclusion: The outcome of rapid two-stage repair is comparable to that of one-stage repair. Rapid two-stage repair has the advantages of significantly shorter cardiopulmonary bypass duration and AXC time, and avoids deep hypothermic circulatory arrest. LVOTO remains an unresolved issue, and postoperative aortic arch restenosis can be dilated effectively by percutaneous balloon angioplasty.

  11. Arc-Welding Spectroscopic Monitoring based on Feature Selection and Neural Networks.

    Science.gov (United States)

    Garcia-Allende, P Beatriz; Mirapeix, Jesus; Conde, Olga M; Cobo, Adolfo; Lopez-Higuera, Jose M

    2008-10-21

    A new spectral processing technique designed for application in the on-line detection and classification of arc-welding defects is presented in this paper. A noninvasive fiber sensor embedded within a TIG torch collects the plasma radiation originated during the welding process. The spectral information is then processed in two consecutive stages. A compression algorithm is first applied to the data, allowing real-time analysis. The selected spectral bands are then used to feed a classification algorithm, which will be demonstrated to provide an efficient weld defect detection and classification. The results obtained with the proposed technique are compared to a similar processing scheme presented in previous works, giving rise to an improvement in the performance of the monitoring system.

  12. Arc-Welding Spectroscopic Monitoring based on Feature Selection and Neural Networks

    Directory of Open Access Journals (Sweden)

    Jose M. Lopez- Higuera

    2008-10-01

    Full Text Available A new spectral processing technique designed for application in the on-line detection and classification of arc-welding defects is presented in this paper. A noninvasive fiber sensor embedded within a TIG torch collects the plasma radiation originated during the welding process. The spectral information is then processed in two consecutive stages. A compression algorithm is first applied to the data, allowing real-time analysis. The selected spectral bands are then used to feed a classification algorithm, which will be demonstrated to provide an efficient weld defect detection and classification. The results obtained with the proposed technique are compared to a similar processing scheme presented in previous works, giving rise to an improvement in the performance of the monitoring system.

  13. Teaching basic life support with an automated external defibrillator using the two-stage or the four-stage teaching technique.

    Science.gov (United States)

    Bjørnshave, Katrine; Krogh, Lise Q; Hansen, Svend B; Nebsbjerg, Mette A; Thim, Troels; Løfgren, Bo

    2018-02-01

    Laypersons often hesitate to perform basic life support (BLS) and use an automated external defibrillator (AED) because of self-perceived lack of knowledge and skills. Training may reduce the barrier to intervene. Reduced training time and costs may allow training of more laypersons. The aim of this study was to compare BLS/AED skills' acquisition and self-evaluated BLS/AED skills after instructor-led training with a two-stage versus a four-stage teaching technique. Laypersons were randomized to either two-stage or four-stage teaching technique courses. Immediately after training, the participants were tested in a simulated cardiac arrest scenario to assess their BLS/AED skills. Skills were assessed using the European Resuscitation Council BLS/AED assessment form. The primary endpoint was passing the test (17 of 17 skills adequately performed). A prespecified noninferiority margin of 20% was used. The two-stage teaching technique (n=72, pass rate 57%) was noninferior to the four-stage technique (n=70, pass rate 59%), with a difference in pass rates of -2%; 95% confidence interval: -18 to 15%. Neither were there significant differences between the two-stage and four-stage groups in the chest compression rate (114±12 vs. 115±14/min), chest compression depth (47±9 vs. 48±9 mm) and number of sufficient rescue breaths between compression cycles (1.7±0.5 vs. 1.6±0.7). In both groups, all participants believed that their training had improved their skills. Teaching laypersons BLS/AED using the two-stage teaching technique was noninferior to the four-stage teaching technique, although the pass rate was -2% (95% confidence interval: -18 to 15%) lower with the two-stage teaching technique.

  14. Novel Mahalanobis-based feature selection improves one-class classification of early hepatocellular carcinoma.

    Science.gov (United States)

    Thomaz, Ricardo de Lima; Carneiro, Pedro Cunha; Bonin, João Eliton; Macedo, Túlio Augusto Alves; Patrocinio, Ana Claudia; Soares, Alcimar Barbosa

    2018-05-01

    Detection of early hepatocellular carcinoma (HCC) is responsible for increasing survival rates in up to 40%. One-class classifiers can be used for modeling early HCC in multidetector computed tomography (MDCT), but demand the specific knowledge pertaining to the set of features that best describes the target class. Although the literature outlines several features for characterizing liver lesions, it is unclear which is most relevant for describing early HCC. In this paper, we introduce an unconstrained GA feature selection algorithm based on a multi-objective Mahalanobis fitness function to improve the classification performance for early HCC. We compared our approach to a constrained Mahalanobis function and two other unconstrained functions using Welch's t-test and Gaussian Data Descriptors. The performance of each fitness function was evaluated by cross-validating a one-class SVM. The results show that the proposed multi-objective Mahalanobis fitness function is capable of significantly reducing data dimensionality (96.4%) and improving one-class classification of early HCC (0.84 AUC). Furthermore, the results provide strong evidence that intensity features extracted at the arterial to portal and arterial to equilibrium phases are important for classifying early HCC.

  15. Linear feature selection in texture analysis - A PLS based method

    DEFF Research Database (Denmark)

    Marques, Joselene; Igel, Christian; Lillholm, Martin

    2013-01-01

    We present a texture analysis methodology that combined uncommitted machine-learning techniques and partial least square (PLS) in a fully automatic framework. Our approach introduces a robust PLS-based dimensionality reduction (DR) step to specifically address outliers and high-dimensional feature...... and considering all CV groups, the methods selected 36 % of the original features available. The diagnosis evaluation reached a generalization area-under-the-ROC curve of 0.92, which was higher than established cartilage-based markers known to relate to OA diagnosis....

  16. Feature Selection of Network Intrusion Data using Genetic Algorithm and Particle Swarm Optimization

    Directory of Open Access Journals (Sweden)

    Iwan Syarif

    2016-12-01

    Full Text Available This paper describes the advantages of using Evolutionary Algorithms (EA for feature selection on network intrusion dataset. Most current Network Intrusion Detection Systems (NIDS are unable to detect intrusions in real time because of high dimensional data produced during daily operation. Extracting knowledge from huge data such as intrusion data requires new approach. The more complex the datasets, the higher computation time and the harder they are to be interpreted and analyzed. This paper investigates the performance of feature selection algoritms in network intrusiona data. We used Genetic Algorithms (GA and Particle Swarm Optimizations (PSO as feature selection algorithms. When applied to network intrusion datasets, both GA and PSO have significantly reduces the number of features. Our experiments show that GA successfully reduces the number of attributes from 41 to 15 while PSO reduces the number of attributes from 41 to 9. Using k Nearest Neighbour (k-NN as a classifier,the GA-reduced dataset which consists of 37% of original attributes, has accuracy improvement from 99.28% to 99.70% and its execution time is also 4.8 faster than the execution time of original dataset. Using the same classifier, PSO-reduced dataset which consists of 22% of original attributes, has the fastest execution time (7.2 times faster than the execution time of original datasets. However, its accuracy is slightly reduced 0.02% from 99.28% to 99.26%. Overall, both GA and PSO are good solution as feature selection techniques because theyhave shown very good performance in reducing the number of features significantly while still maintaining and sometimes improving the classification accuracy as well as reducing the computation time.

  17. Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm

    Science.gov (United States)

    Salameh Shreem, Salam; Abdullah, Salwani; Nazri, Mohd Zakree Ahmad

    2016-04-01

    Microarray technology can be used as an efficient diagnostic system to recognise diseases such as tumours or to discriminate between different types of cancers in normal tissues. This technology has received increasing attention from the bioinformatics community because of its potential in designing powerful decision-making tools for cancer diagnosis. However, the presence of thousands or tens of thousands of genes affects the predictive accuracy of this technology from the perspective of classification. Thus, a key issue in microarray data is identifying or selecting the smallest possible set of genes from the input data that can achieve good predictive accuracy for classification. In this work, we propose a two-stage selection algorithm for gene selection problems in microarray data-sets called the symmetrical uncertainty filter and harmony search algorithm wrapper (SU-HSA). Experimental results show that the SU-HSA is better than HSA in isolation for all data-sets in terms of the accuracy and achieves a lower number of genes on 6 out of 10 instances. Furthermore, the comparison with state-of-the-art methods shows that our proposed approach is able to obtain 5 (out of 10) new best results in terms of the number of selected genes and competitive results in terms of the classification accuracy.

  18. Final Report on Two-Stage Fast Spectrum Fuel Cycle Options

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Won Sik [Purdue Univ., West Lafayette, IN (United States); Lin, C. S. [Purdue Univ., West Lafayette, IN (United States); Hader, J. S. [Purdue Univ., West Lafayette, IN (United States); Park, T. K. [Purdue Univ., West Lafayette, IN (United States); Deng, P. [Purdue Univ., West Lafayette, IN (United States); Yang, G. [Purdue Univ., West Lafayette, IN (United States); Jung, Y. S. [Purdue Univ., West Lafayette, IN (United States); Kim, T. K. [Argonne National Lab. (ANL), Argonne, IL (United States); Stauff, N. E. [Argonne National Lab. (ANL), Argonne, IL (United States)

    2016-01-30

    This report presents the performance characteristics of twotwo-stage” fast spectrum fuel cycle options proposed to enhance uranium resource utilization and to reduce nuclear waste generation. One is a two-stage fast spectrum fuel cycle option of continuous recycle of plutonium (Pu) in a fast reactor (FR) and subsequent burning of minor actinides (MAs) in an accelerator-driven system (ADS). The first stage is a sodium-cooled FR fuel cycle starting with low-enriched uranium (LEU) fuel; at the equilibrium cycle, the FR is operated using the recovered Pu and natural uranium without supporting LEU. Pu and uranium (U) are co-extracted from the discharged fuel and recycled in the first stage, and the recovered MAs are sent to the second stage. The second stage is a sodium-cooled ADS in which MAs are burned in an inert matrix fuel form. The discharged fuel of ADS is reprocessed, and all the recovered heavy metals (HMs) are recycled into the ADS. The other is a two-stage FR/ADS fuel cycle option with MA targets loaded in the FR. The recovered MAs are not directly sent to ADS, but partially incinerated in the FR in order to reduce the amount of MAs to be sent to the ADS. This is a heterogeneous recycling option of transuranic (TRU) elements

  19. Feature Selection for Nonstationary Data: Application to Human Recognition Using Medical Biometrics.

    Science.gov (United States)

    Komeili, Majid; Louis, Wael; Armanfard, Narges; Hatzinakos, Dimitrios

    2018-05-01

    Electrocardiogram (ECG) and transient evoked otoacoustic emission (TEOAE) are among the physiological signals that have attracted significant interest in biometric community due to their inherent robustness to replay and falsification attacks. However, they are time-dependent signals and this makes them hard to deal with in across-session human recognition scenario where only one session is available for enrollment. This paper presents a novel feature selection method to address this issue. It is based on an auxiliary dataset with multiple sessions where it selects a subset of features that are more persistent across different sessions. It uses local information in terms of sample margins while enforcing an across-session measure. This makes it a perfect fit for aforementioned biometric recognition problem. Comprehensive experiments on ECG and TEOAE variability due to time lapse and body posture are done. Performance of the proposed method is compared against seven state-of-the-art feature selection algorithms as well as another six approaches in the area of ECG and TEOAE biometric recognition. Experimental results demonstrate that the proposed method performs noticeably better than other algorithms.

  20. An Empirical Study of Wrappers for Feature Subset Selection based on a Parallel Genetic Algorithm: The Multi-Wrapper Model

    KAUST Repository

    Soufan, Othman

    2012-09-01

    Feature selection is the first task of any learning approach that is applied in major fields of biomedical, bioinformatics, robotics, natural language processing and social networking. In feature subset selection problem, a search methodology with a proper criterion seeks to find the best subset of features describing data (relevance) and achieving better performance (optimality). Wrapper approaches are feature selection methods which are wrapped around a classification algorithm and use a performance measure to select the best subset of features. We analyze the proper design of the objective function for the wrapper approach and highlight an objective based on several classification algorithms. We compare the wrapper approaches to different feature selection methods based on distance and information based criteria. Significant improvement in performance, computational time, and selection of minimally sized feature subsets is achieved by combining different objectives for the wrapper model. In addition, considering various classification methods in the feature selection process could lead to a global solution of desirable characteristics.

  1. A Permutation Importance-Based Feature Selection Method for Short-Term Electricity Load Forecasting Using Random Forest

    Directory of Open Access Journals (Sweden)

    Nantian Huang

    2016-09-01

    Full Text Available The prediction accuracy of short-term load forecast (STLF depends on prediction model choice and feature selection result. In this paper, a novel random forest (RF-based feature selection method for STLF is proposed. First, 243 related features were extracted from historical load data and the time information of prediction points to form the original feature set. Subsequently, the original feature set was used to train an RF as the original model. After the training process, the prediction error of the original model on the test set was recorded and the permutation importance (PI value of each feature was obtained. Then, an improved sequential backward search method was used to select the optimal forecasting feature subset based on the PI value of each feature. Finally, the optimal forecasting feature subset was used to train a new RF model as the final prediction model. Experiments showed that the prediction accuracy of RF trained by the optimal forecasting feature subset was higher than that of the original model and comparative models based on support vector regression and artificial neural network.

  2. A stereo remote sensing feature selection method based on artificial bee colony algorithm

    Science.gov (United States)

    Yan, Yiming; Liu, Pigang; Zhang, Ye; Su, Nan; Tian, Shu; Gao, Fengjiao; Shen, Yi

    2014-05-01

    To improve the efficiency of stereo information for remote sensing classification, a stereo remote sensing feature selection method is proposed in this paper presents, which is based on artificial bee colony algorithm. Remote sensing stereo information could be described by digital surface model (DSM) and optical image, which contain information of the three-dimensional structure and optical characteristics, respectively. Firstly, three-dimensional structure characteristic could be analyzed by 3D-Zernike descriptors (3DZD). However, different parameters of 3DZD could descript different complexity of three-dimensional structure, and it needs to be better optimized selected for various objects on the ground. Secondly, features for representing optical characteristic also need to be optimized. If not properly handled, when a stereo feature vector composed of 3DZD and image features, that would be a lot of redundant information, and the redundant information may not improve the classification accuracy, even cause adverse effects. To reduce information redundancy while maintaining or improving the classification accuracy, an optimized frame for this stereo feature selection problem is created, and artificial bee colony algorithm is introduced for solving this optimization problem. Experimental results show that the proposed method can effectively improve the computational efficiency, improve the classification accuracy.

  3. Runway Operations Planning: A Two-Stage Heuristic Algorithm

    Science.gov (United States)

    Anagnostakis, Ioannis; Clarke, John-Paul

    2003-01-01

    The airport runway is a scarce resource that must be shared by different runway operations (arrivals, departures and runway crossings). Given the possible sequences of runway events, careful Runway Operations Planning (ROP) is required if runway utilization is to be maximized. From the perspective of departures, ROP solutions are aircraft departure schedules developed by optimally allocating runway time for departures given the time required for arrivals and crossings. In addition to the obvious objective of maximizing throughput, other objectives, such as guaranteeing fairness and minimizing environmental impact, can also be incorporated into the ROP solution subject to constraints introduced by Air Traffic Control (ATC) procedures. This paper introduces a two stage heuristic algorithm for solving the Runway Operations Planning (ROP) problem. In the first stage, sequences of departure class slots and runway crossings slots are generated and ranked based on departure runway throughput under stochastic conditions. In the second stage, the departure class slots are populated with specific flights from the pool of available aircraft, by solving an integer program with a Branch & Bound algorithm implementation. Preliminary results from this implementation of the two-stage algorithm on real-world traffic data are presented.

  4. Information Theory for Gabor Feature Selection for Face Recognition

    Directory of Open Access Journals (Sweden)

    Shen Linlin

    2006-01-01

    Full Text Available A discriminative and robust feature—kernel enhanced informative Gabor feature—is proposed in this paper for face recognition. Mutual information is applied to select a set of informative and nonredundant Gabor features, which are then further enhanced by kernel methods for recognition. Compared with one of the top performing methods in the 2004 Face Verification Competition (FVC2004, our methods demonstrate a clear advantage over existing methods in accuracy, computation efficiency, and memory cost. The proposed method has been fully tested on the FERET database using the FERET evaluation protocol. Significant improvements on three of the test data sets are observed. Compared with the classical Gabor wavelet-based approaches using a huge number of features, our method requires less than 4 milliseconds to retrieve a few hundreds of features. Due to the substantially reduced feature dimension, only 4 seconds are required to recognize 200 face images. The paper also unified different Gabor filter definitions and proposed a training sample generation algorithm to reduce the effects caused by unbalanced number of samples available in different classes.

  5. Information Theory for Gabor Feature Selection for Face Recognition

    Science.gov (United States)

    Shen, Linlin; Bai, Li

    2006-12-01

    A discriminative and robust feature—kernel enhanced informative Gabor feature—is proposed in this paper for face recognition. Mutual information is applied to select a set of informative and nonredundant Gabor features, which are then further enhanced by kernel methods for recognition. Compared with one of the top performing methods in the 2004 Face Verification Competition (FVC2004), our methods demonstrate a clear advantage over existing methods in accuracy, computation efficiency, and memory cost. The proposed method has been fully tested on the FERET database using the FERET evaluation protocol. Significant improvements on three of the test data sets are observed. Compared with the classical Gabor wavelet-based approaches using a huge number of features, our method requires less than 4 milliseconds to retrieve a few hundreds of features. Due to the substantially reduced feature dimension, only 4 seconds are required to recognize 200 face images. The paper also unified different Gabor filter definitions and proposed a training sample generation algorithm to reduce the effects caused by unbalanced number of samples available in different classes.

  6. Profiles and drivers of antibiotic resistance genes distribution in one-stage and two-stage sludge anaerobic digestion based on microwave-H2O2 pretreatment.

    Science.gov (United States)

    Zhang, Junya; Liu, Jibao; Wang, Yawei; Yu, Dawei; Sui, Qianwen; Wang, Rui; Chen, Meixue; Tong, Juan; Wei, Yuansong

    2017-10-01

    Three anaerobic digestion (AD) processes of waste activated sludge (WAS) were established including the control (mono-WAS), one-stage AD and two-stage AD along with microwave-H 2 O 2 pre-treatment (MW-H 2 O 2 ) to investigate the profiles and drivers of antibiotic resistance genes (ARGs) distribution concerning co-selection from heavy metals, intI1 and microbial community through qPCR and high-throughput sequencing method. Results showed that MW-H 2 O 2 could reduce the absolute gene copies of all ARGs while increased the relative abundance of most ARGs. After subsequent AD, both total ARGs quantities and relative abundance were enriched while two-stage AD showed some advantages over ARGs abundance reduction. Besides, AD was more effective on the potential pathogens reduction than MW-H 2 O 2 . AD could reduce the role of intI1 on the spread of ARGs, while mantel test and procrustes analysis indicated that the variation of ARGs abundance was closely associated with the discrepancy of bacterial community. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling.

    Science.gov (United States)

    Terza, Joseph V; Basu, Anirban; Rathouz, Paul J

    2008-05-01

    The paper focuses on two estimation methods that have been widely used to address endogeneity in empirical research in health economics and health services research-two-stage predictor substitution (2SPS) and two-stage residual inclusion (2SRI). 2SPS is the rote extension (to nonlinear models) of the popular linear two-stage least squares estimator. The 2SRI estimator is similar except that in the second-stage regression, the endogenous variables are not replaced by first-stage predictors. Instead, first-stage residuals are included as additional regressors. In a generic parametric framework, we show that 2SRI is consistent and 2SPS is not. Results from a simulation study and an illustrative example also recommend against 2SPS and favor 2SRI. Our findings are important given that there are many prominent examples of the application of inconsistent 2SPS in the recent literature. This study can be used as a guide by future researchers in health economics who are confronted with endogeneity in their empirical work.

  8. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics.

    Science.gov (United States)

    Lin, Xiaohui; Li, Chao; Zhang, Yanhui; Su, Benzhe; Fan, Meng; Wei, Hai

    2017-12-26

    Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  9. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Xiaohui Lin

    2017-12-01

    Full Text Available Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  10. Maximally efficient two-stage screening: Determining intellectual disability in Taiwanese military conscripts.

    Science.gov (United States)

    Chien, Chia-Chang; Huang, Shu-Fen; Lung, For-Wey

    2009-01-27

    The purpose of this study was to apply a two-stage screening method for the large-scale intelligence screening of military conscripts. We collected 99 conscripted soldiers whose educational levels were senior high school level or lower to be the participants. Every participant was required to take the Wisconsin Card Sorting Test (WCST) and the Wechsler Adult Intelligence Scale-Revised (WAIS-R) assessments. Logistic regression analysis showed the conceptual level responses (CLR) index of the WCST was the most significant index for determining intellectual disability (ID; FIQ ≤ 84). We used the receiver operating characteristic curve to determine the optimum cut-off point of CLR. The optimum one cut-off point of CLR was 66; the two cut-off points were 49 and 66. Comparing the two-stage window screening with the two-stage positive screening, the area under the curve and the positive predictive value increased. Moreover, the cost of the two-stage window screening decreased by 59%. The two-stage window screening is more accurate and economical than the two-stage positive screening. Our results provide an example for the use of two-stage screening and the possibility of the WCST to replace WAIS-R in large-scale screenings for ID in the future.

  11. Spin-Selective Transmission and Devisable Chirality in Two-Layer Metasurfaces.

    Science.gov (United States)

    Li, Zhancheng; Liu, Wenwei; Cheng, Hua; Chen, Shuqi; Tian, Jianguo

    2017-08-15

    Chirality is a nearly ubiquitous natural phenomenon. Its minute presence in most naturally occurring materials makes it incredibly difficult to detect. Recent advances in metasurfaces indicate that they exhibit devisable chirality in novel forms; this finding offers an effective opening for studying chirality and its features in such nanostructures. These metasurfaces display vast possibilities for highly sensitive chirality discrimination in biological and chemical systems. Here, we show that two-layer metasurfaces based on twisted nanorods can generate giant spin-selective transmission and support engineered chirality in the near-infrared region. Two designed metasurfaces with opposite spin-selective transmission are proposed for treatment as enantiomers and can be used widely for spin selection and enhanced chiral sensing. Specifically, we demonstrate that the chirality in these proposed metasurfaces can be adjusted effectively by simply changing the orientation angle between the twisted nanorods. Our results offer simple and straightforward rules for chirality engineering in metasurfaces and suggest intriguing possibilities for the applications of such metasurfaces in spin optics and chiral sensing.

  12. An enhanced PSO-DEFS based feature selection with biometric authentication for identification of diabetic retinopathy

    Directory of Open Access Journals (Sweden)

    Umarani Balakrishnan

    2016-11-01

    Full Text Available Recently, automatic diagnosis of diabetic retinopathy (DR from the retinal image is the most significant research topic in the medical applications. Diabetic macular edema (DME is the major reason for the loss of vision in patients suffering from DR. Early identification of the DR enables to prevent the vision loss and encourage diabetic control activities. Many techniques are developed to diagnose the DR. The major drawbacks of the existing techniques are low accuracy and high time complexity. To overcome these issues, this paper proposes an enhanced particle swarm optimization-differential evolution feature selection (PSO-DEFS based feature selection approach with biometric authentication for the identification of DR. Initially, a hybrid median filter (HMF is used for pre-processing the input images. Then, the pre-processed images are embedded with each other by using least significant bit (LSB for authentication purpose. Simultaneously, the image features are extracted using convoluted local tetra pattern (CLTrP and Tamura features. Feature selection is performed using PSO-DEFS and PSO-gravitational search algorithm (PSO-GSA to reduce time complexity. Based on some performance metrics, the PSO-DEFS is chosen as a better choice for feature selection. The feature selection is performed based on the fitness value. A multi-relevance vector machine (M-RVM is introduced to classify the 13 normal and 62 abnormal images among 75 images from 60 patients. Finally, the DR patients are further classified by M-RVM. The experimental results exhibit that the proposed approach achieves better accuracy, sensitivity, and specificity than the existing techniques.

  13. A New Feature Ensemble with a Multistage Classification Scheme for Breast Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Idil Isikli Esener

    2017-01-01

    Full Text Available A new and effective feature ensemble with a multistage classification is proposed to be implemented in a computer-aided diagnosis (CAD system for breast cancer diagnosis. A publicly available mammogram image dataset collected during the Image Retrieval in Medical Applications (IRMA project is utilized to verify the suggested feature ensemble and multistage classification. In achieving the CAD system, feature extraction is performed on the mammogram region of interest (ROI images which are preprocessed by applying a histogram equalization followed by a nonlocal means filtering. The proposed feature ensemble is formed by concatenating the local configuration pattern-based, statistical, and frequency domain features. The classification process of these features is implemented in three cases: a one-stage study, a two-stage study, and a three-stage study. Eight well-known classifiers are used in all cases of this multistage classification scheme. Additionally, the results of the classifiers that provide the top three performances are combined via a majority voting technique to improve the recognition accuracy on both two- and three-stage studies. A maximum of 85.47%, 88.79%, and 93.52% classification accuracies are attained by the one-, two-, and three-stage studies, respectively. The proposed multistage classification scheme is more effective than the single-stage classification for breast cancer diagnosis.

  14. Two-dimensional wavelet transform feature extraction for porous silicon chemical sensors.

    Science.gov (United States)

    Murguía, José S; Vergara, Alexander; Vargas-Olmos, Cecilia; Wong, Travis J; Fonollosa, Jordi; Huerta, Ramón

    2013-06-27

    Designing reliable, fast responding, highly sensitive, and low-power consuming chemo-sensory systems has long been a major goal in chemo-sensing. This goal, however, presents a difficult challenge because having a set of chemo-sensory detectors exhibiting all these aforementioned ideal conditions are still largely un-realizable to-date. This paper presents a unique perspective on capturing more in-depth insights into the physicochemical interactions of two distinct, selectively chemically modified porous silicon (pSi) film-based optical gas sensors by implementing an innovative, based on signal processing methodology, namely the two-dimensional discrete wavelet transform. Specifically, the method consists of using the two-dimensional discrete wavelet transform as a feature extraction method to capture the non-stationary behavior from the bi-dimensional pSi rugate sensor response. Utilizing a comprehensive set of measurements collected from each of the aforementioned optically based chemical sensors, we evaluate the significance of our approach on a complex, six-dimensional chemical analyte discrimination/quantification task problem. Due to the bi-dimensional aspects naturally governing the optical sensor response to chemical analytes, our findings provide evidence that the proposed feature extractor strategy may be a valuable tool to deepen our understanding of the performance of optically based chemical sensors as well as an important step toward attaining their implementation in more realistic chemo-sensing applications. Copyright © 2013 Elsevier B.V. All rights reserved.

  15. The Two-Word Stage: Motivated by Linguistic or Cognitive Constraints?

    Science.gov (United States)

    Berk, Stephanie; Lillo-Martin, Diane

    2012-01-01

    Child development researchers often discuss a "two-word" stage during language acquisition. However, there is still debate over whether the existence of this stage reflects primarily cognitive or linguistic constraints. Analyses of longitudinal data from two Deaf children, Mei and Cal, not exposed to an accessible first language (American Sign…

  16. A two-stage heating scheme for heat assisted magnetic recording

    Science.gov (United States)

    Xiong, Shaomin; Kim, Jeongmin; Wang, Yuan; Zhang, Xiang; Bogy, David

    2014-05-01

    Heat Assisted Magnetic Recording (HAMR) has been proposed to extend the storage areal density beyond 1 Tb/in.2 for the next generation magnetic storage. A near field transducer (NFT) is widely used in HAMR systems to locally heat the magnetic disk during the writing process. However, much of the laser power is absorbed around the NFT, which causes overheating of the NFT and reduces its reliability. In this work, a two-stage heating scheme is proposed to reduce the thermal load by separating the NFT heating process into two individual heating stages from an optical waveguide and a NFT, respectively. As the first stage, the optical waveguide is placed in front of the NFT and delivers part of laser energy directly onto the disk surface to heat it up to a peak temperature somewhat lower than the Curie temperature of the magnetic material. Then, the NFT works as the second heating stage to heat a smaller area inside the waveguide heated area further to reach the Curie point. The energy applied to the NFT in the second heating stage is reduced compared with a typical single stage NFT heating system. With this reduced thermal load to the NFT by the two-stage heating scheme, the lifetime of the NFT can be extended orders longer under the cyclic load condition.

  17. TEHRAN AIR POLLUTANTS PREDICTION BASED ON RANDOM FOREST FEATURE SELECTION METHOD

    Directory of Open Access Journals (Sweden)

    A. Shamsoddini

    2017-09-01

    Full Text Available Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.

  18. Tehran Air Pollutants Prediction Based on Random Forest Feature Selection Method

    Science.gov (United States)

    Shamsoddini, A.; Aboodi, M. R.; Karami, J.

    2017-09-01

    Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.

  19. A proposed framework on hybrid feature selection techniques for handling high dimensional educational data

    Science.gov (United States)

    Shahiri, Amirah Mohamed; Husain, Wahidah; Rashid, Nur'Aini Abd

    2017-10-01

    Huge amounts of data in educational datasets may cause the problem in producing quality data. Recently, data mining approach are increasingly used by educational data mining researchers for analyzing the data patterns. However, many research studies have concentrated on selecting suitable learning algorithms instead of performing feature selection process. As a result, these data has problem with computational complexity and spend longer computational time for classification. The main objective of this research is to provide an overview of feature selection techniques that have been used to analyze the most significant features. Then, this research will propose a framework to improve the quality of students' dataset. The proposed framework uses filter and wrapper based technique to support prediction process in future study.

  20. Automatic Sleep Staging using Multi-dimensional Feature Extraction and Multi-kernel Fuzzy Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Yanjun Zhang

    2014-01-01

    Full Text Available This paper employed the clinical Polysomnographic (PSG data, mainly including all-night Electroencephalogram (EEG, Electrooculogram (EOG and Electromyogram (EMG signals of subjects, and adopted the American Academy of Sleep Medicine (AASM clinical staging manual as standards to realize automatic sleep staging. Authors extracted eighteen different features of EEG, EOG and EMG in time domains and frequency domains to construct the vectors according to the existing literatures as well as clinical experience. By adopting sleep samples self-learning, the linear combination of weights and parameters of multiple kernels of the fuzzy support vector machine (FSVM were learned and the multi-kernel FSVM (MK-FSVM was constructed. The overall agreement between the experts' scores and the results presented was 82.53%. Compared with previous results, the accuracy of N1 was improved to some extent while the accuracies of other stages were approximate, which well reflected the sleep structure. The staging algorithm proposed in this paper is transparent, and worth further investigation.

  1. Efficacy of single-stage and two-stage Fowler–Stephens laparoscopic orchidopexy in the treatment of intraabdominal high testis

    Directory of Open Access Journals (Sweden)

    Chang-Yuan Wang

    2017-11-01

    Conclusion: In the case of testis with good collateral circulation, single-stage F-S laparoscopic orchidopexy had the same safety and efficacy as the two-stage F-S procedure. Surgical options should be based on comprehensive consideration of intraoperative testicular location, testicular ischemia test, and collateral circumstances surrounding the testes. Under the appropriate conditions, we propose single-stage F-S laparoscopic orchidopexy be preferred. It may be appropriate to avoid unnecessary application of the two-stage procedure that has a higher cost and causes more pain for patients.

  2. An opinion formation based binary optimization approach for feature selection

    Science.gov (United States)

    Hamedmoghadam, Homayoun; Jalili, Mahdi; Yu, Xinghuo

    2018-02-01

    This paper proposed a novel optimization method based on opinion formation in complex network systems. The proposed optimization technique mimics human-human interaction mechanism based on a mathematical model derived from social sciences. Our method encodes a subset of selected features to the opinion of an artificial agent and simulates the opinion formation process among a population of agents to solve the feature selection problem. The agents interact using an underlying interaction network structure and get into consensus in their opinions, while finding better solutions to the problem. A number of mechanisms are employed to avoid getting trapped in local minima. We compare the performance of the proposed method with a number of classical population-based optimization methods and a state-of-the-art opinion formation based method. Our experiments on a number of high dimensional datasets reveal outperformance of the proposed algorithm over others.

  3. Influence of various chlorine additives on the partitioning of heavy metals during low-temperature two-stage fluidized bed incineration.

    Science.gov (United States)

    Peng, Tzu-Huan; Lin, Chiou-Liang

    2014-12-15

    In this study, a pilot-scale low-temperature two-stage fluidized bed incinerator was evaluated for the control of heavy metal emissions using various chlorine (Cl) additives. Artificial waste containing heavy metals was selected to simulate municipal solid waste (MSW). Operating parameters considered included the first-stage combustion temperature, gas velocity, and different kinds of Cl additives. Results showed that the low-temperature two-stage fluidized bed reactor can be an effective system for the treatment of MSW because of its low NO(x), CO, HCl, and heavy metal emissions. The NO(x) and HCl emissions could be decreased by 42% and 70%, respectively. Further, the results showed that heavy metal emissions were reduced by bed material adsorption and filtration in the second stage. Regarding the Cl addition, although the Cl addition would reduce the metal capture in the first-stage sand bed, but those emitted metals could be effectively captured by the filtration of second stage. No matter choose what kind of additive, metal emissions in the low-temperature two-stage system are still lower than in a traditional high-temperature one-stage system. The results also showed that metal emissions depend not only on the combustion temperature but also on the physicochemical properties of the different metal species. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Two-Stage Regularized Linear Discriminant Analysis for 2-D Data.

    Science.gov (United States)

    Zhao, Jianhua; Shi, Lei; Zhu, Ji

    2015-08-01

    Fisher linear discriminant analysis (LDA) involves within-class and between-class covariance matrices. For 2-D data such as images, regularized LDA (RLDA) can improve LDA due to the regularized eigenvalues of the estimated within-class matrix. However, it fails to consider the eigenvectors and the estimated between-class matrix. To improve these two matrices simultaneously, we propose in this paper a new two-stage method for 2-D data, namely a bidirectional LDA (BLDA) in the first stage and the RLDA in the second stage, where both BLDA and RLDA are based on the Fisher criterion that tackles correlation. BLDA performs the LDA under special separable covariance constraints that incorporate the row and column correlations inherent in 2-D data. The main novelty is that we propose a simple but effective statistical test to determine the subspace dimensionality in the first stage. As a result, the first stage reduces the dimensionality substantially while keeping the significant discriminant information in the data. This enables the second stage to perform RLDA in a much lower dimensional subspace, and thus improves the two estimated matrices simultaneously. Experiments on a number of 2-D synthetic and real-world data sets show that BLDA+RLDA outperforms several closely related competitors.

  5. Two-stage precipitation of plutonium trifluoride

    International Nuclear Information System (INIS)

    Luerkens, D.W.

    1984-04-01

    Plutonium trifluoride was precipitated using a two-stage precipitation system. A series of precipitation experiments identified the significant process variables affecting precipitate characteristics. A mathematical precipitation model was developed which was based on the formation of plutonium fluoride complexes. The precipitation model relates all process variables, in a single equation, to a single parameter that can be used to control particle characteristics

  6. License Application Design Selection Feature Report: Waste Package Self Shielding Design Feature 13

    International Nuclear Information System (INIS)

    Tang, J.S.

    2000-01-01

    In the Viability Assessment (VA) reference design, handling of waste packages (WPs) in the emplacement drifts is performed remotely, and human access to the drifts is precluded when WPs are present. This report will investigate the feasibility of using a self-shielded WP design to reduce the radiation levels in the emplacement drifts to a point that, when coupled with ventilation, will create an acceptable environment for human access. This provides the benefit of allowing human entry to emplacement drifts to perform maintenance on ground support and instrumentation, and carry out performance confirmation activities. More direct human control of WP handling and emplacement operations would also be possible. However, these potential benefits must be weighed against the cost of implementation, and potential impacts on pre- and post-closure performance of the repository and WPs. The first section of this report will provide background information on previous investigations of the self-shielded WP design feature, summarize the objective and scope of this document, and provide quality assurance and software information. A shielding performance and cost study that includes several candidate shield materials will then be performed in the subsequent section to allow selection of two self-shielded WP design options for further evaluation. Finally, the remaining sections will evaluate the impacts of the two WP self-shielding options on the repository design, operations, safety, cost, and long-term performance of the WPs with respect to the VA reference design

  7. Behavioral performance follows the time course of neural facilitation and suppression during cued shifts of feature-selective attention

    OpenAIRE

    Andersen, S. K.; Müller, M. M.

    2010-01-01

    A central question in the field of attention is whether visual processing is a strictly limited resource, which must be allocated by selective attention. If this were the case, attentional enhancement of one stimulus should invariably lead to suppression of unattended distracter stimuli. Here we examine voluntary cued shifts of feature-selective attention to either one of two superimposed red or blue random dot kinematograms (RDKs) to test whether such a reciprocal relationship between enhanc...

  8. Use of a two-stage light-gas gun as an injector for electromagnetic railguns

    International Nuclear Information System (INIS)

    Shahinpoor, M.

    1989-01-01

    Ablation of wall materials is known to be a major factor limiting the performance of railguns. To minimize this effect, it is desirable too inject projectiles into railgun at velocities greater than the ablation threshold velocity (6-8 km/s for copper rails). Because two-stage light-gas guns are capable of achieving such velocities, a program was initiated to design, build and evaluate the performance of a two-stage light gas gun, utilizing hydrogen gas, for use as an injector to an electromagnetic railgun. This effort is part of a project to develop a hypervelocity electromagnetic launcher (HELEOS) for use in equation-of-state studies. In this paper, the specific design features that enhance compatibility of the injector with the railgun, including a slip-joint between the injector launch tube and the coupling section to the railgun are described. The operational capabilities for using all major projectile velocity measuring techniques, such as in-bore pressure gauges, laser and CW x-ray interrupt techniques, flash x-ray and continuous in-bore velocity measurements using VISAR interferometry are also discussed. Finally an internal ballistics code for optimizing gun performance has been utilized to interpret performance data of the gun

  9. Data Visualization and Feature Selection Methods in Gel-based Proteomics

    DEFF Research Database (Denmark)

    Silva, Tomé Santos; Richard, Nadege; Dias, Jorge P.

    2014-01-01

    -based proteomics, summarizing the current state of research within this field. Particular focus is given on discussing the usefulness of available multivariate analysis tools both for data visualization and feature selection purposes. Visual examples are given using a real gel-based proteomic dataset as basis....

  10. Features of construction of structures in long-term training acrobatics at the modern stage

    Directory of Open Access Journals (Sweden)

    N.V. Bachynska

    2015-02-01

    Full Text Available Purpose: the basic directions of the structure of long-term training in sports acrobatics are ground. The objectives of the study was to determine the leading requirements and criteria, the main stages of a multi-year training in acrobatics. Material : analysis of special scientific and methodical literature, revealing the specific features of the construction of long-term training in sports and gymnastics, acrobatic rock 'n' roll, a number of other sports. Results : general structure, goals, objectives and provisions of the basic stages of a multi-year training in sports acrobatics. Singled leading indicators and criteria for each of the main stages of long-term sports training in acrobatics. Recommended duration of training sessions and key requirements for the preparation of acrobats. Conclusions : outlines the main requirements and benchmarks that can guide the trainer in a training and competitive activity when working with acrobats all age groups and different sports qualification.

  11. Two-Stage Dynamic Pricing and Advertising Strategies for Online Video Services

    Directory of Open Access Journals (Sweden)

    Zhi Li

    2017-01-01

    Full Text Available As the demands for online video services increase intensively, the selection of business models has drawn the great attention of online providers. Among them, pay-per-view mode and advertising mode are two important resource modes, where the reasonable fee charge and suitable volume of ads need to be determined. This paper establishes an analytical framework studying the optimal dynamic pricing and advertising strategies for online providers; it shows how the strategies are influenced by the videos available time and the viewers’ emotional factor. We create the two-stage strategy of revenue models involving a single fee mode and a mixed fee-free mode and find out the optimal fee charge and advertising level of online video services. According to the results, the optimal video price and ads volume dynamically vary over time. The viewer’s aversion level to advertising has direct effects on both the volume of ads and the number of viewers who have selected low-quality content. The optimal volume of ads decreases with the increase of ads-aversion coefficient, while increasing as the quality of videos increases. The results also indicate that, in the long run, a pure fee mode or free mode is the optimal strategy for online providers.

  12. On the robustness of two-stage estimators

    KAUST Repository

    Zhelonkin, Mikhail; Genton, Marc G.; Ronchetti, Elvezio

    2012-01-01

    The aim of this note is to provide a general framework for the analysis of the robustness properties of a broad class of two-stage models. We derive the influence function, the change-of-variance function, and the asymptotic variance of a general

  13. Technical Evaluation Report 27: Educational Wikis: Features and selection criteria

    Directory of Open Access Journals (Sweden)

    Jim Rudolph

    2004-04-01

    Full Text Available This report discusses the educational uses of the ‘wiki,’ an increasingly popular approach to online community development. Wikis are defined and compared with ‘blogging’ methods; characteristics of major wiki engines are described; and wiki features and selection criteria are examined.

  14. Features of tennis methods of teaching 5-6 years old children in the initial stages

    Directory of Open Access Journals (Sweden)

    E.V. Kurmaeva

    2014-06-01

    Full Text Available Purpose : theoretical and methodological justification for the existing teaching methods tennis of 5-6 years old children. Material : 17 special analysis and scientific and methodological sources. Results : the features of the existing methods of teaching children at an early stage of training. The main theses of the existing methods: 1 the training process is carried out in the form of games; 2 the level of general physical preparedness level exceeds special; 3 The first two years of the children do not participate in official competitions; 4 education of children begins with " School Ball ", with a gradual transition to employment with racket and ball; 5 training is built on two levels: theoretical - each " part" in the form of pre- formation of a mental model of rational behavior, and practical - the formation of the ability to perform motor actions. Conclusions : it was found that the existing methods of constructing the training process for children 5-6 years do not account for their physiological characteristics, therefore proposed to use computer technology and animation, that will shorten the formation of motor skills of children.

  15. Second feature of the matter two-point function

    Science.gov (United States)

    Tansella, Vittorio

    2018-05-01

    We point out the existence of a second feature in the matter two-point function, besides the acoustic peak, due to the baryon-baryon correlation in the early Universe and positioned at twice the distance of the peak. We discuss how the existence of this feature is implied by the well-known heuristic argument that explains the baryon bump in the correlation function. A standard χ2 analysis to estimate the detection significance of the second feature is mimicked. We conclude that, for realistic values of the baryon density, a SKA-like galaxy survey will not be able to detect this feature with standard correlation function analysis.

  16. Notes on the evolution of feature selection methodology

    Czech Academy of Sciences Publication Activity Database

    Somol, Petr; Novovičová, Jana; Pudil, Pavel

    2007-01-01

    Roč. 43, č. 5 (2007), s. 713-730 ISSN 0023-5954 R&D Projects: GA ČR GA102/07/1594; GA MŠk 1M0572; GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : feature selection * branch and bound * sequential search * mixture model Subject RIV: IN - Informatics, Computer Science Impact factor: 0.552, year: 2007

  17. EVALUATION OF A TWO-STAGE PASSIVE TREATMENT APPROACH FOR MINING INFLUENCE WATERS

    Science.gov (United States)

    A two-stage passive treatment approach was assessed at bench-scale using two Colorado Mining Influenced Waters (MIWs). The first-stage was a limestone drain with the purpose of removing iron and aluminum and mitigating the potential effects of mineral acidity. The second stage w...

  18. Feature-size dependent selective edge enhancement of x-ray images

    International Nuclear Information System (INIS)

    Herman, S.

    1988-01-01

    Morphological filters are nonlinear signal transformations that operate on a picture directly in the space domain. Such filters are based on the theory of mathematical morphology previously formulated. The filt4er being presented here features a ''mask'' operator (called a ''structuring element'' in some of the literature) which is a function of the two spatial coordinates x and y. The two basic mathematical operations are called ''masked erosion'' and ''masked dilation''. In the case of masked erosion the mask is passed over the input image in a raster pattern. At each position of the mask, the pixel values under the mask are multiplied by the mask pixel values. Then the output pixel value, located at the center position of the mask,is set equal to the minimum of the product of the mask and input values. Similarity, for masked dilation, the output pixel value is the maximum of the product of the input and the mask pixel values. The two basic processes of dilation and erosion can be used to construct the next level of operations the ''positive sieve'' (also called ''opening'') and the ''negative sieve'' (''closing''). The positive sieve modifies the peaks in the image whereas the negative sieve works on image valleys. The positive sieve is implemented by passing the output of the masked erosion step through the masked dilation function. The negative sieve reverses this procedure, using a dilation followed by an erosion. Each such sifting operator is characterized by a ''hole size''. It will be shown that the choice of hole size will select the range of pixel detail sizes which are to be enhanced. The shape of the mask will govern the shape of the enhancement. Finally positive sifting is used to enhance positive-going (peak) features, whereas negative enhances the negative-going (valley) landmarks

  19. Selection is stronger in early-versus-late stages of divergence in a Neotropical livebearing fish.

    Science.gov (United States)

    Ingley, Spencer J; Johnson, Jerald B

    2016-03-01

    How selection acts to drive trait evolution at different stages of divergence is of fundamental importance in our understanding of the origins of biodiversity. Yet, most studies have focused on a single point along an evolutionary trajectory. Here, we provide a case study evaluating the strength of divergent selection acting on life-history traits at early-versus-late stages of divergence in Brachyrhaphis fishes. We find that the difference in selection is stronger in the early-diverged population than the late-diverged population, and that trait differences acquired early are maintained over time. © 2016 The Author(s).

  20. Emotion of Physiological Signals Classification Based on TS Feature Selection

    Institute of Scientific and Technical Information of China (English)

    Wang Yujing; Mo Jianlin

    2015-01-01

    This paper propose a method of TS-MLP about emotion recognition of physiological signal.It can recognize emotion successfully by Tabu search which selects features of emotion’s physiological signals and multilayer perceptron that is used to classify emotion.Simulation shows that it has achieved good emotion classification performance.

  1. Transport fuels from two-stage coal liquefaction

    Energy Technology Data Exchange (ETDEWEB)

    Benito, A.; Cebolla, V.; Fernandez, I.; Martinez, M.T.; Miranda, J.L.; Oelert, H.; Prado, J.G. (Instituto de Carboquimica CSIC, Zaragoza (Spain))

    1994-03-01

    Four Spanish lignites and their vitrinite concentrates were evaluated for coal liquefaction. Correlationships between the content of vitrinite and conversion in direct liquefaction were observed for the lignites but not for the vitrinite concentrates. The most reactive of the four coals was processed in two-stage liquefaction at a higher scale. First-stage coal liquefaction was carried out in a continuous unit at Clausthal University at a temperature of 400[degree]C at 20 MPa hydrogen pressure and with anthracene oil as a solvent. The coal conversion obtained was 75.41% being 3.79% gases, 2.58% primary condensate and 69.04% heavy liquids. A hydroprocessing unit was built at the Instituto de Carboquimica for the second-stage coal liquefaction. Whole and deasphalted liquids from the first-stage liquefaction were processed at 450[degree]C and 10 MPa hydrogen pressure, with two commercial catalysts: Harshaw HT-400E (Co-Mo/Al[sub 2]O[sub 3]) and HT-500E (Ni-Mo/Al[sub 2]O[sub 3]). The effects of liquid hourly space velocity (LHSV), temperature, gas/liquid ratio and catalyst on the heteroatom liquids, and levels of 5 ppm of nitrogen and 52 ppm of sulphur were reached at 450[degree]C, 10 MPa hydrogen pressure, 0.08 kg H[sub 2]/kg feedstock and with Harshaw HT-500E catalyst. The liquids obtained were hydroprocessed again at 420[degree]C, 10 MPa hydrogen pressure and 0.06 kg H[sub 2]/kg feedstock to hydrogenate the aromatic structures. In these conditions, the aromaticity was reduced considerably, and 39% of naphthas and 35% of kerosene fractions were obtained. 18 refs., 4 figs., 4 tabs.

  2. Maximally efficient two-stage screening: Determining intellectual disability in Taiwanese military conscripts

    Directory of Open Access Journals (Sweden)

    Chia-Chang Chien

    2009-01-01

    Full Text Available Chia-Chang Chien1, Shu-Fen Huang1,2,3,4, For-Wey Lung1,2,3,41Department of Psychiatry, Kaohsiung Armed Forces General Hospital, Kaohsiung, Taiwan; 2Graduate Institute of Behavioral Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan; 3Department of Psychiatry, National Defense Medical Center, Taipei, Taiwan; 4Calo Psychiatric Center, Pingtung County, TaiwanObjective: The purpose of this study was to apply a two-stage screening method for the large-scale intelligence screening of military conscripts.Methods: We collected 99 conscripted soldiers whose educational levels were senior high school level or lower to be the participants. Every participant was required to take the Wisconsin Card Sorting Test (WCST and the Wechsler Adult Intelligence Scale-Revised (WAIS-R assessments.Results: Logistic regression analysis showed the conceptual level responses (CLR index of the WCST was the most significant index for determining intellectual disability (ID; FIQ ≤ 84. We used the receiver operating characteristic curve to determine the optimum cut-off point of CLR. The optimum one cut-off point of CLR was 66; the two cut-off points were 49 and 66. Comparing the two-stage window screening with the two-stage positive screening, the area under the curve and the positive predictive value increased. Moreover, the cost of the two-stage window screening decreased by 59%.Conclusion: The two-stage window screening is more accurate and economical than the two-stage positive screening. Our results provide an example for the use of two-stage screening and the possibility of the WCST to replace WAIS-R in large-scale screenings for ID in the future.Keywords: intellectual disability, intelligence screening, two-stage positive screening, Wisconsin Card Sorting Test, Wechsler Adult Intelligence Scale-Revised

  3. Two-stage model of development of heterogeneous uranium-lead systems in zircon

    International Nuclear Information System (INIS)

    Mel'nikov, N.N.; Zevchenkov, O.A.

    1985-01-01

    Behaviour of isotope systems of multiphase zircons at their two-stage distortion is considered. The results of calculations testify to the fact that linear correlations on the diagram with concordance can be explained including two-stage discovery of U-Pb systems of cogenetic zircons if zircon is considered physically heterogeneous and losing in its different part different ratios of accumulated radiogenic lead. ''Metamorphism ages'' obtained by these two-stage opening zircons are intermediate, and they not have geochronological significance while ''crystallization ages'' remain rather close to real ones. Two-stage opening zircons in some cases can be diagnosed by discordance of their crystal component

  4. Discrete Biogeography Based Optimization for Feature Selection in Molecular Signatures.

    Science.gov (United States)

    Liu, Bo; Tian, Meihong; Zhang, Chunhua; Li, Xiangtao

    2015-04-01

    Biomarker discovery from high-dimensional data is a complex task in the development of efficient cancer diagnoses and classification. However, these data are usually redundant and noisy, and only a subset of them present distinct profiles for different classes of samples. Thus, selecting high discriminative genes from gene expression data has become increasingly interesting in the field of bioinformatics. In this paper, a discrete biogeography based optimization is proposed to select the good subset of informative gene relevant to the classification. In the proposed algorithm, firstly, the fisher-markov selector is used to choose fixed number of gene data. Secondly, to make biogeography based optimization suitable for the feature selection problem; discrete migration model and discrete mutation model are proposed to balance the exploration and exploitation ability. Then, discrete biogeography based optimization, as we called DBBO, is proposed by integrating discrete migration model and discrete mutation model. Finally, the DBBO method is used for feature selection, and three classifiers are used as the classifier with the 10 fold cross-validation method. In order to show the effective and efficiency of the algorithm, the proposed algorithm is tested on four breast cancer dataset benchmarks. Comparison with genetic algorithm, particle swarm optimization, differential evolution algorithm and hybrid biogeography based optimization, experimental results demonstrate that the proposed method is better or at least comparable with previous method from literature when considering the quality of the solutions obtained. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Diagnostic value of sleep stage dissociation as visualized on a 2-dimensional sleep state space in human narcolepsy

    DEFF Research Database (Denmark)

    Olsen, Anders Vinther; Stephansen, Jens; Leary, Eileen B.

    2017-01-01

    Type 1 narcolepsy (NT1) is characterized by symptoms believed to represent Rapid Eye Movement (REM) sleep stage dissociations, occurrences where features of wake and REM sleep are intermingled, resulting in a mixed state. We hypothesized that sleep stage dissociations can be objectively detected...... through the analysis of nocturnal Polysomnography (PSG) data, and that those affecting REM sleep can be used as a diagnostic feature for narcolepsy. A Linear Discriminant Analysis (LDA) model using 38 features extracted from EOG, EMG and EEG was used in control subjects to select features differentiating...... wake, stage N1, N2, N3 and REM sleep. Sleep stage differentiation was next represented in a 2D projection. Features characteristic of sleep stage differences were estimated from the residual sleep stage probability in the 2D space. Using this model we evaluated PSG data from NT1 and non...

  6. Two technicians apply insulation to S-II second stage

    Science.gov (United States)

    1964-01-01

    Two technicians apply insulation to the outer surface of the S-II second stage booster for the Saturn V moon rocket. The towering 363-foot Saturn V was a multi-stage, multi-engine launch vehicle standing taller than the Statue of Liberty. Altogether, the Saturn V engines produced as much power as 85 Hoover Dams.

  7. Recent Development of the Two-Stroke Engine. II - Design Features. 2; Design Features

    Science.gov (United States)

    Zeman, J.

    1945-01-01

    Completing the first paper dealing with charging methods and arrangements, the present paper discusses the design forms of two-stroke engines. Features which largely influence piston running are: (a) The shape and surface condition of the sliding parts. (b) The cylinder and piston materials. (c) Heat conditions in the piston, and lubrication. There is little essential difference between four-stroke and two-stroke engines with ordinary pistons. In large engines, for example, are always found separately cast or welded frames in which the stresses are taken up by tie rods. Twin piston and timing piston engines often differ from this design. Examples can be found in many engines of German or foreign make. Their methods of operation will be dealt with in the third part of the present paper, which also includes the bibliography. The development of two-stroke engine design is, of course, mainly concerned with such features as are inherently difficult to master; that is, the piston barrel and the design of the gudgeon pin bearing. Designers of four-stroke engines now-a-days experience approximately the same difficulties, since heat stresses have increased to the point of influencing conditions in the piston barrel. Features which notably affect this are: (a) The material. (b) Prevailing heat conditions.

  8. Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients.

    Directory of Open Access Journals (Sweden)

    Nicole A Capela

    Full Text Available Human activity recognition (HAR, using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter. The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree. Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.

  9. Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients.

    Science.gov (United States)

    Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie

    2015-01-01

    Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.

  10. Feature selection for neural network based defect classification of ceramic components using high frequency ultrasound.

    Science.gov (United States)

    Kesharaju, Manasa; Nagarajah, Romesh

    2015-09-01

    The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Mining potential biomarkers associated with space flight in Caenorhabditis elegans experienced Shenzhou-8 mission with multiple feature selection techniques

    International Nuclear Information System (INIS)

    Zhao, Lei; Gao, Ying; Mi, Dong; Sun, Yeqing

    2016-01-01

    Highlights: • A combined algorithm is proposed to mine biomarkers of spaceflight in C. elegans. • This algorithm makes the feature selection more reliable and robust. • Apply this algorithm to predict 17 positive biomarkers to space environment stress. • The strategy can be used as a general method to select important features. - Abstract: To identify the potential biomarkers associated with space flight, a combined algorithm, which integrates the feature selection techniques, was used to deal with the microarray datasets of Caenorhabditis elegans obtained in the Shenzhou-8 mission. Compared with the ground control treatment, a total of 86 differentially expressed (DE) genes in responses to space synthetic environment or space radiation environment were identified by two filter methods. And then the top 30 ranking genes were selected by the random forest algorithm. Gene Ontology annotation and functional enrichment analyses showed that these genes were mainly associated with metabolism process. Furthermore, clustering analysis showed that 17 genes among these are positive, including 9 for space synthetic environment and 8 for space radiation environment only. These genes could be used as the biomarkers to reflect the space environment stresses. In addition, we also found that microgravity is the main stress factor to change the expression patterns of biomarkers for the short-duration spaceflight.

  12. Mining potential biomarkers associated with space flight in Caenorhabditis elegans experienced Shenzhou-8 mission with multiple feature selection techniques

    Energy Technology Data Exchange (ETDEWEB)

    Zhao, Lei [Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026 (China); Gao, Ying [Center of Medical Physics and Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Shushanhu Road 350, Hefei 230031 (China); Mi, Dong, E-mail: mid@dlmu.edu.cn [Department of Physics, Dalian Maritime University, Dalian 116026 (China); Sun, Yeqing, E-mail: yqsun@dlmu.edu.cn [Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026 (China)

    2016-09-15

    Highlights: • A combined algorithm is proposed to mine biomarkers of spaceflight in C. elegans. • This algorithm makes the feature selection more reliable and robust. • Apply this algorithm to predict 17 positive biomarkers to space environment stress. • The strategy can be used as a general method to select important features. - Abstract: To identify the potential biomarkers associated with space flight, a combined algorithm, which integrates the feature selection techniques, was used to deal with the microarray datasets of Caenorhabditis elegans obtained in the Shenzhou-8 mission. Compared with the ground control treatment, a total of 86 differentially expressed (DE) genes in responses to space synthetic environment or space radiation environment were identified by two filter methods. And then the top 30 ranking genes were selected by the random forest algorithm. Gene Ontology annotation and functional enrichment analyses showed that these genes were mainly associated with metabolism process. Furthermore, clustering analysis showed that 17 genes among these are positive, including 9 for space synthetic environment and 8 for space radiation environment only. These genes could be used as the biomarkers to reflect the space environment stresses. In addition, we also found that microgravity is the main stress factor to change the expression patterns of biomarkers for the short-duration spaceflight.

  13. Magnetic Resonance Imaging Features of the Nigrostriatal System: Biomarkers of Parkinson’s Disease Stages?

    Science.gov (United States)

    Hopes, Lucie; Grolez, Guillaume; Moreau, Caroline; Lopes, Renaud; Ryckewaert, Gilles; Carrière, Nicolas; Auger, Florent; Laloux, Charlotte; Petrault, Maud; Devedjian, Jean-Christophe; Bordet, Regis; Defebvre, Luc; Jissendi, Patrice; Delmaire, Christine; Devos, David

    2016-01-01

    Introduction Magnetic resonance imaging (MRI) can be used to identify biomarkers in Parkinson’s disease (PD); R2* values reflect iron content related to high levels of oxidative stress, whereas volume and/or shape changes reflect neuronal death. We sought to assess iron overload in the nigrostriatal system and characterize its relationship with focal and overall atrophy of the striatum in the pivotal stages of PD. Methods Twenty controls and 70 PD patients at different disease stages (untreated de novo patients, treated early-stage patients and advanced-stage patients with L-dopa-related motor complications) were included in the study. We determined the R2* values in the substantia nigra, putamen and caudate nucleus, together with striatal volume and shape analysis. We also measured R2* in an acute MPTP mouse model and in a longitudinal follow-up two years later in the early-stage PD patients. Results The R2* values in the substantia nigra, putamen and caudate nucleus were significantly higher in de novo PD patients than in controls. Early-stage patients displayed significantly higher R2* values in the substantia nigra (with changes in striatal shape), relative to de novo patients. Measurements after a two-year follow-up in early-stage patients and characterization of the acute MPTP mouse model confirmed that R2* changed rapidly with disease progression. Advanced-stage patients displayed significant atrophy of striatum, relative to earlier disease stages. Conclusion Each pivotal stage in PD appears to be characterized by putative nigrostriatal MRI biomarkers: iron overload at the de novo stage, striatal shape changes at early-stage disease and generalized striatal atrophy at advanced disease. PMID:27035571

  14. Investigation of Power Losses of Two-Stage Two-Phase Converter with Two-Phase Motor

    Directory of Open Access Journals (Sweden)

    Michal Prazenica

    2011-01-01

    Full Text Available The paper deals with determination of losses of two-stage power electronic system with two-phase variable orthogonal output. The simulation is focused on the investigation of losses in the converter during one period in steady-state operation. Modeling and simulation of two matrix converters with R-L load is shown in the paper. The simulation results confirm a very good time-waveform of the phase current and the system seems to be suitable for low-cost application in automotive/aerospace industries and in application with high frequency voltage sources.

  15. The co-feature ratio, a novel method for the measurement of chromatographic and signal selectivity in LC-MS-based metabolomics

    International Nuclear Information System (INIS)

    Elmsjö, Albert; Haglöf, Jakob; Engskog, Mikael K.R.; Nestor, Marika; Arvidsson, Torbjörn; Pettersson, Curt

    2017-01-01

    Evaluation of analytical procedures, especially in regards to measuring chromatographic and signal selectivity, is highly challenging in untargeted metabolomics. The aim of this study was to suggest a new straightforward approach for a systematic examination of chromatographic and signal selectivity in LC-MS-based metabolomics. By calculating the ratio between each feature and its co-eluting features (the co-features), a measurement of the chromatographic selectivity (i.e. extent of co-elution) as well as the signal selectivity (e.g. amount of adduct formation) of each feature could be acquired, the co-feature ratio. This approach was used to examine possible differences in chromatographic and signal selectivity present in samples exposed to three different sample preparation procedures. The capability of the co-feature ratio was evaluated both in a classical targeted setting using isotope labelled standards as well as without standards in an untargeted setting. For the targeted analysis, several metabolites showed a skewed quantitative signal due to poor chromatographic selectivity and/or poor signal selectivity. Moreover, evaluation of the untargeted approach through multivariate analysis of the co-feature ratios demonstrated the possibility to screen for metabolites displaying poor chromatographic and/or signal selectivity characteristics. We conclude that the co-feature ratio can be a useful tool in the development and evaluation of analytical procedures in LC-MS-based metabolomics investigations. Increased selectivity through proper choice of analytical procedures may decrease the false positive and false negative discovery rate and thereby increase the validity of any metabolomic investigation. - Highlights: • The co-feature ratio (CFR) is introduced. • CFR measures chromatographic and signal selectivity of a feature. • CFR can be used for evaluating experimental procedures in metabolomics. • CFR can aid in locating features with poor selectivity.

  16. The co-feature ratio, a novel method for the measurement of chromatographic and signal selectivity in LC-MS-based metabolomics

    Energy Technology Data Exchange (ETDEWEB)

    Elmsjö, Albert, E-mail: Albert.Elmsjo@farmkemi.uu.se [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden); Haglöf, Jakob; Engskog, Mikael K.R. [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden); Nestor, Marika [Department of Immunology, Genetics and Pathology, Uppsala University (Sweden); Arvidsson, Torbjörn [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden); Medical Product Agency, Uppsala (Sweden); Pettersson, Curt [Department of Medicinal Chemistry, Division of Analytical Pharmaceutical Chemistry, Uppsala University (Sweden)

    2017-03-01

    Evaluation of analytical procedures, especially in regards to measuring chromatographic and signal selectivity, is highly challenging in untargeted metabolomics. The aim of this study was to suggest a new straightforward approach for a systematic examination of chromatographic and signal selectivity in LC-MS-based metabolomics. By calculating the ratio between each feature and its co-eluting features (the co-features), a measurement of the chromatographic selectivity (i.e. extent of co-elution) as well as the signal selectivity (e.g. amount of adduct formation) of each feature could be acquired, the co-feature ratio. This approach was used to examine possible differences in chromatographic and signal selectivity present in samples exposed to three different sample preparation procedures. The capability of the co-feature ratio was evaluated both in a classical targeted setting using isotope labelled standards as well as without standards in an untargeted setting. For the targeted analysis, several metabolites showed a skewed quantitative signal due to poor chromatographic selectivity and/or poor signal selectivity. Moreover, evaluation of the untargeted approach through multivariate analysis of the co-feature ratios demonstrated the possibility to screen for metabolites displaying poor chromatographic and/or signal selectivity characteristics. We conclude that the co-feature ratio can be a useful tool in the development and evaluation of analytical procedures in LC-MS-based metabolomics investigations. Increased selectivity through proper choice of analytical procedures may decrease the false positive and false negative discovery rate and thereby increase the validity of any metabolomic investigation. - Highlights: • The co-feature ratio (CFR) is introduced. • CFR measures chromatographic and signal selectivity of a feature. • CFR can be used for evaluating experimental procedures in metabolomics. • CFR can aid in locating features with poor selectivity.

  17. A consistency-based feature selection method allied with linear SVMs for HIV-1 protease cleavage site prediction.

    Directory of Open Access Journals (Sweden)

    Orkun Oztürk

    Full Text Available BACKGROUND: Predicting type-1 Human Immunodeficiency Virus (HIV-1 protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug design (especially for HIV-1 protease inhibitors against this life-threatening virus. However, some drawbacks (like the shortage of the available training data and the high dimensionality of the feature space turn this task into a difficult classification problem. Thus, various machine learning techniques, and specifically several classification methods have been proposed in order to increase the accuracy of the classification model. In addition, for several classification problems, which are characterized by having few samples and many features, selecting the most relevant features is a major factor for increasing classification accuracy. RESULTS: We propose for HIV-1 data a consistency-based feature selection approach in conjunction with recursive feature elimination of support vector machines (SVMs. We used various classifiers for evaluating the results obtained from the feature selection process. We further demonstrated the effectiveness of our proposed method by comparing it with a state-of-the-art feature selection method applied on HIV-1 data, and we evaluated the reported results based on attributes which have been selected from different combinations. CONCLUSION: Applying feature selection on training data before realizing the classification task seems to be a reasonable data-mining process when working with types of data similar to HIV-1. On HIV-1 data, some feature selection or extraction operations in conjunction with different classifiers have been tested and noteworthy outcomes have been reported. These facts motivate for the work presented in this paper. SOFTWARE AVAILABILITY: The software is available at http

  18. Absolute cosine-based SVM-RFE feature selection method for prostate histopathological grading.

    Science.gov (United States)

    Sahran, Shahnorbanun; Albashish, Dheeb; Abdullah, Azizi; Shukor, Nordashima Abd; Hayati Md Pauzi, Suria

    2018-04-18

    Feature selection (FS) methods are widely used in grading and diagnosing prostate histopathological images. In this context, FS is based on the texture features obtained from the lumen, nuclei, cytoplasm and stroma, all of which are important tissue components. However, it is difficult to represent the high-dimensional textures of these tissue components. To solve this problem, we propose a new FS method that enables the selection of features with minimal redundancy in the tissue components. We categorise tissue images based on the texture of individual tissue components via the construction of a single classifier and also construct an ensemble learning model by merging the values obtained by each classifier. Another issue that arises is overfitting due to the high-dimensional texture of individual tissue components. We propose a new FS method, SVM-RFE(AC), that integrates a Support Vector Machine-Recursive Feature Elimination (SVM-RFE) embedded procedure with an absolute cosine (AC) filter method to prevent redundancy in the selected features of the SV-RFE and an unoptimised classifier in the AC. We conducted experiments on H&E histopathological prostate and colon cancer images with respect to three prostate classifications, namely benign vs. grade 3, benign vs. grade 4 and grade 3 vs. grade 4. The colon benchmark dataset requires a distinction between grades 1 and 2, which are the most difficult cases to distinguish in the colon domain. The results obtained by both the single and ensemble classification models (which uses the product rule as its merging method) confirm that the proposed SVM-RFE(AC) is superior to the other SVM and SVM-RFE-based methods. We developed an FS method based on SVM-RFE and AC and successfully showed that its use enabled the identification of the most crucial texture feature of each tissue component. Thus, it makes possible the distinction between multiple Gleason grades (e.g. grade 3 vs. grade 4) and its performance is far superior to

  19. Artificial immune system and sheep flock algorithms for two-stage fixed-charge transportation problem

    DEFF Research Database (Denmark)

    Kannan, Devika; Govindan, Kannan; Soleimani, Hamed

    2014-01-01

    In this paper, we cope with a two-stage distribution planning problem of supply chain regarding fixed charges. The focus of the paper is on developing efficient solution methodologies of the selected NP-hard problem. Based on computational limitations, common exact and approximation solution...... approaches are unable to solve real-world instances of such NP-hard problems in a reasonable time. These approaches involve cumbersome computational steps in real-size cases. In order to solve the mixed integer linear programming model, we develop an artificial immune system and a sheep flock algorithm...

  20. Device for two-stage cementing of casing

    Energy Technology Data Exchange (ETDEWEB)

    Kudimov, D A; Goncharevskiy, Ye N; Luneva, L G; Shchelochkov, S N; Shil' nikova, L N; Tereshchenko, V G; Vasiliev, V A; Volkova, V V; Zhdokov, K I

    1981-01-01

    A device is claimed for two-stage cementing of casing. It consists of a body with lateral plugging vents, upper and lower movable sleeves, a check valve with axial channels that's situated in the lower sleeve, and a displacement limiting device for the lower sleeve. To improve the cementing process of the casing by preventing overflow of cementing fluids from the annular space into the first stage casing, the limiter is equipped with a spring rod that is capable of covering the axial channels of the check valve while it's in an operating mode. In addition, the rod in the upper part is equipped with a reinforced area under the axial channels of the check valve.

  1. An adaptive two-stage analog/regression model for probabilistic prediction of small-scale precipitation in France

    Science.gov (United States)

    Chardon, Jérémy; Hingray, Benoit; Favre, Anne-Catherine

    2018-01-01

    Statistical downscaling models (SDMs) are often used to produce local weather scenarios from large-scale atmospheric information. SDMs include transfer functions which are based on a statistical link identified from observations between local weather and a set of large-scale predictors. As physical processes driving surface weather vary in time, the most relevant predictors and the regression link are likely to vary in time too. This is well known for precipitation for instance and the link is thus often estimated after some seasonal stratification of the data. In this study, we present a two-stage analog/regression model where the regression link is estimated from atmospheric analogs of the current prediction day. Atmospheric analogs are identified from fields of geopotential heights at 1000 and 500 hPa. For the regression stage, two generalized linear models are further used to model the probability of precipitation occurrence and the distribution of non-zero precipitation amounts, respectively. The two-stage model is evaluated for the probabilistic prediction of small-scale precipitation over France. It noticeably improves the skill of the prediction for both precipitation occurrence and amount. As the analog days vary from one prediction day to another, the atmospheric predictors selected in the regression stage and the value of the corresponding regression coefficients can vary from one prediction day to another. The model allows thus for a day-to-day adaptive and tailored downscaling. It can also reveal specific predictors for peculiar and non-frequent weather configurations.

  2. Optimization of Two-Stage Peltier Modules: Structure and Exergetic Efficiency

    Directory of Open Access Journals (Sweden)

    Cesar Ramirez-Lopez

    2012-08-01

    Full Text Available In this paper we undertake the theoretical analysis of a two-stage semiconductor thermoelectric module (TEM which contains an arbitrary and different number of thermocouples, n1 and n2, in each stage (pyramid-styled TEM. The analysis is based on a dimensionless entropy balance set of equations. We study the effects of n1 and n2, the flowing electric currents through each stage, the applied temperatures and the thermoelectric properties of the semiconductor materials on the exergetic efficiency. Our main result implies that the electric currents flowing in each stage must necessarily be different with a ratio about 4.3 if the best thermal performance and the highest temperature difference possible between the cold and hot side of the device are pursued. This fact had not been pointed out before for pyramid-styled two stage TEM. The ratio n1/n2 should be about 8.

  3. Biometric templates selection and update using quality measures

    Science.gov (United States)

    Abboud, Ali J.; Jassim, Sabah A.

    2012-06-01

    To deal with severe variation in recording conditions, most biometric systems acquire multiple biometric samples, at the enrolment stage, for the same person and then extract their individual biometric feature vectors and store them in the gallery in the form of biometric template(s), labelled with the person's identity. The number of samples/templates and the choice of the most appropriate templates influence the performance of the system. The desired biometric template(s) selection technique must aim to control the run time and storage requirements while improving the recognition accuracy of the biometric system. This paper is devoted to elaborating on and discussing a new two stages approach for biometric templates selection and update. This approach uses a quality-based clustering, followed by a special criterion for the selection of an ultimate set of biometric templates from the various clusters. This approach is developed to select adaptively a specific number of templates for each individual. The number of biometric templates depends mainly on the performance of each individual (i.e. gallery size should be optimised to meet the needs of each target individual). These experiments have been conducted on two face image databases and their results will demonstrate the effectiveness of proposed quality-guided approach.

  4. Two stage treatment of dairy effluent using immobilized Chlorella pyrenoidosa

    Science.gov (United States)

    2013-01-01

    Background Dairy effluents contains high organic load and unscrupulous discharge of these effluents into aquatic bodies is a matter of serious concern besides deteriorating their water quality. Whilst physico-chemical treatment is the common mode of treatment, immobilized microalgae can be potentially employed to treat high organic content which offer numerous benefits along with waste water treatment. Methods A novel low cost two stage treatment was employed for the complete treatment of dairy effluent. The first stage consists of treating the diary effluent in a photobioreactor (1 L) using immobilized Chlorella pyrenoidosa while the second stage involves a two column sand bed filtration technique. Results Whilst NH4+-N was completely removed, a 98% removal of PO43--P was achieved within 96 h of two stage purification processes. The filtrate was tested for toxicity and no mortality was observed in the zebra fish which was used as a model at the end of 96 h bioassay. Moreover, a significant decrease in biological oxygen demand and chemical oxygen demand was achieved by this novel method. Also the biomass separated was tested as a biofertilizer to the rice seeds and a 30% increase in terms of length of root and shoot was observed after the addition of biomass to the rice plants. Conclusions We conclude that the two stage treatment of dairy effluent is highly effective in removal of BOD and COD besides nutrients like nitrates and phosphates. The treatment also helps in discharging treated waste water safely into the receiving water bodies since it is non toxic for aquatic life. Further, the algal biomass separated after first stage of treatment was highly capable of increasing the growth of rice plants because of nitrogen fixation ability of the green alga and offers a great potential as a biofertilizer. PMID:24355316

  5. GAIN RATIO BASED FEATURE SELECTION METHOD FOR PRIVACY PRESERVATION

    Directory of Open Access Journals (Sweden)

    R. Praveena Priyadarsini

    2011-04-01

    Full Text Available Privacy-preservation is a step in data mining that tries to safeguard sensitive information from unsanctioned disclosure and hence protecting individual data records and their privacy. There are various privacy preservation techniques like k-anonymity, l-diversity and t-closeness and data perturbation. In this paper k-anonymity privacy protection technique is applied to high dimensional datasets like adult and census. since, both the data sets are high dimensional, feature subset selection method like Gain Ratio is applied and the attributes of the datasets are ranked and low ranking attributes are filtered to form new reduced data subsets. K-anonymization privacy preservation technique is then applied on reduced datasets. The accuracy of the privacy preserved reduced datasets and the original datasets are compared for their accuracy on the two functionalities of data mining namely classification and clustering using naïve Bayesian and k-means algorithm respectively. Experimental results show that classification and clustering accuracy are comparatively the same for reduced k-anonym zed datasets and the original data sets.

  6. Model-Based Learning of Local Image Features for Unsupervised Texture Segmentation

    Science.gov (United States)

    Kiechle, Martin; Storath, Martin; Weinmann, Andreas; Kleinsteuber, Martin

    2018-04-01

    Features that capture well the textural patterns of a certain class of images are crucial for the performance of texture segmentation methods. The manual selection of features or designing new ones can be a tedious task. Therefore, it is desirable to automatically adapt the features to a certain image or class of images. Typically, this requires a large set of training images with similar textures and ground truth segmentation. In this work, we propose a framework to learn features for texture segmentation when no such training data is available. The cost function for our learning process is constructed to match a commonly used segmentation model, the piecewise constant Mumford-Shah model. This means that the features are learned such that they provide an approximately piecewise constant feature image with a small jump set. Based on this idea, we develop a two-stage algorithm which first learns suitable convolutional features and then performs a segmentation. We note that the features can be learned from a small set of images, from a single image, or even from image patches. The proposed method achieves a competitive rank in the Prague texture segmentation benchmark, and it is effective for segmenting histological images.

  7. Automatic Diagnose of the Stages of Breast Cancer using Intelligent Technique

    Science.gov (United States)

    Jiji, G. W.; Marsilin, J. R.

    2012-12-01

    The objective of this paper is to locate the stage of the breast cancer and to locate the tumor tissues. The proposed work is performed in two stages. In the first stage, location of tumour is done and in the second phase retrieving of the tumor images and its stages based on the "query input". Texture and Shape features are used here as Feature descriptors. Shape features used are asymmetry, aspect ratio, eccentricity and Bending Energy. Texture features used are contrast, energy and Gabor Filter. KNN classifier is used for classification and for Pattern matching, Euclidean distance is used. The proposed approach which gives 95.6% accuracy has been compared with earlier approaches. The proposed approach was tested with 32 samples and got annotated by radiologists.

  8. Sleep Stage Transition Dynamics Reveal Specific Stage 2 Vulnerability in Insomnia.

    Science.gov (United States)

    Wei, Yishul; Colombo, Michele A; Ramautar, Jennifer R; Blanken, Tessa F; van der Werf, Ysbrand D; Spiegelhalder, Kai; Feige, Bernd; Riemann, Dieter; Van Someren, Eus J W

    2017-09-01

    Objective sleep impairments in insomnia disorder (ID) are insufficiently understood. The present study evaluated whether whole-night sleep stage dynamics derived from polysomnography (PSG) differ between people with ID and matched controls and whether sleep stage dynamic features discriminate them better than conventional sleep parameters. Eighty-eight participants aged 21-70 years, including 46 with ID and 42 age- and sex-matched controls without sleep complaints, were recruited through www.sleepregistry.nl and completed two nights of laboratory PSG. Data of 100 people with ID and 100 age- and sex-matched controls from a previously reported study were used to validate the generalizability of findings. The second night was used to obtain, in addition to conventional sleep parameters, probabilities of transitions between stages and bout duration distributions of each stage. Group differences were evaluated with nonparametric tests. People with ID showed higher empirical probabilities to transition from stage N2 to the lighter sleep stage N1 or wakefulness and a faster decaying stage N2 bout survival function. The increased transition probability from stage N2 to stage N1 discriminated people with ID better than any of their deviations in conventional sleep parameters, including less total sleep time, less sleep efficiency, more stage N1, and more wake after sleep onset. Moreover, adding this transition probability significantly improved the discriminating power of a multiple logistic regression model based on conventional sleep parameters. Quantification of sleep stage dynamics revealed a particular vulnerability of stage N2 in insomnia. The feature characterizes insomnia better than-and independently of-any conventional sleep parameter. © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.

  9. Feature Selection and Kernel Learning for Local Learning-Based Clustering.

    Science.gov (United States)

    Zeng, Hong; Cheung, Yiu-ming

    2011-08-01

    The performance of the most clustering algorithms highly relies on the representation of data in the input space or the Hilbert space of kernel methods. This paper is to obtain an appropriate data representation through feature selection or kernel learning within the framework of the Local Learning-Based Clustering (LLC) (Wu and Schölkopf 2006) method, which can outperform the global learning-based ones when dealing with the high-dimensional data lying on manifold. Specifically, we associate a weight to each feature or kernel and incorporate it into the built-in regularization of the LLC algorithm to take into account the relevance of each feature or kernel for the clustering. Accordingly, the weights are estimated iteratively in the clustering process. We show that the resulting weighted regularization with an additional constraint on the weights is equivalent to a known sparse-promoting penalty. Hence, the weights of those irrelevant features or kernels can be shrunk toward zero. Extensive experiments show the efficacy of the proposed methods on the benchmark data sets.

  10. Improving neuromodulation technique for refractory voiding dysfunctions: two-stage implant.

    Science.gov (United States)

    Janknegt, R A; Weil, E H; Eerdmans, P H

    1997-03-01

    Neuromodulation is a new technique that uses electrical stimulation of the sacral nerves for patients with refractory urinary urge/frequency or urge-incontinence, and some forms of urinary retention. The limiting factor for receiving an implant is often a failure of the percutaneous nerve evaluation (PNE) test. Present publications mention only about a 50% success score for PNE of all patients, although the micturition diaries and urodynamic parameters are similar. We wanted to investigate whether PNE results improved by using a permanent electrode as a PNE test. This would show that improvement of the PNE technique is feasible. In 10 patients where the original PNE had failed to improve the micturition diary parameters more than 50%, a permanent electrode was implanted by operation. It was connected to an external stimulator. In those cases where the patients improved according to their micturition diary by more than 50% during a period of 4 days, the external stimulator was replaced by a permanent subcutaneous neurostimulator. Eight of the 10 patients had a good to very good result (60% to 90% improvement) during the testing period and received their implant 5 to 14 days after the first stage. The good results of the two-stage implant technique we used indicate that the development of better PNE electrodes may lead to an improvement of the testing technique and better selection between nonresponders and technical failures.

  11. Repetitive, small-bore two-stage light gas gun

    International Nuclear Information System (INIS)

    Combs, S.K.; Foust, C.R.; Fehling, D.T.; Gouge, M.J.; Milora, S.L.

    1991-01-01

    A repetitive two-stage light gas gun for high-speed pellet injection has been developed at Oak Ridge National Laboratory. In general, applications of the two-stage light gas gun have been limited to only single shots, with a finite time (at least minutes) needed for recovery and preparation for the next shot. The new device overcomes problems associated with repetitive operation, including rapidly evacuating the propellant gases, reloading the gun breech with a new projectile, returning the piston to its initial position, and refilling the first- and second-stage gas volumes to the appropriate pressure levels. In addition, some components are subjected to and must survive severe operating conditions, which include rapid cycling to high pressures and temperatures (up to thousands of bars and thousands of kelvins) and significant mechanical shocks. Small plastic projectiles (4-mm nominal size) and helium gas have been used in the prototype device, which was equipped with a 1-m-long pump tube and a 1-m-long gun barrel, to demonstrate repetitive operation (up to 1 Hz) at relatively high pellet velocities (up to 3000 m/s). The equipment is described, and experimental results are presented. 124 refs., 6 figs., 5 tabs

  12. Two-Stage Liver Transplantation with Temporary Porto-Middle Hepatic Vein Shunt

    Directory of Open Access Journals (Sweden)

    Giovanni Varotti

    2010-01-01

    Full Text Available Two-stage liver transplantation (LT has been reported for cases of fulminant liver failure that can lead to toxic hepatic syndrome, or massive hemorrhages resulting in uncontrollable bleeding. Technically, the first stage of the procedure consists of a total hepatectomy with preservation of the recipient's inferior vena cava (IVC, followed by the creation of a temporary end-to-side porto-caval shunt (TPCS. The second stage consists of removing the TPCS and implanting a liver graft when one becomes available. We report a case of a two-stage total hepatectomy and LT in which a temporary end-to-end anastomosis between the portal vein and the middle hepatic vein (TPMHV was performed as an alternative to the classic end-to-end TPCS. The creation of a TPMHV proved technically feasible and showed some advantages compared to the standard TPCS. In cases in which a two-stage LT with side-to-side caval reconstruction is utilized, TPMHV can be considered as a safe and effective alternative to standard TPCS.

  13. Two-stage perceptual learning to break visual crowding.

    Science.gov (United States)

    Zhu, Ziyun; Fan, Zhenzhi; Fang, Fang

    2016-01-01

    When a target is presented with nearby flankers in the peripheral visual field, it becomes harder to identify, which is referred to as crowding. Crowding sets a fundamental limit of object recognition in peripheral vision, preventing us from fully appreciating cluttered visual scenes. We trained adult human subjects on a crowded orientation discrimination task and investigated whether crowding could be completely eliminated by training. We discovered a two-stage learning process with this training task. In the early stage, when the target and flankers were separated beyond a certain distance, subjects acquired a relatively general ability to break crowding, as evidenced by the fact that the breaking of crowding could transfer to another crowded orientation, even a crowded motion stimulus, although the transfer to the opposite visual hemi-field was weak. In the late stage, like many classical perceptual learning effects, subjects' performance gradually improved and showed specificity to the trained orientation. We also found that, when the target and flankers were spaced too finely, training could only reduce, rather than completely eliminate, the crowding effect. This two-stage learning process illustrates a learning strategy for our brain to deal with the notoriously difficult problem of identifying peripheral objects in clutter. The brain first learned to solve the "easy and general" part of the problem (i.e., improving the processing resolution and segmenting the target and flankers) and then tackle the "difficult and specific" part (i.e., refining the representation of the target).

  14. Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests.

    Science.gov (United States)

    Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A

    2017-09-15

    Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code

  15. A two-stage approach for improved prediction of residue contact maps

    Directory of Open Access Journals (Sweden)

    Pollastri Gianluca

    2006-03-01

    Full Text Available Abstract Background Protein topology representations such as residue contact maps are an important intermediate step towards ab initio prediction of protein structure. Although improvements have occurred over the last years, the problem of accurately predicting residue contact maps from primary sequences is still largely unsolved. Among the reasons for this are the unbalanced nature of the problem (with far fewer examples of contacts than non-contacts, the formidable challenge of capturing long-range interactions in the maps, the intrinsic difficulty of mapping one-dimensional input sequences into two-dimensional output maps. In order to alleviate these problems and achieve improved contact map predictions, in this paper we split the task into two stages: the prediction of a map's principal eigenvector (PE from the primary sequence; the reconstruction of the contact map from the PE and primary sequence. Predicting the PE from the primary sequence consists in mapping a vector into a vector. This task is less complex than mapping vectors directly into two-dimensional matrices since the size of the problem is drastically reduced and so is the scale length of interactions that need to be learned. Results We develop architectures composed of ensembles of two-layered bidirectional recurrent neural networks to classify the components of the PE in 2, 3 and 4 classes from protein primary sequence, predicted secondary structure, and hydrophobicity interaction scales. Our predictor, tested on a non redundant set of 2171 proteins, achieves classification performances of up to 72.6%, 16% above a base-line statistical predictor. We design a system for the prediction of contact maps from the predicted PE. Our results show that predicting maps through the PE yields sizeable gains especially for long-range contacts which are particularly critical for accurate protein 3D reconstruction. The final predictor's accuracy on a non-redundant set of 327 targets is 35

  16. Benefit salience and consumers' selective attention to product features

    OpenAIRE

    Ratneshwar, S; Warlop, Luk; Mick, DG; Seeger, G

    1997-01-01

    Although attention is a key construct in models of marketing communication and consumer choice, its selective nature has rarely been examined in common time-pressured conditions. We focus on the role of benefit salience, that is, the readiness with which particular benefits are brought to mind by consumers in relation to a given product category. Study I demonstrated that when product feature information was presented rapidly, individuals for whom the benefit of personalised customer service ...

  17. Conditional Mutual Information Based Feature Selection for Classification Task

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Somol, Petr; Haindl, Michal; Pudil, Pavel

    2007-01-01

    Roč. 45, č. 4756 (2007), s. 417-426 ISSN 0302-9743 R&D Projects: GA MŠk 1M0572; GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Pattern classification * feature selection * conditional mutual information * text categorization Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.402, year: 2005

  18. Research on Two-channel Interleaved Two-stage Paralleled Buck DC-DC Converter for Plasma Cutting Power Supply

    DEFF Research Database (Denmark)

    Yang, Xi-jun; Qu, Hao; Yao, Chen

    2014-01-01

    As for high power plasma power supply, due to high efficiency and flexibility, multi-channel interleaved multi-stage paralleled Buck DC-DC Converter becomes the first choice. In the paper, two-channel interleaved two- stage paralleled Buck DC-DC Converter powered by three-phase AC power supply...

  19. Production of endo-pectate lyase by two stage cultivation of Erwinia carotovora

    Energy Technology Data Exchange (ETDEWEB)

    Fukuoka, Satoshi; Kobayashi, Yoshiaki

    1987-02-26

    The productivity of endo-pectate lyase from Erwinia carotovora GIR 1044 was found to be greatly improved by two stage cultivation: in the first stage the bacterium was grown with an inducing carbon source, e.g., pectin, and in the second stage it was cultivated with glycerol, xylose, or fructose with the addition of monosodium L-glutamate as nitrogen source. In the two stage cultivation using pectin or glycerol as the carbon source the enzyme activity reached 400 units/ml, almost 3 times as much as that of one stage cultivation in a 10 liter fermentor. Using two stage cultivation in the 200 liter fermentor improved enzyme productivity over that in the 10 liter fermentor, with 500 units/ml of activity. Compared with the cultivation in Erlenmeyer flasks, fermentor cultivation improved enzyme productivity. The optimum cultivating conditions were agitation of 480 rpm with aeration of 0.5 vvm at 28 /sup 0/C. (4 figs, 4 tabs, 14 refs)

  20. Multi-stage selective catalytic reduction of NOx in lean burn engine exhaust

    Energy Technology Data Exchange (ETDEWEB)

    Penetrante, B.M.; Hsaio, M.C.; Merritt, B.T.; Vogtlin, G.E. [Lawrence Livermore National Lab., CA (United States)

    1997-12-31

    Many studies suggest that the conversion of NO to NO{sub 2} is an important intermediate step in the selective catalytic reduction (SCR) of NO{sub x} to N{sub 2}. Some effort has been devoted to separating the oxidative and reductive functions of the catalyst in a multi-stage system. This method works fine for systems that require hydrocarbon addition. The hydrocarbon has to be injected between the NO oxidation catalyst and the NO{sub 2} reduction catalyst; otherwise, the first-stage oxidation catalyst will also oxidize the hydrocarbon and decrease its effectiveness as a reductant. The multi-stage catalytic scheme is appropriate for diesel engine exhausts since they contain insufficient hydrocarbons for SCR, and the hydrocarbons can be added at the desired location. For lean-burn gasoline engine exhausts, the hydrocarbons already present in the exhausts will make it necessary to find an oxidation catalyst that can oxidize NO to NO{sub 2} but not oxidize the hydrocarbon. A plasma can also be used to oxidize NO to NO{sub 2}. Plasma oxidation has several advantages over catalytic oxidation. Plasma-assisted catalysis can work well for both diesel engine and lean-burn gasoline engine exhausts. This is because the plasma can oxidize NO in the presence of hydrocarbons without degrading the effectiveness of the hydrocarbon as a reductant for SCR. In the plasma, the hydrocarbon enhances the oxidation of NO, minimizes the electrical energy requirement, and prevents the oxidation of SO{sub 2}. This paper discusses the use of multi-stage systems for selective catalytic reduction of NO{sub x}. The multi-stage catalytic scheme is compared to the plasma-assisted catalytic scheme.

  1. Chronic infections in hip arthroplasties: comparing risk of reinfection following one-stage and two-stage revision: a systematic review and meta-analysis

    Directory of Open Access Journals (Sweden)

    Lange J

    2012-03-01

    Full Text Available Jeppe Lange1,2, Anders Troelsen3, Reimar W Thomsen4, Kjeld Søballe1,51Lundbeck Foundation Centre for Fast-Track Hip and Knee Surgery, Aarhus C, 2Center for Planned Surgery, Silkeborg Regional Hospital, Silkeborg, 3Department of Orthopaedics, Hvidovre Hospital, Hvidovre, 4Department of Clinical Epidemiology, Aarhus University Hospital, Aalborg, 5Department of Orthopaedics, Aarhus University Hospital, Aarhus C, DenmarkBackground: Two-stage revision is regarded by many as the best treatment of chronic infection in hip arthroplasties. Some international reports, however, have advocated one-stage revision. No systematic review or meta-analysis has ever compared the risk of reinfection following one-stage and two-stage revisions for chronic infection in hip arthroplasties.Methods: The review was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis. Relevant studies were identified using PubMed and Embase. We assessed studies that included patients with a chronic infection of a hip arthroplasty treated with either one-stage or two-stage revision and with available data on occurrence of reinfections. We performed a meta-analysis estimating absolute risk of reinfection using a random-effects model.Results: We identified 36 studies eligible for inclusion. None were randomized controlled trials or comparative studies. The patients in these studies had received either one-stage revision (n = 375 or two-stage revision (n = 929. Reinfection occurred with an estimated absolute risk of 13.1% (95% confidence interval: 10.0%–17.1% in the one-stage cohort and 10.4% (95% confidence interval: 8.5%–12.7% in the two-stage cohort. The methodological quality of most included studies was considered low, with insufficient data to evaluate confounding factors.Conclusions: Our results may indicate three additional reinfections per 100 reimplanted patients when performing a one-stage versus two-stage revision. However, the

  2. Hypospadias repair: Byar's two stage operation revisited.

    Science.gov (United States)

    Arshad, A R

    2005-06-01

    Hypospadias is a congenital deformity characterised by an abnormally located urethral opening, that could occur anywhere proximal to its normal location on the ventral surface of glans penis to the perineum. Many operations had been described for the management of this deformity. One hundred and fifteen patients with hypospadias were treated at the Department of Plastic Surgery, Hospital Kuala Lumpur, Malaysia between September 1987 and December 2002, of which 100 had Byar's procedure performed on them. The age of the patients ranged from neonates to 26 years old. Sixty-seven patients had penoscrotal (58%), 20 had proximal penile (18%), 13 had distal penile (11%) and 15 had subcoronal hypospadias (13%). Operations performed were Byar's two-staged (100), Bracka's two-staged (11), flip-flap (2) and MAGPI operation (2). The most common complication encountered following hypospadias surgery was urethral fistula at a rate of 18%. There is a higher incidence of proximal hypospadias in the Malaysian community. Byar's procedure is a very versatile technique and can be used for all types of hypospadias. Fistula rate is 18% in this series.

  3. A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification

    Science.gov (United States)

    Wang, Suge; Li, Deyu; Wei, Yingjie; Li, Hongxia

    With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers' decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.

  4. A low-voltage Op Amp with rail-to-rail constant-gm input stage and a class AB rail-to-rail output stage

    NARCIS (Netherlands)

    Botma, J.H.; Wassenaar, R.F.; Wiegerink, Remco J.

    1993-01-01

    In this paper a low-voltage two-stage Op Amp is presented. The Op Amp features rail-to-rail operation and has an @put stage with a constant transconductance (%) over the entire common-mode input range. The input stage consists of an n- and a PMOS differential pair connected in parallel. The constant

  5. Investigation of selected surface integrity features of duplex stainless steel (DSS after turning

    Directory of Open Access Journals (Sweden)

    G. Krolczyk

    2015-01-01

    Full Text Available The article presents surface roughness profiles and Abbott - Firestone curves with vertical and amplitude parameters of surface roughness after turning by means of a coated sintered carbide wedge with a coating with ceramic intermediate layer. The investigation comprised the influence of cutting speed on the selected features of surface integrity in dry machining. The material under investigation was duplex stainless steel with two-phase ferritic-austenitic structure. The tests have been performed under production conditions during machining of parts for electric motors and deep-well pumps. The obtained results allow to draw conclusions about the characteristics of surface properties of the machined parts.

  6. Feature selection based on SVM significance maps for classification of dementia

    NARCIS (Netherlands)

    E.E. Bron (Esther); M. Smits (Marion); J.C. van Swieten (John); W.J. Niessen (Wiro); S. Klein (Stefan)

    2014-01-01

    textabstractSupport vector machine significance maps (SVM p-maps) previously showed clusters of significantly different voxels in dementiarelated brain regions. We propose a novel feature selection method for classification of dementia based on these p-maps. In our approach, the SVM p-maps are

  7. Two-Stage orders sequencing system for mixed-model assembly

    Science.gov (United States)

    Zemczak, M.; Skolud, B.; Krenczyk, D.

    2015-11-01

    In the paper, the authors focus on the NP-hard problem of orders sequencing, formulated similarly to Car Sequencing Problem (CSP). The object of the research is the assembly line in an automotive industry company, on which few different models of products, each in a certain number of versions, are assembled on the shared resources, set in a line. Such production type is usually determined as a mixed-model production, and arose from the necessity of manufacturing customized products on the basis of very specific orders from single clients. The producers are nowadays obliged to provide each client the possibility to determine a huge amount of the features of the product they are willing to buy, as the competition in the automotive market is large. Due to the previously mentioned nature of the problem (NP-hard), in the given time period only satisfactory solutions are sought, as the optimal solution method has not yet been found. Most of the researchers that implemented inaccurate methods (e.g. evolutionary algorithms) to solving sequencing problems dropped the research after testing phase, as they were not able to obtain reproducible results, and met problems while determining the quality of the received solutions. Therefore a new approach to solving the problem, presented in this paper as a sequencing system is being developed. The sequencing system consists of a set of determined rules, implemented into computer environment. The system itself works in two stages. First of them is connected with the determination of a place in the storage buffer to which certain production orders should be sent. In the second stage of functioning, precise sets of sequences are determined and evaluated for certain parts of the storage buffer under certain criteria.

  8. Two-dimensional echocardiographic features of right ventricular infarction

    International Nuclear Information System (INIS)

    D'Arcy, B.; Nanda, N.C.

    1982-01-01

    Real-time, two-dimensional echocardiographic studies were performed in 10 patients with acute myocardial infarction who had clinical features suggestive of right ventricular involvement. All patients showed right ventricular wall motion abnormalities. In the four-chamber view, seven patients showed akinesis of the entire right ventricular diaphragmatic wall and three showed akinesis of segments of the diaphragmatic wall. Segmental dyskinetic areas involving the right ventricular free wall were identified in four patients. One patient showed a large right ventricular apical aneurysm. Other echocardiographic features included enlargement of the right ventricle in eight cases, paradoxical ventricular septal motion in seven cases, tricuspid incompetence in eight cases, dilation of the stomach in four cases and localized pericardial effusion in two cases. Right ventricular infarction was confirmed by radionuclide methods in seven patients, at surgery in one patient and at autopsy in two patients

  9. A two-stage method for inverse medium scattering

    KAUST Repository

    Ito, Kazufumi; Jin, Bangti; Zou, Jun

    2013-01-01

    We present a novel numerical method to the time-harmonic inverse medium scattering problem of recovering the refractive index from noisy near-field scattered data. The approach consists of two stages, one pruning step of detecting the scatterer

  10. Two-stage plasma gun based on a gas discharge with a self-heating hollow emitter.

    Science.gov (United States)

    Vizir, A V; Tyunkov, A V; Shandrikov, M V; Oks, E M

    2010-02-01

    The paper presents the results of tests of a new compact two-stage bulk gas plasma gun. The plasma gun is based on a nonself-sustained gas discharge with an electron emitter based on a discharge with a self-heating hollow cathode. The operating characteristics of the plasma gun are investigated. The discharge system makes it possible to produce uniform and stable gas plasma in the dc mode with a plasma density up to 3x10(9) cm(-3) at an operating gas pressure in the vacuum chamber of less than 2x10(-2) Pa. The device features high power efficiency, design simplicity, and compactness.

  11. Two-stage plasma gun based on a gas discharge with a self-heating hollow emitter

    International Nuclear Information System (INIS)

    Vizir, A. V.; Tyunkov, A. V.; Shandrikov, M. V.; Oks, E. M.

    2010-01-01

    The paper presents the results of tests of a new compact two-stage bulk gas plasma gun. The plasma gun is based on a nonself-sustained gas discharge with an electron emitter based on a discharge with a self-heating hollow cathode. The operating characteristics of the plasma gun are investigated. The discharge system makes it possible to produce uniform and stable gas plasma in the dc mode with a plasma density up to 3x10 9 cm -3 at an operating gas pressure in the vacuum chamber of less than 2x10 -2 Pa. The device features high power efficiency, design simplicity, and compactness.

  12. Combining evidence from multiple electronic health care databases: performances of one-stage and two-stage meta-analysis in matched case-control studies.

    Science.gov (United States)

    La Gamba, Fabiola; Corrao, Giovanni; Romio, Silvana; Sturkenboom, Miriam; Trifirò, Gianluca; Schink, Tania; de Ridder, Maria

    2017-10-01

    Clustering of patients in databases is usually ignored in one-stage meta-analysis of multi-database studies using matched case-control data. The aim of this study was to compare bias and efficiency of such a one-stage meta-analysis with a two-stage meta-analysis. First, we compared the approaches by generating matched case-control data under 5 simulated scenarios, built by varying: (1) the exposure-outcome association; (2) its variability among databases; (3) the confounding strength of one covariate on this association; (4) its variability; and (5) the (heterogeneous) confounding strength of two covariates. Second, we made the same comparison using empirical data from the ARITMO project, a multiple database study investigating the risk of ventricular arrhythmia following the use of medications with arrhythmogenic potential. In our study, we specifically investigated the effect of current use of promethazine. Bias increased for one-stage meta-analysis with increasing (1) between-database variance of exposure effect and (2) heterogeneous confounding generated by two covariates. The efficiency of one-stage meta-analysis was slightly lower than that of two-stage meta-analysis for the majority of investigated scenarios. Based on ARITMO data, there were no evident differences between one-stage (OR = 1.50, CI = [1.08; 2.08]) and two-stage (OR = 1.55, CI = [1.12; 2.16]) approaches. When the effect of interest is heterogeneous, a one-stage meta-analysis ignoring clustering gives biased estimates. Two-stage meta-analysis generates estimates at least as accurate and precise as one-stage meta-analysis. However, in a study using small databases and rare exposures and/or outcomes, a correct one-stage meta-analysis becomes essential. Copyright © 2017 John Wiley & Sons, Ltd.

  13. Different underlying mechanisms for face emotion and gender processing during feature-selective attention: Evidence from event-related potential studies.

    Science.gov (United States)

    Wang, Hailing; Ip, Chengteng; Fu, Shimin; Sun, Pei

    2017-05-01

    Face recognition theories suggest that our brains process invariant (e.g., gender) and changeable (e.g., emotion) facial dimensions separately. To investigate whether these two dimensions are processed in different time courses, we analyzed the selection negativity (SN, an event-related potential component reflecting attentional modulation) elicited by face gender and emotion during a feature selective attention task. Participants were instructed to attend to a combination of face emotion and gender attributes in Experiment 1 (bi-dimensional task) and to either face emotion or gender in Experiment 2 (uni-dimensional task). The results revealed that face emotion did not elicit a substantial SN, whereas face gender consistently generated a substantial SN in both experiments. These results suggest that face gender is more sensitive to feature-selective attention and that face emotion is encoded relatively automatically on SN, implying the existence of different underlying processing mechanisms for invariant and changeable facial dimensions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Fuzzy Mutual Information Based min-Redundancy and Max-Relevance Heterogeneous Feature Selection

    Directory of Open Access Journals (Sweden)

    Daren Yu

    2011-08-01

    Full Text Available Feature selection is an important preprocessing step in pattern classification and machine learning, and mutual information is widely used to measure relevance between features and decision. However, it is difficult to directly calculate relevance between continuous or fuzzy features using mutual information. In this paper we introduce the fuzzy information entropy and fuzzy mutual information for computing relevance between numerical or fuzzy features and decision. The relationship between fuzzy information entropy and differential entropy is also discussed. Moreover, we combine fuzzy mutual information with qmin-Redundancy-Max-Relevanceq, qMax-Dependencyq and min-Redundancy-Max-Dependencyq algorithms. The performance and stability of the proposed algorithms are tested on benchmark data sets. Experimental results show the proposed algorithms are effective and stable.

  15. Features of selection of children for occupations by artistic gymnastics in modern Kurdistan

    OpenAIRE

    Abdulvahid Dlshad Nihad

    2015-01-01

    Purpose: to study the organizational and pedagogical conditions of selection of children for occupations existing in the republic Kurdistan artistic gymnastics Material and Methods: questioning of 24 trainers on artistic gymnastics and experts in physical culture of the republic Kurdistan was carried out. The general questions of selection and methodical features of selection of children for occupations by artistic gymnastics in Kurdistan were studied. Results: questioning revealed absence of...

  16. FIRST DIRECT EVIDENCE OF TWO STAGES IN FREE RECALL

    Directory of Open Access Journals (Sweden)

    Eugen Tarnow

    2015-12-01

    Full Text Available I find that exactly two stages can be seen directly in sequential free recall distributions. These distributions show that the first three recalls come from the emptying of working memory, recalls 6 and above come from a second stage and the 4th and 5th recalls are mixtures of the two.A discontinuity, a rounded step function, is shown to exist in the fitted linear slope of the recall distributions as the recall shifts from the emptying of working memory (positive slope to the second stage (negative slope. The discontinuity leads to a first estimate of the capacity of working memory at 4-4.5 items. The total recall is shown to be a linear combination of the content of working memory and items recalled in the second stage with 3.0-3.9 items coming from working memory, a second estimate of the capacity of working memory. A third, separate upper limit on the capacity of working memory is found (3.06 items, corresponding to the requirement that the content of working memory cannot exceed the total recall, item by item. This third limit is presumably the best limit on the average capacity of unchunked working memory.The second stage of recall is shown to be reactivation: The average times to retrieve additional items in free recall obey a linear relationship as a function of the recall probability which mimics recognition and cued recall, both mechanisms using reactivation (Tarnow, 2008.

  17. Labor Union Effects on Innovation and Commercialization Productivity: An Integrated Propensity Score Matching and Two-Stage Data Envelopment Analysis

    Directory of Open Access Journals (Sweden)

    Dongphil Chun

    2015-04-01

    Full Text Available Research and development (R&D is a critical factor in sustaining a firm’s competitive advantage. Accurate measurement of R&D productivity and investigation of its influencing factors are of value for R&D productivity improvements. This study is divided into two sections. The first section outlines the innovation and commercialization stages of firm-level R&D activities. This section analyzes the productivity of each stage using a propensity score matching (PSM and two-stage data envelopment analysis (DEA integrated model to solve the selection bias problem. Second, this study conducts a comparative analysis among subgroups categorized as labor unionized or non-labor unionized on productivity at each stage. We used Korea Innovation Survey (KIS data for analysis using a sample of 400 Korean manufacturers. The key findings of this study include: (1 firm innovation and commercialization productivity are balanced and show relatively low innovation productivity; and (2 labor unions have a positive effect on commercialization productivity. Moreover, labor unions are an influential factor in determining manufacturing firms’ commercialization productivity.

  18. Orthogonal feature selection method. [For preprocessing of man spectral data

    Energy Technology Data Exchange (ETDEWEB)

    Kowalski, B R [Univ. of Washington, Seattle; Bender, C F

    1976-01-01

    A new method of preprocessing spectral data for extraction of molecular structural information is desired. This SELECT method generates orthogonal features that are important for classification purposes and that also retain their identity to the original measurements. A brief introduction to chemical pattern recognition is presented. A brief description of the method and an application to mass spectral data analysis follow. (BLM)

  19. Modal intersection types, two-level languages, and staged synthesis

    DEFF Research Database (Denmark)

    Henglein, Fritz; Rehof, Jakob

    2016-01-01

    -linguistic framework for staged program synthesis, where metaprograms are automatically synthesized which, when executed, generate code in a target language. We survey the basic theory of staged synthesis and illustrate by example how a two-level language theory specialized from λ∩ ⎕ can be used to understand......A typed λ-calculus, λ∩ ⎕, is introduced, combining intersection types and modal types. We develop the metatheory of λ∩ ⎕, with particular emphasis on the theory of subtyping and distributivity of the modal and intersection type operators. We describe how a stratification of λ∩ ⎕ leads to a multi...... the process of staged synthesis....

  20. YamiPred: A novel evolutionary method for predicting pre-miRNAs and selecting relevant features

    KAUST Repository

    Kleftogiannis, Dimitrios A.; Theofilatos, Konstantinos; Likothanassis, Spiros; Mavroudi, Seferina

    2015-01-01

    MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of Support Vector Machines (SVM) with Genetic Algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.

  1. YamiPred: A novel evolutionary method for predicting pre-miRNAs and selecting relevant features

    KAUST Repository

    Kleftogiannis, Dimitrios A.

    2015-01-23

    MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of Support Vector Machines (SVM) with Genetic Algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.

  2. Measuring demand for flat water recreation using a two-stage/disequilibrium travel cost model with adjustment for overdispersion and self-selection

    Science.gov (United States)

    McKean, John R.; Johnson, Donn; Taylor, R. Garth

    2003-04-01

    An alternate travel cost model is applied to an on-site sample to estimate the value of flat water recreation on the impounded lower Snake River. Four contiguous reservoirs would be eliminated if the dams are breached to protect endangered Pacific salmon and steelhead trout. The empirical method applies truncated negative binomial regression with adjustment for endogenous stratification. The two-stage decision model assumes that recreationists allocate their time among work and leisure prior to deciding among consumer goods. The allocation of time and money among goods in the second stage is conditional on the predetermined work time and income. The second stage is a disequilibrium labor market which also applies if employers set work hours or if recreationists are not in the labor force. When work time is either predetermined, fixed by contract, or nonexistent, recreationists must consider separate prices and budgets for time and money.

  3. Evaluating the Validity of a Two-stage Sample in a Birth Cohort Established from Administrative Databases.

    Science.gov (United States)

    El-Zein, Mariam; Conus, Florence; Benedetti, Andrea; Parent, Marie-Elise; Rousseau, Marie-Claude

    2016-01-01

    When using administrative databases for epidemiologic research, a subsample of subjects can be interviewed, eliciting information on undocumented confounders. This article presents a thorough investigation of the validity of a two-stage sample encompassing an assessment of nonparticipation and quantification of the extent of bias. Established through record linkage of administrative databases, the Québec Birth Cohort on Immunity and Health (n = 81,496) aims to study the association between Bacillus Calmette-Guérin vaccination and asthma. Among 76,623 subjects classified in four Bacillus Calmette-Guérin-asthma strata, a two-stage sampling strategy with a balanced design was used to randomly select individuals for interviews. We compared stratum-specific sociodemographic characteristics and healthcare utilization of stage 2 participants (n = 1,643) with those of eligible nonparticipants (n = 74,980) and nonrespondents (n = 3,157). We used logistic regression to determine whether participation varied across strata according to these characteristics. The effect of nonparticipation was described by the relative odds ratio (ROR = ORparticipants/ORsource population) for the association between sociodemographic characteristics and asthma. Parental age at childbirth, area of residence, family income, and healthcare utilization were comparable between groups. Participants were slightly more likely to be women and have a mother born in Québec. Participation did not vary across strata by sex, parental birthplace, or material and social deprivation. Estimates were not biased by nonparticipation; most RORs were below one and bias never exceeded 20%. Our analyses evaluate and provide a detailed demonstration of the validity of a two-stage sample for researchers assembling similar research infrastructures.

  4. An adaptive two-stage analog/regression model for probabilistic prediction of small-scale precipitation in France

    Directory of Open Access Journals (Sweden)

    J. Chardon

    2018-01-01

    Full Text Available Statistical downscaling models (SDMs are often used to produce local weather scenarios from large-scale atmospheric information. SDMs include transfer functions which are based on a statistical link identified from observations between local weather and a set of large-scale predictors. As physical processes driving surface weather vary in time, the most relevant predictors and the regression link are likely to vary in time too. This is well known for precipitation for instance and the link is thus often estimated after some seasonal stratification of the data. In this study, we present a two-stage analog/regression model where the regression link is estimated from atmospheric analogs of the current prediction day. Atmospheric analogs are identified from fields of geopotential heights at 1000 and 500 hPa. For the regression stage, two generalized linear models are further used to model the probability of precipitation occurrence and the distribution of non-zero precipitation amounts, respectively. The two-stage model is evaluated for the probabilistic prediction of small-scale precipitation over France. It noticeably improves the skill of the prediction for both precipitation occurrence and amount. As the analog days vary from one prediction day to another, the atmospheric predictors selected in the regression stage and the value of the corresponding regression coefficients can vary from one prediction day to another. The model allows thus for a day-to-day adaptive and tailored downscaling. It can also reveal specific predictors for peculiar and non-frequent weather configurations.

  5. Two-stage, high power X-band amplifier experiment

    International Nuclear Information System (INIS)

    Kuang, E.; Davis, T.J.; Ivers, J.D.; Kerslick, G.S.; Nation, J.A.; Schaechter, L.

    1993-01-01

    At output powers in excess of 100 MW the authors have noted the development of sidebands in many TWT structures. To address this problem an experiment using a narrow bandwidth, two-stage TWT is in progress. The TWT amplifier consists of a dielectric (e = 5) slow-wave structure, a 30 dB sever section and a 8.8-9.0 GHz passband periodic, metallic structure. The electron beam used in this experiment is a 950 kV, 1 kA, 50 ns pencil beam propagating along an applied axial field of 9 kG. The dielectric first stage has a maximum gain of 30 dB measured at 8.87 GHz, with output powers of up to 50 MW in the TM 01 mode. In these experiments the dielectric amplifier output power is about 3-5 MW and the output power of the complete two-stage device is ∼160 MW at the input frequency. The sidebands detected in earlier experiments have been eliminated. The authors also report measurements of the energy spread of the electron beam resulting from the amplification process. These experimental results are compared with MAGIC code simulations and analytic work they have carried out on such devices

  6. A two-stage approach for multi-objective decision making with applications to system reliability optimization

    International Nuclear Information System (INIS)

    Li Zhaojun; Liao Haitao; Coit, David W.

    2009-01-01

    This paper proposes a two-stage approach for solving multi-objective system reliability optimization problems. In this approach, a Pareto optimal solution set is initially identified at the first stage by applying a multiple objective evolutionary algorithm (MOEA). Quite often there are a large number of Pareto optimal solutions, and it is difficult, if not impossible, to effectively choose the representative solutions for the overall problem. To overcome this challenge, an integrated multiple objective selection optimization (MOSO) method is utilized at the second stage. Specifically, a self-organizing map (SOM), with the capability of preserving the topology of the data, is applied first to classify those Pareto optimal solutions into several clusters with similar properties. Then, within each cluster, the data envelopment analysis (DEA) is performed, by comparing the relative efficiency of those solutions, to determine the final representative solutions for the overall problem. Through this sequential solution identification and pruning process, the final recommended solutions to the multi-objective system reliability optimization problem can be easily determined in a more systematic and meaningful way.

  7. Feature extraction and selection for objective gait analysis and fall risk assessment by accelerometry

    Directory of Open Access Journals (Sweden)

    Cremer Gerald

    2011-01-01

    Full Text Available Abstract Background Falls in the elderly is nowadays a major concern because of their consequences on elderly general health and moral states. Moreover, the aging of the population and the increasing life expectancy make the prediction of falls more and more important. The analysis presented in this article makes a first step in this direction providing a way to analyze gait and classify hospitalized elderly fallers and non-faller. This tool, based on an accelerometer network and signal processing, gives objective informations about the gait and does not need any special gait laboratory as optical analysis do. The tool is also simple to use by a non expert and can therefore be widely used on a large set of patients. Method A population of 20 hospitalized elderlies was asked to execute several classical clinical tests evaluating their risk of falling. They were also asked if they experienced any fall in the last 12 months. The accelerations of the limbs were recorded during the clinical tests with an accelerometer network distributed on the body. A total of 67 features were extracted from the accelerometric signal recorded during a simple 25 m walking test at comfort speed. A feature selection algorithm was used to select those able to classify subjects at risk and not at risk for several classification algorithms types. Results The results showed that several classification algorithms were able to discriminate people from the two groups of interest: fallers and non-fallers hospitalized elderlies. The classification performances of the used algorithms were compared. Moreover a subset of the 67 features was considered to be significantly different between the two groups using a t-test. Conclusions This study gives a method to classify a population of hospitalized elderlies in two groups: at risk of falling or not at risk based on accelerometric data. This is a first step to design a risk of falling assessment system that could be used to provide

  8. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  9. Noncausal two-stage image filtration at presence of observations with anomalous errors

    OpenAIRE

    S. V. Vishnevyy; S. Ya. Zhuk; A. N. Pavliuchenkova

    2013-01-01

    Introduction. It is necessary to develop adaptive algorithms, which allow to detect such regions and to apply filter with respective parameters for suppression of anomalous noises for the purposes of image filtration, which consist of regions with anomalous errors. Development of adaptive algorithm for non-causal two-stage images filtration at pres-ence of observations with anomalous errors. The adaptive algorithm for noncausal two-stage filtration is developed. On the first stage the adaptiv...

  10. CFD simulations of compressed air two stage rotary Wankel expander – Parametric analysis

    International Nuclear Information System (INIS)

    Sadiq, Ghada A.; Tozer, Gavin; Al-Dadah, Raya; Mahmoud, Saad

    2017-01-01

    Highlights: • CFD ANSYS-Fluent 3D simulation of Wankel expander is developed. • Single and two-stage expander’s performance is compared. • Inlet and outlet ports shape and configurations are investigated. • Isentropic efficiency of two stage Wankel expander of 91% is achieved. - Abstract: A small scale volumetric Wankel expander is a powerful device for small-scale power generation in compressed air energy storage (CAES) systems and Organic Rankine cycles powered by different heat sources such as, biomass, low temperature geothermal, solar and waste heat leading to significant reduction in CO_2 emissions. Wankel expanders outperform other types of expander due to their ability to produce two power pulses per revolution per chamber additional to higher compactness, lower noise and vibration and lower cost. In this paper, a computational fluid dynamics (CFD) model was developed using ANSYS 16.2 to simulate the flow dynamics for a single and two stage Wankel expanders and to investigate the effect of port configurations, including size and spacing, on the expander’s power output and isentropic efficiency. Also, single-stage and two-stage expanders were analysed with different operating conditions. Single-stage 3D CFD results were compared to published work showing close agreement. The CFD modelling was used to investigate the performance of the rotary device using air as an ideal gas with various port diameters ranging from 15 mm to 50 mm; port spacing varying from 28 mm to 66 mm; different Wankel expander sizes (r = 48, e = 6.6, b = 32) mm and (r = 58, e = 8, b = 40) mm both as single-stage and as two-stage expanders with different configurations and various operating conditions. Results showed that the best Wankel expander design for a single-stage was (r = 48, e = 6.6, b = 32) mm, with the port diameters 20 mm and port spacing equal to 50 mm. Moreover, combining two Wankel expanders horizontally, with a larger one at front, produced 8.52 kW compared

  11. Modulation of neuronal responses during covert search for visual feature conjunctions.

    Science.gov (United States)

    Buracas, Giedrius T; Albright, Thomas D

    2009-09-29

    While searching for an object in a visual scene, an observer's attentional focus and eye movements are often guided by information about object features and spatial locations. Both spatial and feature-specific attention are known to modulate neuronal responses in visual cortex, but little is known of the dynamics and interplay of these mechanisms as visual search progresses. To address this issue, we recorded from directionally selective cells in visual area MT of monkeys trained to covertly search for targets defined by a unique conjunction of color and motion features and to signal target detection with an eye movement to the putative target. Two patterns of response modulation were observed. One pattern consisted of enhanced responses to targets presented in the receptive field (RF). These modulations occurred at the end-stage of search and were more potent during correct target identification than during erroneous saccades to a distractor in RF, thus suggesting that this modulation is not a mere presaccadic enhancement. A second pattern of modulation was observed when RF stimuli were nontargets that shared a feature with the target. The latter effect was observed during early stages of search and is consistent with a global feature-specific mechanism. This effect often terminated before target identification, thus suggesting that it interacts with spatial attention. This modulation was exhibited not only for motion but also for color cue, although MT neurons are known to be insensitive to color. Such cue-invariant attentional effects may contribute to a feature binding mechanism acting across visual dimensions.

  12. An application of locally linear model tree algorithm with combination of feature selection in credit scoring

    Science.gov (United States)

    Siami, Mohammad; Gholamian, Mohammad Reza; Basiri, Javad

    2014-10-01

    Nowadays, credit scoring is one of the most important topics in the banking sector. Credit scoring models have been widely used to facilitate the process of credit assessing. In this paper, an application of the locally linear model tree algorithm (LOLIMOT) was experimented to evaluate the superiority of its performance to predict the customer's credit status. The algorithm is improved with an aim of adjustment by credit scoring domain by means of data fusion and feature selection techniques. Two real world credit data sets - Australian and German - from UCI machine learning database were selected to demonstrate the performance of our new classifier. The analytical results indicate that the improved LOLIMOT significantly increase the prediction accuracy.

  13. Testing of Selective Laser Melting Turbomachinery Applicable to Exploration Upper Stage

    Science.gov (United States)

    Calvert, Marty; Turpin, Jason; Nettles, Mindy

    2015-01-01

    This task is to design, fabricate, and spin test to failure a Ti6-4 hydrogen turbopump impeller that was built using the selective laser melting (SLM) fabrication process (fig. 1). The impeller is sized around upper stage engine requirements. In addition to the spin burst test, material testing will be performed on coupons that are built with the impeller.

  14. Human activity recognition based on feature selection in smart home using back-propagation algorithm.

    Science.gov (United States)

    Fang, Hongqing; He, Lei; Si, Hao; Liu, Peng; Xie, Xiaolei

    2014-09-01

    In this paper, Back-propagation(BP) algorithm has been used to train the feed forward neural network for human activity recognition in smart home environments, and inter-class distance method for feature selection of observed motion sensor events is discussed and tested. And then, the human activity recognition performances of neural network using BP algorithm have been evaluated and compared with other probabilistic algorithms: Naïve Bayes(NB) classifier and Hidden Markov Model(HMM). The results show that different feature datasets yield different activity recognition accuracy. The selection of unsuitable feature datasets increases the computational complexity and degrades the activity recognition accuracy. Furthermore, neural network using BP algorithm has relatively better human activity recognition performances than NB classifier and HMM. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  15. Multi-stage classification method oriented to aerial image based on low-rank recovery and multi-feature fusion sparse representation.

    Science.gov (United States)

    Ma, Xu; Cheng, Yongmei; Hao, Shuai

    2016-12-10

    Automatic classification of terrain surfaces from an aerial image is essential for an autonomous unmanned aerial vehicle (UAV) landing at an unprepared site by using vision. Diverse terrain surfaces may show similar spectral properties due to the illumination and noise that easily cause poor classification performance. To address this issue, a multi-stage classification algorithm based on low-rank recovery and multi-feature fusion sparse representation is proposed. First, color moments and Gabor texture feature are extracted from training data and stacked as column vectors of a dictionary. Then we perform low-rank matrix recovery for the dictionary by using augmented Lagrange multipliers and construct a multi-stage terrain classifier. Experimental results on an aerial map database that we prepared verify the classification accuracy and robustness of the proposed method.

  16. Randomised study on single stage laparo-endoscopic rendezvous (intra-operative ERCP) procedure versus two stage approach (Pre-operative ERCP followed by laparoscopic cholecystectomy) for the management of cholelithiasis with choledocholithiasis.

    Science.gov (United States)

    Sahoo, Manash Ranjan; Kumar, Anil T; Patnaik, Aashish

    2014-07-01

    The 'Rendezvous' technique consists of laparoscopic cholecystectomy (LC) standards with intra-operative cholangiography followed by endoscopic sphincterotomy. The sphincterotome is driven across the papilla through a guidewire inserted by the transcystic route. In this study, we intended to compare the two methods in a prospective randomised trial. From 2005 to 2012, we enrolled 83 patients with a diagnosis of cholecysto-choledocolithiasis. They were randomised into two groups. In 'group-A',41 patients were treated with two stages management, first by pre-operative endoscopic retrograde cholangiopancreatography (ERCP) and common bile duct (CBD) clearance and second by LC. In 'group-B', 42 patients were treated with LC and intra-operative cholangiography; and when diagnosis of choledocholithiasis was confirmed, patients had undergone one stage management of by Laparo-endoscopic Rendezvous technique. In arm-A and arm-B groups, complete CBD clearance was achieved in 29 and 38 patients, respectively. Failure of the treatment in arm-A was 29% and in arm-B was 9.5%. In arm-A, selective CBD cannulation was achieved in 33 cases (80.5%) and in arm-B in 39 cases (93%). In arm-Agroup, post-ERCP hyperamylasia was presented in nine patients (22%) and severe pancreatitis in five patients (12%) versus none of the patients (0%) in arm-B group, respectively. Mean post-operative hospital stay in arm-A and arm-B groups are 10.9 and 6.8 days, respectively. One stage laparo-endoscopic rendezvous approach increases selective cannulation of CBD, reduces post-ERCP pancreatitis, reduces days of hospital stay, increases patient's compliance and prevents unnecessary intervention to CBD.

  17. Real-time hypothesis driven feature extraction on parallel processing architectures

    DEFF Research Database (Denmark)

    Granmo, O.-C.; Jensen, Finn Verner

    2002-01-01

    the problem of higher-order feature-content/feature-feature correlation, causally complexly interacting features are identified through Bayesian network d-separation analysis and combined into joint features. When used on a moderately complex object-tracking case, the technique is able to select...... extraction, which selectively extract relevant features one-by-one, have in some cases achieved real-time performance on single processing element architectures. In this paperwe propose a novel technique which combines the above two approaches. Features are selectively extracted in parallelizable sets...

  18. Wastewater treatment and biodiesel production by Scenedesmus obliquus in a two-stage cultivation process.

    Science.gov (United States)

    Álvarez-Díaz, P D; Ruiz, J; Arbib, Z; Barragán, J; Garrido-Pérez, M C; Perales, J A

    2015-04-01

    The microalga Scenedesmus obliquus was cultured in two cultivation stages: (1) in batch with real wastewater; (2) maintaining the stationary phase with different conditions of CO2, light and salinity according to a factorial design in order to improve the lipid content. The presence of the three factors increased lipid content from 35.8% to 49% at the end of the second stage; CO2 presence presented the highest direct effect increasing lipid content followed by light presence and salt presence. The ω-3 fatty acids content increased with CO2 and light presence acting in isolation, nevertheless, when both factors acted together the interaction effect was negative. The ω-3 eicosapentaenoic acid content of the oil from S. obliquus slightly exceeded the 1% maximum to be used as biodiesel source (EU normative). Therefore, it is suggested the blend with other oils or the selective extraction of the ω-3 fatty acids from S. obliquus oil. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Multi stage electrodialysis for separation of two metal ion species

    Energy Technology Data Exchange (ETDEWEB)

    Takahashi, K.; Sakurai, H.; Nii, S.; Sugiura, K. [Nagoya Univ., Nagoya (Japan)

    1995-04-20

    In this article, separation of two metal ions by electrodialysis with a cation exchange membrane has been investigated. In other words, separation of potassium ion and sodium ion has been investigated by using batch dialysis with and without an electric field and continuous electrodialysis with a four-stage dialyzer. As a result, the difference in the permselectivity between the dialysis with and without an electric field has not been appreciable for the system of potassium and sodium ions with the cation exchange membrane. Concerning the continuous electrodialysis, the concentration ratio between potassium and sodium ions in the outlet solution from the recovery side of the dialyzer has increased with the reflux flow rate and the number of stages. In case when the reflux flow rate has been zero, the concentration ratio with the four-stage dialyzer has become 1.5 which is almost the same as with that with a two-stage dialyzer consisting of a simple membrane. When the reflux flow ratio has been 0.7, the concentration ratio has reached 3.6. 20 refs., 8 figs.

  20. Multi-channel EEG-based sleep stage classification with joint collaborative representation and multiple kernel learning.

    Science.gov (United States)

    Shi, Jun; Liu, Xiao; Li, Yan; Zhang, Qi; Li, Yingjie; Ying, Shihui

    2015-10-30

    Electroencephalography (EEG) based sleep staging is commonly used in clinical routine. Feature extraction and representation plays a crucial role in EEG-based automatic classification of sleep stages. Sparse representation (SR) is a state-of-the-art unsupervised feature learning method suitable for EEG feature representation. Collaborative representation (CR) is an effective data coding method used as a classifier. Here we use CR as a data representation method to learn features from the EEG signal. A joint collaboration model is established to develop a multi-view learning algorithm, and generate joint CR (JCR) codes to fuse and represent multi-channel EEG signals. A two-stage multi-view learning-based sleep staging framework is then constructed, in which JCR and joint sparse representation (JSR) algorithms first fuse and learning the feature representation from multi-channel EEG signals, respectively. Multi-view JCR and JSR features are then integrated and sleep stages recognized by a multiple kernel extreme learning machine (MK-ELM) algorithm with grid search. The proposed two-stage multi-view learning algorithm achieves superior performance for sleep staging. With a K-means clustering based dictionary, the mean classification accuracy, sensitivity and specificity are 81.10 ± 0.15%, 71.42 ± 0.66% and 94.57 ± 0.07%, respectively; while with the dictionary learned using the submodular optimization method, they are 80.29 ± 0.22%, 71.26 ± 0.78% and 94.38 ± 0.10%, respectively. The two-stage multi-view learning based sleep staging framework outperforms all other classification methods compared in this work, while JCR is superior to JSR. The proposed multi-view learning framework has the potential for sleep staging based on multi-channel or multi-modality polysomnography signals. Copyright © 2015 Elsevier B.V. All rights reserved.