Grandidier, F.; Sabourin, R.; Suen, C.Y.; Gilloux, M.
In this paper we introduce a new strategy for improving a discrete HMMbased handwriting recognition system, by integrating several information sources from specialized feature sets. For a given system, the basic idea is to keep the most discriminative features, and to replace the others with new
Chherawala, Youssouf; Roy, Partha Pratim; Cheriet, Mohamed
The performance of handwriting recognition systems is dependent on the features extracted from the word image. A large body of features exists in the literature, but no method has yet been proposed to identify the most promising of these, other than a straightforward comparison based on the recognition rate. In this paper, we propose a framework for feature set evaluation based on a collaborative setting. We use a weighted vote combination of recurrent neural network (RNN) classifiers, each trained with a particular feature set. This combination is modeled in a probabilistic framework as a mixture model and two methods for weight estimation are described. The main contribution of this paper is to quantify the importance of feature sets through the combination weights, which reflect their strength and complementarity. We chose the RNN classifier because of its state-of-the-art performance. Also, we provide the first feature set benchmark for this classifier. We evaluated several feature sets on the IFN/ENIT and RIMES databases of Arabic and Latin script, respectively. The resulting combination model is competitive with state-of-the-art systems.
Lin, Yuan-Pin; Chen, Jyh-Horng; Duann, Jeng-Ren; Lin, Chin-Teng; Jung, Tzyy-Ping
Electroencephalogram (EEG)-based emotion recognition has been an intensely growing field. Yet, how to achieve acceptable accuracy on a practical system with as fewer electrodes as possible is less concerned. This study evaluates a set of subject-independent features, based on differential power asymmetry of symmetric electrode pairs , with emphasis on its applicability to subject variability in music-induced emotion classification problem. Results of this study have evidently validated the feasibility of using subject-independent EEG features to classify four emotional states with acceptable accuracy in second-scale temporal resolution. These features could be generalized across subjects to detect emotion induced by music excerpts not limited to the music database that was used to derive the emotion-specific features.
Kortelainen, Jukka; Seppänen, Tapio
Emotions are fundamental for everyday life affecting our communication, learning, perception, and decision making. Including emotions into the human-computer interaction (HCI) could be seen as a significant step forward offering a great potential for developing advanced future technologies. While the electrical activity of the brain is affected by emotions, offers electroencephalogram (EEG) an interesting channel to improve the HCI. In this paper, the selection of subject-independent feature set for EEG-based emotion recognition is studied. We investigate the effect of different feature sets in classifying person's arousal and valence while watching videos with emotional content. The classification performance is optimized by applying a sequential forward floating search algorithm for feature selection. The best classification rate (65.1% for arousal and 63.0% for valence) is obtained with a feature set containing power spectral features from the frequency band of 1-32 Hz. The proposed approach substantially improves the classification rate reported in the literature. In future, further analysis of the video-induced EEG changes including the topographical differences in the spectral features is needed.
Das, Nibaran; Mollah, Ayatullah Faruk; Sarkar, Ram; Basu, Subhadip
The work presents a comparative assessment of seven different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron (MLP) based classifier. The seven feature sets employed here consist of shadow features, octant centroids, longest runs, angular distances, effective spans, dynamic centers of gravity, and some of their combinations. On experimentation with a database of 3000 samples, the maximum recognition rate of 95.80% is observed with both of two separat...
Full Text Available In this paper, a novel approach for identifying normal and obscene videos is proposed. In order to classify different episodes of a video independently and discard the need to process all frames, first, key frames are extracted and skin regions are detected for groups of video frames starting with key frames. In the second step, three different features including 1- structural features based on single frame information, 2- features based on spatiotemporal volume and 3-motion-based features, are extracted for each episode of video. The PCA-LDA method is then applied to reduce the size of structural features and select more distinctive features. For the final step, we use fuzzy or a Weighted Support Vector Machine (WSVM classifier to identify video episodes. We also employ a multilayer Kohonen network as an initial clustering algorithm to increase the ability to discriminate between the extracted features into two classes of videos. Features based on motion and periodicity characteristics increase the efficiency of the proposed algorithm in videos with bad illumination and skin colour variation. The proposed method is evaluated using 1100 videos in different environmental and illumination conditions. The experimental results show a correct recognition rate of 94.2% for the proposed algorithm.
Jain, Lalit Prithviraj
Real-world tasks in computer vision, pattern recognition and machine learning often touch upon the open set recognition problem: multi-class recognition with incomplete knowledge of the world and many unknown inputs. An obvious way to approach such problems is to develop a recognition system that thresholds probabilities to reject unknown classes. Traditional rejection techniques are not about the unknown; they are about the uncertain boundary and rejection around that boundary. Thus traditional techniques only represent the "known unknowns". However, a proper open set recognition algorithm is needed to reduce the risk from the "unknown unknowns". This dissertation examines this concept and finds existing probabilistic multi-class recognition approaches are ineffective for true open set recognition. We hypothesize the cause is due to weak adhoc assumptions combined with closed-world assumptions made by existing calibration techniques. Intuitively, if we could accurately model just the positive data for any known class without overfitting, we could reject the large set of unknown classes even under this assumption of incomplete class knowledge. For this, we formulate the problem as one of modeling positive training data by invoking statistical extreme value theory (EVT) near the decision boundary of positive data with respect to negative data. We provide a new algorithm called the PI-SVM for estimating the unnormalized posterior probability of class inclusion. This dissertation also introduces a new open set recognition model called Compact Abating Probability (CAP), where the probability of class membership decreases in value (abates) as points move from known data toward open space. We show that CAP models improve open set recognition for multiple algorithms. Leveraging the CAP formulation, we go on to describe the novel Weibull-calibrated SVM (W-SVM) algorithm, which combines the useful properties of statistical EVT for score calibration with one-class and binary
The Department of Environmental Analysis at the Kentucky Transportation Cabinet has expressed an interest in feature-recognition capability because it may help analysts identify environmentally sensitive features in the landscape, : including those r...
da Costa, R M; Gonzaga, A
The human eye is sensitive to visible light. Increasing illumination on the eye causes the pupil of the eye to contract, while decreasing illumination causes the pupil to dilate. Visible light causes specular reflections inside the iris ring. On the other hand, the human retina is less sensitive to near infra-red (NIR) radiation in the wavelength range from 800 nm to 1400 nm, but iris detail can still be imaged with NIR illumination. In order to measure the dynamic movement of the human pupil and iris while keeping the light-induced reflexes from affecting the quality of the digitalized image, this paper describes a device based on the consensual reflex. This biological phenomenon contracts and dilates the two pupils synchronously when illuminating one of the eyes by visible light. In this paper, we propose to capture images of the pupil of one eye using NIR illumination while illuminating the other eye using a visible-light pulse. This new approach extracts iris features called "dynamic features (DFs)." This innovative methodology proposes the extraction of information about the way the human eye reacts to light, and to use such information for biometric recognition purposes. The results demonstrate that these features are discriminating features, and, even using the Euclidean distance measure, an average accuracy of recognition of 99.1% was obtained. The proposed methodology has the potential to be "fraud-proof," because these DFs can only be extracted from living irises.
Gilbert, Andrew; Illingworth, John; Bowden, Richard
The field of Action Recognition has seen a large increase in activity in recent years. Much of the progress has been through incorporating ideas from single-frame object recognition and adapting them for temporal-based action recognition. Inspired by the success of interest points in the 2D spatial domain, their 3D (space-time) counterparts typically form the basic components used to describe actions, and in action recognition the features used are often engineered to fire sparsely. This is to ensure that the problem is tractable; however, this can sacrifice recognition accuracy as it cannot be assumed that the optimum features in terms of class discrimination are obtained from this approach. In contrast, we propose to initially use an overcomplete set of simple 2D corners in both space and time. These are grouped spatially and temporally using a hierarchical process, with an increasing search area. At each stage of the hierarchy, the most distinctive and descriptive features are learned efficiently through data mining. This allows large amounts of data to be searched for frequently reoccurring patterns of features. At each level of the hierarchy, the mined compound features become more complex, discriminative, and sparse. This results in fast, accurate recognition with real-time performance on high-resolution video. As the compound features are constructed and selected based upon their ability to discriminate, their speed and accuracy increase at each level of the hierarchy. The approach is tested on four state-of-the-art data sets, the popular KTH data set to provide a comparison with other state-of-the-art approaches, the Multi-KTH data set to illustrate performance at simultaneous multiaction classification, despite no explicit localization information provided during training. Finally, the recent Hollywood and Hollywood2 data sets provide challenging complex actions taken from commercial movie sequences. For all four data sets, the proposed hierarchical
Hildebrandt, Mario; Kiltz, Stefan; Dittmann, Jana; Vielhauer, Claus
In crime scene forensics latent fingerprints are found on various substrates. Nowadays primarily physical or chemical preprocessing techniques are applied for enhancing the visibility of the fingerprint trace. In order to avoid altering the trace it has been shown that contact-less sensors offer a non-destructive acquisition approach. Here, the exploitation of fingerprint or substrate properties and the utilization of signal processing techniques are an essential requirement to enhance the fingerprint visibility. However, especially the optimal sensory is often substrate-dependent. An enhanced generic pattern recognition based contrast enhancement approach for scans of a chromatic white light sensor is introduced in Hildebrandt et al.1 using statistical, structural and Benford's law2 features for blocks of 50 micron. This approach achieves very good results for latent fingerprints on cooperative, non-textured, smooth substrates. However, on textured and structured substrates the error rates are very high and the approach thus unsuitable for forensic use cases. We propose the extension of the feature set with semantic features derived from known Gabor filter based exemplar fingerprint enhancement techniques by suggesting an Epsilon-neighborhood of each block in order to achieve an improved accuracy (called fingerprint ridge orientation semantics). Furthermore, we use rotation invariant Hu moments as an extension of the structural features and two additional preprocessing methods (separate X- and Y Sobel operators). This results in a 408-dimensional feature space. In our experiments we investigate and report the recognition accuracy for eight substrates, each with ten latent fingerprints: white furniture surface, veneered plywood, brushed stainless steel, aluminum foil, "Golden-Oak" veneer, non-metallic matte car body finish, metallic car body finish and blued metal. In comparison to Hildebrandt et al.,1 our evaluation shows a significant reduction of the error rates
Wan Min; Xiang Rujian; Wan Yongxing
The characteristics of objects, especially flying objects, are analyzed, which include characteristics of spectrum, image and motion. Feature extraction is also achieved. To improve the speed of object recognition, a feature database is used to simplify the data in the source database. The feature vs. object relationship maps are stored in the feature database. An object recognition model based on the feature database is presented, and the way to achieve object recognition is also explained
Ren, X; Tian, Q; Zhang, J; Wu, S; Zeng, Y
In iris recognition, feature extraction can be influenced by factors such as illumination and contrast, and thus the features extracted may be unreliable, which can cause a high rate of false results in iris pattern recognition. In order to obtain stable features, an algorithm was proposed in this paper to extract key features of a pattern from multiple images. The proposed algorithm built an iris feature template by extracting key features and performed iris identity enrolment. Simulation results showed that the selected key features have high recognition accuracy on the CASIA Iris Set, where both contrast and illumination variance exist.
Nasrollahi, Kamal; Moeslund, Thomas B.
Biometric recognition is still a very difficult task in real-world scenarios wherein unforeseen changes in degradations factors like noise, occlusion, blurriness and illumination can drastically affect the extracted features from the biometric signals. Very recently Haar-like rectangular features...... which have usually been used for object detection were introduced for biometric recognition resulting in systems that are robust against most of the mentioned degradations . The problem with these features is that one can define many different such features for a given biometric signal...... and it is not clear whether all of these features are required for the actual recognition or not. This is exactly what we are dealing with in this paper: How can an initial set of Haar-like rectangular features, that have been used for biometric recognition, be reduced to a set of most influential features...
Jamal Ahmad Dargham
Full Text Available Face recognition is an important biometric method because of its potential applications in many fields, such as access control, surveillance, and human-computer interaction. In this paper, a face recognition system that fuses the outputs of three face recognition systems based on Gabor jets is presented. The first system uses the magnitude, the second uses the phase, and the third uses the phase-weighted magnitude of the jets. The jets are generated from facial landmarks selected using three selection methods. It was found out that fusing the facial features gives better recognition rate than either facial feature used individually regardless of the landmark selection method.
ROH Yong-Wan; KIM Dong-Ju; LEE Woo-Seok; HONG Kwang-Seok
This paper focuses on acoustic features that effectively improve the recognition of emotion in human speech. The novel features in this paper are based on spectral-based entropy parameters such as fast Fourier transform (FFT) spectral entropy, delta FFT spectral entropy, Mel-frequency filter bank (MFB)spectral entropy, and Delta MFB spectral entropy. Spectral-based entropy features are simple. They reflect frequency characteristic and changing characteristic in frequency of speech. We implement an emotion rejection module using the probability distribution of recognized-scores and rejected-scores.This reduces the false recognition rate to improve overall performance. Recognized-scores and rejected-scores refer to probabilities of recognized and rejected emotion recognition results, respectively.These scores are first obtained from a pattern recognition procedure. The pattern recognition phase uses the Gaussian mixture model (GMM). We classify the four emotional states as anger, sadness,happiness and neutrality. The proposed method is evaluated using 45 sentences in each emotion for 30 subjects, 15 males and 15 females. Experimental results show that the proposed method is superior to the existing emotion recognition methods based on GMM using energy, Zero Crossing Rate (ZCR),linear prediction coefficient (LPC), and pitch parameters. We demonstrate the effectiveness of the proposed approach. One of the proposed features, combined MFB and delta MFB spectral entropy improves performance approximately 10% compared to the existing feature parameters for speech emotion recognition methods. We demonstrate a 4% performance improvement in the applied emotion rejection with low confidence score.
Bakircioglu, Hakan; Gelenbe, Erol
Detection and recognition of target signatures in sensory data obtained by synthetic aperture radar (SAR), forward- looking infrared, or laser radar, have received considerable attention in the literature. In this paper, we propose a feature based target classification methodology to detect and classify targets in cluttered SAR images, that makes use of selective signature data from sensory data, together with a neural network technique which uses a set of trained networks based on the Random Neural Network (RNN) model (Gelenbe 89, 90, 91, 93) which is trained to act as a matched filter. We propose and investigate radial features of target shapes that are invariant to rotation, translation, and scale, to characterize target and clutter signatures. These features are then used to train a set of learning RNNs which can be used to detect targets within clutter with high accuracy, and to classify the targets or man-made objects from natural clutter. Experimental data from SAR imagery is used to illustrate and validate the proposed method, and to calculate Receiver Operating Characteristics which illustrate the performance of the proposed algorithm.
Full Text Available This paper proposes a boosted linear discriminant analysis (LDA solution on features extracted by the multilinear principal component analysis (MPCA to enhance gait recognition performance. Three-dimensional gait objects are projected in the MPCA space first to obtain low-dimensional tensorial features. Then, lower-dimensional vectorial features are obtained through discriminative feature selection. These feature vectors are then fed into an LDA-style booster, where several regularized and weakened LDA learners work together to produce a strong learner through a novel feature weighting and sampling process. The LDA learner employs a simple nearest-neighbor classifier with a weighted angle distance measure for classification. The experimental results on the NIST/USF “Gait Challenge” data-sets show that the proposed solution has successfully improved the gait recognition performance and outperformed several state-of-the-art gait recognition algorithms.
ROH; Yong-Wan; KIM; Dong-Ju; LEE; Woo-Seok; HONG; Kwang-Seok
This paper focuses on acoustic features that effectively improve the recognition of emotion in human speech.The novel features in this paper are based on spectral-based entropy parameters such as fast Fourier transform(FFT) spectral entropy,delta FFT spectral entropy,Mel-frequency filter bank(MFB) spectral entropy,and Delta MFB spectral entropy.Spectral-based entropy features are simple.They reflect frequency characteristic and changing characteristic in frequency of speech.We implement an emotion rejection module using the probability distribution of recognized-scores and rejected-scores.This reduces the false recognition rate to improve overall performance.Recognized-scores and rejected-scores refer to probabilities of recognized and rejected emotion recognition results,respectively.These scores are first obtained from a pattern recognition procedure.The pattern recognition phase uses the Gaussian mixture model(GMM).We classify the four emotional states as anger,sadness,happiness and neutrality.The proposed method is evaluated using 45 sentences in each emotion for 30 subjects,15 males and 15 females.Experimental results show that the proposed method is superior to the existing emotion recognition methods based on GMM using energy,Zero Crossing Rate(ZCR),linear prediction coefficient(LPC),and pitch parameters.We demonstrate the effectiveness of the proposed approach.One of the proposed features,combined MFB and delta MFB spectral entropy improves performance approximately 10% compared to the existing feature parameters for speech emotion recognition methods.We demonstrate a 4% performance improvement in the applied emotion rejection with low confidence score.
This brief presents a comprehensive introduction to feature coding, which serves as a key module for the typical object recognition pipeline. The text offers a rich blend of theory and practice while reflects the recent developments on feature coding, covering the following five aspects: (1) Review the state-of-the-art, analyzing the motivations and mathematical representations of various feature coding methods; (2) Explore how various feature coding algorithms evolve along years; (3) Summarize the main characteristics of typical feature coding algorithms and categorize them accordingly; (4) D
Full Text Available Lately a lot of research effort is devoted for recognition of a human being using his biometric characteristics. Biometric recognition systems are used in various applications, e. g., identification for state border crossing or firearm, which allows only enrolled persons to use it. In this paper biometric characteristics and their properties are reviewed. Development of high accuracy system requires distinctive and permanent characteristics, whereas development of user friendly system requires collectable and acceptable characteristics. It is showed that properties of biometric characteristics do not influence research effort significantly. Properties of biometric characteristic features and their influence are discussed.Article in Lithuanian
Prevost, Donald; Doucet, Michel; Bergeron, Alain; Veilleux, Luc; Chevrette, Paul C.; Gingras, Denis J.
A rotation, scale and translation invariant pattern recognition technique is proposed.It is based on Fourier- Mellin Descriptors (FMD). Each FMD is taken as an independent feature of the object, and a set of those features forms a signature. FMDs are naturally rotation invariant. Translation invariance is achieved through pre- processing. A proper normalization of the FMDs gives the scale invariance property. This approach offers the double advantage of providing invariant signatures of the objects, and a dramatic reduction of the amount of data to process. The compressed invariant feature signature is next presented to a multi-layered perceptron neural network. This final step provides some robustness to the classification of the signatures, enabling good recognition behavior under anamorphically scaled distortion. We also present an original feature extraction technique, adapted to optical calculation of the FMDs. A prototype optical set-up was built, and experimental results are presented.
Full Text Available The classification and recognition technology of underwater acoustic signal were always an important research content in the field of underwater acoustic signal processing. Currently, wavelet transform, Hilbert-Huang transform, and Mel frequency cepstral coefficients are used as a method of underwater acoustic signal feature extraction. In this paper, a method for feature extraction and identification of underwater noise data based on CNN and ELM is proposed. An automatic feature extraction method of underwater acoustic signals is proposed using depth convolution network. An underwater target recognition classifier is based on extreme learning machine. Although convolution neural networks can execute both feature extraction and classification, their function mainly relies on a full connection layer, which is trained by gradient descent-based; the generalization ability is limited and suboptimal, so an extreme learning machine (ELM was used in classification stage. Firstly, CNN learns deep and robust features, followed by the removing of the fully connected layers. Then ELM fed with the CNN features is used as the classifier to conduct an excellent classification. Experiments on the actual data set of civil ships obtained 93.04% recognition rate; compared to the traditional Mel frequency cepstral coefficients and Hilbert-Huang feature, recognition rate greatly improved.
Odinokikh, G.; Fartukov, A.; Korobkin, M.; Yoo, J.
One of the basic stages of iris recognition pipeline is iris feature vector construction procedure. The procedure represents the extraction of iris texture information relevant to its subsequent comparison. Thorough investigation of feature vectors obtained from iris showed that not all the vector elements are equally relevant. There are two characteristics which determine the vector element utility: fragility and discriminability. Conventional iris feature extraction methods consider the concept of fragility as the feature vector instability without respect to the nature of such instability appearance. This work separates sources of the instability into natural and encodinginduced which helps deeply investigate each source of instability independently. According to the separation concept, a novel approach of iris feature vector construction is proposed. The approach consists of two steps: iris feature extraction using Gabor filtering with optimal parameters and quantization with separated preliminary optimized fragility thresholds. The proposed method has been tested on two different datasets of iris images captured under changing environmental conditions. The testing results show that the proposed method surpasses all the methods considered as a prior art by recognition accuracy on both datasets.
Xi, Xiaoming; Yang, Gongping; Yin, Yilong; Meng, Xianjing
Finger veins are a promising biometric pattern for personalized identification in terms of their advantages over existing biometrics. Based on the spatial pyramid representation and the combination of more effective information such as gray, texture and shape, this paper proposes a simple but powerful feature, called Pyramid Histograms of Gray, Texture and Orientation Gradients (PHGTOG). For a finger vein image, PHGTOG can reflect the global spatial layout and local details of gray, texture and shape. To further improve the recognition performance and reduce the computational complexity, we select a personalized subset of features from PHGTOG for each subject by using the sparse weight vector, which is trained by using LASSO and called PFS-PHGTOG. We conduct extensive experiments to demonstrate the promise of the PHGTOG and PFS-PHGTOG, experimental results on our databases show that PHGTOG outperforms the other existing features. Moreover, PFS-PHGTOG can further boost the performance in comparison with PHGTOG.
Full Text Available Finger veins are a promising biometric pattern for personalized identification in terms of their advantages over existing biometrics. Based on the spatial pyramid representation and the combination of more effective information such as gray, texture and shape, this paper proposes a simple but powerful feature, called Pyramid Histograms of Gray, Texture and Orientation Gradients (PHGTOG. For a finger vein image, PHGTOG can reflect the global spatial layout and local details of gray, texture and shape. To further improve the recognition performance and reduce the computational complexity, we select a personalized subset of features from PHGTOG for each subject by using the sparse weight vector, which is trained by using LASSO and called PFS-PHGTOG. We conduct extensive experiments to demonstrate the promise of the PHGTOG and PFS-PHGTOG, experimental results on our databases show that PHGTOG outperforms the other existing features. Moreover, PFS-PHGTOG can further boost the performance in comparison with PHGTOG.
Full Text Available Long-term place recognition in outdoor environments remains a challenge due to high appearance changes in the environment. The problem becomes even more difficult when the matching between two scenes has to be made with information coming from different visual sources, particularly with different spectral ranges. For instance, an infrared camera is helpful for night vision in combination with a visible camera. In this paper, we emphasize our work on testing usual feature point extractors under both constraints: repeatability across spectral ranges and long-term appearance. We develop a new feature extraction method dedicated to improve the repeatability across spectral ranges. We conduct an evaluation of feature robustness on long-term datasets coming from different imaging sources (optics, sensors size and spectral ranges with a Bag-of-Words approach. The tests we perform demonstrate that our method brings a significant improvement on the image retrieval issue in a visual place recognition context, particularly when there is a need to associate images from various spectral ranges such as infrared and visible: we have evaluated our approach using visible, Near InfraRed (NIR, Short Wavelength InfraRed (SWIR and Long Wavelength InfraRed (LWIR.
Khotimah, C.; Juniati, D.
Biometrics is a science that is now growing rapidly. Iris recognition is a biometric modality which captures a photo of the eye pattern. The markings of the iris are distinctive that it has been proposed to use as a means of identification, instead of fingerprints. Iris recognition was chosen for identification in this research because every human has a special feature that each individual is different and the iris is protected by the cornea so that it will have a fixed shape. This iris recognition consists of three step: pre-processing of data, feature extraction, and feature matching. Hough transformation is used in the process of pre-processing to locate the iris area and Daugman’s rubber sheet model to normalize the iris data set into rectangular blocks. To find the characteristics of the iris, it was used box counting method to get the fractal dimension value of the iris. Tests carried out by used k-fold cross method with k = 5. In each test used 10 different grade K of K-Nearest Neighbor (KNN). The result of iris recognition was obtained with the best accuracy was 92,63 % for K = 3 value on K-Nearest Neighbor (KNN) method.
feature space . . . . . . . . . . . . . . . 85 5.3.1 Preserving neighborhood relationships during coding . . . . . . 86 5.3.2 Letting only neighbors vote ...Letting only neighbors vote during pooling Pooling involves extracting an ensemble statistic from a potentially large group of in- puts. However...element. For slicing the 4D tensor S we adopt the MATLAB notation for simplicity of notation. function ConvCoD(x,D, α) Set: S = DT ∗ D Initialize: z = 0; β
Full Text Available A discriminative and robust feature—kernel enhanced informative Gabor feature—is proposed in this paper for face recognition. Mutual information is applied to select a set of informative and nonredundant Gabor features, which are then further enhanced by kernel methods for recognition. Compared with one of the top performing methods in the 2004 Face Verification Competition (FVC2004, our methods demonstrate a clear advantage over existing methods in accuracy, computation efficiency, and memory cost. The proposed method has been fully tested on the FERET database using the FERET evaluation protocol. Significant improvements on three of the test data sets are observed. Compared with the classical Gabor wavelet-based approaches using a huge number of features, our method requires less than 4 milliseconds to retrieve a few hundreds of features. Due to the substantially reduced feature dimension, only 4 seconds are required to recognize 200 face images. The paper also unified different Gabor filter definitions and proposed a training sample generation algorithm to reduce the effects caused by unbalanced number of samples available in different classes.
Shen, Linlin; Bai, Li
A discriminative and robust feature—kernel enhanced informative Gabor feature—is proposed in this paper for face recognition. Mutual information is applied to select a set of informative and nonredundant Gabor features, which are then further enhanced by kernel methods for recognition. Compared with one of the top performing methods in the 2004 Face Verification Competition (FVC2004), our methods demonstrate a clear advantage over existing methods in accuracy, computation efficiency, and memory cost. The proposed method has been fully tested on the FERET database using the FERET evaluation protocol. Significant improvements on three of the test data sets are observed. Compared with the classical Gabor wavelet-based approaches using a huge number of features, our method requires less than 4 milliseconds to retrieve a few hundreds of features. Due to the substantially reduced feature dimension, only 4 seconds are required to recognize 200 face images. The paper also unified different Gabor filter definitions and proposed a training sample generation algorithm to reduce the effects caused by unbalanced number of samples available in different classes.
Full Text Available Speech recognition is about what is being said, irrespective of who is saying. Speech recognition is a growing field. Major progress is taking place on the technology of automatic speech recognition (ASR. Still, there are lots of barriers in this field in terms of recognition rate, background noise, speaker variability, speaking rate, accent etc. Speech recognition rate mainly depends on the selection of features and feature extraction methods. This paper outlines the feature extraction techniques for speaker dependent speech recognition for isolated words. A brief survey of different feature extraction techniques like Mel-Frequency Cepstral Coefficients (MFCC, Linear Predictive Coding Coefficients (LPCC, Perceptual Linear Prediction (PLP, Relative Spectra Perceptual linear Predictive (RASTA-PLP analysis are presented and evaluation is done. Speech recognition has various applications from daily use to commercial use. We have made a speaker dependent system and this system can be useful in many areas like controlling a patient vehicle using simple commands.
Full Text Available The Tifinagh alphabet-IRCAM is the official alphabet of the Amazigh language widely used in North Africa . It includes thirty-one basic letter and two letters each composed of a base letter followed by the sign of labialization. Normalized only in 2003 (Unicode , ICRAM-Tifinagh is a young character repertoire. Which needs more work on all levels. In this context we propose a data set for handwritten Tifinagh characters composed of 1376 image; 43 Image For Each character. The dataset can be used to train a Tifinagh character recognition system, or to extract the meaning characteristics of each character.
Wong, Wai Keung; Lai, Zhihui; Xu, Yong; Wen, Jiajun; Ho, Chu Po
Tensor-based object recognition has been widely studied in the past several years. This paper focuses on the issue of joint feature selection from the tensor data and proposes a novel method called joint tensor feature analysis (JTFA) for tensor feature extraction and recognition. In order to obtain a set of jointly sparse projections for tensor feature extraction, we define the modified within-class tensor scatter value and the modified between-class tensor scatter value for regression. The k-mode optimization technique and the L(2,1)-norm jointly sparse regression are combined together to compute the optimal solutions. The convergent analysis, computational complexity analysis and the essence of the proposed method/model are also presented. It is interesting to show that the proposed method is very similar to singular value decomposition on the scatter matrix but with sparsity constraint on the right singular value matrix or eigen-decomposition on the scatter matrix with sparse manner. Experimental results on some tensor datasets indicate that JTFA outperforms some well-known tensor feature extraction and selection algorithms.
Irhebhude, Martins E.; Edirisinghe, Eran A.
This paper presents an automatic, machine vision based, military personnel identification and classification system. Classification is done using a Support Vector Machine (SVM) on sets of Army, Air Force and Navy camouflage uniform personnel datasets. In the proposed system, the arm of service of personnel is recognised by the camouflage of a persons uniform, type of cap and the type of badge/logo. The detailed analysis done include; camouflage cap and plain cap differentiation using gray level co-occurrence matrix (GLCM) texture feature; classification on Army, Air Force and Navy camouflaged uniforms using GLCM texture and colour histogram bin features; plain cap badge classification into Army, Air Force and Navy using Speed Up Robust Feature (SURF). The proposed method recognised camouflage personnel arm of service on sets of data retrieved from google images and selected military websites. Correlation-based Feature Selection (CFS) was used to improve recognition and reduce dimensionality, thereby speeding the classification process. With this method success rates recorded during the analysis include 93.8% for camouflage appearance category, 100%, 90% and 100% rates of plain cap and camouflage cap categories for Army, Air Force and Navy categories, respectively. Accurate recognition was recorded using SURF for the plain cap badge category. Substantial analysis has been carried out and results prove that the proposed method can correctly classify military personnel into various arms of service. We show that the proposed method can be integrated into a face recognition system, which will recognise personnel in addition to determining the arm of service which the personnel belong. Such a system can be used to enhance the security of a military base or facility.
Turmon, Michael; Jones, Harrison P.; Malanushenko, Olena V.; Pap, Judit M.
A maximum a posteriori (MAP) technique is developed to identify solar features in cotemporal and cospatial images of line-of-sight magnetic flux, continuum intensity, and equivalent width observed with the NASA/National Solar Observatory (NSO) Spectromagnetograph (SPM). The technique facilitates human understanding of patterns in large data sets and enables systematic studies of feature characteristics for comparison with models and observations of long-term solar activity and variability. The method uses Bayes’ rule to compute the posterior probability of any feature segmentation of a trio of observed images from per-pixel, class-conditional probabilities derived from independently-segmented training images. Simulated annealing is used to find the most likely segmentation. New algorithms for computing class-conditional probabilities from three-dimensional Gaussian mixture models and interpolated histogram densities are described and compared. A new extension to the spatial smoothing in the Bayesian prior model is introduced, which can incorporate a spatial dependence such as center-to-limb variation. How the spatial scale of training segmentations affects the results is discussed, and a new method for statistical separation of quiet Sun and quiet network is presented.
Petridis, Stavros; Pantic, Maja
Deep bottleneck features (DBNFs) have been used successfully in the past for acoustic speech recognition from audio. However, research on extracting DBNFs for visual speech recognition is very limited. In this work, we present an approach to extract deep bottleneck visual features based on deep
Full Text Available Image scene recognition is a core technology for many aerial remote sensing applications. Different landforms are inputted as different scenes in aerial imaging, and all landform information is regarded as valuable for aerial image scene recognition. However, the conventional features of the Bag-of-Words model are designed using local points or other related information and thus are unable to fully describe landform areas. This limitation cannot be ignored when the aim is to ensure accurate aerial scene recognition. A novel superpixel-based feature is proposed in this study to characterize aerial image scenes. Then, based on the proposed feature, a scene recognition method of the Bag-of-Words model for aerial imaging is designed. The proposed superpixel-based feature that utilizes landform information establishes top-task superpixel extraction of landforms to bottom-task expression of feature vectors. This characterization technique comprises the following steps: simple linear iterative clustering based superpixel segmentation, adaptive filter bank construction, Lie group-based feature quantification, and visual saliency model-based feature weighting. Experiments of image scene recognition are carried out using real image data captured by an unmanned aerial vehicle (UAV. The recognition accuracy of the proposed superpixel-based feature is 95.1%, which is higher than those of scene recognition algorithms based on other local features.
Surinta, Olarik; Karaaba, Mahir F.; Schomaker, Lambert R.B.; Wiering, Marco A.
Abstract In this paper we propose to use local gradient feature descriptors, namely the scale invariant feature transform keypoint descriptor and the histogram of oriented gradients, for handwritten character recognition. The local gradient feature descriptors are used to extract feature vectors
Duan, Xiaodong; Tan, Zheng-Hua
In this paper, we present a local feature learning method for face recognition to deal with varying poses. As opposed to the commonly used approaches of recovering frontal face images from profile views, the proposed method extracts the subject related part from a local feature by removing the pose...... related part in it on the basis of a pose feature. The method has a closed-form solution, hence being time efficient. For performance evaluation, cross pose face recognition experiments are conducted on two public face recognition databases FERET and FEI. The proposed method shows a significant...... recognition improvement under varying poses over general local feature approaches and outperforms or is comparable with related state-of-the-art pose invariant face recognition approaches. Copyright ©2015 by IEEE....
Full Text Available In order to overcome the limitation of single mode emotion recognition. This paper describes a novel multimodal emotion recognition algorithm, and takes speech signal and facial expression signal as the research subjects. First, fuse the speech signal feature and facial expression signal feature, get sample sets by putting back sampling, and then get classifiers by BP neural network (BPNN. Second, measure the difference between two classifiers by double error difference selection strategy. Finally, get the final recognition result by the majority voting rule. Experiments show the method improves the accuracy of emotion recognition by giving full play to the advantages of decision level fusion and feature level fusion, and makes the whole fusion process close to human emotion recognition more, with a recognition rate 90.4%.
Full Text Available Both static features and motion features have shown promising performance in human activities recognition task. However, the information included in these features is insufficient for complex human activities. In this paper, we propose extracting relational information of static features and motion features for human activities recognition. The videos are represented by a classical Bag-of-Word (BoW model which is useful in many works. To get a compact and discriminative codebook with small dimension, we employ the divisive algorithm based on KL-divergence to reconstruct the codebook. After that, to further capture strong relational information, we construct a bipartite graph to model the relationship between words of different feature set. Then we use a k-way partition to create a new codebook in which similar words are getting together. With this new codebook, videos can be represented by a new BoW vector with strong relational information. Moreover, we propose a method to compute new clusters from the divisive algorithm’s projective function. We test our work on the several datasets and obtain very promising results.
Clemmensen, Line Katrine Harder; Gomez, David Delgado; Ersbøll, Bjarne Kjær
The accuracy of data classification methods depends considerably on the data representation and on the selected features. In this work, the elastic net model selection is used to identify meaningful and important features in face recognition. Modelling the characteristics which distinguish one...... person from another using only subsets of features will both decrease the computational cost and increase the generalization capacity of the face recognition algorithm. Moreover, identifying which are the features that better discriminate between persons will also provide a deeper understanding...... of the face recognition problem. The elastic net model is able to select a subset of features with low computational effort compared to other state-of-the-art feature selection methods. Furthermore, the fact that the number of features usually is larger than the number of images in the data base makes feature...
Pawan Kumar Singh
Full Text Available Handwritten digit recognition plays a significant role in many user authentication applications in the modern world. As the handwritten digits are not of the same size, thickness, style, and orientation, therefore, these challenges are to be faced to resolve this problem. A lot of work has been done for various non-Indic scripts particularly, in case of Roman, but, in case of Indic scripts, the research is limited. This paper presents a script invariant handwritten digit recognition system for identifying digits written in five popular scripts of Indian subcontinent, namely, Indo-Arabic, Bangla, Devanagari, Roman, and Telugu. A 130-element feature set which is basically a combination of six different types of moments, namely, geometric moment, moment invariant, affine moment invariant, Legendre moment, Zernike moment, and complex moment, has been estimated for each digit sample. Finally, the technique is evaluated on CMATER and MNIST databases using multiple classifiers and, after performing statistical significance tests, it is observed that Multilayer Perceptron (MLP classifier outperforms the others. Satisfactory recognition accuracies are attained for all the five mentioned scripts.
Goltsev, Alexander; Gritsenko, Vladimir
In the paper, effective and simple features for image recognition (named LiRA-features) are investigated in the task of handwritten digit recognition. Two neural network classifiers are considered-a modified 3-layer perceptron LiRA and a modular assembly neural network. A method of feature selection is proposed that analyses connection weights formed in the preliminary learning process of a neural network classifier. In the experiments using the MNIST database of handwritten digits, the feature selection procedure allows reduction of feature number (from 60 000 to 7000) preserving comparable recognition capability while accelerating computations. Experimental comparison between the LiRA perceptron and the modular assembly neural network is accomplished, which shows that recognition capability of the modular assembly neural network is somewhat better. Copyright © 2011 Elsevier Ltd. All rights reserved.
Sun, Bo; Li, Liandong; Zhou, Guoyan; He, Jun
Facial expression recognition in the wild is a very challenging task. We describe our work in static and continuous facial expression recognition in the wild. We evaluate the recognition results of gray deep features and color deep features, and explore the fusion of multimodal texture features. For the continuous facial expression recognition, we design two temporal-spatial dense scale-invariant feature transform (SIFT) features and combine multimodal features to recognize expression from image sequences. For the static facial expression recognition based on video frames, we extract dense SIFT and some deep convolutional neural network (CNN) features, including our proposed CNN architecture. We train linear support vector machine and partial least squares classifiers for those kinds of features on the static facial expression in the wild (SFEW) and acted facial expression in the wild (AFEW) dataset, and we propose a fusion network to combine all the extracted features at decision level. The final achievement we gained is 56.32% on the SFEW testing set and 50.67% on the AFEW validation set, which are much better than the baseline recognition rates of 35.96% and 36.08%.
Zhang, Ming; Xie, Fei; Zhao, Jing; Sun, Rui; Zhang, Lei; Zhang, Yue
The prosperity of license plate recognition technology has made great contribution to the development of Intelligent Transport System (ITS). In this paper, a robust and efficient license plate recognition method is proposed which is based on a combined feature extraction model and BPNN (Back Propagation Neural Network) algorithm. Firstly, the candidate region of the license plate detection and segmentation method is developed. Secondly, a new feature extraction model is designed considering three sets of features combination. Thirdly, the license plates classification and recognition method using the combined feature model and BPNN algorithm is presented. Finally, the experimental results indicate that the license plate segmentation and recognition both can be achieved effectively by the proposed algorithm. Compared with three traditional methods, the recognition accuracy of the proposed method has increased to 95.7% and the consuming time has decreased to 51.4ms.
Wei, Wei; Jia, Qingxuan
Emotion recognition with weighted feature based on facial expression is a challenging research topic and has attracted great attention in the past few years. This paper presents a novel method, utilizing subregion recognition rate to weight kernel function. First, we divide the facial expression image into some uniform subregions and calculate corresponding recognition rate and weight. Then, we get a weighted feature Gaussian kernel function and construct a classifier based on Support Vector Machine (SVM). At last, the experimental results suggest that the approach based on weighted feature Gaussian kernel function has good performance on the correct rate in emotion recognition. The experiments on the extended Cohn-Kanade (CK+) dataset show that our method has achieved encouraging recognition results compared to the state-of-the-art methods.
Rao, K Sreenivasa
In this brief, the authors discuss recently explored spectral (sub-segmental and pitch synchronous) and prosodic (global and local features at word and syllable levels in different parts of the utterance) features for discerning emotions in a robust manner. The authors also delve into the complementary evidences obtained from excitation source, vocal tract system and prosodic features for the purpose of enhancing emotion recognition performance. Features based on speaking rate characteristics are explored with the help of multi-stage and hybrid models for further improving emotion recognition performance. Proposed spectral and prosodic features are evaluated on real life emotional speech corpus.
Sahar Yousefi; Morteza Zahedi
This paper proposes a robust approach for face detection and gender classification in color images. Previous researches about gender recognition suppose an expensive computational and time-consuming pre-processing step in order to alignment in which face images are aligned so that facial landmarks like eyes, nose, lips, chin are placed in uniform locations in image. In this paper, a novel technique based on mathematical analysis is represented in three stages that eliminates align...
Silapachote, Piyanuch; Karuppiah, Deepak R; Hanson, Allen R
We propose a classification technique for face expression recognition using AdaBoost that learns by selecting the relevant global and local appearance features with the most discriminating information...
Yan, Yan; Lee, Feifei; Wu, Xueqian; Chen, Qiu
In this paper, we propose a face recognition algorithm based on a combination of vector quantization (VQ) and Markov stationary features (MSF). The VQ algorithm has been shown to be an effective method for generating features; it extracts a codevector histogram as a facial feature representation for face recognition. Still, the VQ histogram features are unable to convey spatial structural information, which to some extent limits their usefulness in discrimination. To alleviate this limitation of VQ histograms, we utilize Markov stationary features (MSF) to extend the VQ histogram-based features so as to add spatial structural information. We demonstrate the effectiveness of our proposed algorithm by achieving recognition results superior to those of several state-of-the-art methods on publicly available face databases.
Rao, K Sreenivasa
This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Wang, Yuehao; Peng, Lingling; Zhe, Fuchuan
In this paper we propose a novel face recognition approach based on slow feature analysis (SFA) in contourlet transform domain. This method firstly use contourlet transform to decompose the face image into low frequency and high frequency part, and then takes technological advantages of slow feature analysis for facial feature extraction. We named the new method combining the slow feature analysis and contourlet transform as CT-SFA. The experimental results on international standard face database demonstrate that the new face recognition method is effective and competitive.
Huo, Guang; Liu, Yuanning; Zhu, Xiaodong; Dong, Hongxing
This paper proposes a secondary iris recognition based on local features. The application of the energy-orientation feature (EOF) by two-dimensional Gabor filter to the extraction of the iris goes before the first recognition by the threshold of similarity, which sets the whole iris database into two categories-a correctly recognized class and a class to be recognized. Therefore, the former are accepted and the latter are transformed by histogram to achieve an energy-orientation histogram feature (EOHF), which is followed by a second recognition with the chi-square distance. The experiment has proved that the proposed method, because of its higher correct recognition rate, could be designated as the most efficient and effective among its companion studies in iris recognition algorithms.
Xiong, Yudian; Lu, Tongwei; Jiang, Yongyuan
As an important application in the field of text line recognition and office automation, Chinese character recognition has become an important subject of pattern recognition. However, due to the large number of Chinese characters and the complexity of its structure, there is a great difficulty in the Chinese character recognition. In order to solve this problem, this paper proposes a method of printed Chinese character recognition based on Gabor feature extraction and Convolution Neural Network(CNN). The main steps are preprocessing, feature extraction, training classification. First, the gray-scale Chinese character image is binarized and normalized to reduce the redundancy of the image data. Second, each image is convoluted with Gabor filter with different orientations, and the feature map of the eight orientations of Chinese characters is extracted. Third, the feature map through Gabor filters and the original image are convoluted with learning kernels, and the results of the convolution is the input of pooling layer. Finally, the feature vector is used to classify and recognition. In addition, the generalization capacity of the network is improved by Dropout technology. The experimental results show that this method can effectively extract the characteristics of Chinese characters and recognize Chinese characters.
Full Text Available Handwritten character recognition is a challenging area of research. Lots of research activities in the area of character recognition are already done for Indian languages such as Hindi, Bangla, Kannada, Tamil and Telugu. Literature review on handwritten character recognition indicates that in comparison with other Indian scripts research activities on Gujarati handwritten character recognition are very less. This paper aims to bring Gujarati character recognition in attention. Recognition of isolated Gujarati handwritten characters is proposed using three different kinds of features and their fusion. Chain code based, zone based and projection profiles based features are utilized as individual features. One of the significant contribution of proposed work is towards the generation of large and representative dataset of 88,000 handwritten Gujarati characters. Experiments are carried out on this developed dataset. Artificial Neural Network (ANN, Support Vector Machine (SVM and Naive Bayes (NB classifier based methods are implemented for handwritten Gujarati character recognition. Experimental results show substantial enhancement over state-of-the-art and authenticate our proposals.
Kamminga, Jacob Wilhelm; Le Viet Duc, Duc Viet; Meijers, Jan Pieter; Bisby, Helena C.; Meratnia, Nirvana; Havinga, Paul J.M.
Fundamental challenges faced by real-time animal activity recognition include variation in motion data due to changing sensor orientations, numerous features, and energy and processing constraints of animal tags. This paper aims at finding small optimal feature sets that are lightweight and robust
Oikonomopoulos, A.; Pantic, Maja
In this paper we address the problem of human activity modelling and recognition by means of a hierarchical representation of mined dense spatiotemporal features. At each level of the hierarchy, the proposed method selects feature constellations that are increasingly discriminative and
Full Text Available This paper introduces a method for human action recognition based on optical flow motion features extraction. Automatic spatial and temporal alignments are combined together in order to encourage the temporal consistence on each action by an enhanced dynamic time warping (DTW algorithm. At the same time, a fast method based on coarse-to-fine DTW constraint to improve computational performance without reducing accuracy is induced. The main contributions of this study include (1 a joint spatial-temporal multiresolution optical flow computation method which can keep encoding more informative motion information than recent proposed methods, (2 an enhanced DTW method to improve temporal consistence of motion in action recognition, and (3 coarse-to-fine DTW constraint on motion features pyramids to speed up recognition performance. Using this method, high recognition accuracy is achieved on different action databases like Weizmann database and KTH database.
Li, Ming; Wang, Zengfu
Recently, dynamic facial expression recognition in videos has attracted growing attention. In this paper, we propose a novel dynamic facial expression recognition method by using geometric and texture features. In our system, the facial landmark movements and texture variations upon pairwise images are used to perform the dynamic facial expression recognition tasks. For one facial expression sequence, pairwise images are created between the first frame and each of its subsequent frames. Integration of both geometric and texture features further enhances the representation of the facial expressions. Finally, Support Vector Machine is used for facial expression recognition. Experiments conducted on the extended Cohn-Kanade database show that our proposed method can achieve a competitive performance with other methods.
In flexible automated manufacturing, robots can perform routine operations as well as recover from atypical events, provided that process-relevant information is available to the robot controller. Real time vision is among the most versatile sensing tools, yet the reliability of machine-based scene interpretation can be questionable. The effort described here is focused on the development of machine-based vision methods to support autonomous nuclear fuel manufacturing operations in hot cells. This thesis presents a method to efficiently recognize 3D objects from 2D images based on feature-based indexing. Object recognition is the identification of correspondences between parts of a current scene and stored views of known objects, using chains of segments or indexing vectors. To create indexed object models, characteristic model image features are extracted during preprocessing. Feature vectors representing model object contours are acquired from several points of view around each object and stored. Recognition is the process of matching stored views with features or patterns detected in a test scene. Two sets of algorithms were developed, one for preprocessing and indexed database creation, and one for pattern searching and matching during recognition. At recognition time, those indexing vectors with the highest match probability are retrieved from the model image database, using a nearest neighbor search algorithm. The nearest neighbor search predicts the best possible match candidates. Extended searches are guided by a search strategy that employs knowledge-base (KB) selection criteria. The knowledge-based system simplifies the recognition process and minimizes the number of iterations and memory usage. Novel contributions include the use of a feature-based indexing data structure together with a knowledge base. Both components improve the efficiency of the recognition process by improved structuring of the database of object features and reducing data base size
Full Text Available The intensive research of speech emotion recognition introduced a huge collection of speech emotion features. Large feature sets complicate the speech emotion recognition task. Among various feature selection and transformation techniques for one-stage classification, multiple classifier systems were proposed. The main idea of multiple classifiers is to arrange the emotion classification process in stages. Besides parallel and serial cases, the hierarchical arrangement of multi-stage classification is most widely used for speech emotion recognition. In this paper, we present a sequential-forward-feature-selection-based multi-stage classification scheme. The Sequential Forward Selection (SFS and Sequential Floating Forward Selection (SFFS techniques were employed for every stage of the multi-stage classification scheme. Experimental testing of the proposed scheme was performed using the German and Lithuanian emotional speech datasets. Sequential-feature-selection-based multi-stage classification outperformed the single-stage scheme by 12–42 % for different emotion sets. The multi-stage scheme has shown higher robustness to the growth of emotion set. The decrease in recognition rate with the increase in emotion set for multi-stage scheme was lower by 10–20 % in comparison with the single-stage case. Differences in SFS and SFFS employment for feature selection were negligible.
Tao, Dacheng; Li, Xuelong; Wu, Xindong; Maybank, Stephen J
The traditional image representations are not suited to conventional classification methods, such as the linear discriminant analysis (LDA), because of the under sample problem (USP): the dimensionality of the feature space is much higher than the number of training samples. Motivated by the successes of the two dimensional LDA (2DLDA) for face recognition, we develop a general tensor discriminant analysis (GTDA) as a preprocessing step for LDA. The benefits of GTDA compared with existing preprocessing methods, e.g., principal component analysis (PCA) and 2DLDA, include 1) the USP is reduced in subsequent classification by, for example, LDA; 2) the discriminative information in the training tensors is preserved; and 3) GTDA provides stable recognition rates because the alternating projection optimization algorithm to obtain a solution of GTDA converges, while that of 2DLDA does not. We use human gait recognition to validate the proposed GTDA. The averaged gait images are utilized for gait representation. Given the popularity of Gabor function based image decompositions for image understanding and object recognition, we develop three different Gabor function based image representations: 1) the GaborD representation is the sum of Gabor filter responses over directions, 2) GaborS is the sum of Gabor filter responses over scales, and 3) GaborSD is the sum of Gabor filter responses over scales and directions. The GaborD, GaborS and GaborSD representations are applied to the problem of recognizing people from their averaged gait images.A large number of experiments were carried out to evaluate the effectiveness (recognition rate) of gait recognition based on first obtaining a Gabor, GaborD, GaborS or GaborSD image representation, then using GDTA to extract features and finally using LDA for classification. The proposed methods achieved good performance for gait recognition based on image sequences from the USF HumanID Database. Experimental comparisons are made with nine
Kawewong, Aram; Tangruamsub, Sirinart; Hasegawa, Osamu
A novel Position-Invariant Robust Feature, designated as PIRF, is presented to address the problem of highly dynamic scene recognition. The PIRF is obtained by identifying existing local features (i.e. SIFT) that have a wide baseline visibility within a place (one place contains more than one sequential images). These wide-baseline visible features are then represented as a single PIRF, which is computed as an average of all descriptors associated with the PIRF. Particularly, PIRFs are robust against highly dynamical changes in scene: a single PIRF can be matched correctly against many features from many dynamical images. This paper also describes an approach to using these features for scene recognition. Recognition proceeds by matching an individual PIRF to a set of features from test images, with subsequent majority voting to identify a place with the highest matched PIRF. The PIRF system is trained and tested on 2000+ outdoor omnidirectional images and on COLD datasets. Despite its simplicity, PIRF offers a markedly better rate of recognition for dynamic outdoor scenes (ca. 90%) than the use of other features. Additionally, a robot navigation system based on PIRF (PIRF-Nav) can outperform other incremental topological mapping methods in terms of time (70% less) and memory. The number of PIRFs can be reduced further to reduce the time while retaining high accuracy, which makes it suitable for long-term recognition and localization.
Nie, Aiqing; Jiang, Jingguo; Fu, Qiao
Previous research has found that conjunction faces (whose internal features, e.g. eyes, nose, and mouth, and external features, e.g. hairstyle and ears, are from separate studied faces) and feature faces (partial features of these are studied) can produce higher false alarms than both old and new faces (i.e. those that are exactly the same as the studied faces and those that have not been previously presented) in recognition. The event-related potentials (ERPs) that relate to conjunction and feature faces at recognition, however, have not been described as yet; in addition, the contributions of different facial features toward ERPs have not been differentiated. To address these issues, the present study compared the ERPs elicited by old faces, conjunction faces (the internal and the external features were from two studied faces), old internal feature faces (whose internal features were studied), and old external feature faces (whose external features were studied) with those of new faces separately. The results showed that old faces not only elicited an early familiarity-related FN400, but a more anterior distributed late old/new effect that reflected recollection. Conjunction faces evoked similar late brain waveforms as old internal feature faces, but not to old external feature faces. These results suggest that, at recognition, old faces hold higher familiarity than compound faces in the profiles of ERPs and internal facial features are more crucial than external ones in triggering the brain waveforms that are characterized as reflecting the result of familiarity.
Sébastien M Crouzet
Full Text Available Research progress in machine vision has been very significant in recent years. Robust face detection and identification algorithms are already readily available to consumers, and modern computer vision algorithms for generic object recognition are now coping with the richness and complexity of natural visual scenes. Unlike early vision models of object recognition that emphasized the role of figure-ground segmentation and spatial information between parts, recent successful approaches are based on the computation of loose collections of image features without prior segmentation or any explicit encoding of spatial relations. While these models remain simplistic models of visual processing, they suggest that, in principle, bottom-up activation of a loose collection of image features could support the rapid recognition of natural object categories and provide an initial coarse visual representation before more complex visual routines and attentional mechanisms take place. Focusing on biologically-plausible computational models of (bottom-up pre-attentive visual recognition, we review some of the key visual features that have been described in the literature. We discuss the consistency of these feature-based representations with classical theories from visual psychology and test their ability to account for human performance on a rapid object categorization task.
Full Text Available Automatic recognition of mature fruits in a complex agricultural environment is still a challenge for an autonomous harvesting robot due to various disturbances existing in the background of the image. The bottleneck to robust fruit recognition is reducing influence from two main disturbances: illumination and overlapping. In order to recognize the tomato in the tree canopy using a low-cost camera, a robust tomato recognition algorithm based on multiple feature images and image fusion was studied in this paper. Firstly, two novel feature images, the a*-component image and the I-component image, were extracted from the L*a*b* color space and luminance, in-phase, quadrature-phase (YIQ color space, respectively. Secondly, wavelet transformation was adopted to fuse the two feature images at the pixel level, which combined the feature information of the two source images. Thirdly, in order to segment the target tomato from the background, an adaptive threshold algorithm was used to get the optimal threshold. The final segmentation result was processed by morphology operation to reduce a small amount of noise. In the detection tests, 93% target tomatoes were recognized out of 200 overall samples. It indicates that the proposed tomato recognition method is available for robotic tomato harvesting in the uncontrolled environment with low cost.
Turati, Chiara; Macchi Cassia, Viola; Simion, Francesca; Leo, Irene
Existing data indicate that newborns are able to recognize individual faces, but little is known about what perceptual cues drive this ability. The current study showed that either the inner or outer features of the face can act as sufficient cues for newborns' face recognition (Experiment 1), but the outer part of the face enjoys an advantage…
Zhao, Yuanshen; Gong, Liang; Huang, Yixiang; Liu, Chengliang
Automatic recognition of mature fruits in a complex agricultural environment is still a challenge for an autonomous harvesting robot due to various disturbances existing in the background of the image. The bottleneck to robust fruit recognition is reducing influence from two main disturbances: illumination and overlapping. In order to recognize the tomato in the tree canopy using a low-cost camera, a robust tomato recognition algorithm based on multiple feature images and image fusion was studied in this paper. Firstly, two novel feature images, the a*-component image and the I-component image, were extracted from the L*a*b* color space and luminance, in-phase, quadrature-phase (YIQ) color space, respectively. Secondly, wavelet transformation was adopted to fuse the two feature images at the pixel level, which combined the feature information of the two source images. Thirdly, in order to segment the target tomato from the background, an adaptive threshold algorithm was used to get the optimal threshold. The final segmentation result was processed by morphology operation to reduce a small amount of noise. In the detection tests, 93% target tomatoes were recognized out of 200 overall samples. It indicates that the proposed tomato recognition method is available for robotic tomato harvesting in the uncontrolled environment with low cost.
Full Text Available This paper explores feature reduction properties of independent component analysis (ICA on breast cancer decision support system. Wisconsin diagnostic breast cancer (WDBC dataset is reduced to one-dimensional feature vector computing an independent component (IC. The original data with 30 features and reduced one feature (IC are used to evaluate diagnostic accuracy of the classifiers such as k-nearest neighbor (k-NN, artificial neural network (ANN, radial basis function neural network (RBFNN, and support vector machine (SVM. The comparison of the proposed classification using the IC with original feature set is also tested on different validation (5/10-fold cross-validations and partitioning (20%–40% methods. These classifiers are evaluated how to effectively categorize tumors as benign and malignant in terms of specificity, sensitivity, accuracy, F-score, Youden’s index, discriminant power, and the receiver operating characteristic (ROC curve with its criterion values including area under curve (AUC and 95% confidential interval (CI. This represents an improvement in diagnostic decision support system, while reducing computational complexity.
Zhang, Xuesong; Zhuang, Yan; Wang, Wei; Pedrycz, Witold
In this paper, we introduce a new research problem termed online feature transformation learning in the context of multiclass object category recognition. The learning of a feature transformation is viewed as learning a global similarity metric function in an online manner. We first consider the problem of online learning a feature transformation matrix expressed in the original feature space and propose an online passive aggressive feature transformation algorithm. Then these original features are mapped to kernel space and an online single kernel feature transformation (OSKFT) algorithm is developed to learn a nonlinear feature transformation. Based on the OSKFT and the existing Hedge algorithm, a novel online multiple kernel feature transformation algorithm is also proposed, which can further improve the performance of online feature transformation learning in large-scale application. The classifier is trained with k nearest neighbor algorithm together with the learned similarity metric function. Finally, we experimentally examined the effect of setting different parameter values in the proposed algorithms and evaluate the model performance on several multiclass object recognition data sets. The experimental results demonstrate the validity and good performance of our methods on cross-domain and multiclass object recognition application.
Partila, Pavol; Voznak, Miroslav; Tovarek, Jaromir
The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Full Text Available The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Full Text Available The most popular features for speaker recognition are Mel frequency cepstral coefficients (MFCCs and linear prediction cepstral coefficients (LPCCs. These features are used extensively because they characterize the vocal tract configuration which is known to be highly speaker-dependent. In this work, several features are introduced that can characterize the vocal system in order to complement the traditional features and produce better speaker recognition models. The spectral centroid (SC, spectral bandwidth (SBW, spectral band energy (SBE, spectral crest factor (SCF, spectral flatness measure (SFM, Shannon entropy (SE, and Renyi entropy (RE were utilized for this purpose. This work demonstrates that these features are robust in noisy conditions by simulating some common distortions that are found in the speakers' environment and a typical telephone channel. Babble noise, additive white Gaussian noise (AWGN, and a bandpass channel with 1Ã¢Â€Â‰dB of ripple were used to simulate these noisy conditions. The results show significant improvements in classification performance for all noise conditions when these features were used to complement the MFCC and ÃŽÂ”MFCC features. In particular, the SC and SCF improved performance in almost all noise conditions within the examined SNR range (10Ã¢Â€Â“40Ã¢Â€Â‰dB. For example, in cases where there was only one source of distortion, classification improvements of up to 8% and 10% were achieved under babble noise and AWGN, respectively, using the SCF feature.
Full Text Available In recent years, fire recognition based on image features has become a hotspot in fire monitoring. However, due to the complexity of forest environment, the accuracy of forest fireworks recognition based on image features is low. Based on this, this paper proposes a feature extraction algorithm based on YCrCb color space and K-means clustering. Firstly, the paper prepares and analyzes the color characteristics of a large number of forest fire image samples. Using the K-means clustering algorithm, the forest flame model is obtained by comparing the two commonly used color spaces, and the suspected flame area is discriminated and extracted. The experimental results show that the extraction accuracy of flame area based on YCrCb color model is higher than that of HSI color model, which can be applied in different scene forest fire identification, and it is feasible in practice.
Full Text Available Sparse representation based on compressed sensing theory has been widely used in the field of face recognition, and has achieved good recognition results. but the face feature extraction based on sparse representation is too simple, and the sparse coefficient is not sparse. In this paper, we improve the classification algorithm based on the fusion of sparse representation and Gabor feature, and then improved algorithm for Gabor feature which overcomes the problem of large dimension of the vector dimension, reduces the computation and storage cost, and enhances the robustness of the algorithm to the changes of the environment.The classification efficiency of sparse representation is determined by the collaborative representation,we simplify the sparse constraint based on L1 norm to the least square constraint, which makes the sparse coefficients both positive and reduce the complexity of the algorithm. Experimental results show that the proposed method is robust to illumination, facial expression and pose variations of face recognition, and the recognition rate of the algorithm is improved.
Nasrollahi, Kamal; Moeslund, Thomas B.; Rashidi, Maryam
Developing a reliable, fast, and robust biometric recognition system is still a challenging task. This is because the inputs to these systems can be noisy, occluded, poorly illuminated, rotated, and of very low-resolutions. This paper proposes a probabilistic classifier using Haar-like features......, which mostly have been used for detection, for biometric recognition. The proposed system has been tested for three different biometrics: ear, iris, and hand vein patterns and it is shown that it is robust against most of the mentioned degradations and it outperforms state-of-the-art systems...
Full Text Available This paper proposes an improved performance algorithm of face recognition to identify two face mismatch pairs in cases of incorrect decisions. The primary feature of this method is to deploy the similarity score with respect to Gaussian components between two previously unseen faces. Unlike the conventional classical vector distance measurement, our algorithms also consider the plot of summation of the similarity index versus face feature vector distance. A mixture of Gaussian models of labeled faces is also widely applicable to different biometric system parameters. By comparative evaluations, it has been shown that the efficiency of the proposed algorithm is superior to that of the conventional algorithm by an average accuracy of up to 1.15% and 16.87% when compared with 3x3 Multi-Region Histogram (MRH direct-bag-of-features and Principal Component Analysis (PCA-based face recognition systems, respectively. The experimental results show that similarity score consideration is more discriminative for face recognition compared to feature distance. Experimental results of Labeled Face in the Wild (LFW data set demonstrate that our algorithms are suitable for real applications probe-to-gallery identification of face recognition systems. Moreover, this proposed method can also be applied to other recognition systems and therefore additionally improves recognition scores.
Oztel, Ismail; Yolcu, Gozde; Oz, Cemil; Kazan, Serap; Bunyak, Filiz
Facial expressions have an important role in interpersonal communications and estimation of emotional states or intentions. Automatic recognition of facial expressions has led to many practical applications and became one of the important topics in computer vision. We present a facial expression recognition system that relies on geometry-based features extracted from eye and eyebrow regions of the face. The proposed system detects keypoints on frontal face images and forms a feature set using geometric relationships among groups of detected keypoints. Obtained feature set is refined and reduced using the sequential forward selection (SFS) algorithm and fed to a support vector machine classifier to recognize five facial expression classes. The proposed system, iFER (eye-eyebrow only facial expression recognition), is robust to lower face occlusions that may be caused by beards, mustaches, scarves, etc. and lower face motion during speech production. Preliminary experiments on benchmark datasets produced promising results outperforming previous facial expression recognition studies using partial face features, and comparable results to studies using whole face information, only slightly lower by ˜ 2.5 % compared to the best whole face facial recognition system while using only ˜ 1 / 3 of the facial region.
J. Del Rio Vera
Full Text Available This paper presents a new supervised classification approach for automated target recognition (ATR in SAS images. The recognition procedure starts with a novel segmentation stage based on the Hilbert transform. A number of geometrical features are then extracted and used to classify observed objects against a previously compiled database of target and non-target features. The proposed approach has been tested on a set of 1528 simulated images created by the NURC SIGMAS sonar model, achieving up to 95% classification accuracy.
HUYue-li; CAOJia-lin; ZHAOQian; FENGXu
Automatic recognition of skin micro-image symptom is important in skin diagnosis and treatment. Feature selection is to improve the classification performance of skin micro-image symptom.This paper proposes a hybrid approach based on the support vector machine (SVM) technique and genetic algorithm (GA) to select an optimum feature subset from the feature group extracted from the skin micro-images. An adaptive GA is introduced for maintaining the convergence rate. With the proposed method, the average cross validation accuracy is increased from 88.25% using all features to 96.92% using only selected features provided by a classifier for classification of 5 classes of skin symptoms. The experimental results are satisfactory.
HU Yue-li; CAO Jia-lin; ZHAO Qian; FENG Xu
Automatic recognition of skin micro-image symptom is important in skin diagnosis and treatment. Feature selection is to improve the classification performance of skin micro-image symptom.This paper proposes a hybrid approach based on the support vector machine (SVM) technique and genetic algorithm (GA) to select an optimum feature subset from the feature group extracted from the skin micro-images. An adaptive GA is introduced for maintaining the convergence rate. With the proposed method, the average cross validation accuracy is increased from 88.25% using all features to 96.92 % using only selected features provided by a classifier for classification of 5 classes of skin symptoms. The experimental results are satisfactory.
Yuan, Baohua; Li, Shijin; Li, Ning
The features extracted from deep convolutional neural networks (CNNs) have shown their promise as generic descriptors for land-use scene recognition. However, most of the work directly adopts the deep features for the classification of remote sensing images, and does not encode the deep features for improving their discriminative power, which can affect the performance of deep feature representations. To address this issue, we propose an effective framework, LASC-CNN, obtained by locality-constrained affine subspace coding (LASC) pooling of a CNN filter bank. LASC-CNN obtains more discriminative deep features than directly extracted from CNNs. Furthermore, LASC-CNN builds on the top convolutional layers of CNNs, which can incorporate multiscale information and regions of arbitrary resolution and sizes. Our experiments have been conducted using two widely used remote sensing image databases, and the results show that the proposed method significantly improves the performance when compared to other state-of-the-art methods.
GADH,RAJIT; LU,YONG; TAUTGES,TIMOTHY J.
Considerable progress has been made on automatic hexahedral mesh generation in recent years. Several automatic meshing algorithms have proven to be very reliable on certain classes of geometry. While it is always worth pursuing general algorithms viable on more general geometry, a combination of the well-established algorithms is ready to take on classes of complicated geometry. By partitioning the entire geometry into meshable pieces matched with appropriate meshing algorithm the original geometry becomes meshable and may achieve better mesh quality. Each meshable portion is recognized as a meshing feature. This paper, which is a part of the feature based meshing methodology, presents the work on shape recognition and volume decomposition to automatically decompose a CAD model into meshable volumes. There are four phases in this approach: (1) Feature Determination to extinct decomposition features, (2) Cutting Surfaces Generation to form the ''tailored'' cutting surfaces, (3) Body Decomposition to get the imprinted volumes; and (4) Meshing Algorithm Assignment to match volumes decomposed with appropriate meshing algorithms. The feature determination procedure is based on the CLoop feature recognition algorithm that is extended to be more general. Results are demonstrated over several parts with complicated topology and geometry.
Full Text Available To improve the human-computer interaction (HCI to be as good as human-human interaction, building an efficient approach for human emotion recognition is required. These emotions could be fused from several modalities such as facial expression, hand gesture, acoustic data, and biophysiological data. In this paper, we address the frame-based perception of the universal human facial expressions (happiness, surprise, anger, disgust, fear, and sadness, with the help of several geometrical features. Unlike many other geometry-based approaches, the frame-based method does not rely on prior knowledge of a person-specific neutral expression; this knowledge is gained through human intervention and not available in real scenarios. Additionally, we provide a method to investigate the performance of the geometry-based approaches under various facial point localization errors. From an evaluation on two public benchmark datasets, we have found that using eight facial points, we can achieve the state-of-the-art recognition rate. However, this state-of-the-art geometry-based approach exploits features derived from 68 facial points and requires prior knowledge of the person-specific neutral expression. The expression recognition rate using geometrical features is adversely affected by the errors in the facial point localization, especially for the expressions with subtle facial deformations.
Sanpachai, H.; Settapong, M.
Biometrics is a promising technique that is used to identify individual traits and characteristics. Iris recognition is one of the most reliable biometric methods. As iris texture and color is fully developed within a year of birth, it remains unchanged throughout a person's life. Contrary to fingerprint, which can be altered due to several aspects including accidental damage, dry or oily skin and dust. Although iris recognition has been studied for more than a decade, there are limited commercial products available due to its arduous requirement such as camera resolution, hardware size, expensive equipment and computational complexity. However, at the present time, technology has overcome these obstacles. Iris recognition can be done through several sequential steps which include pre-processing, features extractions, post-processing, and matching stage. In this paper, we adopted the directional high-low pass filter for feature extraction. A box-counting fractal dimension and Iris code have been proposed as feature representations. Our approach has been tested on CASIA Iris Image database and the results are considered successful.
Luo, Dan; Gao, Hua; Ekenel, Hazim Kemal; Ohya, Jun
The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.
Full Text Available This paper proposes a determining algorithm for froth image features based on the amplitude spectrum energy statistics by applying Fast Fourier Transformation to analyze the energy distribution of various-sized froth. The proposed algorithm has been used to do a froth feature analysis of the froth images from the alumina flotation processing site, and the results show that the consistency rate reaches 98.1 % and the usability rate 94.2 %; with its good robustness and high efficiency, the algorithm is quite suitable for flotation processing state recognition.
Xi, Xiaoming; Yang, Gongping; Yin, Yilong; Yang, Lu
The finger vein is a promising biometric pattern for personal identification due to its advantages over other existing biometrics. In finger vein recognition, feature extraction is a critical step, and many feature extraction methods have been proposed to extract the gray, texture, or shape of the finger vein. We treat them as low-level features and present a high-level feature extraction framework. Under this framework, base attribute is first defined to represent the characteristics of a certain subcategory of a subject. Then, for an image, the correlation coefficient is used for constructing the high-level feature, which reflects the correlation between this image and all base attributes. Since the high-level feature can reveal characteristics of more subcategories and contain more discriminative information, we call it hyperinformation feature (HIF). Compared with low-level features, which only represent the characteristics of one subcategory, HIF is more powerful and robust. In order to demonstrate the potential of the proposed framework, we provide a case study to extract HIF. We conduct comprehensive experiments to show the generality of the proposed framework and the efficiency of HIF on our databases, respectively. Experimental results show that HIF significantly outperforms the low-level features.
Zhang, Guoliang; Jia, Songmin; Li, Xiuzhi; Zhang, Xiangyin
The majority of human action recognition methods use multifeature fusion strategy to improve the classification performance, where the contribution of different features for specific action has not been paid enough attention. We present an extendible and universal weighted score-level feature fusion method using the Dempster-Shafer (DS) evidence theory based on the pipeline of bag-of-visual-words. First, the partially distinctive samples in the training set are selected to construct the validation set. Then, local spatiotemporal features and pose features are extracted from these samples to obtain evidence information. The DS evidence theory and the proposed rule of survival of the fittest are employed to achieve evidence combination and calculate optimal weight vectors of every feature type belonging to each action class. Finally, the recognition results are deduced via the weighted summation strategy. The performance of the established recognition framework is evaluated on Penn Action dataset and a subset of the joint-annotated human metabolome database (sub-JHMDB). The experiment results demonstrate that the proposed feature fusion method can adequately exploit the complementarity among multiple features and improve upon most of the state-of-the-art algorithms on Penn Action and sub-JHMDB datasets.
Li, Baopu; Meng, Max Q-H
Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.
Full Text Available In recent years wavelet transform has been found to be an effective tool for time–frequency analysis. Wavelet transform has been used as feature extraction in speech recognition applications and it has proved to be an effective technique for unvoiced phoneme classification. In this paper a new filter structure using admissible wavelet packet is analyzed for English phoneme recognition. These filters have the benefit of having frequency bands spacing similar to the auditory Equivalent Rectangular Bandwidth (ERB scale. Central frequencies of ERB scale are equally distributed along the frequency response of human cochlea. A new sets of features are derived using wavelet packet transform's multi-resolution capabilities and found to be better than conventional features for unvoiced phoneme problems. Some of the noises from NOISEX-92 database has been used for preparing the artificial noisy database to test the robustness of wavelet based features.
Li, Hong; Wei, Yantao; Li, Luoqing; Chen, C L P
In this paper, a hierarchical feature extraction method is proposed for image recognition. The key idea of the proposed method is to extract an effective feature, called local neural response (LNR), of the input image with nontrivial discrimination and invariance properties by alternating between local coding and maximum pooling operation. The local coding, which is carried out on the locally linear manifold, can extract the salient feature of image patches and leads to a sparse measure matrix on which maximum pooling is carried out. The maximum pooling operation builds the translation invariance into the model. We also show that other invariant properties, such as rotation and scaling, can be induced by the proposed model. In addition, a template selection algorithm is presented to reduce computational complexity and to improve the discrimination ability of the LNR. Experimental results show that our method is robust to local distortion and clutter compared with state-of-the-art algorithms.
Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin
Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory region of-interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.
Dong, Pei; Li, Jie; Dong, Junyu; Qi, Lin
Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. The challenge for action recognition is to capture and fuse the multi-dimension information in video data. In order to take into account these characteristics simultaneously, we present a novel method that fuses multiple dimensional features, such as chromatic images, depth and optical flow fields. We built our model based on the multi-stream deep convolutional networks with the help of temporal segment networks and extract discriminative spatial and temporal features by fusing ConvNets towers multi-dimension, in which different feature weights are assigned in order to take full advantage of this multi-dimension information. Our architecture is trained and evaluated on the currently largest and most challenging benchmark NTU RGB-D dataset. The experiments demonstrate that the performance of our method outperforms the state-of-the-art methods.
Zubair, A. F.; Abu Mansor, M. S.
Computer Aided Process Planning (CAPP) is the bridge between CAD and CAM and pre-processing of the CAD data in the CAPP system is essential. For CNC turning part, conical faces of part model is inevitable to be recognised beside cylindrical and planar faces. As the sinus cosines of the cone radius structure differ according to different models, face identification in automatic feature recognition of the part model need special intention. This paper intends to focus hole on feature on conical faces that can be detected by CAD solid modeller ACIS via. SAT file. Detection algorithm of face topology were generated and compared. The study shows different faces setup for similar conical part models with different hole type features. Three types of holes were compared and different between merge faces and unmerge faces were studied.
P. C. PANCHARIYA
Full Text Available This paper describes an approach for extraction of features from data generated from an electronic tongue based on large amplitude pulse voltammetry. In this approach statistical features of the meaningful selected variables from current response signals are extracted and used for recognition of beverage samples. The proposed feature extraction approach not only reduces the computational complexity but also reduces the computation time and requirement of storage of data for the development of E-tongue for field applications. With the reduced information, a probabilistic neural network (PNN was trained for qualitative analysis of different beverages. Before the qualitative analysis of the beverages, the methodology has been tested for the basic artificial taste solutions i.e. sweet, sour, salt, bitter, and umami. The proposed procedure was compared with the more conventional and linear feature extraction technique employing principal component analysis combined with PNN. Using the extracted feature vectors, highly correct classification by PNN was achieved for eight types of juices and six types of soft drinks. The results indicated that the electronic tongue based on large amplitude pulse voltammetry with reduced feature was capable of discriminating not only basic artificial taste solutions but also the various sorts of the same type of natural beverages (fruit juices, vegetable juices, soft drinks, etc..
Tsai, Chung-Chih; Lin, Heng-Yi; Taur, Jinshiuh; Tao, Chin-Wang
In this paper, we propose a novel possibilistic fuzzy matching strategy with invariant properties, which can provide a robust and effective matching scheme for two sets of iris feature points. In addition, the nonlinear normalization model is adopted to provide more accurate position before matching. Moreover, an effective iris segmentation method is proposed to refine the detected inner and outer boundaries to smooth curves. For feature extraction, the Gabor filters are adopted to detect the local feature points from the segmented iris image in the Cartesian coordinate system and to generate a rotation-invariant descriptor for each detected point. After that, the proposed matching algorithm is used to compute a similarity score for two sets of feature points from a pair of iris images. The experimental results show that the performance of our system is better than those of the systems based on the local features and is comparable to those of the typical systems.
P. V. L. Suvarchala
Full Text Available Iris recognition is considered as one of the most promising noninvasive biometric systems providing automated human identification. Numerous programs, like unique ID program in India - Aadhar, include iris biometric to provide distinctive identity identification to citizens. The active area is usually captured under non ideal imaging conditions. It usually suffers from poor brightness, low contrast, blur due to camera or subject's relative movement and eyelid eyelash occlusions. Besides the technical challenges, iris recognition started facing sophisticated threats like spoof attacks. Therefore it is vital that the integrity of such large scale iris deployments must be preserved. This paper presents the development of a new spoof resistant approach which exploits the statistical dependencies of both general eye and localized iris regions in textural domain using spatial gray level dependence matrix (SGLDM, gray level run length matrix (GLRLM and contourlets in transform domain. We did experiments on publicly available fake and lens iris image databases. Correct classification rate obtained with ATVS-FIr iris database is 100% while it is 95.63% and 88.83% with IITD spoof iris databases respectively.
Pisharady, Pramod Kumar; Poh, Loh Ai
This book presents a collection of computational intelligence algorithms that addresses issues in visual pattern recognition such as high computational complexity, abundance of pattern features, sensitivity to size and shape variations and poor performance against complex backgrounds. The book has 3 parts. Part 1 describes various research issues in the field with a survey of the related literature. Part 2 presents computational intelligence based algorithms for feature selection and classification. The algorithms are discriminative and fast. The main application area considered is hand posture recognition. The book also discusses utility of these algorithms in other visual as well as non-visual pattern recognition tasks including face recognition, general object recognition and cancer / tumor classification. Part 3 presents biologically inspired algorithms for feature extraction. The visual cortex model based features discussed have invariance with respect to appearance and size of the hand, and provide good...
Cui, Chen; Asari, Vijayan K.
Biometric features such as fingerprints, iris patterns, and face features help to identify people and restrict access to secure areas by performing advanced pattern analysis and matching. Face recognition is one of the most promising biometric methodologies for human identification in a non-cooperative security environment. However, the recognition results obtained by face recognition systems are a affected by several variations that may happen to the patterns in an unrestricted environment. As a result, several algorithms have been developed for extracting different facial features for face recognition. Due to the various possible challenges of data captured at different lighting conditions, viewing angles, facial expressions, and partial occlusions in natural environmental conditions, automatic facial recognition still remains as a difficult issue that needs to be resolved. In this paper, we propose a novel approach to tackling some of these issues by analyzing the local textural descriptions for facial feature representation. The textural information is extracted by an enhanced local binary pattern (ELBP) description of all the local regions of the face. The relationship of each pixel with respect to its neighborhood is extracted and employed to calculate the new representation. ELBP reconstructs a much better textural feature extraction vector from an original gray level image in different lighting conditions. The dimensionality of the texture image is reduced by principal component analysis performed on each local face region. Each low dimensional vector representing a local region is now weighted based on the significance of the sub-region. The weight of each sub-region is determined by employing the local variance estimate of the respective region, which represents the significance of the region. The final facial textural feature vector is obtained by concatenating the reduced dimensional weight sets of all the modules (sub-regions) of the face image
Pan, Hong; Olsen, Søren Ingvor; Zhu, Yaping
Feature extraction and learning is critical for object recognition and detection. By embedding context cue of image attributes into the kernel descriptors, we propose a set of novel kernel descriptors called context kernel descriptors (CKD). The motivation of CKD is to use the spatial consistency...... even in high-dimensional space. In addition, the latent connection between Rényi quadratic entropy and the mapping data in kernel feature space further facilitates us to capture the geometric structure as well as the information about the underlying labels of the CKD using CSQMI. Thus the resulting...... codebook and reduced CKD are discriminative. We report superior performance of our algorithm for object recognition on benchmark datasets like Caltech-101 and CIFAR-10, as well as for detection on a challenging chicken feet dataset....
Zhu, Rong; Hu, Xueying; Tang, Jiajun; Hu, Sheng
Although image/video based fire recognition has received growing attention, an efficient and robust fire detection strategy is rarely explored. In this paper, we propose a novel approach to automatically identify the flame or smoke regions in an image. It is composed to three stages: (1) a block processing is applied to divide an image into several nonoverlapping image blocks, and these image blocks are identified as suspicious fire regions or not by using two color models and a color histogram-based similarity matching method in the HSV color space, (2) considering that compared to other information, the flame and smoke regions have significant visual characteristics, so that two kinds of image features are extracted for fire recognition, where local features are obtained based on the Scale Invariant Feature Transform (SIFT) descriptor and the Bags of Keypoints (BOK) technique, and texture features are extracted based on the Gray Level Co-occurrence Matrices (GLCM) and the Wavelet-based Analysis (WA) methods, and (3) a manifold learning-based classifier is constructed based on two image manifolds, which is designed via an improve Globular Neighborhood Locally Linear Embedding (GNLLE) algorithm, and the extracted hybrid features are used as input feature vectors to train the classifier, which is used to make decision for fire images or non fire images. Experiments and comparative analyses with four approaches are conducted on the collected image sets. The results show that the proposed approach is superior to the other ones in detecting fire and achieving a high recognition accuracy and a low error rate.
Zhang, Chongsheng; Masseglia, Florent; Zhang, Xiangliang
For many textual collections, the number of features is often overly large. These features can be very redundant, it is therefore desirable to have a small, succinct, yet highly informative collection of features that describes the key characteristics of a dataset. Information theory is one such tool for us to obtain this feature collection. With this paper, we mainly contribute to the improvement of efficiency for the process of selecting the most informative feature set over high-dimensional unlabeled data. We propose a heuristic theory for informative feature set selection from high dimensional data. Moreover, we design data structures that enable us to compute the entropies of the candidate feature sets efficiently. We also develop a simple pruning strategy that eliminates the hopeless candidates at each forward selection step. We test our method through experiments on real-world data sets, showing that our proposal is very efficient. © 2012 IEEE.
For many textual collections, the number of features is often overly large. These features can be very redundant, it is therefore desirable to have a small, succinct, yet highly informative collection of features that describes the key characteristics of a dataset. Information theory is one such tool for us to obtain this feature collection. With this paper, we mainly contribute to the improvement of efficiency for the process of selecting the most informative feature set over high-dimensional unlabeled data. We propose a heuristic theory for informative feature set selection from high dimensional data. Moreover, we design data structures that enable us to compute the entropies of the candidate feature sets efficiently. We also develop a simple pruning strategy that eliminates the hopeless candidates at each forward selection step. We test our method through experiments on real-world data sets, showing that our proposal is very efficient. © 2012 IEEE.
Wu, Jianwei; Wang, Zongyue
Vehicle license plate (VLP) recognition is of great importance to many traffic applications. Though researchers have paid much attention to VLP recognition there has not been a fully operational VLP recognition system yet for many reasons. This paper discusses a valid and practical method for vehicle license plate recognition based on geometry restraints and multi-feature decision including statistical and structural features. In general, the VLP recognition includes the following steps: the location of VLP, character segmentation, and character recognition. This paper discusses the three steps in detail. The characters of VLP are always declining caused by many factors, which makes it more difficult to recognize the characters of VLP, therefore geometry restraints such as the general ratio of length and width, the adjacent edges being perpendicular are used for incline correction. Image Moment has been proved to be invariant to translation, rotation and scaling therefore image moment is used as one feature for character recognition. Stroke is the basic element for writing and hence taking it as a feature is helpful to character recognition. Finally we take the image moment, the strokes and the numbers of each stroke for each character image and some other structural features and statistical features as the multi-feature to match each character image with sample character images so that each character image can be recognized by BP neural net. The proposed method combines statistical and structural features for VLP recognition, and the result shows its validity and efficiency.
El Moubtahij Hicham
Full Text Available This paper presents an analytical approach of an offline handwritten Arabic text recognition system. It is based on the Hidden Markov Models (HMM Toolkit (HTK without explicit segmentation. The first phase is preprocessing, where the data is introduced in the system after quality enhancements. Then, a set of characteristics (features of local densities and features statistics are extracted by using the technique of sliding windows. Subsequently, the resulting feature vectors are injected to the Hidden Markov Model Toolkit (HTK. The simple database âArabic-Numbersâ and IFN/ENIT are used to evaluate the performance of this system. Keywords: Hidden Markov Models (HMM Toolkit (HTK, Sliding windows
The features proposed in this paper are derived from minutiae quadruplets and are applicable in matching and indexing ngerprint images. In this work nineteen different possibilities of features were explored for indexing and the performances of some of the feature sets were mixed: some giving good performances on ...
Full Text Available Humans are able to recognize small number of people they know well by the way they walk. This ability represents basic motivation for using human gait as the means for biometric identification. Such biometrics can be captured at public places from a distance without subject's collaboration, awareness, and even consent. Although current approaches give encouraging results, we are still far from effective use in real-life applications. In general, methods set various constraints to circumvent the influence of covariate factors like changes of walking speed, view, clothing, footwear, and object carrying, that have negative impact on recognition performance. In this paper we propose a skeleton model based gait recognition system focusing on modelling gait dynamics and eliminating the influence of subjects appearance on recognition. Furthermore, we tackle the problem of walking speed variation and propose space transformation and feature fusion that mitigates its influence on recognition performance. With the evaluation on OU-ISIR gait dataset, we demonstrate state of the art performance of proposed methods.
Alexander Mikhailovich Alyushin
Full Text Available The algorithms fordynamic spectrograms images recognition, processing and soundspeech signature (SS weredeveloped. The software for mobile phones, thatcan recognize speech signatureswas prepared. The investigation of the SS recognition speed on its boundarytypes was conducted. Recommendations on the boundary types choice in the optimal ratio of recognitionspeed and required space were given.
Kent, Christopher; Lamberts, Koen; Patton, Richard
Previous studies on how people set and modify decision criteria in old-new recognition tasks (in which they have to decide whether or not a stimulus was seen in a study phase) have almost exclusively focused on properties of the study items, such as presentation frequency or study list length. In contrast, in the three studies reported here, we manipulated the quality of the test cues in a scene-recognition task, either by degrading through Gaussian blurring (Experiment 1) or by limiting presentation duration (Experiment 2 and 3). In Experiments 1 and 2, degradation of the test cue led to worse old-new discrimination. Most importantly, however, participants were more liberal in their responses to degraded cues (i.e., more likely to call the cue "old"), demonstrating strong within-list, item-by-item, criterion shifts. This liberal response bias toward degraded stimuli came at the cost of increasing the false alarm rate while maintaining a constant hit rate. Experiment 3 replicated Experiment 2 with additional stimulus types (words and faces) but did not provide accuracy feedback to participants. The criterion shifts in Experiment 3 were smaller in magnitude than Experiments 1 and 2 and varied in consistency across stimulus type, suggesting, in line with previous studies, that feedback is important for participants to shift their criteria.
Pohl, Rüdiger F; Michalkiewicz, Martha; Erdfelder, Edgar; Hilbig, Benjamin E
According to the recognition-heuristic theory, decision makers solve paired comparisons in which one object is recognized and the other not by recognition alone, inferring that recognized objects have higher criterion values than unrecognized ones. However, success-and thus usefulness-of this heuristic depends on the validity of recognition as a cue, and adaptive decision making, in turn, requires that decision makers are sensitive to it. To this end, decision makers could base their evaluation of the recognition validity either on the selected set of objects (the set's recognition validity), or on the underlying domain from which the objects were drawn (the domain's recognition validity). In two experiments, we manipulated the recognition validity both in the selected set of objects and between domains from which the sets were drawn. The results clearly show that use of the recognition heuristic depends on the domain's recognition validity, not on the set's recognition validity. In other words, participants treat all sets as roughly representative of the underlying domain and adjust their decision strategy adaptively (only) with respect to the more general environment rather than the specific items they are faced with.
Wang, Xiaohua; Xia, Chen; Hu, Min; Ren, Fuji
Facial expression recognition under partial occlusion is a challenging research. This paper proposes a novel framework for facial expression recognition under occlusion by fusing the global and local features. In global aspect, first, information entropy are employed to locate the occluded region. Second, principal Component Analysis (PCA) method is adopted to reconstruct the occlusion region of image. After that, a replace strategy is applied to reconstruct image by replacing the occluded region with the corresponding region of the best matched image in training set, Pyramid Weber Local Descriptor (PWLD) feature is then extracted. At last, the outputs of SVM are fitted to the probabilities of the target class by using sigmoid function. For the local aspect, an overlapping block-based method is adopted to extract WLD features, and each block is weighted adaptively by information entropy, Chi-square distance and similar block summation methods are then applied to obtain the probabilities which emotion belongs to. Finally, fusion at the decision level is employed for the data fusion of the global and local features based on Dempster-Shafer theory of evidence. Experimental results on the Cohn-Kanade and JAFFE databases demonstrate the effectiveness and fault tolerance of this method.
As a typical deep-learning model, Convolutional Neural Networks (CNNs) can be exploited to automatically extract features from images using the hierarchical structure inspired by mammalian visual system. For image classification tasks, traditional CNN models employ the softmax function for classification. However, owing to the limited capacity of the softmax function, there are some shortcomings of traditional CNN models in image classification. To deal with this problem, a new method combining Biomimetic Pattern Recognition (BPR) with CNNs is proposed for image classification. BPR performs class recognition by a union of geometrical cover sets in a high-dimensional feature space and therefore can overcome some disadvantages of traditional pattern recognition. The proposed method is evaluated on three famous image classification benchmarks, that is, MNIST, AR, and CIFAR-10. The classification accuracies of the proposed method for the three datasets are 99.01%, 98.40%, and 87.11%, respectively, which are much higher in comparison with the other four methods in most cases. PMID:28316614
Full Text Available the employment of other algorithms and commands so as to better present and demonstrate the obtained results. Edge detection and enhancing images for use in an iris recognition system allow for efficient recognition and extraction of iris patterns. REFERENCES... Gonzalez, R.C. and Woods, R.E. 2002. Digital Image Processing 2nd Edition, Instructor?s manual .Englewood Cliffs, Prentice Hall, pp 17-36. Proen?a, H. and Alexandre, L.A. 2007. Toward Noncooperative Iris Recognition: A classification approach using...
Full Text Available The feature fusion from separate source is the current technical difficulties of cross-corpus speech emotion recognition. The purpose of this paper is to, based on Deep Belief Nets (DBN in Deep Learning, use the emotional information hiding in speech spectrum diagram (spectrogram as image features and then implement feature fusion with the traditional emotion features. First, based on the spectrogram analysis by STB/Itti model, the new spectrogram features are extracted from the color, the brightness, and the orientation, respectively; then using two alternative DBN models they fuse the traditional and the spectrogram features, which increase the scale of the feature subset and the characterization ability of emotion. Through the experiment on ABC database and Chinese corpora, the new feature subset compared with traditional speech emotion features, the recognition result on cross-corpus, distinctly advances by 8.8%. The method proposed provides a new idea for feature fusion of emotion recognition.
Spreeuwers, Lieuwe Jan
Biometrics - recognition of persons based on how they look or behave, is the main subject of research at the Chair of Biometric Pattern Recognition (BPR) of the Services, Cyber Security and Safety Group (SCS) of the EEMCS Faculty at the University of Twente. Examples are finger print recognition,
Heba Hamdy Ali
Full Text Available Depth Maps-based Human Activity Recognition is the process of categorizing depth sequences with a particular activity. In this problem, some applications represent robust solutions in domains such as surveillance system, computer vision applications, and video retrieval systems. The task is challenging due to variations inside one class and distinguishes between activities of various classes and video recording settings. In this study, we introduce a detailed study of current advances in the depth maps-based image representations and feature extraction process. Moreover, we discuss the state of art datasets and subsequent classification procedure. Also, a comparative study of some of the more popular depth-map approaches has provided in greater detail. The proposed methods are evaluated on three depth-based datasets “MSR Action 3D”, “MSR Hand Gesture”, and “MSR Daily Activity 3D”. Experimental results achieved 100%, 95.83%, and 96.55% respectively. While combining depth and color features on “RGBD-HuDaAct” Dataset, achieved 89.1%. Keywords: Activity recognition, Depth, Feature extraction, Video, Human body detection, Hand gesture
Ren, Bo; Liu, Deyin; Qi, Lin
In order to deal with the insufficiency of recently algorithms based on Two Dimensions Fractional Fourier Transform (2D-FrFT), this paper proposes a multiple order features based method for emotion recognition. Most existing methods utilize the feature of single order or a couple of orders of 2D-FrFT. However, different orders of 2D-FrFT have different contributions on the feature extraction of emotion recognition. Combination of these features can enhance the performance of an emotion recognition system. The proposed approach obtains numerous features that extracted in different orders of 2D-FrFT in the directions of x-axis and y-axis, and uses the statistical magnitudes as the final feature vectors for recognition. The Support Vector Machine (SVM) is utilized for the classification and RML Emotion database and Cohn-Kanade (CK) database are used for the experiment. The experimental results demonstrate the effectiveness of the proposed method.
representations. For representing objects, we derive global descriptors encoding shape using viewpoint-invariant features obtained from multiple sensors observing the scene. Objects are also described using color independently. This allows for combining color and shape when it is required for the task. For more...... robust color description, color calibration is performed. The framework was used in three recognition tasks: object instance recognition, object category recognition, and object spatial relationship recognition. For the object instance recognition task, we present a system that utilizes color and scale...
Doni, Andrea; Musso, Tiziana; Morone, Diego; Bastone, Antonio; Zambelli, Vanessa; Sironi, Marina; Castagnoli, Carlotta; Cambieri, Irene; Stravalaci, Matteo; Pasqualini, Fabio; Laface, Ilaria; Valentino, Sonia; Tartari, Silvia; Ponzetta, Andrea; Maina, Virginia; Barbieri, Silvia S.; Tremoli, Elena; Catapano, Alberico L.; Norata, Giuseppe D.; Bottazzi, Barbara; Garlanda, Cecilia
Pentraxin 3 (PTX3) is a fluid-phase pattern recognition molecule and a key component of the humoral arm of innate immunity. In four different models of tissue damage in mice, PTX3 deficiency was associated with increased fibrin deposition and persistence, and thicker clots, followed by increased collagen deposition, when compared with controls. Ptx3-deficient macrophages showed defective pericellular fibrinolysis in vitro. PTX3-bound fibrinogen/fibrin and plasminogen at acidic pH and increased plasmin-mediated fibrinolysis. The second exon-encoded N-terminal domain of PTX3 recapitulated the activity of the intact molecule. Thus, a prototypic component of humoral innate immunity, PTX3, plays a nonredundant role in the orchestration of tissue repair and remodeling. Tissue acidification resulting from metabolic adaptation during tissue repair sets PTX3 in a tissue remodeling and repair mode, suggesting that matrix and microbial recognition are common, ancestral features of the humoral arm of innate immunity. PMID:25964372
Sharma, O.; Anton, François; Mioc, Darka
Polygon features are of interest in many GEOProcessing applications like shoreline mapping, boundary delineation, change detection, etc. This paper presents a unique new GPU-based methodology to automate feature extraction combining level sets, or mean shift based segmentation together with Voron...
The objective of feature selection is to find the most relevant features for classification. Thus, the dimensionality of the information will be reduced and may improve classification's accuracy. This paper proposed a minimum set of relevant questions that can be used for early detection of dyslexia. In this research, we ...
Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz
Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.
We present a fully automatic multimodal 2D + 3D feature-based facial expression recognition approach and demonstrate its performance on the BU-3DFE database. Our approach combines multi-order gradient-based local texture and shape descriptors in order to achieve efficiency and robustness. First, a large set of fiducial facial landmarks of 2D face images along with their 3D face scans are localized using a novel algorithm namely incremental Parallel Cascade of Linear Regression (iPar-CLR). Then, a novel Histogram of Second Order Gradients (HSOG) based local image descriptor in conjunction with the widely used first-order gradient based SIFT descriptor are used to describe the local texture around each 2D landmark. Similarly, the local geometry around each 3D landmark is described by two novel local shape descriptors constructed using the first-order and the second-order surface differential geometry quantities, i.e., Histogram of mesh Gradients (meshHOG) and Histogram of mesh Shape index (curvature quantization, meshHOS). Finally, the Support Vector Machine (SVM) based recognition results of all 2D and 3D descriptors are fused at both feature-level and score-level to further improve the accuracy. Comprehensive experimental results demonstrate that there exist impressive complementary characteristics between the 2D and 3D descriptors. We use the BU-3DFE benchmark to compare our approach to the state-of-the-art ones. Our multimodal feature-based approach outperforms the others by achieving an average recognition accuracy of 86.32%. Moreover, a good generalization ability is shown on the Bosphorus database.
German Ignacio Parisi
Full Text Available The visual recognition of complex, articulated human movements is fundamental for a wide range of artificial systems oriented towards human-robot communication, action classification, and action-driven perception. These challenging tasks may generally involve the processing of a huge amount of visual information and learning-based mechanisms for generalizing a set of training actions and classifying new samples. To operate in natural environments, a crucial property is the efficient and robust recognition of actions, also under noisy conditions caused by, for instance, systematic sensor errors and temporarily occluded persons. Studies of the mammalian visual system and its outperforming ability to process biological motion information suggest separate neural pathways for the distinct processing of pose and motion features at multiple levels and the subsequent integration of these visual cues for action perception. We present a neurobiologically-motivated approach to achieve noise-tolerant action recognition in real time. Our model consists of self-organizing Growing When Required (GWR networks that obtain progressively generalized representations of sensory inputs and learn inherent spatiotemporal dependencies. During the training, the GWR networks dynamically change their topological structure to better match the input space. We first extract pose and motion features from video sequences and then cluster actions in terms of prototypical pose-motion trajectories. Multi-cue trajectories from matching action frames are subsequently combined to provide action dynamics in the joint feature space. Reported experiments show that our approach outperforms previous results on a dataset of full-body actions captured with a depth sensor, and ranks among the best 21 results for a public benchmark of domestic daily actions.
Li, Huibin; Ding, Huaxiong; Huang, Di; Wang, Yunhong; Zhao, Xi; Morvan, Jean-Marie; Chen, Liming
We present a fully automatic multimodal 2D + 3D feature-based facial expression recognition approach and demonstrate its performance on the BU-3DFE database. Our approach combines multi-order gradient-based local texture and shape descriptors in order to achieve efficiency and robustness. First, a large set of fiducial facial landmarks of 2D face images along with their 3D face scans are localized using a novel algorithm namely incremental Parallel Cascade of Linear Regression (iPar-CLR). Then, a novel Histogram of Second Order Gradients (HSOG) based local image descriptor in conjunction with the widely used first-order gradient based SIFT descriptor are used to describe the local texture around each 2D landmark. Similarly, the local geometry around each 3D landmark is described by two novel local shape descriptors constructed using the first-order and the second-order surface differential geometry quantities, i.e., Histogram of mesh Gradients (meshHOG) and Histogram of mesh Shape index (curvature quantization, meshHOS). Finally, the Support Vector Machine (SVM) based recognition results of all 2D and 3D descriptors are fused at both feature-level and score-level to further improve the accuracy. Comprehensive experimental results demonstrate that there exist impressive complementary characteristics between the 2D and 3D descriptors. We use the BU-3DFE benchmark to compare our approach to the state-of-the-art ones. Our multimodal feature-based approach outperforms the others by achieving an average recognition accuracy of 86.32%. Moreover, a good generalization ability is shown on the Bosphorus database.
Xie, Zhihua; Zhang, Shuai; Liu, Guodong; Xiong, Jinquan
Visible face recognition systems, being vulnerable to illumination, expression, and pose, can not achieve robust performance in unconstrained situations. Meanwhile, near infrared face images, being light- independent, can avoid or limit the drawbacks of face recognition in visible light, but its main challenges are low resolution and signal noise ratio (SNR). Therefore, near infrared and visible fusion face recognition has become an important direction in the field of unconstrained face recognition research. In order to extract the discriminative complementary features between near infrared and visible images, in this paper, we proposed a novel near infrared and visible face fusion recognition algorithm based on DCT and LBP features. Firstly, the effective features in near-infrared face image are extracted by the low frequency part of DCT coefficients and the partition histograms of LBP operator. Secondly, the LBP features of visible-light face image are extracted to compensate for the lacking detail features of the near-infrared face image. Then, the LBP features of visible-light face image, the DCT and LBP features of near-infrared face image are sent to each classifier for labeling. Finally, decision level fusion strategy is used to obtain the final recognition result. The visible and near infrared face recognition is tested on HITSZ Lab2 visible and near infrared face database. The experiment results show that the proposed method extracts the complementary features of near-infrared and visible face images and improves the robustness of unconstrained face recognition. Especially for the circumstance of small training samples, the recognition rate of proposed method can reach 96.13%, which has improved significantly than 92.75 % of the method based on statistical feature fusion.
The current era of standards and accountability in U.S. public schooling narrows recognition and assessment to an almost exclusive focus on the production of test scores as legitimate markers of student achievement. This climate prevents rather than encourages democratic forms of exchange within and across social worlds. Via a case study of one…
Nguyen, Phuong Giang; Andersen, Hans Jørgen
The vast growth of image databases creates many challenges for computer vision applications, for instance image retrieval and object recognition. Large variation in imaging conditions such as illumination and geometrical properties (including scale, rotation, and viewpoint) gives rise to the need...
Full Text Available Robust and fast traffic sign recognition is very important but difficult for safe driving assistance systems. This study addresses fast and robust traffic sign recognition to enhance driving safety. The proposed method includes three stages. First, a typical Hough transformation is adopted to implement coarse-grained location of the candidate regions of traffic signs. Second, a RIBP (Rotation Invariant Binary Pattern based feature in the affine and Gaussian space is proposed to reduce the time of traffic sign detection and achieve robust traffic sign detection in terms of scale, rotation, and illumination. Third, the techniques of ANN (Artificial Neutral Network based feature dimension reduction and classification are designed to reduce the traffic sign recognition time. Compared with the current work, the experimental results in the public datasets show that this work achieves robustness in traffic sign recognition with comparable recognition accuracy and faster processing speed, including training speed and recognition speed.
Alm, Rebekka; Waltemath, Dagmar; Wolfien, Markus; Wolkenhauer, Olaf; Henkel, Ron
Model repositories such as BioModels Database provide computational models of biological systems for the scientific community. These models contain rich semantic annotations that link model entities to concepts in well-established bio-ontologies such as Gene Ontology. Consequently, thematically similar models are likely to share similar annotations. Based on this assumption, we argue that semantic annotations are a suitable tool to characterize sets of models. These characteristics improve model classification, allow to identify additional features for model retrieval tasks, and enable the comparison of sets of models. In this paper we discuss four methods for annotation-based feature extraction from model sets. We tested all methods on sets of models in SBML format which were composed from BioModels Database. To characterize each of these sets, we analyzed and extracted concepts from three frequently used ontologies, namely Gene Ontology, ChEBI and SBO. We find that three out of the methods are suitable to determine characteristic features for arbitrary sets of models: The selected features vary depending on the underlying model set, and they are also specific to the chosen model set. We show that the identified features map on concepts that are higher up in the hierarchy of the ontologies than the concepts used for model annotations. Our analysis also reveals that the information content of concepts in ontologies and their usage for model annotation do not correlate. Annotation-based feature extraction enables the comparison of model sets, as opposed to existing methods for model-to-keyword comparison, or model-to-model comparison.
Mala, S; Latha, K
Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.
Karadogan, Seliz; Larsen, Jan
The recognition of affect in speech has attracted a lot of interest recently; especially in the area of cognitive and computer sciences. Most of the previous studies focused on the recognition of basic emotions (such as happiness, sadness and anger) using categorical approach. Recently, the focus...... has been shifting towards dimensional affect recognition based on the idea that emotional states are not independent from one another but related in a systematic manner. In this paper, we design a continuous dimensional speech affect recognition model that combines acoustic and semantic features. We...... show that combining semantic and acoustic information for dimensional speech recognition improves the results. Moreover, we show that valence is better estimated using semantic features while arousal is better estimated using acoustic features....
Fang, Hongqing; He, Lei; Si, Hao; Liu, Peng; Xie, Xiaolei
In this paper, Back-propagation(BP) algorithm has been used to train the feed forward neural network for human activity recognition in smart home environments, and inter-class distance method for feature selection of observed motion sensor events is discussed and tested. And then, the human activity recognition performances of neural network using BP algorithm have been evaluated and compared with other probabilistic algorithms: Naïve Bayes(NB) classifier and Hidden Markov Model(HMM). The results show that different feature datasets yield different activity recognition accuracy. The selection of unsuitable feature datasets increases the computational complexity and degrades the activity recognition accuracy. Furthermore, neural network using BP algorithm has relatively better human activity recognition performances than NB classifier and HMM. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Liang, Wei; Zhang, Laibin; Mingda, Wang; Jinqiu, Hu [College of Mechanical and Transportation Engineering, China University of Petroleum, Beijing, (China)
The negative wave pressure method is one of the processes used to detect leaks on oil pipelines. The development of new leakage recognition processes is difficult because it is practically impossible to collect leakage pressure samples. The method of leakage feature extraction and the selection of the recognition model are also important in pipeline leakage detection. This study investigated a new feature extraction approach Singular Value Projection (SVP). It projects the singular value to a standard basis. A new pipeline recognition model based on the multi-class Support Vector Machines was also developed. It was found that SVP is a clear and concise recognition feature of the negative pressure wave. Field experiments proved that the model provided a high recognition accuracy rate. This approach to pipeline leakage detection based on the SVP and SVM has a high application value.
Nicole A Capela
Full Text Available Human activity recognition (HAR, using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter. The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree. Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.
Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie
Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.
All PROVE-IT(FRiV) project reports are listed below. 1. E. Granger, P. Radtke , and D. Gorodnichy, “Survey of academic research and prototypes for face...recognition in video”, Border Technology Division, Division Report 2014-25 (TR). 2. D. Gorodnichy, E.Granger, and P. Radtke , “Survey of commercial...Gorodnichy, E. Choy, W. Khreich, P. Radtke , J. Bergeron, and D. Bissessar, “Results from evaluation of three commercial off-the-shelf face
Zuo, F.; With, de P.H.N.; Ebrahimi, T.; Sikora, T.
In a home environment, video surveillance employing face detection and recognition is attractive for new applications. Facial feature (e.g. eyes and mouth) localization in the face is an essential task for face recognition because it constitutes an indispensable step for face geometry normalization.
Bapst, Aleksander B.; Tran, Jonathan; Koch, Mark W.; Moya, Mary M.; Swahn, Robert
Fast, accurate and robust automatic target recognition (ATR) in optical aerial imagery can provide game-changing advantages to military commanders and personnel. ATR algorithms must reject non-targets with a high degree of confidence in a world with an infinite number of possible input images. Furthermore, they must learn to recognize new targets without requiring massive data collections. Whereas most machine learning algorithms classify data in a closed set manner by mapping inputs to a fixed set of training classes, open set recognizers incorporate constraints that allow for inputs to be labelled as unknown. We have adapted two template-based open set recognizers to use computer generated synthetic images of military aircraft as training data, to provide a baseline for military-grade ATR: (1) a frequentist approach based on probabilistic fusion of extracted image features, and (2) an open set extension to the one-class support vector machine (SVM). These algorithms both use histograms of oriented gradients (HOG) as features as well as artificial augmentation of both real and synthetic image chips to take advantage of minimal training data. Our results show that open set recognizers trained with synthetic data and tested with real data can successfully discriminate real target inputs from non-targets. However, there is still a requirement for some knowledge of the real target in order to calibrate the relationship between synthetic template and target score distributions. We conclude by proposing algorithm modifications that may improve the ability of synthetic data to represent real data.
Takacs, Gabriel; Chandrasekhar, Vijay; Tsai, Sam; Chen, David; Grzeszczuk, Radek; Girod, Bernd
We present an end-to-end feature description pipeline which uses a novel interest point detector and Rotation- Invariant Fast Feature (RIFF) descriptors. The proposed RIFF algorithm is 15× faster than SURF1 while producing large-scale retrieval results that are comparable to SIFT.2 Such high-speed features benefit a range of applications from Mobile Augmented Reality (MAR) to web-scale image retrieval and analysis.
Kim, Jonghwa; André, Elisabeth
This paper investigates the potential of physiological signals as a reliable channel for automatic recognition of user's emotial state. For the emotion recognition, little attention has been paid so far to physiological signals compared to audio-visual emotion channels such as facial expression or speech. All essential stages of automatic recognition system using biosignals are discussed, from recording physiological dataset up to feature-based multiclass classification. Four-channel biosensors are used to measure electromyogram, electrocardiogram, skin conductivity and respiration changes. A wide range of physiological features from various analysis domains, including time/frequency, entropy, geometric analysis, subband spectra, multiscale entropy, etc., is proposed in order to search the best emotion-relevant features and to correlate them with emotional states. The best features extracted are specified in detail and their effectiveness is proven by emotion recognition results.
Cleary, Anne M; Ryals, Anthony J; Wagner, Samantha R
Research suggests that a feature-matching process underlies cue familiarity-detection when cued recall with graphemic cues fails. When a test cue (e.g., potchbork) overlaps in graphemic features with multiple unrecalled studied items (e.g., patchwork, pitchfork, pocketbook, pullcork), higher cue familiarity ratings are given during recall failure of all of the targets than when the cue overlaps in graphemic features with only one studied target and that target fails to be recalled (e.g., patchwork). The present study used semantic feature production norms (McRae et al., Behavior Research Methods, Instruments, & Computers, 37, 547-559, 2005) to examine whether the same holds true when the cues are semantic in nature (e.g., jaguar is used to cue cheetah). Indeed, test cues (e.g., cedar) that overlapped in semantic features (e.g., a_tree, has_bark, etc.) with four unretrieved studied items (e.g., birch, oak, pine, willow) received higher cue familiarity ratings during recall failure than test cues that overlapped in semantic features with only two (also unretrieved) studied items (e.g., birch, oak), which in turn received higher familiarity ratings during recall failure than cues that did not overlap in semantic features with any studied items. These findings suggest that the feature-matching theory of recognition during recall failure can accommodate recognition of semantic cues during recall failure, providing a potential mechanism for conceptually-based forms of cue recognition during target retrieval failure. They also provide converging evidence for the existence of the semantic features envisaged in feature-based models of semantic knowledge representation and for those more concretely specified by the production norms of McRae et al. (Behavior Research Methods, Instruments, & Computers, 37, 547-559, 2005).
Veacheslav L. Perju
Full Text Available The new image complexity informative feature is proposed. The experimental estimation of the image complexity is carried out. There are elaborated two optical-electronic processors for image complexity calculation. The determination of the necessary number of the image's digitization elements depending on the image complexity was carried out. The accuracy of the image complexity feature calculation was made.
Zhang, Shijun; Jing, Zhongliang; Li, Jianxun
The rotation invariant feature of the target is obtained using the multi-direction feature extraction property of the steerable filter. Combining the morphological operation top-hat transform with the self-organizing feature map neural network, the adaptive topological region is selected. Using the erosion operation, the topological region shrinkage is achieved. The steerable filter based morphological self-organizing feature map neural network is applied to automatic target recognition of binary standard patterns and real-world infrared sequence images. Compared with Hamming network and morphological shared-weight networks respectively, the higher recognition correct rate, robust adaptability, quick training, and better generalization of the proposed method are achieved.
Iqtait, M.; Mohamad, F. S.; Mamat, M.
Biometric is a pattern recognition system which is used for automatic recognition of persons based on characteristics and features of an individual. Face recognition with high recognition rate is still a challenging task and usually accomplished in three phases consisting of face detection, feature extraction, and expression classification. Precise and strong location of trait point is a complicated and difficult issue in face recognition. Cootes proposed a Multi Resolution Active Shape Models (ASM) algorithm, which could extract specified shape accurately and efficiently. Furthermore, as the improvement of ASM, Active Appearance Models algorithm (AAM) is proposed to extracts both shape and texture of specified object simultaneously. In this paper we give more details about the two algorithms and give the results of experiments, testing their performance on one dataset of faces. We found that the ASM is faster and gains more accurate trait point location than the AAM, but the AAM gains a better match to the texture.
Amol D. Rahulkar
Full Text Available The feature extraction plays a very important role in iris recognition. Recent researches on multiscale analysis provide good opportunity to extract more accurate information for iris recognition. In this work, a new directional iris texture features based on 2-D Fast Discrete Curvelet Transform (FDCT is proposed. The proposed approach divides the normalized iris image into six sub-images and the curvelet transform is applied independently on each sub-image. The anisotropic feature vector for each sub-image is derived using the directional energies of the curvelet coefficients. These six feature vectors are combined to create the resultant feature vector. During recognition, the nearest neighbor classifier based on Euclidean distance has been used for authentication. The effectiveness of the proposed approach has been tested on two different databases namely UBIRIS and MMU1. Experimental results show the superiority of the proposed approach.
The iris image is easily polluted by noise and uneven light. This paper proposed an improved extreme learning machine (ELM) based iris recognition algorithm with hybrid feature. 2D-Gabor filters and GLCM is employed to generate a multi-granularity hybrid feature vector. 2D-Gabor filter and GLCM feature work for capturing low-intermediate frequency and high frequency texture information, respectively. Finally, we utilize extreme learning machine for iris recognition. Experimental results reveal our proposed ELM based multi-granularity iris recognition algorithm (ELM-MGIR) has higher accuracy of 99.86%, and lower EER of 0.12% under the premise of real-time performance. The proposed ELM-MGIR algorithm outperforms other mainstream iris recognition algorithms.
databases. This shows that the evaluation of algorithms on just one or two databases is not sufficient to confirm the performance of tech- niques as they may be database-dependent. Much work was done to find a feature-set that would have a good performance across three. FVC databases of the FVC 2000, 2002 and. 2004 ...
Hesselink, Wim H.
The Euclidean distance transform of a binary image is the function that assigns to every pixel the Euclidean distance to the background. The Euclidean feature transform is the function that assigns to every pixel the set of background pixels with this distance. We present an algorithm to compute the
Full Text Available -optimal classification accuracy. Therefore to improve the classification accuracy, a new feature vector, combining joint angles and the relative position of the arm joints with respect to the head, is proposed. A k-means classifier is used to cluster each gesture. New...
Teulings, Hans-leo L; Schomaker, L R; Impedovo, S.
A handwriting pattern is considered as a sequence of ballistic strokes. Replications of a pattern may be generated from a single, higher-level memory representation, acting as a motor program. Therefore, those stroke features which show the most invariant pattern are probably related to the
Full Text Available This paper addresses the problems arising from the use of data acquired with two different remote sensing techniques—high-resolution satellite imagery (HRSI and terrestrial laser scanning (TLS—for the extraction of digital elevation models (DEMs used in the geomorphological analysis and recognition of landslides, taking into account the uncertainties associated with DEM production. In order to obtain a georeferenced and edited point cloud, the two data sets require quite different processes, which are more complex for satellite images than for TLS data. The differences between the two processes are highlighted. The point clouds are interpolated on a DEM with a 1 m grid size using kriging. Starting from these DEMs, a number of contour, slope, and aspect maps are extracted, together with their associated uncertainty maps. Comparative analysis of selected landslide features drawn from the two data sources allows recognition and classification of hierarchical and multiscale landslide components. Taking into account the uncertainty related to the map enables areas to be located for which one data source was able to give more reliable results than another. Our case study is located in Southern Italy, in an area known for active landslides.
Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang
Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.
Full Text Available In this paper, we present a Synthetic Aperture Radar (SAR image target recognition algorithm based on multi-feature multiple representation learning classifier fusion. First, it extracts three features from the SAR images, namely principal component analysis, wavelet transform, and Two-Dimensional Slice Zernike Moments (2DSZM features. Second, we harness the sparse representation classifier and the cooperative representation classifier with the above-mentioned features to get six predictive labels. Finally, we adopt classifier fusion to obtain the final recognition decision. We researched three different classifier fusion algorithms in our experiments, and the results demonstrate thatusing Bayesian decision fusion gives thebest recognition performance. The method based on multi-feature multiple representation learning classifier fusion integrates the discrimination of multi-features and combines the sparse and cooperative representation classification performance to gain complementary advantages and to improve recognition accuracy. The experiments are based on the Moving and Stationary Target Acquisition and Recognition (MSTAR database,and they demonstrate the effectiveness of the proposed approach.
Kristensen, Rasmus Lyngby; Tan, Zheng-Hua; Ma, Zhanyu
This paper conducts a survey of modern binary pattern flavored feature extractors applied to the Facial Expression Recognition (FER) problem. In total, 26 different feature extractors are included, of which six are selected for in depth description. In addition, the paper unifies important FER...
Freire, Alejo; Lee, Kang
Tested in two studies 4- to 7-year-olds' face recognition by manipulating the faces' configural and featural information. Found that even with only a single 5-second exposure, most children could use configural and featural cues to make identity judgments. Repeated exposure and feedback improved others' performance. Even proficient memories were…
The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).
Muthusamy, Hariharan; Polat, Kemal; Yaacob, Sazali
In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.
Diaz-Escobar, Julia; Kober, Vitaly
Nowadays most of digital information is obtained using mobile devices specially smartphones. In particular, it brings the opportunity for optical character recognition in camera-captured images. For this reason many recognition applications have been recently developed such as recognition of license plates, business cards, receipts and street signal; document classification, augmented reality, language translator and so on. Camera-captured images are usually affected by geometric distortions, nonuniform illumination, shadow, noise, which make difficult the recognition task with existing systems. It is well known that the Fourier phase contains a lot of important information regardless of the Fourier magnitude. So, in this work we propose a phase-based recognition system exploiting phase-congruency features for illumination/scale invariance. The performance of the proposed system is tested in terms of miss classifications and false alarms with the help of computer simulation.
Nasrollahi, Kamal; Moeslund, Thomas B.
Face recognition is still a very challenging task when the input face image is noisy, occluded by some obstacles, of very low-resolution, not facing the camera, and not properly illuminated. These problems make the feature extraction and consequently the face recognition system unstable....... The proposed system in this paper introduces the novel idea of using Haar-like features, which have commonly been used for object detection, along with a probabilistic classifier for face recognition. The proposed system is simple, real-time, effective and robust against most of the mentioned problems....... Experimental results on public databases show that the proposed system indeed outperforms the state-of-the-art face recognition systems....
Keong Chen Wong; Yusof Yusri
This paper presents an algorithm for efficiently recognizing and determining the convexity of an edge blend feature. The algorithm first recognizes all of the edge blend features from the Boundary Representation of a part; then a series of convexity test have been run on the recognized edge blend features. The novelty of the presented algorithm lies in, instead of each recognized blend feature is suppressed as most of researchers did, the recognized blend features of this research are gone th...
in saving la- beling costs while simultaneously achieving good performance. Most semi-supervised learning methods assume that nearby points are likely...3, 5, 10 and 15) per category in the training set, thus resulting in , , , and randomly la- beled videos, with the remaining training videos unlabeled...with the increase of la- beled training samples, the performance of all algorithms rises. Meanwhile, the performance differences between our method and
Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III
Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.
Babloyantz, A.; Ivanov, V.V.; Zrelov, P.V.
A new approach for the detection of slight changes in the form of the ECG signal is proposed. It is based on the approximation of raw ECG data inside each RR-interval by the expansion in polynomials of special type and on the classification of samples represented by sets of expansion coefficients using a layered feed-forward neural network. The transformation applied provides significantly simpler data structure, stability to noise and to other accidental factors. A by-product of the method is the compression of ECG data with factor 5
V. K. Hohlov
Full Text Available The article forms the rationale for selecting the informative features of the helicopter and aircraft acoustic signals to solve a problem of their recognition and shows that the most informative ones are the counts of extrema in the energy spectra of the input signals, which represent non-centered random variables. An apparatus of the multiple initial regression coefficients was selected as a mathematical tool of research. The application of digital re-circulators with positive and negative feedbacks, which have the comb-like frequency characteristics, solves the problem of selecting informative features. A distinguishing feature of such an approach is easy agility of the comb frequency characteristics just through the agility of a delay value of digital signal in the feedback circuit. Adding an adaptation block to the selection block of the informative features enables us to ensure the invariance of used informative feature and counts of local extrema of the spectral power density to the airspeed of a helicopter. The paper gives reasons for the principle of adaptation and the structure of the adaptation block. To form the discriminator characteristics are used the cross-correlation statistical characteristics of the quadrature components of acoustic signal realizations, obtained by Hilbert transform. The paper proposes to provide signal recognition using a regression algorithm that allows handling the non-centered informative features and using a priori information about coefficients of initial regression of signal and noise.The simulation in Matlab Simulink has shown that selected informative features of signals in regressive processing of signal realizations of 0.5 s duration have good separability, and based on a set of 100 acoustic signal realizations in each class in full-scale conditions, has proved ensuring a correct recognition probability of 0.975, at least. The considered principles of informative features selection and adaptation can
Full Text Available This study examines the complexities of using netted radar to recognize and resolve ballistic midcourse targets. The application of micro-motion feature extraction to ballistic mid-course targets is analyzed, and the current status of application and research on micro-motion feature recognition is concluded for singlefunction radar networks such as low- and high-resolution imaging radar networks. Advantages and disadvantages of these networks are discussed with respect to target recognition. Hybrid-mode radar networks combine low- and high-resolution imaging radar and provide a specific reference frequency that is the basis for ballistic target recognition. Main research trends are discussed for hybrid-mode networks that apply micromotion feature extraction to ballistic mid-course targets.
Du, C; Zhou, S; Sun, J; Zhao, J
On the basis of manifold learning theory, a new feature extraction method for Synthetic aperture radar (SAR) target recognition is proposed. First, the proposed algorithm estimates the within-class and between-class local neighbourhood surrounding each SAR sample. After computing the local tangent space for each neighbourhood, the proposed algorithm seeks for the optimal projecting matrix by preserving the local within-class property and simultaneously maximizing the local between-class separability. The use of uncorrelated constraint can also enhance the discriminating power of the optimal projecting matrix. Finally, the nearest neighbour classifier is applied to recognize SAR targets in the projected feature subspace. Experimental results on MSTAR datasets demonstrate that the proposed method can provide a higher recognition rate than traditional feature extraction algorithms in SAR target recognition
Islam, Md Rabiul
The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al.
Feng, Guang; Li, Hengjian; Dong, Jiwen; Chen, Xi; Yang, Huiru
In this paper, we proposed a joint and collaborative representation with Volterra kernel convolution feature (JCRVK) for face recognition. Firstly, the candidate face images are divided into sub-blocks in the equal size. The blocks are extracted feature using the two-dimensional Voltera kernels discriminant analysis, which can better capture the discrimination information from the different faces. Next, the proposed joint and collaborative representation is employed to optimize and classify the local Volterra kernels features (JCR-VK) individually. JCR-VK is very efficiently for its implementation only depending on matrix multiplication. Finally, recognition is completed by using the majority voting principle. Extensive experiments on the Extended Yale B and AR face databases are conducted, and the results show that the proposed approach can outperform other recently presented similar dictionary algorithms on recognition accuracy.
Van der Walt, Christiaan M
Full Text Available artificial data sets to construct a meta-classifier. 4.1. Classifiers We will use model-based and discriminative classifiers to con- struct our meta-classifier; these classifiers are the Naı¨ve Bayes (NB), Gaussian (Gauss), Gaussian Mixture Model (GMM... of these classifiers for real-world data sets. 4.2. Artificial data We will make use of artificial data sets to construct a meta- classification training set; these artificial data sets are gener- ated with very specific data properties that influence classifi...
Chidananda, H.; Reddy, T. Hanumantha
This paper presents a natural representation of numerical digit(s) using hand activity analysis based on number of fingers out stretched for each numerical digit in sequence extracted from a video. The analysis is based on determining a set of six features from a hand image. The most important features used from each frame in a video are the first fingertip from top, palm-line, palm-center, valley points between the fingers exists above the palm-line. Using this work user can convey any number of numerical digits using right or left or both the hands naturally in a video. Each numerical digit ranges from 0 to9. Hands (right/left/both) used to convey digits can be recognized accurately using the valley points and with this recognition whether the user is a right / left handed person in practice can be analyzed. In this work, first the hand(s) and face parts are detected by using YCbCr color space and face part is removed by using ellipse based method. Then, the hand(s) are analyzed to recognize the activity that represents a series of numerical digits in a video. This work uses pixel continuity algorithm using 2D coordinate geometry system and does not use regular use of calculus, contours, convex hull and datasets.
Zhang, Yu; Wu, Jianxin; Cai, Jianfei
In large-scale visual recognition and image retrieval tasks, feature vectors, such as Fisher vector (FV) or the vector of locally aggregated descriptors (VLAD), have achieved state-of-the-art results. However, the combination of the large numbers of examples and high-dimensional vectors necessitates dimensionality reduction, in order to reduce its storage and CPU costs to a reasonable range. In spite of the popularity of various feature compression methods, this paper shows that the feature (dimension) selection is a better choice for high-dimensional FV/VLAD than the feature (dimension) compression methods, e.g., product quantization. We show that strong correlation among the feature dimensions in the FV and the VLAD may not exist, which renders feature selection a natural choice. We also show that, many dimensions in FV/VLAD are noise. Throwing them away using feature selection is better than compressing them and useful dimensions altogether using feature compression methods. To choose features, we propose an efficient importance sorting algorithm considering both the supervised and unsupervised cases, for visual recognition and image retrieval, respectively. Combining with the 1-bit quantization, feature selection has achieved both higher accuracy and less computational cost than feature compression methods, such as product quantization, on the FV and the VLAD image representations.
Noor Abdalrazak Shnain
Full Text Available Facial recognition is one of the most challenging and interesting problems within the field of computer vision and pattern recognition. During the last few years, it has gained special attention due to its importance in relation to current issues such as security, surveillance systems and forensics analysis. Despite this high level of attention to facial recognition, the success is still limited by certain conditions; there is no method which gives reliable results in all situations. In this paper, we propose an efficient similarity index that resolves the shortcomings of the existing measures of feature and structural similarity. This measure, called the Feature-Based Structural Measure (FSM, combines the best features of the well-known SSIM (structural similarity index measure and FSIM (feature similarity index measure approaches, striking a balance between performance for similar and dissimilar images of human faces. In addition to the statistical structural properties provided by SSIM, edge detection is incorporated in FSM as a distinctive structural feature. Its performance is tested for a wide range of PSNR (peak signal-to-noise ratio, using ORL (Olivetti Research Laboratory, now AT&T Laboratory Cambridge and FEI (Faculty of Industrial Engineering, São Bernardo do Campo, São Paulo, Brazil databases. The proposed measure is tested under conditions of Gaussian noise; simulation results show that the proposed FSM outperforms the well-known SSIM and FSIM approaches in its efficiency of similarity detection and recognition of human faces.
Petar S. Aleksic
Full Text Available We describe an audio-visual automatic continuous speech recognition system, which significantly improves speech recognition performance over a wide range of acoustic noise levels, as well as under clean audio conditions. The system utilizes facial animation parameters (FAPs supported by the MPEG-4 standard for the visual representation of speech. We also describe a robust and automatic algorithm we have developed to extract FAPs from visual data, which does not require hand labeling or extensive training procedures. The principal component analysis (PCA was performed on the FAPs in order to decrease the dimensionality of the visual feature vectors, and the derived projection weights were used as visual features in the audio-visual automatic speech recognition (ASR experiments. Both single-stream and multistream hidden Markov models (HMMs were used to model the ASR system, integrate audio and visual information, and perform a relatively large vocabulary (approximately 1000 words speech recognition experiments. The experiments performed use clean audio data and audio data corrupted by stationary white Gaussian noise at various SNRs. The proposed system reduces the word error rate (WER by 20% to 23% relatively to audio-only speech recognition WERs, at various SNRs (0Ã¢Â€Â“30 dB with additive white Gaussian noise, and by 19% relatively to audio-only speech recognition WER under clean audio conditions.
Full Text Available Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence. This paper proposes a new method of forensic automatic speaker recognition using the likelihood ratio framework to quantify the strength of voice evidence. The proposed method uses a reference database to calculate the within- and between-speaker variability. Some acoustic-phonetic features are extracted automatically using the software VoiceSauce. The effectiveness of the approach was tested using two Mandarin databases: A mobile telephone database and a landline database. The experiment's results indicate that these acoustic-phonetic features do have some discriminating potential and are worth trying in discrimination. The automatic acoustic-phonetic features have acceptable discriminative performance and can provide more reliable results in evidence analysis when fused with other kind of voice features.
Miwa, Shotaro; Kage, Hiroshi; Hirai, Takashi; Sumi, Kazuhiko
We propose a probabilistic face recognition algorithm for Access Control System(ACS)s. Comparing with existing ACSs using low cost IC-cards, face recognition has advantages in usability and security that it doesn't require people to hold cards over scanners and doesn't accept imposters with authorized cards. Therefore face recognition attracts more interests in security markets than IC-cards. But in security markets where low cost ACSs exist, price competition is important, and there is a limitation on the quality of available cameras and image control. Therefore ACSs using face recognition are required to handle much lower quality images, such as defocused and poor gain-controlled images than high security systems, such as immigration control. To tackle with such image quality problems we developed a face recognition algorithm based on a probabilistic model which combines a variety of image-difference features trained by Real AdaBoost with their prior probability distributions. It enables to evaluate and utilize only reliable features among trained ones during each authentication, and achieve high recognition performance rates. The field evaluation using a pseudo Access Control System installed in our office shows that the proposed system achieves a constant high recognition performance rate independent on face image qualities, that is about four times lower EER (Equal Error Rate) under a variety of image conditions than one without any prior probability distributions. On the other hand using image difference features without any prior probabilities are sensitive to image qualities. We also evaluated PCA, and it has worse, but constant performance rates because of its general optimization on overall data. Comparing with PCA, Real AdaBoost without any prior distribution performs twice better under good image conditions, but degrades to a performance as good as PCA under poor image conditions.
Joao Ricardo Sato
Full Text Available Attention-Deficit/Hyperactivity Disorder is a neurodevelopmental disorder, being one of the most prevalent psychiatric disorders in childhood. The neural substrates associated with this condition, both from structural and functional perspectives, are not yet well established . Recent studies have highlighted the relevance of neuroimaging not only to provide a more solid understanding about the disorder but also for possible clinical support. The ADHD-200 Consortium organized the ADHD-200 global competition making publicly available, hundreds of structural magnetic resonance imaging (MRI and functional MRI (fMRI datasets of both ADHD patients and typically developing controls for research use. In the current study, we evaluate the predictive power of a set of three different feature extraction methods and 10 different pattern recognition methods. The features tested were regional homogeneity (ReHo, amplitude of low frequency fluctuations (ALFF and independent components analysis maps (RSN. Our findings suggest that the combination ALFF+ReHo maps contain relevant information to discriminate ADHD patients from typically developing controls, but with limited accuracy. All classifiers provided almost the same performance in this case. In addition, the combination ALFF+ReHo+RSN was relevant in combined vs inattentive ADHD classification, achieving a score accuracy of 67%. In this latter case, the performances of the classifiers were not equivalent and L2-regularized logistic regression (both in primal and dual space provided the most accurate predictions. The analysis of brain regions containing most discriminative information suggested that in both classifications (ADHD vs typically developing controls and combined vs inattentive, the relevant information is not confined only to a small set of regions but it is spatially distributed across the whole brain.
Zhang, Yu; Zhou, Guoxu; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej
Canonical correlation analysis (CCA) has been successfully applied to steady-state visual evoked potential (SSVEP) recognition for brain-computer interface (BCI) application. Although the CCA method outperforms the traditional power spectral density analysis through multi-channel detection, it requires additionally pre-constructed reference signals of sine-cosine waves. It is likely to encounter overfitting in using a short time window since the reference signals include no features from training data. We consider that a group of electroencephalogram (EEG) data trials recorded at a certain stimulus frequency on a same subject should share some common features that may bear the real SSVEP characteristics. This study therefore proposes a common feature analysis (CFA)-based method to exploit the latent common features as natural reference signals in using correlation analysis for SSVEP recognition. Good performance of the CFA method for SSVEP recognition is validated with EEG data recorded from ten healthy subjects, in contrast to CCA and a multiway extension of CCA (MCCA). Experimental results indicate that the CFA method significantly outperformed the CCA and the MCCA methods for SSVEP recognition in using a short time window (i.e., less than 1s). The superiority of the proposed CFA method suggests it is promising for the development of a real-time SSVEP-based BCI. Copyright © 2014 Elsevier B.V. All rights reserved.
Bartko, Susan J.; Winters, Boyer D.; Cowell, Rosemary A.; Saksida, Lisa M.; Bussey, Timothy J.
The perirhinal cortex (PRh) has a well-established role in object recognition memory. More recent studies suggest that PRh is also important for two-choice visual discrimination tasks. Specifically, it has been suggested that PRh contains conjunctive representations that help resolve feature ambiguity, which occurs when a task cannot easily be…
Ribeiro, Alexandra B.; Nielsen, Allan Aasbjerg
qualitative microprobe results: present elements Al, Si, Cr, Fe, As (associated with others). Selected groups of calibrated images (same light conditions and magnification) submitted to discriminant analysis, in order to find a pattern of recognition in the soil features corresponding to contamination already...
Gutta, Sandeep; Cheng, Qi
Traditional biometric recognition systems often utilize physiological traits such as fingerprint, face, iris, etc. Recent years have seen a growing interest in electrocardiogram (ECG)-based biometric recognition techniques, especially in the field of clinical medicine. In existing ECG-based biometric recognition methods, feature extraction and classifier design are usually performed separately. In this paper, a multitask learning approach is proposed, in which feature extraction and classifier design are carried out simultaneously. Weights are assigned to the features within the kernel of each task. We decompose the matrix consisting of all the feature weights into sparse and low-rank components. The sparse component determines the features that are relevant to identify each individual, and the low-rank component determines the common feature subspace that is relevant to identify all the subjects. A fast optimization algorithm is developed, which requires only the first-order information. The performance of the proposed approach is demonstrated through experiments using the MIT-BIH Normal Sinus Rhythm database.
Li, Chunyong; Xue, Jiguo; Quan, Cheng; Yue, Jingwei; Zhang, Chenggang
Biometric recognition technology based on eye-movement dynamics has been in development for more than ten years. Different visual tasks, feature extraction and feature recognition methods are proposed to improve the performance of eye movement biometric system. However, the correct identification and verification rates, especially in long-term experiments, as well as the effects of visual tasks and eye trackers' temporal and spatial resolution are still the foremost considerations in eye movement biometrics. With a focus on these issues, we proposed a new visual searching task for eye movement data collection and a new class of eye movement features for biometric recognition. In order to demonstrate the improvement of this visual searching task being used in eye movement biometrics, three other eye movement feature extraction methods were also tested on our eye movement datasets. Compared with the original results, all three methods yielded better results as expected. In addition, the biometric performance of these four feature extraction methods was also compared using the equal error rate (EER) and Rank-1 identification rate (Rank-1 IR), and the texture features introduced in this paper were ultimately shown to offer some advantages with regard to long-term stability and robustness over time and spatial precision. Finally, the results of different combinations of these methods with a score-level fusion method indicated that multi-biometric methods perform better in most cases.
Zhang, Daming; Zhang, Xueyong; Li, Lu; Liu, Huayong
This paper investigates a face recognition approach based on Scale Invariant Feature Transform (SIFT) feature and sparse representation. The approach takes advantage of SIFT which is local feature other than holistic feature in classical Sparse Representation based Classification (SRC) algorithm and possesses strong robustness to expression, pose and illumination variations. Since hexagonal image has more inherit merits than square image to make recognition process more efficient, we extract SIFT keypoint in hexagonal-sampling image. Instead of matching SIFT feature, firstly the sparse representation of each SIFT keypoint is given according the constructed dictionary; secondly these sparse vectors are quantized according dictionary; finally each face image is represented by a histogram and these so-called Bag-of-Words vectors are classified by SVM. Due to use of local feature, the proposed method achieves better result even when the number of training sample is small. In the experiments, the proposed method gave higher face recognition rather than other methods in ORL and Yale B face databases; also, the effectiveness of the hexagonal-sampling in the proposed method is verified.
Full Text Available Abstract We present results of a study into the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits. This includes the first reported use of features extracted using a discrete curvelet transform. The study will show a comparison of some methods for selecting features of each feature type and show the relative benefits of both static and dynamic visual features. The performance of the features will be tested on both clean video data and also video data corrupted in a variety of ways to assess each feature type's robustness to potential real-world conditions. One of the test conditions involves a novel form of video corruption we call jitter which simulates camera and/or head movement during recording.
Full Text Available We present results of a study into the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits. This includes the first reported use of features extracted using a discrete curvelet transform. The study will show a comparison of some methods for selecting features of each feature type and show the relative benefits of both static and dynamic visual features. The performance of the features will be tested on both clean video data and also video data corrupted in a variety of ways to assess each feature type's robustness to potential real-world conditions. One of the test conditions involves a novel form of video corruption we call jitter which simulates camera and/or head movement during recording.
Background Cholera mainly affects developing countries where safe water supply and sanitation infrastructure are often rudimentary. Sub-Saharan Africa is a cholera hotspot. Effective cholera control requires not only a professional assessment, but also consideration of community-based priorities. The present work compares local sociocultural features of endemic cholera in urban and rural sites from three field studies in southeastern Democratic Republic of Congo (SE-DRC), western Kenya and Zanzibar. Methods A vignette-based semistructured interview was used in 2008 in Zanzibar to study sociocultural features of cholera-related illness among 356 men and women from urban and rural communities. Similar cross-sectional surveys were performed in western Kenya (n = 379) and in SE-DRC (n = 360) in 2010. Systematic comparison across all settings considered the following domains: illness identification; perceived seriousness, potential fatality and past household episodes; illness-related experience; meaning; knowledge of prevention; help-seeking behavior; and perceived vulnerability. Results Cholera is well known in all three settings and is understood to have a significant impact on people’s lives. Its social impact was mainly characterized by financial concerns. Problems with unsafe water, sanitation and dirty environments were the most common perceived causes across settings; nonetheless, non-biomedical explanations were widespread in rural areas of SE-DRC and Zanzibar. Safe food and water and vaccines were prioritized for prevention in SE-DRC. Safe water was prioritized in western Kenya along with sanitation and health education. The latter two were also prioritized in Zanzibar. Use of oral rehydration solutions and rehydration was a top priority everywhere; healthcare facilities were universally reported as a primary source of help. Respondents in SE-DRC and Zanzibar reported cholera as affecting almost everybody without differentiating much for gender, age
Schafer, Phillip B; Jin, Dezhe Z
Speech recognition in noisy conditions is a major challenge for computer systems, but the human brain performs it routinely and accurately. Automatic speech recognition (ASR) systems that are inspired by neuroscience can potentially bridge the performance gap between humans and machines. We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes from a population of simulated auditory feature-detecting neurons. Each neuron is trained to respond selectively to a brief spectrotemporal pattern, or feature, drawn from the simulated auditory nerve response to speech. The neural population conveys the time-dependent structure of a sound by its sequence of spikes. We compare two methods for decoding the spike sequences--one using a hidden Markov model-based recognizer, the other using a novel template-based recognition scheme. In the latter case, words are recognized by comparing their spike sequences to template sequences obtained from clean training data, using a similarity measure based on the length of the longest common sub-sequence. Using isolated spoken digits from the AURORA-2 database, we show that our combined system outperforms a state-of-the-art robust speech recognizer at low signal-to-noise ratios. Both the spike-based encoding scheme and the template-based decoding offer gains in noise robustness over traditional speech recognition methods. Our system highlights potential advantages of spike-based acoustic coding and provides a biologically motivated framework for robust ASR development.
Sagara, Kaoru; Abe, Akinori; Ozaku, Hiromi Itoh; Kuwahara, Noriaki; Kogure, Kiyoshi
This paper reports the features and relationships between standardizes nursing terminology sets used in Japan. First, we analyzed the common parts in five standardized nursing terminology sets: the Japan Nursing Practice Standard Master (JNPSM) that includes the names of nursing activities and is built by the Medical Information Center Development Center (MEDIS-DC); the labels of the Japan Classification of Nursing Practice (JCNP), built by the term advisory committee in the Japan Academy of Nursing Science; the labels of the International Classification for Nursing Practice (ICNP) translated to Japanese; the labels, domain names, and class names of the North American Nursing Diagnosis Association (NANDA) Nursing Diagnoses 2003-2004 translated to Japanese; and the terms included in the labels of Nursing Interventions Classification (NIC) translated to Japanese. Then we compared them with terms in a thesaurus dictionary, the Bunrui Goihyo, that contains general Japanese words and is built by the National Institute for Japanese Language. 1) the level of interchangeability between four standardized nursing terminology sets is quite low; 2) abbreviations and katakana words are frequently used to express nursing activities; 3) general Japanese words are usually used to express the status or situation of patients.
Guo, Dongwei; Wang, Zhe
Convolutional neural networks (CNN) achieve great success in computer vision, it can learn hierarchical representation from raw pixels and has outstanding performance in various image recognition tasks . However, CNN is easy to be fraudulent in terms of it is possible to produce images totally unrecognizable to human eyes that CNNs believe with near certainty are familiar objects. . In this paper, an associative memory model based on multiple features is proposed. Within this model, feature extraction and classification are carried out by CNN, T-SNE and exponential bidirectional associative memory neural network (EBAM). The geometric features extracted from CNN and the digital features extracted from T-SNE are associated by EBAM. Thus we ensure the recognition of robustness by a comprehensive assessment of the two features. In our model, we can get only 8% error rate with fraudulent data. In systems that require a high safety factor or some key areas, strong robustness is extremely important, if we can ensure the image recognition robustness, network security will be greatly improved and the social production efficiency will be extremely enhanced.
Zhu, Jianwei; Zhang, Haicang; Li, Shuai Cheng; Wang, Chao; Kong, Lupeng; Sun, Shiwei; Zheng, Wei-Mou; Bu, Dongbo
Accurate recognition of protein fold types is a key step for template-based prediction of protein structures. The existing approaches to fold recognition mainly exploit the features derived from alignments of query protein against templates. These approaches have been shown to be successful for fold recognition at family level, but usually failed at superfamily/fold levels. To overcome this limitation, one of the key points is to explore more structurally informative features of proteins. Although residue-residue contacts carry abundant structural information, how to thoroughly exploit these information for fold recognition still remains a challenge. In this study, we present an approach (called DeepFR) to improve fold recognition at superfamily/fold levels. The basic idea of our approach is to extract fold-specific features from predicted residue-residue contacts of proteins using deep convolutional neural network (DCNN) technique. Based on these fold-specific features, we calculated similarity between query protein and templates, and then assigned query protein with fold type of the most similar template. DCNN has showed excellent performance in image feature extraction and image recognition; the rational underlying the application of DCNN for fold recognition is that contact likelihood maps are essentially analogy to images, as they both display compositional hierarchy. Experimental results on the LINDAHL dataset suggest that even using the extracted fold-specific features alone, our approach achieved success rate comparable to the state-of-the-art approaches. When further combining these features with traditional alignment-related features, the success rate of our approach increased to 92.3%, 82.5% and 78.8% at family, superfamily and fold levels, respectively, which is about 18% higher than the state-of-the-art approach at fold level, 6% higher at superfamily level and 1% higher at family level. An independent assessment on SCOP_TEST dataset showed consistent
Kim, J. Y.; Kim, C. H.; Kim, B. H.
In this study, the researches classifying the artificial and natural flaws in welding parts are performed using the pattern recognition technology. For this purpose the signal pattern recognition package including the user defined function was developed and the total procedure including the digital signal processing, feature extraction, feature selection and classifier selection is treated by bulk. Specially it is composed with and discussed using the statistical classifier such as the linear discriminant function classifier, the empirical Bayesian classifier. Also, the pattern recognition technology is applied to classification problem of natural flaw(i.e multiple classification problem-crack, lack of penetration, lack of fusion, porosity, and slag inclusion, the planar and volumetric flaw classification problem). According to this results, if appropriately teamed the neural network classifier is better than stastical classifier in the classification problem of natural flaw. And it is possible to acquire the recognition rate of 80% above through it is different a little according to domain extracting the feature and the classifier.
Khayat, Omid; Afarideh, Hossein
Track counting algorithms as one of the fundamental principles of nuclear science have been emphasized in the recent years. Accurate measurement of nuclear tracks on solid-state nuclear track detectors is the aim of track counting systems. Commonly track counting systems comprise a hardware system for the task of imaging and software for analysing the track images. In this paper, a track recognition algorithm based on 12 defined textual and shape-based features and a neuro-fuzzy classifier is proposed. Features are defined so as to discern the tracks from the background and small objects. Then, according to the defined features, tracks are detected using a trained neuro-fuzzy system. Features and the classifier are finally validated via 100 Alpha track images and 40 training samples. It is shown that principle textual and shape-based features concomitantly yield a high rate of track detection compared with the single-feature based methods.
Kolesov, V. I.; Danilov, O. F.; Petrov, A. I.
Road traffic safety (RTS) management is inherently a branch of cybernetics and therefore requires clear formalization of the task. The paper aims at identification of the specific features of goal setting in RTS management under the system approach. The paper presents the results of cybernetic modeling of the cause-to-effect mechanism of a road traffic accident (RTA); in here, the mechanism itself is viewed as a complex system. A designed management goal function is focused on minimizing the difficulty in achieving the target goal. Optimization of the target goal has been performed using the Lagrange principle. The created working algorithms have passed the soft testing. The key role of the obtained solution in the tactical and strategic RTS management is considered. The dynamics of the management effectiveness indicator has been analyzed based on the ten-year statistics for Russia.
Sa'adillah Maylawati, Dian; Putri Saptawati, G. A.
Indonesian slang are commonly used in social media. Due to their unstructured syntax, it is difficult to extract their features based on Indonesian grammar for text mining. To do so, we propose Set of Frequent Word Item sets (SFWI) as text representation which is considered match for Indonesian slang. Besides, SFWI is able to keep the meaning of Indonesian slang with regard to the order of appearance sentence. We use FP-Growth algorithm with adding separation sentence function into the algorithm to extract the feature of SFWI. The experiments is done with text data from social media such as Facebook, Twitter, and personal website. The result of experiments shows that Indonesian slang were more correctly interpreted based on SFWI.
Zhou, Daoxiang; Yang, Dan; Zhang, Xiaohong; Huang, Sheng; Feng, Shu
Currently, considerable efforts have been devoted to devise image representation. However, handcrafted methods need strong domain knowledge and show low generalization ability, and conventional feature learning methods require enormous training data and rich parameters tuning experience. A lightened feature learner is presented to solve these problems with application to face recognition, which shares similar topology architecture as a convolutional neural network. Our model is divided into three components: cascaded convolution filters bank learning layer, nonlinear processing layer, and feature pooling layer. Specifically, in the filters learning layer, we use K-means to learn convolution filters. Features are extracted via convoluting images with the learned filters. Afterward, in the nonlinear processing layer, hyperbolic tangent is employed to capture the nonlinear feature. In the feature pooling layer, to remove the redundancy information and incorporate the spatial layout, we exploit multilevel spatial pyramid second-order pooling technique to pool the features in subregions and concatenate them together as the final representation. Extensive experiments on four representative datasets demonstrate the effectiveness and robustness of our model to various variations, yielding competitive recognition results on extended Yale B and FERET. In addition, our method achieves the best identification performance on AR and labeled faces in the wild datasets among the comparative methods.
Yang, Guang; Yin, Yafeng; Park, Jeanrok; Man, Hong
As a uncommon biometric modality, human gait recognition has a great advantage of identify people at a distance without high resolution images. It has attracted much attention in recent years, especially in the fields of computer vision and remote sensing. In this paper, we propose a human gait recognition framework that consists of a reliable background subtraction method followed by the pyramid of Histogram of Gradient (pHOG) feature extraction on the silhouette image, and a Hidden Markov Model (HMM) based classifier. Through background subtraction, the silhouette of human gait in each frame is extracted and normalized from the raw video sequence. After removing the shadow and noise in each region of interest (ROI), pHOG feature is computed on the silhouettes images. Then the pHOG features of each gait class will be used to train a corresponding HMM. In the test stage, pHOG feature will be extracted from each test sequence and used to calculate the posterior probability toward each trained HMM model. Experimental results on the CASIA Gait Dataset B1 demonstrate that with our proposed method can achieve very competitive recognition rate.
Full Text Available Contact-free palm-vein recognition is one of the most challenging and promising areas in hand biometrics. In view of the existing problems in contact-free palm-vein imaging, including projection transformation, uneven illumination and difficulty in extracting exact ROIs, this paper presents a novel recognition approach for contact-free palm-vein recognition that performs feature extraction and matching on all vein textures distributed over the palm surface, including finger veins and palm veins, to minimize the loss of feature information. First, a hierarchical enhancement algorithm, which combines a DOG filter and histogram equalization, is adopted to alleviate uneven illumination and to highlight vein textures. Second, RootSIFT, a more stable local invariant feature extraction method in comparison to SIFT, is adopted to overcome the projection transformation in contact-free mode. Subsequently, a novel hierarchical mismatching removal algorithm based on neighborhood searching and LBP histograms is adopted to improve the accuracy of feature matching. Finally, we rigorously evaluated the proposed approach using two different databases and obtained 0.996% and 3.112% Equal Error Rates (EERs, respectively, which demonstrate the effectiveness of the proposed approach.
Kang, Wenxiong; Liu, Yang; Wu, Qiuxia; Yue, Xishun
Contact-free palm-vein recognition is one of the most challenging and promising areas in hand biometrics. In view of the existing problems in contact-free palm-vein imaging, including projection transformation, uneven illumination and difficulty in extracting exact ROIs, this paper presents a novel recognition approach for contact-free palm-vein recognition that performs feature extraction and matching on all vein textures distributed over the palm surface, including finger veins and palm veins, to minimize the loss of feature information. First, a hierarchical enhancement algorithm, which combines a DOG filter and histogram equalization, is adopted to alleviate uneven illumination and to highlight vein textures. Second, RootSIFT, a more stable local invariant feature extraction method in comparison to SIFT, is adopted to overcome the projection transformation in contact-free mode. Subsequently, a novel hierarchical mismatching removal algorithm based on neighborhood searching and LBP histograms is adopted to improve the accuracy of feature matching. Finally, we rigorously evaluated the proposed approach using two different databases and obtained 0.996% and 3.112% Equal Error Rates (EERs), respectively, which demonstrate the effectiveness of the proposed approach.
Liu, Yahui; Zhang, Bob; Lu, Guangming; Zhang, David
The three-dimensional shape of the ear has been proven to be a stable candidate for biometric authentication because of its desirable properties such as universality, uniqueness, and permanence. In this paper, a special laser scanner designed for online three-dimensional ear acquisition was described. Based on the dataset collected by our scanner, two novel feature classes were defined from a three-dimensional ear image: the global feature class (empty centers and angles) and local feature class (points, lines, and areas). These features are extracted and combined in an optimal way for three-dimensional ear recognition. Using a large dataset consisting of 2,000 samples, the experimental results illustrate the effectiveness of fusing global and local features, obtaining an equal error rate of 2.2%.
Full Text Available The three-dimensional shape of the ear has been proven to be a stable candidate for biometric authentication because of its desirable properties such as universality, uniqueness, and permanence. In this paper, a special laser scanner designed for online three-dimensional ear acquisition was described. Based on the dataset collected by our scanner, two novel feature classes were defined from a three-dimensional ear image: the global feature class (empty centers and angles and local feature class (points, lines, and areas. These features are extracted and combined in an optimal way for three-dimensional ear recognition. Using a large dataset consisting of 2,000 samples, the experimental results illustrate the effectiveness of fusing global and local features, obtaining an equal error rate of 2.2%.
Yang, Jinfeng; Hong, Bofeng
Multimodal biometrics based on the finger identification is a hot topic in recent years. In this paper, a novel fingerprint-vein based biometric method is proposed to improve the reliability and accuracy of the finger recognition system. First, the second order steerable filters are used here to enhance and extract the minutiae features of the fingerprint (FP) and finger-vein (FV). Second, the texture features of fingerprint and finger-vein are extracted by a bank of Gabor filter. Third, a new triangle-region fusion method is proposed to integrate all the fingerprint and finger-vein features in feature-level. Thus, the fusion features contain both the finger texture-information and the minutiae triangular geometry structure. Finally, experimental results performed on the self-constructed finger-vein and fingerprint databases are shown that the proposed method is reliable and precise in personal identification.
Jayanti Yusmah Sari
Full Text Available In recent years, palm vein recognition has been studied to overcome problems in conventional systems in biometrics technology (finger print, face, and iris. Those problems in biometrics includes convenience and performance. However, due to the clarity of the palm vein image, the veins could not be segmented properly. To overcome this problem, we propose a palm vein recognition system using Local Line Binary Pattern (LLBP method that can extract robust features from the palm vein images that has unclear veins. LLBP is an advanced method of Local Binary Pattern (LBP, a texture descriptor based on the gray level comparison of a neighborhood of pixels. There are four major steps in this paper, Region of Interest (ROI detection, image preprocessing, features extraction using LLBP method, and matching using Fuzzy k-NN classifier. The proposed method was applied on the CASIA Multi-Spectral Image Database. Experimental results showed that the proposed method using LLBP has a good performance with recognition accuracy of 97.3%. In the future, experiments will be conducted to observe which parameter that could affect processing time and recognition accuracy of LLBP is needed
Hasanuzzaman, Faiz M; Yang, Xiaodong; Tian, YingLi
Camera-based computer vision technology is able to assist visually impaired people to automatically recognize banknotes. A good banknote recognition algorithm for blind or visually impaired people should have the following features: 1) 100% accuracy, and 2) robustness to various conditions in different environments and occlusions. Most existing algorithms of banknote recognition are limited to work for restricted conditions. In this paper we propose a component-based framework for banknote recognition by using Speeded Up Robust Features (SURF). The component-based framework is effective in collecting more class-specific information and robust in dealing with partial occlusion and viewpoint changes. Furthermore, the evaluation of SURF demonstrates its effectiveness in handling background noise, image rotation, scale, and illumination changes. To authenticate the robustness and generalizability of the proposed approach, we have collected a large dataset of banknotes from a variety of conditions including occlusion, cluttered background, rotation, and changes of illumination, scaling, and viewpoints. The proposed algorithm achieves 100% recognition rate on our challenging dataset.
Full Text Available Spoken language recognition (SLR has been of increasing interest in multilingual speech recognition for identifying the languages of speech utterances. Most existing SLR approaches apply statistical modeling techniques with acoustic and phonotactic features. Among the popular approaches, the acoustic approach has become of greater interest than others because it does not require any prior language-specific knowledge. Previous research on the acoustic approach has shown less interest in applying linguistic knowledge; it was only used as supplementary features, while the current state-of-the-art system assumes independency among features. This paper proposes an SLR system based on the latent-dynamic conditional random field (LDCRF model using phonological features (PFs. We use PFs to represent acoustic characteristics and linguistic knowledge. The LDCRF model was employed to capture the dynamics of the PFs sequences for language classification. Baseline systems were conducted to evaluate the features and methods including Gaussian mixture model (GMM based systems using PFs, GMM using cepstral features, and the CRF model using PFs. Evaluated on the NIST LRE 2007 corpus, the proposed method showed an improvement over the baseline systems. Additionally, it showed comparable result with the acoustic system based on i-vector. This research demonstrates that utilizing PFs can enhance the performance.
Andersen, Hans Jørgen; Nguyen, Phuong Giang
In image recognition, the common approach for extracting local features using a scale-space representation has usually three main steps; first interest points are extracted at different scales, next from a patch around each interest point the rotation is calculated with corresponding orientation...... and compensation, and finally a descriptor is computed for the derived patch (i.e. feature of the patch). To avoid the memory and computational intensive process of constructing the scale-space, we use a method where no scale-space is required This is done by dividing the given image into a number of triangles...... with sizes dependent on the content of the image, at the location of each triangle. In this paper, we will demonstrate that by rotation of the interest regions at the triangles it is possible in grey scale images to achieve a recognition precision comparable with that of MOPS. The test of the proposed method...
Tian, Fuyang; Cao, Dong; Dong, Xiaoning; Zhao, Xinqiang; Li, Fade; Wang, Zhonghua
Behavioral features recognition was an important effect to detect oestrus and sickness in dairy herds and there is a need for heat detection aid. The detection method was based on the measure of the individual behavioural activity, standing time, and temperature of dairy using vibrational sensor and temperature sensor in this paper. The data of behavioural activity index, standing time, lying time and walking time were sent to computer by lower power consumption wireless communication system. The fast approximate K-means algorithm (FAKM) was proposed to deal the data of the sensor for behavioral features recognition. As a result of technical progress in monitoring cows using computers, automatic oestrus detection has become possible.
Yu, Jason S.; Dagli, Cihan H.
The invariant image preprocessing of moment invariants generates an invariant representation of object features which are insensitive to position, orientation, size, illusion, and contrast change. In this study ARTMAP is used for 3-D object recognition of manufacturing parts through these invariant characteristics. The analog of moment invariants created through the image preprocessing is interpreted by a binary code which is used to predict the manufacturing part through ARTMAP.
Memiş, Abbas; Albayrak, Songül
This paper presents a sign language recognition system that uses spatio-temporal features on RGB video images and depth maps for dynamic gestures of Turkish Sign Language. Proposed system uses motion differences and accumulation approach for temporal gesture analysis. Motion accumulation method, which is an effective method for temporal domain analysis of gestures, produces an accumulated motion image by combining differences of successive video frames. Then, 2D Discrete Cosine Transform (DCT) is applied to accumulated motion images and temporal domain features transformed into spatial domain. These processes are performed on both RGB images and depth maps separately. DCT coefficients that represent sign gestures are picked up via zigzag scanning and feature vectors are generated. In order to recognize sign gestures, K-Nearest Neighbor classifier with Manhattan distance is performed. Performance of the proposed sign language recognition system is evaluated on a sign database that contains 1002 isolated dynamic signs belongs to 111 words of Turkish Sign Language (TSL) in three different categories. Proposed sign language recognition system has promising success rates.
Komeili, Majid; Louis, Wael; Armanfard, Narges; Hatzinakos, Dimitrios
Electrocardiogram (ECG) and transient evoked otoacoustic emission (TEOAE) are among the physiological signals that have attracted significant interest in biometric community due to their inherent robustness to replay and falsification attacks. However, they are time-dependent signals and this makes them hard to deal with in across-session human recognition scenario where only one session is available for enrollment. This paper presents a novel feature selection method to address this issue. It is based on an auxiliary dataset with multiple sessions where it selects a subset of features that are more persistent across different sessions. It uses local information in terms of sample margins while enforcing an across-session measure. This makes it a perfect fit for aforementioned biometric recognition problem. Comprehensive experiments on ECG and TEOAE variability due to time lapse and body posture are done. Performance of the proposed method is compared against seven state-of-the-art feature selection algorithms as well as another six approaches in the area of ECG and TEOAE biometric recognition. Experimental results demonstrate that the proposed method performs noticeably better than other algorithms.
Barros, Pablo; Jirak, Doreen; Weber, Cornelius; Wermter, Stefan
Emotional state recognition has become an important topic for human-robot interaction in the past years. By determining emotion expressions, robots can identify important variables of human behavior and use these to communicate in a more human-like fashion and thereby extend the interaction possibilities. Human emotions are multimodal and spontaneous, which makes them hard to be recognized by robots. Each modality has its own restrictions and constraints which, together with the non-structured behavior of spontaneous expressions, create several difficulties for the approaches present in the literature, which are based on several explicit feature extraction techniques and manual modality fusion. Our model uses a hierarchical feature representation to deal with spontaneous emotions, and learns how to integrate multiple modalities for non-verbal emotion recognition, making it suitable to be used in an HRI scenario. Our experiments show that a significant improvement of recognition accuracy is achieved when we use hierarchical features and multimodal information, and our model improves the accuracy of state-of-the-art approaches from 82.5% reported in the literature to 91.3% for a benchmark dataset on spontaneous emotion expressions. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Md. Rabiul Islam
Full Text Available The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al.
Chao, Tien-Hsin; Stoner, William W.
An optical neural network based on the neocognitron paradigm is introduced. A novel aspect of the architecture design is shift-invariant multichannel Fourier optical correlation within each processing layer. Multilayer processing is achieved by feeding back the ouput of the feature correlator interatively to the input spatial light modulator and by updating the Fourier filters. By training the neural net with characteristic features extracted from the target images, successful pattern recognition with intraclass fault tolerance and interclass discrimination is achieved. A detailed system description is provided. Experimental demonstrations of a two-layer neural network for space-object discrimination is also presented.
An optical neural network based upon the Neocognitron paradigm (K. Fukushima et al. 1983) is introduced. A novel aspect of the architectural design is shift-invariant multichannel Fourier optical correlation within each processing layer. Multilayer processing is achieved by iteratively feeding back the output of the feature correlator to the input spatial light modulator and updating the Fourier filters. By training the neural net with characteristic features extracted from the target images, successful pattern recognition with intra-class fault tolerance and inter-class discrimination is achieved. A detailed system description is provided. Experimental demonstration of a two-layer neural network for space objects discrimination is also presented.
Zhila Esna Ashari
Full Text Available Type IV secretion systems (T4SS are multi-protein complexes in a number of bacterial pathogens that can translocate proteins and DNA to the host. Most T4SSs function in conjugation and translocate DNA; however, approximately 13% function to secrete proteins, delivering effector proteins into the cytosol of eukaryotic host cells. Upon entry, these effectors manipulate the host cell's machinery for their own benefit, which can result in serious illness or death of the host. For this reason recognition of T4SS effectors has become an important subject. Much previous work has focused on verifying effectors experimentally, a costly endeavor in terms of money, time, and effort. Having good predictions for effectors will help to focus experimental validations and decrease testing costs. In recent years, several scoring and machine learning-based methods have been suggested for the purpose of predicting T4SS effector proteins. These methods have used different sets of features for prediction, and their predictions have been inconsistent. In this paper, an optimal set of features is presented for predicting T4SS effector proteins using a statistical approach. A thorough literature search was performed to find features that have been proposed. Feature values were calculated for datasets of known effectors and non-effectors for T4SS-containing pathogens for four genera with a sufficient number of known effectors, Legionella pneumophila, Coxiella burnetii, Brucella spp, and Bartonella spp. The features were ranked, and less important features were filtered out. Correlations between remaining features were removed, and dimensional reduction was accomplished using principal component analysis and factor analysis. Finally, the optimal features for each pathogen were chosen by building logistic regression models and evaluating each model. The results based on evaluation of our logistic regression models confirm the effectiveness of our four optimal sets of
Stentiford, F W
An automatic evolutionary search is applied to the problem of feature extraction in an OCR application. A performance measure based on feature independence is used to generate features which do not appear to suffer from peaking effects . Features are extracted from a training set of 30 600 machine printed 34 class alphanumeric characters derived from British mail. Classification results on the training set and a test set of 10 200 characters are reported for an increasing number of features. A 1.01 percent forced decision error rate is obtained on the test data using 316 features. The hardware implementation should be cheap and fast to operate. The performance compares favorably with current low cost OCR page readers.
Kee Moe Han
Full Text Available Music is the combination of melody linguistic information and the vocalists emotion. Since music is a work of art analyzing emotion in music by computer is a difficult task. Many approaches have been developed to detect the emotions included in music but the results are not satisfactory because emotion is very complex. In this paper the evaluations of audio features from the music files are presented. The extracted features are used to classify the different emotion classes of the vocalists. Musical features extraction is done by using Music Information Retrieval MIR tool box in this paper. The database of 100 music clips are used to classify the emotions perceived in music clips. Music may contain many emotions according to the vocalists mood such as happy sad nervous bored peace etc. In this paper the audio features related to the emotions of the vocalists are extracted to use in emotion recognition system based on music.
Andrews, Timothy J; Baseler, Heidi; Jenkins, Rob; Burton, A Mike; Young, Andrew W
A full understanding of face recognition will involve identifying the visual information that is used to discriminate different identities and how this is represented in the brain. The aim of this study was to explore the importance of shape and surface properties in the recognition and neural representation of familiar faces. We used image morphing techniques to generate hybrid faces that mixed shape properties (more specifically, second order spatial configural information as defined by feature positions in the 2D-image) from one identity and surface properties from a different identity. Behavioural responses showed that recognition and matching of these hybrid faces was primarily based on their surface properties. These behavioural findings contrasted with neural responses recorded using a block design fMRI adaptation paradigm to test the sensitivity of Haxby et al.'s (2000) core face-selective regions in the human brain to the shape or surface properties of the face. The fusiform face area (FFA) and occipital face area (OFA) showed a lower response (adaptation) to repeated images of the same face (same shape, same surface) compared to different faces (different shapes, different surfaces). From the behavioural data indicating the critical contribution of surface properties to the recognition of identity, we predicted that brain regions responsible for familiar face recognition should continue to adapt to faces that vary in shape but not surface properties, but show a release from adaptation to faces that vary in surface properties but not shape. However, we found that the FFA and OFA showed an equivalent release from adaptation to changes in both shape and surface properties. The dissociation between the neural and perceptual responses suggests that, although they may play a role in the process, these core face regions are not solely responsible for the recognition of facial identity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Full Text Available Interval valued neutrosophic soft set introduced by Irfan Deli in 2014 is a generalization of neutrosophic set introduced by F. Smarandache in 1995, which can be used in real scientific and engineering applications. In this paper the Hamming and Euclidean distances between two interval valued neutrosophic soft sets (IVNS sets are defined and similarity measures based on distances between two interval valued neutrosophic soft sets are proposed. Similarity measure based on set theoretic approach is also proposed. Some basic properties of similarity measures between two interval valued neutrosophic soft sets is also studied. A decision making method is established for interval valued neutrosophic soft set setting using similarity measures between IVNS sets. Finally an example is given to demonstrate the possible application of similarity measures in pattern recognition problems.
Full Text Available We present dynamic interval-valued intuitionistic fuzzy sets (DIVIFS, which can improve the recognition accuracy when they are applied to pattern recognition. By analyzing the degree of hesitancy, we propose some DIVIFS models from intuitionistic fuzzy sets (IFS and interval-valued IFS (IVIFS. And then we present a novel ranking condition on the distance of IFS and IVIFS and introduce some distance measures of DIVIFS satisfying the ranking condition. Finally, a pattern recognition example applied to medical diagnosis decision making is given to demonstrate the application of DIVIFS and its distances. The simulation results show that the DIVIFS method is more comprehensive and flexible than the IFS method and the IVIFS method.
Full Text Available Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias.
Full Text Available LiDAR technology can provide very detailed and highly accurate geospatial information on an urban scene for the creation of Virtual Geographic Environments (VGEs for different applications. However, automatic 3D modeling and feature recognition from LiDAR point clouds are very complex tasks. This becomes even more complex when the data is incomplete (occlusion problem or uncertain. In this paper, we propose to build a knowledge base comprising of ontology and semantic rules aiming at automatic feature recognition from point clouds in support of 3D modeling. First, several modules for ontology are defined from different perspectives to describe an urban scene. For instance, the spatial relations module allows the formalized representation of possible topological relations extracted from point clouds. Then, a knowledge base is proposed that contains different concepts, their properties and their relations, together with constraints and semantic rules. Then, instances and their specific relations form an urban scene and are added to the knowledge base as facts. Based on the knowledge and semantic rules, a reasoning process is carried out to extract semantic features of the objects and their components in the urban scene. Finally, several experiments are presented to show the validity of our approach to recognize different semantic features of buildings from LiDAR point clouds.
Sormaz, Mladen; Young, Andrew W; Andrews, Timothy J
Theoretical accounts of face processing often emphasise feature shapes as the primary visual cue to the recognition of facial expressions. However, changes in facial expression also affect the surface properties of the face. In this study, we investigated whether this surface information can also be used in the recognition of facial expression. First, participants identified facial expressions (fear, anger, disgust, sadness, happiness) from images that were manipulated such that they varied mainly in shape or mainly in surface properties. We found that the categorization of facial expression is possible in either type of image, but that different expressions are relatively dependent on surface or shape properties. Next, we investigated the relative contributions of shape and surface information to the categorization of facial expressions. This employed a complementary method that involved combining the surface properties of one expression with the shape properties from a different expression. Our results showed that the categorization of facial expressions in these hybrid images was equally dependent on the surface and shape properties of the image. Together, these findings provide a direct demonstration that both feature shape and surface information make significant contributions to the recognition of facial expressions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Jin Jing; Wei Biao; Feng Peng; Tang Yuelin; Zhou Mi
Based on the interdependent relationship between fission neutrons ( 252 Cf) and fission chain ( 235 U system), the paper presents the time-frequency feature analysis and recognition in fission neutron signal based on support vector machine (SVM) through the analysis on signal characteristics and the measuring principle of the 252 Cf fission neutron signal. The time-frequency characteristics and energy features of the fission neutron signal are extracted by using wavelet decomposition and de-noising wavelet packet decomposition, and then applied to training and classification by means of support vector machine based on statistical learning theory. The results show that, it is effective to obtain features of nuclear signal via wavelet decomposition and de-noising wavelet packet decomposition, and the latter can reflect the internal characteristics of the fission neutron system better. With the training accomplished, the SVM classifier achieves an accuracy rate above 70%, overcoming the lack of training samples, and verifying the effectiveness of the algorithm. (authors)
Full Text Available The automatic analysis of speech to detect affective states may improve the way users interact with electronic devices. However, the analysis only at the acoustic level could be not enough to determine the emotion of a user in a realistic scenario. In this paper we analyzed the spontaneous speech recordings of the FAU Aibo Corpus at the acoustic and linguistic levels to extract two sets of features. The acoustic set was reduced by a greedy procedure selecting the most relevant features to optimize the learning stage. We compared two versions of this greedy selection algorithm by performing the search of the relevant features forwards and backwards. We experimented with three classification approaches: Naïve-Bayes, a support vector machine and a logistic model tree, and two fusion schemes: decision-level fusion, merging the hard-decisions of the acoustic and linguistic classifiers by means of a decision tree; and feature-level fusion, concatenating both sets of features before the learning stage. Despite the low performance achieved by the linguistic data, a dramatic improvement was achieved after its combination with the acoustic information, improving the results achieved by this second modality on its own. The results achieved by the classifiers using the parameters merged at feature level outperformed the classification results of the decision-level fusion scheme, despite the simplicity of the scheme. Moreover, the extremely reduced set of acoustic features obtained by the greedy forward search selection algorithm improved the results provided by the full set.
Anzures, Gizelle; Kelly, David J.; Pascalis, Olivier; Quinn, Paul C.; Slater, Alan M.; de Viviés, Xavier; Lee, Kang
We used a matching-to-sample task and manipulated facial pose and feature composition to examine the other-race effect (ORE) in face identity recognition between 5 and 10 years of age. Overall, the present findings provide a genuine measure of own- and other-race face identity recognition in children that is independent of photographic and image…
Full Text Available To improve the performance of phoneme based Automatic Speech Recognition (ASR in noisy environment; we developed a new technique that could add robustness to clean phonemes features. These robust features are obtained from Complex Wavelet Packet Transform (CWPT coefficients. Since the CWPT coefficients represent all different frequency bands of the input signal, decomposing the input signal into complete CWPT tree would also cover all frequencies involved in recognition process. For time overlapping signals with different frequency contents, e. g. phoneme signal with noises, its CWPT coefficients are the combination of CWPT coefficients of phoneme signal and CWPT coefficients of noises. The CWPT coefficients of phonemes signal would be changed according to frequency components contained in noises. Since the numbers of phonemes in every language are relatively small (limited and already well known, one could easily derive principal component vectors from clean training dataset using Principal Component Analysis (PCA. These principal component vectors could be used then to add robustness and minimize noises effects in testing phase. Simulation results, using Alpha Numeric 4 (AN4 from Carnegie Mellon University and NOISEX-92 examples from Rice University, showed that this new technique could be used as features extractor that improves the robustness of phoneme based ASR systems in various adverse noisy conditions and still preserves the performance in clean environments.
Chen, Yameng; Sun, Gengxin; Lei, Yiming; Zhang, Jinpeng
Liver disease is one of the main causes of human healthy problem. Cirrhosis, of course, is the critical phase during the development of liver lesion, especially the hepatoma. Many clinical cases are still influenced by the subjectivity of physicians in some degree, and some objective factors such as illumination, scale, edge blurring will affect the judgment of clinicians. Then the subjectivity will affect the accuracy of diagnosis and the treatment of patients. In order to solve the difficulty above and improve the recognition rate of liver cirrhosis, we propose a method of multi-feature fusion to obtain more robust representations of texture in ultrasound liver images, the texture features we extract include local binary pattern(LBP), gray level co-occurrence matrix(GLCM) and histogram of oriented gradient(HOG). In this paper, we firstly make a fusion of multi-feature to recognize cirrhosis and normal liver based on parallel combination concept, and the experimental results shows that the classifier is effective for cirrhosis recognition which is evaluated by the satisfying classification rate, sensitivity and specificity of receiver operating characteristic(ROC), and cost time. Through the method we proposed, it will be helpful to improve the accuracy of diagnosis of cirrhosis and prevent the development of liver lesion towards hepatoma.
Full Text Available Finger knuckle print is considered as one of the emerging hand biometric traits due to its potentiality toward the identification of individuals. This paper contributes a new method for personal recognition using finger knuckle print based on two approaches namely, geometric and texture analyses. In the first approach, the shape oriented features of the finger knuckle print are extracted by means of angular geometric analysis and then integrated to achieve better precision rate. Whereas, the knuckle texture feature analysis is carried out by means of multi-resolution transform known as Curvelet transform. This Curvelet transform has the ability to approximate curved singularities with minimum number of Curvelet coefficients. Since, finger knuckle patterns mainly consist of lines and curves, Curvelet transform is highly suitable for its representation. Further, the Curvelet transform decomposes the finger knuckle image into Curvelet sub-bands which are termed as ‘Curvelet knuckle’. Finally, principle component analysis is applied on each Curvelet knuckle for extracting its feature vector through the covariance matrix derived from their Curvelet coefficients. Extensive experiments were conducted using PolyU database and IIT finger knuckle database. The experimental results confirm that, our proposed method shows a high recognition rate of 98.72% with lower false acceptance rate of 0.06%.
Abbas, Qaisar; Fondon, Irene; Sarmiento, Auxiliadora; Jiménez, Soledad; Alemany, Pedro
Diabetic retinopathy (DR) is leading cause of blindness among diabetic patients. Recognition of severity level is required by ophthalmologists to early detect and diagnose the DR. However, it is a challenging task for both medical experts and computer-aided diagnosis systems due to requiring extensive domain expert knowledge. In this article, a novel automatic recognition system for the five severity level of diabetic retinopathy (SLDR) is developed without performing any pre- and post-processing steps on retinal fundus images through learning of deep visual features (DVFs). These DVF features are extracted from each image by using color dense in scale-invariant and gradient location-orientation histogram techniques. To learn these DVF features, a semi-supervised multilayer deep-learning algorithm is utilized along with a new compressed layer and fine-tuning steps. This SLDR system was evaluated and compared with state-of-the-art techniques using the measures of sensitivity (SE), specificity (SP) and area under the receiving operating curves (AUC). On 750 fundus images (150 per category), the SE of 92.18%, SP of 94.50% and AUC of 0.924 values were obtained on average. These results demonstrate that the SLDR system is appropriate for early detection of DR and provide an effective treatment for prediction type of diabetes.
Anthimopoulos, Marios M; Gianola, Lauro; Scarnato, Luca; Diem, Peter; Mougiakakou, Stavroula G
Computer vision-based food recognition could be used to estimate a meal's carbohydrate content for diabetic patients. This study proposes a methodology for automatic food recognition, based on the bag-of-features (BoF) model. An extensive technical investigation was conducted for the identification and optimization of the best performing components involved in the BoF architecture, as well as the estimation of the corresponding parameters. For the design and evaluation of the prototype system, a visual dataset with nearly 5000 food images was created and organized into 11 classes. The optimized system computes dense local features, using the scale-invariant feature transform on the HSV color space, builds a visual dictionary of 10000 visual words by using the hierarchical k-means clustering and finally classifies the food images with a linear support vector machine classifier. The system achieved classification accuracy of the order of 78%, thus proving the feasibility of the proposed approach in a very challenging image dataset.
Full Text Available Getting a good feature representation of data is paramount for Human Activity Recognition (HAR using wearable sensors. An increasing number of feature learning approaches—in particular deep-learning based—have been proposed to extract an effective feature representation by analyzing large amounts of data. However, getting an objective interpretation of their performances faces two problems: the lack of a baseline evaluation setup, which makes a strict comparison between them impossible, and the insufficiency of implementation details, which can hinder their use. In this paper, we attempt to address both issues: we firstly propose an evaluation framework allowing a rigorous comparison of features extracted by different methods, and use it to carry out extensive experiments with state-of-the-art feature learning approaches. We then provide all the codes and implementation details to make both the reproduction of the results reported in this paper and the re-use of our framework easier for other researchers. Our studies carried out on the OPPORTUNITY and UniMiB-SHAR datasets highlight the effectiveness of hybrid deep-learning architectures involving convolutional and Long-Short-Term-Memory (LSTM to obtain features characterising both short- and long-term time dependencies in the data.
Kevin J.Y. Lam
Full Text Available Embodied theories of language postulate that language meaning is stored in modality-specific brain areas generally involved in perception and action in the real world. However, the temporal dynamics of the interaction between modality-specific information and lexical-semantic processing remain unclear. We investigated the relative timing at which two types of modality-specific information (action-based and visual-form information contribute to lexical-semantic comprehension. To this end, we applied a behavioral priming paradigm in which prime and target words were related with respect to (1 action features, (2 visual features, or (3 semantically associative information. Using a Go/No-Go lexical decision task, priming effects were measured across four different inter-stimulus intervals (ISI = 100 ms, 250 ms, 400 ms, and 1,000 ms to determine the relative time course of the different features . Notably, action priming effects were found in ISIs of 100 ms, 250 ms, and 1,000 ms whereas a visual priming effect was seen only in the ISI of 1,000 ms. Importantly, our data suggest that features follow different time courses of activation during word recognition. In this regard, feature activation is dynamic, measurable in specific time windows but not in others. Thus the current study (1 demonstrates how multiple ISIs can be used within an experiment to help chart the time course of feature activation and (2 provides new evidence for embodied theories of language.
Li, Frédéric; Shirahama, Kimiaki; Nisar, Muhammad Adeel; Köping, Lukas; Grzegorzek, Marcin
Getting a good feature representation of data is paramount for Human Activity Recognition (HAR) using wearable sensors. An increasing number of feature learning approaches-in particular deep-learning based-have been proposed to extract an effective feature representation by analyzing large amounts of data. However, getting an objective interpretation of their performances faces two problems: the lack of a baseline evaluation setup, which makes a strict comparison between them impossible, and the insufficiency of implementation details, which can hinder their use. In this paper, we attempt to address both issues: we firstly propose an evaluation framework allowing a rigorous comparison of features extracted by different methods, and use it to carry out extensive experiments with state-of-the-art feature learning approaches. We then provide all the codes and implementation details to make both the reproduction of the results reported in this paper and the re-use of our framework easier for other researchers. Our studies carried out on the OPPORTUNITY and UniMiB-SHAR datasets highlight the effectiveness of hybrid deep-learning architectures involving convolutional and Long-Short-Term-Memory (LSTM) to obtain features characterising both short- and long-term time dependencies in the data.
Creating and maintaining accurate bindings of elementary features (e.g., color and shape) in visual short-term memory (VSTM) is fundamental for veridical perception. How are low-level features bound in memory? The present work harnessed a multivariate model of perception - the General Recognition Theory (GRT) - to unravel the internal representations underlying feature binding in VSTM. On each trial, preview and target colored shapes were presented in succession, appearing in either repeated or altered spatial locations. Participants gave two same/different responses: one with respect to color and one with respect to shape. Converging GRT analyses on the accuracy confusion matrices provided substantial evidence for binding in the form of violations of perceptual independence at the level of the individual stimulus, such that positive correlations were obtained when both features repeated or alternated together, while negative correlations were obtained when one feature repeated and the other alternated. This "cloverleaf" GRT pattern of binding was similar whether the spatial location of the preview and target repeated or altered. The current results are consistent with: (a) the discrete memory "slots" model of VSTM, and (b) the notion that spatial location is not necessary for the formation of "object files." The GRT approach presented here offers a viable quantitative model for testing various questions regarding feature binding in VSTM.
Full Text Available Emotion recognition from speech may play a crucial role in many applications related to human–computer interaction or understanding the affective state of users in certain tasks, where other modalities such as video or physiological parameters are unavailable. In general, a human’s emotions may be recognized using several modalities such as analyzing facial expressions, speech, physiological parameters (e.g., electroencephalograms, electrocardiograms etc. However, measuring of these modalities may be difficult, obtrusive or require expensive hardware. In that context, speech may be the best alternative modality in many practical applications. In this work we present an approach that uses a Convolutional Neural Network (CNN functioning as a visual feature extractor and trained using raw speech information. In contrast to traditional machine learning approaches, CNNs are responsible for identifying the important features of the input thus, making the need of hand-crafted feature engineering optional in many tasks. In this paper no extra features are required other than the spectrogram representations and hand-crafted features were only extracted for validation purposes of our method. Moreover, it does not require any linguistic model and is not specific to any particular language. We compare the proposed approach using cross-language datasets and demonstrate that it is able to provide superior results vs. traditional ones that use hand-crafted features.
Wei-Jong Yang; Wei-Hau Du; Pau-Choo Chang; Jar-Ferr Yang; Pi-Hsia Hung
The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an importan...
Chen, Ying; Liu, Yuanning; Zhu, Xiaodong; Chen, Huiling; He, Fei; Pang, Yutong
For building a new iris template, this paper proposes a strategy to fuse different portions of iris based on machine learning method to evaluate local quality of iris. There are three novelties compared to previous work. Firstly, the normalized segmented iris is divided into multitracks and then each track is estimated individually to analyze the recognition accuracy rate (RAR). Secondly, six local quality evaluation parameters are adopted to analyze texture information of each track. Besides, particle swarm optimization (PSO) is employed to get the weights of these evaluation parameters and corresponding weighted coefficients of different tracks. Finally, all tracks' information is fused according to the weights of different tracks. The experimental results based on subsets of three public and one private iris image databases demonstrate three contributions of this paper. (1) Our experimental results prove that partial iris image cannot completely replace the entire iris image for iris recognition system in several ways. (2) The proposed quality evaluation algorithm is a self-adaptive algorithm, and it can automatically optimize the parameters according to iris image samples' own characteristics. (3) Our feature information fusion strategy can effectively improve the performance of iris recognition system.
Oliver, Lindsay D; Virani, Karim; Finger, Elizabeth C; Mitchell, Derek G V
Frontotemporal dementia (FTD) is a debilitating neurodegenerative disorder characterized by severely impaired social and emotional behaviour, including emotion recognition deficits. Though fear recognition impairments seen in particular neurological and developmental disorders can be ameliorated by reallocating attention to critical facial features, the possibility that similar benefits can be conferred to patients with FTD has yet to be explored. In the current study, we examined the impact of presenting distinct regions of the face (whole face, eyes-only, and eyes-removed) on the ability to recognize expressions of anger, fear, disgust, and happiness in 24 patients with FTD and 24 healthy controls. A recognition deficit was demonstrated across emotions by patients with FTD relative to controls. Crucially, removal of diagnostic facial features resulted in an appropriate decline in performance for both groups; furthermore, patients with FTD demonstrated a lack of disproportionate improvement in emotion recognition accuracy as a result of isolating critical facial features relative to controls. Thus, unlike some neurological and developmental disorders featuring amygdala dysfunction, the emotion recognition deficit observed in FTD is not likely driven by selective inattention to critical facial features. Patients with FTD also mislabelled negative facial expressions as happy more often than controls, providing further evidence for abnormalities in the representation of positive affect in FTD. This work suggests that the emotional expression recognition deficit associated with FTD is unlikely to be rectified by adjusting selective attention to diagnostic features, as has proven useful in other select disorders. Copyright © 2014 Elsevier Ltd. All rights reserved.
Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L
In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called 'shadow features' are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.
Omara, Ibrahim; Xiao, Gang; Amrani, Moussa; Yan, Zifei; Zuo, Wangmeng
Recently, multimodal biometric systems have received considerable research interest in many applications especially in the fields of security. Multimodal systems can increase the resistance to spoof attacks, provide more details and flexibility, and lead to better performance and lower error rate. In this paper, we present a multimodal biometric system based on face and ear, and propose how to exploit the extracted deep features from Convolutional Neural Networks (CNNs) on the face and ear images to introduce more powerful discriminative features and robust representation ability for them. First, the deep features for face and ear images are extracted based on VGG-M Net. Second, the extracted deep features are fused by using a traditional concatenation and a Discriminant Correlation Analysis (DCA) algorithm. Third, multiclass support vector machine is adopted for matching and classification. The experimental results show that the proposed multimodal system based on deep features is efficient and achieves a promising recognition rate up to 100 % by using face and ear. In addition, the results indicate that the fusion based on DCA is superior to traditional fusion.
Li, Qin; Wang, Hua Jing; You, Jane; Li, Zhao Ming; Li, Jin Xue
In some large-scale face recognition task, such as driver license identification and law enforcement, the training set only contains one image per person. This situation is referred to as one sample problem. Because many face recognition techniques implicitly assume that several (at least two) images per person are available for training, they cannot deal with the one sample problem. This paper investigates principal component analysis (PCA), Fisher linear discriminant analysis (LDA), and locality preserving projections (LPP) and shows why they cannot perform well in one sample problem. After that, this paper presents four reasons that make one sample problem itself difficult: the small sample size problem; the lack of representative samples; the underestimated intra-class variation; and the overestimated inter-class variation. Based on the analysis, this paper proposes to enlarge the training set based on the inter-class relationship. This paper also extends LDA and LPP to extract features from the enlarged training set. The experimental results show the effectiveness of the proposed method.
Full Text Available In some large-scale face recognition task, such as driver license identification and law enforcement, the training set only contains one image per person. This situation is referred to as one sample problem. Because many face recognition techniques implicitly assume that several (at least two images per person are available for training, they cannot deal with the one sample problem. This paper investigates principal component analysis (PCA, Fisher linear discriminant analysis (LDA, and locality preserving projections (LPP and shows why they cannot perform well in one sample problem. After that, this paper presents four reasons that make one sample problem itself difficult: the small sample size problem; the lack of representative samples; the underestimated intra-class variation; and the overestimated inter-class variation. Based on the analysis, this paper proposes to enlarge the training set based on the inter-class relationship. This paper also extends LDA and LPP to extract features from the enlarged training set. The experimental results show the effectiveness of the proposed method.
Hortos, William S.
Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at
Reşit Kavsaoğlu, A; Polat, Kemal; Recep Bozkurt, M
This study is intended for describing the application of the Photoplethysmography (PPG) signal and the time domain features acquired from its first and second derivatives for biometric identification. For this purpose, a sum of 40 features has been extracted and a feature-ranking algorithm is proposed. This proposed algorithm calculates the contribution of each feature to biometric recognition and collocates the features, the contribution of which is from great to small. While identifying the contribution of the features, the Euclidean distance and absolute distance formulas are used. The efficiency of the proposed algorithms is demonstrated by the results of the k-NN (k-nearest neighbor) classifier applications of the features. During application, each 15-period-PPG signal belonging to two different durations from each of the thirty healthy subjects were used with a PPG data acquisition card. The first PPG signals recorded from the subjects were evaluated as the 1st configuration; the PPG signals recorded later at a different time as the 2nd configuration and the combination of both were evaluated as the 3rd configuration. When the results were evaluated for the k-NN classifier model created along with the proposed algorithm, an identification of 90.44% for the 1st configuration, 94.44% for the 2nd configuration, and 87.22% for the 3rd configuration has successfully been attained. The obtained results showed that both the proposed algorithm and the biometric identification model based on this developed PPG signal are very promising for contactless recognizing the people with the proposed method. Copyright © 2014 Elsevier Ltd. All rights reserved.
diverse. In ants, social interactions are regulated by at least three levels of recognition. Nestmate recognition occurs between colonies, is very effective, and involves fast processing. Within a colony, division of labor is enhanced by recognition of different classes of individuals. Ultimately......, in particular circumstances, such as cooperative colony founding with stable dominance hierarchies, ants are capable of individual recognition. The underlying recognition cues and mechanisms appear to be specific to each recognition level, and their integrated understanding could contribute...
Full Text Available Face detection and recognition is the first step for many applications in various fields such as identification and is used as a key to enter into the various electronic devices, video surveillance, and human computer interface and image database management. This paper focuses on feature extraction in an image using Gabor filter and the extracted image feature vector is then given as an input to the neural network. The neural network is trained with the input data. The Gabor wavelet concentrates on the important components of the face including eye, mouth, nose, cheeks. The main requirement of this technique is the threshold, which gives privileged sensitivity. The threshold values are the feature vectors taken from the faces. These feature vectors are given into the feed forward neural network to train the network. Using the feed forward neural network as a classifier, the recognized and unrecognized faces are classified. This classifier attains a higher face deduction rate. By training more input vectors the system proves to be effective. The effectiveness of the proposed method is demonstrated by the experimental results.
Full Text Available In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called ‘shadow features’ are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.
Reyes-Galaviz, Orion Fausto; Reyes-García, Carlos Alberto
Data compression is always advisable when it comes to handling and processing information quickly and efficiently. There are two main problems that need to be solved when it comes to handling data; store information in smaller spaces and processes it in the shortest possible time. When it comes to infant cry analysis (ICA), there is always the need to construct large sound repositories from crying babies. Samples that have to be analyzed and be used to train and test pattern recognition algorithms; making this a time consuming task when working with uncompressed feature vectors. In this work, we show a simple, but efficient, method that uses Fuzzy Relational Product (FRP) to compresses the information inside a feature vector, building with this a compressed matrix that will help us recognize two kinds of pathologies in infants; Asphyxia and Deafness. We describe the sound analysis, which consists on the extraction of Mel Frequency Cepstral Coefficients that generate vectors which will later be compressed by using FRP. There is also a description of the infant cry database used in this work, along with the training and testing of a Time Delay Neural Network with the compressed features, which shows a performance of 96.44% with our proposed feature vector compression.
Yuan, Xue; Hao, Xiaoli; Chen, Houjin; Wei, Xueye
A new context-aware scale-invariant feature transform (CASIFT) approach is proposed, which is designed for the use in traffic sign recognition (TSR) systems. The following issues remain in previous works in which SIFT is used for matching or recognition: (1) SIFT is unable to provide color information; (2) SIFT only focuses on local features while ignoring the distribution of global shapes; (3) the template with the maximum number of matching points selected as the final result is instable, especially for images with simple patterns; and (4) SIFT is liable to result in errors when different images share the same local features. In order to resolve these problems, a new CASIFT approach is proposed. The contributions of the work are as follows: (1) color angular patterns are used to provide the color distinguishing information; (2) a CASIFT which effectively combines local and global information is proposed; and (3) a method for computing the similarity between two images is proposed, which focuses on the distribution of the matching points, rather than using the traditional SIFT approach of selecting the template with maximum number of matching points as the final result. The proposed approach is particularly effective in dealing with traffic signs which have rich colors and varied global shape distribution. Experiments are performed to validate the effectiveness of the proposed approach in TSR systems, and the experimental results are satisfying even for images containing traffic signs that have been rotated, damaged, altered in color, have undergone affine transformations, or images which were photographed under different weather or illumination conditions.
Full Text Available Logo is a graphical symbol that is the identity of an organization, institution, or company. Logo is generally used to introduce to the public the existence of an organization, institution, or company. Through the existence of an agency logo can be seen by the public. Feature recognition is one of the processes that exist within an augmented reality system. One of uses augmented reality is able to recognize the identity of the logo through a camera.The first step to make a process of feature recognition is through the corner detection. Incorporation of several method such as FAST, SURF, and FLANN TREE for the feature detection process based corner detection feature matching up process, will have the better ability to detect the presence of a logo. Additionally when running the feature extraction process there are several issues that arise as scale invariant feature and rotation invariant feature. In this study the research object in the form of logo to the priority to make the process of feature recognition. FAST, SURF, and FLANN TREE method will detection logo with scale invariant feature and rotation invariant feature conditions. Obtained from this study will demonstration the accuracy from FAST, SURF, and FLANN TREE methods to solve the scale invariant and rotation invariant feature problems.
Rahulkar, Amol D
This book provides the new results in wavelet filter banks based feature extraction, and the classifier in the field of iris image recognition. It provides the broad treatment on the design of separable, non-separable wavelets filter banks, and the classifier. The design techniques presented in the book are applied on iris image analysis for person authentication. This book also brings together the three strands of research (wavelets, iris image analysis, and classifier). It compares the performance of the presented techniques with state-of-the-art available schemes. This book contains the compilation of basic material on the design of wavelets that avoids reading many different books. Therefore, it provide an easier path for the new-comers, researchers to master the contents. In addition, the designed filter banks and classifier can also be effectively used than existing filter-banks in many signal processing applications like pattern classification, data-compression, watermarking, denoising etc. that will...
Dore, Kelly L.; Brooks, Lee R.; Weaver, Bruce; Norman, Geoffrey R.
Medical diagnosis can be viewed as a categorization task. There are two mechanisms whereby humans make categorical judgments: "analytical reasoning," based on explicit consideration of features and "nonanalytical reasoning," an unconscious holistic process of matching against prior exemplars. However, there is evidence that prior experience can…
Hamm, Lisa M; Yeoman, Janice P; Anstice, Nicola; Dakin, Steven C
When measuring recognition acuity in a research setting, the most widely used symbols are the Early Treatment of Diabetic Retinopathy Study (ETDRS) set of 10 Sloan letters. However, the symbols are not appropriate for patients unfamiliar with letters, and acuity for individual letters is variable. Alternative pictogram sets are available, but are generally comprised of fewer items. We set out to develop an open-access set of 10 pictograms that would elicit more consistent estimates of acuity across items than the ETDRS letters from visually normal adults. We measured monocular acuity for individual uncrowded optotypes within a newly designed set (The Auckland Optotype [TAO]), the ETDRS set, and Landolt Cs. Eleven visually normal adults were assessed on regular and vanishing formats of each set. Inter-optotype reliability and ability to detect subtle differences between participants were assessed using intraclass correlations (ICC) and fractional rank precision (FRP). The TAO vanishing set showed the strongest performance (ICC = 0.97, FRP = 0.90), followed by the other vanishing sets (Sloan ICC = 0.88, FRP = 0.74; Landolt ICC = 0.86, FRP = 0.80). Within the regular format, TAO again outperformed the existing sets (TAO ICC = 0.77, FRP = 0.75; Sloan ICC = 0.65, FRP = 0.64; Landolt ICC = 0.48, FRP = 0.63). For adults with normal visual acuity, the new optotypes (in both regular and vanishing formats) are more equally legible and sensitive to subtle individual differences than their Sloan counterparts. As this set does not require observers to be able to name Roman letters, and is freely available to use and modify, it may have wide application for measurement of acuity.
Christiansen, Thomas Ulrich; Greenberg, Steven
The perceptual basis of consonant recognition was experimentally investigated through a study of how information associated with phonetic features (Voicing, Manner, and Place of Articulation) combines across the acoustic-frequency spectrum. The speech signals, 11 Danish consonants embedded...... in Consonant + Vowel + Liquid syllables, were partitioned into 3/4-octave bands (“slits”) centered at 750 Hz, 1500 Hz, and 3000 Hz, and presented individually and in two- or three-slit combinations. The amount of information transmitted (IT) was calculated from consonant- confusion matrices for each feature...... the bands are essentially independent in terms of decoding this feature. Because consonant recognition and Place decoding are highly correlated (correlation coefficient r2 = 0.99), these results imply that the auditory processes underlying consonant recognition are not strictly linear. This may account...
Kakoty, Nayan M; Hazarika, Shyamanta M
With the advancement in machine learning and signal processing techniques, electromyogram (EMG) signals have increasingly gained importance in man-machine interaction. Multifingered hand prostheses using surface EMG for control has appeared in the market. However, EMG based control is still rudimentary, being limited to a few hand postures based on higher number of EMG channels. Moreover, control is non-intuitive, in the sense that the user is required to learn to associate muscle remnants actions to unrelated posture of the prosthesis. Herein lies the promise of a low channel EMG based grasp classification architecture for development of an embedded intelligent prosthetic controller. This paper reports classification of six grasp types used during 70% of daily living activities based on two channel forearm EMG. A feature vector through principal component analysis of discrete wavelet transform coefficients based features of the EMG signal is derived. Classification is through radial basis function kernel based support vector machine following preprocessing and maximum voluntary contraction normalization of EMG signals. 10-fold cross validation is done. We have achieved an average recognition rate of 97.5%. © 2011 IEEE
Schuh, Michael A.; Angryk, Rafal A.; Martens, Petrus C.
The massive repository of images of the Sun captured by the Solar Dynamics Observatory (SDO) mission has ushered in the era of Big Data for Solar Physics. In this work, we investigate the entire public collection of events reported to the Heliophysics Event Knowledgebase (HEK) from automated solar feature recognition modules operated by the SDO Feature Finding Team (FFT). With the SDO mission recently surpassing five years of operations, and over 280,000 event reports for seven types of solar phenomena, we present the broadest and most comprehensive large-scale dataset of the SDO FFT modules to date. We also present numerous statistics on these modules, providing valuable contextual information for better understanding and validating of the individual event reports and the entire dataset as a whole. After extensive data cleaning through exploratory data analysis, we highlight several opportunities for knowledge discovery from data (KDD). Through these important prerequisite analyses presented here, the results of KDD from Solar Big Data will be overall more reliable and better understood. As the SDO mission remains operational over the coming years, these datasets will continue to grow in size and value. Future versions of this dataset will be analyzed in the general framework established in this work and maintained publicly online for easy access by the community.
Zhang, Qiang; Li, Jiafeng; Zhuo, Li; Zhang, Hui; Li, Xiaoguang
Color is one of the most stable attributes of vehicles and often used as a valuable cue in some important applications. Various complex environmental factors, such as illumination, weather, noise and etc., result in the visual characteristics of the vehicle color being obvious diversity. Vehicle color recognition in complex environments has been a challenging task. The state-of-the-arts methods roughly take the whole image for color recognition, but many parts of the images such as car windows; wheels and background contain no color information, which will have negative impact on the recognition accuracy. In this paper, a novel vehicle color recognition method using local vehicle-color saliency detection and dual-orientational dimensionality reduction of convolutional neural network (CNN) deep features has been proposed. The novelty of the proposed method includes two parts: (1) a local vehicle-color saliency detection method has been proposed to determine the vehicle color region of the vehicle image and exclude the influence of non-color regions on the recognition accuracy; (2) dual-orientational dimensionality reduction strategy has been designed to greatly reduce the dimensionality of deep features that are learnt from CNN, which will greatly mitigate the storage and computational burden of the subsequent processing, while improving the recognition accuracy. Furthermore, linear support vector machine is adopted as the classifier to train the dimensionality reduced features to obtain the recognition model. The experimental results on public dataset demonstrate that the proposed method can achieve superior recognition performance over the state-of-the-arts methods.
In this article, I shall examine the cognitive, heuristic and theoretical functions of the concept of recognition. To evaluate both the explanatory power and the limitations of a sociological concept, the theory construction must be analysed and its actual productivity for sociological theory mus...
Dat Tien Nguyen
Full Text Available Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples. Therefore, a presentation attack detection (PAD method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP, local ternary pattern (LTP, and histogram of oriented gradients (HOG. As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN method to extract deep image features and the multi-level local binary pattern (MLBP method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases.
Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung
Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases.
Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung
Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases. PMID:29495417
Full Text Available owadays, there have been so many development of robot that can receive command and do speech recognition and face recognition. In this research, we develop a humanoid robot system with a controller that based on Raspberry Pi 2. The methods we used are based on Audio recognition and detection, and also face recognition using PCA (Principal Component Analysis with OpenCV and Python. PCA is one of the algorithms to do face detection by doing reduction to the number of dimension of the image possessed. The result of this reduction process is then known as eigenface to do face recognition process. In this research, we still find a false recognition. It can be caused by many things, like database condition, maybe the images are too dark or less varied, blur test image, etc. The accuracy from 3 tests on different people is about 93% (28 correct recognitions out of 30.
Prima Dewi Purnamasari
Full Text Available The development of automatic emotion detection systems has recently gained significant attention due to the growing possibility of their implementation in several applications, including affective computing and various fields within biomedical engineering. Use of the electroencephalograph (EEG signal is preferred over facial expression, as people cannot control the EEG signal generated by their brain; the EEG ensures a stronger reliability in the psychological signal. However, because of its uniqueness between individuals and its vulnerability to noise, use of EEG signals can be rather complicated. In this paper, we propose a methodology to conduct EEG-based emotion recognition by using a filtered bispectrum as the feature extraction subsystem and an artificial neural network (ANN as the classifier. The bispectrum is theoretically superior to the power spectrum because it can identify phase coupling between the nonlinear process components of the EEG signal. In the feature extraction process, to extract the information contained in the bispectrum matrices, a 3D pyramid filter is used for sampling and quantifying the bispectrum value. Experiment results show that the mean percentage of the bispectrum value from 5 × 5 non-overlapped 3D pyramid filters produces the highest recognition rate. We found that reducing the number of EEG channels down to only eight in the frontal area of the brain does not significantly affect the recognition rate, and the number of data samples used in the training process is then increased to improve the recognition rate of the system. We have also utilized a probabilistic neural network (PNN as another classifier and compared its recognition rate with that of the back-propagation neural network (BPNN, and the results show that the PNN produces a comparable recognition rate and lower computational costs. Our research shows that the extracted bispectrum values of an EEG signal using 3D filtering as a feature extraction
Sadeghi, Zahra; Testolin, Alberto
In humans, efficient recognition of written symbols is thought to rely on a hierarchical processing system, where simple features are progressively combined into more abstract, high-level representations. Here, we present a computational model of Persian character recognition based on deep belief networks, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data. Crucially, high-level internal representations emerging from unsupervised deep learning can be easily read out by a linear classifier, achieving state-of-the-art recognition accuracy. Furthermore, we tested the hypothesis that handwritten digits and letters share many common visual features: A generative model that captures the statistical structure of the letters distribution should therefore also support the recognition of written digits. To this aim, deep networks trained on Persian letters were used to build high-level representations of Persian digits, which were indeed read out with high accuracy. Our simulations show that complex visual features, such as those mediating the identification of Persian symbols, can emerge from unsupervised learning in multilayered neural networks and can support knowledge transfer across related domains.
Full Text Available Hand posture recognition is an essential module in applications such as human-computer interaction (HCI, games, and sign language systems, in which performance and robustness are the primary requirements. In this paper, we proposed automatic classification to recognize 21 hand postures that represent letters in Thai finger-spelling based on Histogram of Orientation Gradient (HOG feature (which is applied with more focus on the information within certain region of the image rather than each single pixel and Adaptive Boost (i.e., AdaBoost learning technique to select the best weak classifier and to construct a strong classifier that consists of several weak classifiers to be cascaded in detection architecture. We collected 21 static hand posture images from 10 subjects for testing and training in Thai letters finger-spelling. The parameters for the training process have been adjusted in three experiments, false positive rates (FPR, true positive rates (TPR, and number of training stages (N, to achieve the most suitable training model for each hand posture. All cascaded classifiers are loaded into the system simultaneously to classify different hand postures. A correlation coefficient is computed to distinguish the hand postures that are similar. The system achieves approximately 78% accuracy on average on all classifier experiments.
Maity, Priti Prasanna; Chatterjee, Subhamoy; Das, Raunak Kumar; Mukhopadhyay, Subhalaxmi; Maity, Ashok; Maulik, Dhrubajyoti; Ray, Ajoy Kumar; Dhara, Santanu; Chatterjee, Jyotirmoy
Benign phyllodes and fibroadenoma are two well-known breast tumors with remarkable diagnostic ambiguity. The present study is aimed at determining an optimum set of immuno-histochemical features to distinguish them by analyzing important observations on expressions of important genes in fibro-glandular tissue. Immuno-histochemically, the expressions of p63 and α-SMA in myoepithelial cells and collagen I, III and CD105 in stroma of tumors and their normal counterpart were studied. Semi-quantified features were analyzed primarily by ANOVA and ranked through F-scores for understanding relative importance of group of features in discriminating three classes followed by reduction in F-score arranged feature space dimension and application of inter-class Bhattacharyya distances to distinguish tumors with an optimum set of features. Among thirteen studied features except one all differed significantly in three study classes. F-Ranking of features revealed highest discriminative potential of collagen III (initial region). F-Score arranged feature space dimension and application of Bhattacharyya distance gave rise to a feature set of lower dimension which can discriminate benign phyllodes and fibroadenoma effectively. The work definitely separated normal breast, fibroadenoma and benign phyllodes, through an optimal set of immuno-histochemical features which are not only useful to address diagnostic ambiguity of the tumors but also to spell about malignant potentiality. Copyright © 2013 Elsevier Ltd. All rights reserved.
Echeagaray-Patrón, B. A.; Kober, Vitaly
3D face recognition has attracted attention in the last decade due to improvement of technology of 3D image acquisition and its wide range of applications such as access control, surveillance, human-computer interaction and biometric identification systems. Most research on 3D face recognition has focused on analysis of 3D still data. In this work, a new method for face recognition using dynamic 3D range sequences is proposed. Experimental results are presented and discussed using 3D sequences in the presence of pose variation. The performance of the proposed method is compared with that of conventional face recognition algorithms based on descriptors.
Full Text Available This paper introduces a novel method for the recognition of human faces in digital images using a new feature extraction method that combines the global and local information in frontal view of facial images. Radial basis function (RBF neural network with a hybrid learning algorithm (HLA has been used as a classifier. The proposed feature extraction method includes human face localization derived from the shape information. An efficient distance measure as facial candidate threshold (FCT is defined to distinguish between face and nonface images. Pseudo-Zernike moment invariant (PZMI with an efficient method for selecting moment order has been used. A newly defined parameter named axis correction ratio (ACR of images for disregarding irrelevant information of face images is introduced. In this paper, the effect of these parameters in disregarding irrelevant information in recognition rate improvement is studied. Also we evaluate the effect of orders of PZMI in recognition rate of the proposed technique as well as RBF neural network learning speed. Simulation results on the face database of Olivetti Research Laboratory (ORL indicate that the proposed method for human face recognition yielded a recognition rate of 99.3%.
Handwritten character recognition plays an important role in transforming raw visual image data obtained from handwritten documents using for example scanners to a format which is understandable by a computer. It is an important application in the field of pattern recognition, machine learning and
Biesbroek, J Matthijs; van Zandvoort, Martine J E; Kappelle, L Jaap; Schoo, Linda; Kuijf, Hugo J; Velthuis, Birgitta K; Biessels, Geert Jan; Postma, Albert
Recognition memory, that is, the ability to judge whether an item has been previously encountered in a particular context, depends on two factors: discriminability and criterion setting. Discriminability draws on memory processes while criterion setting (i.e., the application of a threshold
Biesbroek, J. Matthijs; van Zandvoort, Martine J E; Kappelle, L. Jaap; Schoo, Linda; Kuijf, Hugo J.; Velthuis, BK; Biessels, Geert Jan; Postma, Albert
Recognition memory, that is, the ability to judge whether an item has been previously encountered in a particular context, depends on two factors: discriminability and criterion setting. Discriminability draws on memory processes while criterion setting (i.e., the application of a threshold
Zhu, Xiao Ran; Zhang, You Yun; Zhu, Yong Sheng [Xi' an Jiaotong Univ., Xi' an (China)
Intelligent fault diagnosis benefits from efficient feature selection. Neighborhood rough sets are effective in feature selection. However, determining the neighborhood value accurately remains a challenge. The wrapper feature selection algorithm is designed by combining the kernel method and neighborhood rough sets to self-adaptively select sensitive features. The combination effectively solves the shortcomings in selecting the neighborhood value in the previous application process. The statistical features of time and frequency domains are used to describe the characteristic of the rolling bearing to make the intelligent fault diagnosis approach work. Three classification algorithms, namely, classification and regression tree (CART), commercial version 4.5 (C4.5), and radial basis function support vector machines (RBFSVM), are used to test UCI datasets and 10 fault datasets of rolling bearing. The results indicate that the diagnostic approach presented could effectively select the sensitive fault features and simultaneously identify the type and degree of the fault.
Zhu, Xiao Ran; Zhang, You Yun; Zhu, Yong Sheng
Intelligent fault diagnosis benefits from efficient feature selection. Neighborhood rough sets are effective in feature selection. However, determining the neighborhood value accurately remains a challenge. The wrapper feature selection algorithm is designed by combining the kernel method and neighborhood rough sets to self-adaptively select sensitive features. The combination effectively solves the shortcomings in selecting the neighborhood value in the previous application process. The statistical features of time and frequency domains are used to describe the characteristic of the rolling bearing to make the intelligent fault diagnosis approach work. Three classification algorithms, namely, classification and regression tree (CART), commercial version 4.5 (C4.5), and radial basis function support vector machines (RBFSVM), are used to test UCI datasets and 10 fault datasets of rolling bearing. The results indicate that the diagnostic approach presented could effectively select the sensitive fault features and simultaneously identify the type and degree of the fault
Podbielska, Halina; Bauer, Joanna
Human body posses many unique, singular features that are impossible to copy or forge. Nowadays, to establish and to ensure the public security requires specially designed devices and systems. Biometrics is a field of science and technology, exploiting human body characteristics for people recognition. It identifies the most characteristic and unique ones in order to design and construct systems capable to recognize people. In this paper some overview is given, presenting the achievements in biometrics. The verification and identification process is explained, along with the way of evaluation of biometric recognition systems. The most frequently human biometrics used in practice are shortly presented, including fingerprints, facial imaging (including thermal characteristic), hand geometry and iris patterns.
Xu, Huile; Liu, Jinyi; Hu, Haibo; Zhang, Yi
Wearable sensors-based human activity recognition introduces many useful applications and services in health care, rehabilitation training, elderly monitoring and many other areas of human interaction. Existing works in this field mainly focus on recognizing activities by using traditional features extracted from Fourier transform (FT) or wavelet transform (WT). However, these signal processing approaches are suitable for a linear signal but not for a nonlinear signal. In this paper, we investigate the characteristics of the Hilbert-Huang transform (HHT) for dealing with activity data with properties such as nonlinearity and non-stationarity. A multi-features extraction method based on HHT is then proposed to improve the effect of activity recognition. The extracted multi-features include instantaneous amplitude (IA) and instantaneous frequency (IF) by means of empirical mode decomposition (EMD), as well as instantaneous energy density (IE) and marginal spectrum (MS) derived from Hilbert spectral analysis. Experimental studies are performed to verify the proposed approach by using the PAMAP2 dataset from the University of California, Irvine for wearable sensors-based activity recognition. Moreover, the effect of combining multi-features vs. a single-feature are investigated and discussed in the scenario of a dependent subject. The experimental results show that multi-features combination can further improve the performance measures. Finally, we test the effect of multi-features combination in the scenario of an independent subject. Our experimental results show that we achieve four performance indexes: recall, precision, F-measure, and accuracy to 0.9337, 0.9417, 0.9353, and 0.9377 respectively, which are all better than the achievements of related works.
Full Text Available Wearable sensors-based human activity recognition introduces many useful applications and services in health care, rehabilitation training, elderly monitoring and many other areas of human interaction. Existing works in this field mainly focus on recognizing activities by using traditional features extracted from Fourier transform (FT or wavelet transform (WT. However, these signal processing approaches are suitable for a linear signal but not for a nonlinear signal. In this paper, we investigate the characteristics of the Hilbert-Huang transform (HHT for dealing with activity data with properties such as nonlinearity and non-stationarity. A multi-features extraction method based on HHT is then proposed to improve the effect of activity recognition. The extracted multi-features include instantaneous amplitude (IA and instantaneous frequency (IF by means of empirical mode decomposition (EMD, as well as instantaneous energy density (IE and marginal spectrum (MS derived from Hilbert spectral analysis. Experimental studies are performed to verify the proposed approach by using the PAMAP2 dataset from the University of California, Irvine for wearable sensors-based activity recognition. Moreover, the effect of combining multi-features vs. a single-feature are investigated and discussed in the scenario of a dependent subject. The experimental results show that multi-features combination can further improve the performance measures. Finally, we test the effect of multi-features combination in the scenario of an independent subject. Our experimental results show that we achieve four performance indexes: recall, precision, F-measure, and accuracy to 0.9337, 0.9417, 0.9353, and 0.9377 respectively, which are all better than the achievements of related works.
R, Elakkiya; K, Selvamani
Subunit segmenting and modelling in medical sign language is one of the important studies in linguistic-oriented and vision-based Sign Language Recognition (SLR). Many efforts were made in the precedent to focus the functional subunits from the view of linguistic syllables but the problem is implementing such subunit extraction using syllables is not feasible in real-world computer vision techniques. And also, the present recognition systems are designed in such a way that it can detect the signer dependent actions under restricted and laboratory conditions. This research paper aims at solving these two important issues (1) Subunit extraction and (2) Signer independent action on visual sign language recognition. Subunit extraction involved in the sequential and parallel breakdown of sign gestures without any prior knowledge on syllables and number of subunits. A novel Bayesian Parallel Hidden Markov Model (BPaHMM) is introduced for subunit extraction to combine the features of manual and non-manual parameters to yield better results in classification and recognition of signs. Signer independent action aims in using a single web camera for different signer behaviour patterns and for cross-signer validation. Experimental results have proved that the proposed signer independent subunit level modelling for sign language classification and recognition has shown improvement and variations when compared with other existing works.
Prakash, Ammu; Ocana Macias, Mariano; Hewko, Mark; Sowa, Michael; Sherif, Sherif
Optical coherence tomography (OCT) images are capable of detecting vascular plaque by using the full set of 26 Haralick textural features and a standard K-means clustering algorithm. However, the use of the full set of 26 textural features is computationally expensive and may not be feasible for real time implementation. In this work, we identified a reduced set of 3 textural feature which characterizes vascular plaque and used a generalized Fuzzy C-means clustering algorithm. Our work involves three steps: 1) the reduction of a full set 26 textural feature to a reduced set of 3 textural features by using genetic algorithm (GA) optimization method 2) the implementation of an unsupervised generalized clustering algorithm (Fuzzy C-means) on the reduced feature space, and 3) the validation of our results using histology and actual photographic images of vascular plaque. Our results show an excellent match with histology and actual photographic images of vascular tissue. Therefore, our results could provide an efficient pre-clinical tool for the detection of vascular plaque in real time OCT imaging.
Zou, Zheng; Wang, Niannian; Zhao, Peng; Zhao, Xuefeng
Ancient architecture has a very high historical and artistic value. The ancient buildings have a wide variety of textures and decorative paintings, which contain a lot of historical meaning. Therefore, the research and statistics work of these different compositional and decorative features play an important role in the subsequent research. However, until recently, the statistics of those components are mainly by artificial method, which consumes a lot of labor and time, inefficiently. At present, as the strong support of big data and GPU accelerated training, machine vision with deep learning as the core has been rapidly developed and widely used in many fields. This paper proposes an idea to recognize and detect the textures, decorations and other features of ancient building based on machine vision. First, classify a large number of surface textures images of ancient building components manually as a set of samples. Then, using the convolution neural network to train the samples in order to get a classification detector. Finally verify its precision.
Mioulet, L.; Bideault, G.; Chatelain, C.; Paquet, T.; Brunessaux, S.
The BLSTM-CTC is a novel recurrent neural network architecture that has outperformed previous state of the art algorithms in tasks such as speech recognition or handwriting recognition. It has the ability to process long term dependencies in temporal signals in order to label unsegmented data. This paper describes different ways of combining features using a BLSTM-CTC architecture. Not only do we explore the low level combination (feature space combination) but we also explore high level combination (decoding combination) and mid-level (internal system representation combination). The results are compared on the RIMES word database. Our results show that the low level combination works best, thanks to the powerful data modeling of the LSTM neurons.
Full Text Available Selecting the right set of features from data of high dimensionality for inducing an accurate classification model is a tough computational challenge. It is almost a NP-hard problem as the combinations of features escalate exponentially as the number of features increases. Unfortunately in data mining, as well as other engineering applications and bioinformatics, some data are described by a long array of features. Many feature subset selection algorithms have been proposed in the past, but not all of them are effective. Since it takes seemingly forever to use brute force in exhaustively trying every possible combination of features, stochastic optimization may be a solution. In this paper, we propose a new feature selection scheme called Swarm Search to find an optimal feature set by using metaheuristics. The advantage of Swarm Search is its flexibility in integrating any classifier into its fitness function and plugging in any metaheuristic algorithm to facilitate heuristic search. Simulation experiments are carried out by testing the Swarm Search over some high-dimensional datasets, with different classification algorithms and various metaheuristic algorithms. The comparative experiment results show that Swarm Search is able to attain relatively low error rates in classification without shrinking the size of the feature subset to its minimum.
Giardino, Marco; Magagna, Alessandra; Ferrero, Elena; Perrone, Gianluigi
Digital field mapping has certainly provided geoscientists with the opportunity to map and gather data in the field directly using digital tools and software rather than using paper maps, notebooks and analogue devices and then subsequently transferring the data to a digital format for subsequent analysis. But, the same opportunity has to be recognized for Geoscience education, as well as for stimulating and helping students in the recognition of landforms and interpretation of the geological and geomorphological components of a landscape. More, an early exposure to mapping during school and prior to university can optimise the ability to "read" and identify uncertainty in 3d models. During 2014, about 200 Secondary School students (aged 12-15) of the Piedmont region (NW Italy) participated in a research program involving the use of mobile devices (smartphone and tablet) in the field. Students, divided in groups, used the application Trimble Outdoors Navigators for tracking a geological trail in the Sangone Valley and for taking georeferenced pictures and notes. Back to school, students downloaded the digital data in a .kml file for the visualization on Google Earth. This allowed them: to compare the hand tracked trail on a paper map with the digital trail, and to discuss about the functioning and the precision of the tools; to overlap a digital/semitransparent version of the 2D paper map (a Regional Technical Map) used during the field trip on the 2.5D landscape of Google Earth, as to help them in the interpretation of conventional symbols such as contour lines; to perceive the landforms seen during the field trip as a part of a more complex Pleistocene glacial landscape; to understand the classical and innovative contributions from different geoscientific disciplines to the generation of a 3D structural geological model of the Rivoli-Avigliana Morainic Amphitheatre. In 2013 and 2014, some other pilot projects have been carried out in different areas of the
The great amount of multispectral VHR satellite images, even available free of charge in Google earth has opened new strategic challenges in the field of remote sensing for archaeological studies. These challenges substantially deal with: (i) the strategic exploitation of satellite data as much as possible, (ii) the setting up of effective and reliable automatic and/or semiautomatic data processing strategies and (iii) the integration with other data sources from documentary resources to the traditional ground survey, historical documentation, geophysical prospection, etc. VHR satellites provide high resolution data which can improve knowledge on past human activities providing precious qualitative and quantitative information developed to such an extent that currently they share many of the physical characteristics of aerial imagery. This makes them ideal for investigations ranging from a local to a regional scale (see. for example, Lasaponara and Masini 2006a,b, 2007a, 2011; Masini and Lasaponara 2006, 2007, Sparavigna, 2010). Moreover, satellite data are still the only data source for research performed in areas where aerial photography is restricted because of military or political reasons. Among the main advantages of using satellite remote sensing compared to traditional field archaeology herein we briefly focalize on the use of wavelet data processing for enhancing google earth satellite data with particular reference to multitemporal datasets. Study areas selected from Southern Italy, Middle East and South America are presented and discussed. Results obtained point out the use of automatic image enhancement can successfully applied as first step of supervised classification and intelligent data analysis for semiautomatic identification of features of archaeological interest. Reference Lasaponara R, Masini N (2006a) On the potential of panchromatic and multispectral Quickbird data for archaeological prospection. Int J Remote Sens 27: 3607-3614. Lasaponara R
Gert Van Dijck
Full Text Available Since two decades, wavelet packet decompositions have been shown effective as a generic approach to feature extraction from time series and images for the prediction of a target variable. Redundancies exist between the wavelet coefficients and between the energy features that are derived from the wavelet coefficients. We assess these redundancies in wavelet packet decompositions by means of the Markov blanket filtering theory. We introduce the concept of joint Markov blankets. It is shown that joint Markov blankets are a natural extension of Markov blankets, which are defined for single features, to a set of features. We show that these joint Markov blankets exist in feature sets consisting of the wavelet coefficients. Furthermore, we prove that wavelet energy features from the highest frequency resolution level form a joint Markov blanket for all other wavelet energy features. The joint Markov blanket theory indicates that one can expect an increase of classification accuracy with the increase of the frequency resolution level of the energy features.
Sultana, Maryam; Bhatti, Naeem; Javed, Sajid; Jung, Soon Ki
Facial expression recognition (FER) is an important task for various computer vision applications. The task becomes challenging when it requires the detection and encoding of macro- and micropatterns of facial expressions. We present a two-stage texture feature extraction framework based on the local binary pattern (LBP) variants and evaluate its significance in recognizing posed and nonposed facial expressions. We focus on the parametric limitations of the LBP variants and investigate their effects for optimal FER. The size of the local neighborhood is an important parameter of the LBP technique for its extraction in images. To make the LBP adaptive, we exploit the granulometric information of the facial images to find the local neighborhood size for the extraction of center-symmetric LBP (CS-LBP) features. Our two-stage texture representations consist of an LBP variant and the adaptive CS-LBP features. Among the presented two-stage texture feature extractions, the binarized statistical image features and adaptive CS-LBP features were found showing high FER rates. Evaluation of the adaptive texture features shows competitive and higher performance than the nonadaptive features and other state-of-the-art approaches, respectively.
Tunmer, William E.; Chapman, James W.
This study investigated the hypothesis that vocabulary influences word recognition skills indirectly through "set for variability", the ability to determine the correct pronunciation of approximations to spoken English words. One hundred forty children participating in a 3-year longitudinal study were administered reading and…
Betta, G.; Capriglione, D.; Crenna, F.; Rossi, G. B.; Gasparetto, M.; Zappa, E.; Liguori, C.; Paolillo, A.
Security systems based on face recognition through video surveillance systems deserve great interest. Their use is important in several areas including airport security, identification of individuals and access control to critical areas. These systems are based either on the measurement of details of a human face or on a global approach whereby faces are considered as a whole. The recognition is then performed by comparing the measured parameters with reference values stored in a database. The result of this comparison is not deterministic because measurement results are affected by uncertainty due to random variations and/or to systematic effects. In these circumstances the recognition of a face is subject to the risk of a faulty decision. Therefore, a proper metrological characterization is needed to improve the performance of such systems. Suitable methods are proposed for a quantitative metrological characterization of face measurement systems, on which recognition procedures are based. The proposed methods are applied to three different algorithms based either on linear discrimination, on eigenface analysis, or on feature detection.
Betta, G; Capriglione, D; Crenna, F; Rossi, G B; Gasparetto, M; Zappa, E; Liguori, C; Paolillo, A
Security systems based on face recognition through video surveillance systems deserve great interest. Their use is important in several areas including airport security, identification of individuals and access control to critical areas. These systems are based either on the measurement of details of a human face or on a global approach whereby faces are considered as a whole. The recognition is then performed by comparing the measured parameters with reference values stored in a database. The result of this comparison is not deterministic because measurement results are affected by uncertainty due to random variations and/or to systematic effects. In these circumstances the recognition of a face is subject to the risk of a faulty decision. Therefore, a proper metrological characterization is needed to improve the performance of such systems. Suitable methods are proposed for a quantitative metrological characterization of face measurement systems, on which recognition procedures are based. The proposed methods are applied to three different algorithms based either on linear discrimination, on eigenface analysis, or on feature detection
Carbonetto , Peter; De Freitas , Nando; Gustafson , Paul; Thompson , Natalie
International audience; We present a method for variable selection/weighting in an unsupervised learning context using Bayesian shrinkage. The basis for the model parameters and cluster assignments can be computed simultaneous using an efficient EM algorithm. Applying our Bayesian shrinkage model to a complex problem in object recognition (Duygulu, Barnard, de Freitas and Forsyth 2002), our experiments yied good results.
Kim, Jaebok; Truong, Khiet; Englebienne, Gwenn; Evers, Vanessa
In this paper, we propose to use deep 3-dimensional convolutional networks (3D CNNs) in order to address the challenge of modelling spectro-temporal dynamics for speech emotion recognition (SER). Compared to a hybrid of Convolutional Neural Network and Long-Short-Term-Memory (CNN-LSTM), our proposed
Yebes, J Javier; Bergasa, Luis M; García-Garrido, Miguel Ángel
Driver assistance systems and autonomous robotics rely on the deployment of several sensors for environment perception. Compared to LiDAR systems, the inexpensive vision sensors can capture the 3D scene as perceived by a driver in terms of appearance and depth cues. Indeed, providing 3D image understanding capabilities to vehicles is an essential target in order to infer scene semantics in urban environments. One of the challenges that arises from the navigation task in naturalistic urban scenarios is the detection of road participants (e.g., cyclists, pedestrians and vehicles). In this regard, this paper tackles the detection and orientation estimation of cars, pedestrians and cyclists, employing the challenging and naturalistic KITTI images. This work proposes 3D-aware features computed from stereo color images in order to capture the appearance and depth peculiarities of the objects in road scenes. The successful part-based object detector, known as DPM, is extended to learn richer models from the 2.5D data (color and disparity), while also carrying out a detailed analysis of the training pipeline. A large set of experiments evaluate the proposals, and the best performing approach is ranked on the KITTI website. Indeed, this is the first work that reports results with stereo data for the KITTI object challenge, achieving increased detection ratios for the classes car and cyclist compared to a baseline DPM.
J. Javier Yebes
Full Text Available Driver assistance systems and autonomous robotics rely on the deployment of several sensors for environment perception. Compared to LiDAR systems, the inexpensive vision sensors can capture the 3D scene as perceived by a driver in terms of appearance and depth cues. Indeed, providing 3D image understanding capabilities to vehicles is an essential target in order to infer scene semantics in urban environments. One of the challenges that arises from the navigation task in naturalistic urban scenarios is the detection of road participants (e.g., cyclists, pedestrians and vehicles. In this regard, this paper tackles the detection and orientation estimation of cars, pedestrians and cyclists, employing the challenging and naturalistic KITTI images. This work proposes 3D-aware features computed from stereo color images in order to capture the appearance and depth peculiarities of the objects in road scenes. The successful part-based object detector, known as DPM, is extended to learn richer models from the 2.5D data (color and disparity, while also carrying out a detailed analysis of the training pipeline. A large set of experiments evaluate the proposals, and the best performing approach is ranked on the KITTI website. Indeed, this is the first work that reports results with stereo data for the KITTI object challenge, achieving increased detection ratios for the classes car and cyclist compared to a baseline DPM.
Sirakov, Nikolay M.; Suh, Sang; Attardo, Salvatore
This paper presents a further step of a research toward the development of a quick and accurate weapons identification methodology and system. A basic stage of this methodology is the automatic acquisition and updating of weapons ontology as a source of deriving high level weapons information. The present paper outlines the main ideas used to approach the goal. In the next stage, a clustering approach is suggested on the base of hierarchy of concepts. An inherent slot of every node of the proposed ontology is a low level features vector (LLFV), which facilitates the search through the ontology. Part of the LLFV is the information about the object's parts. To partition an object a new approach is presented capable of defining the objects concavities used to mark the end points of weapon parts, considered as convexities. Further an existing matching approach is optimized to determine whether an ontological object matches the objects from an input image. Objects from derived ontological clusters will be considered for the matching process. Image resizing is studied and applied to decrease the runtime of the matching approach and investigate its rotational and scaling invariance. Set of experiments are preformed to validate the theoretical concepts.
Carlos E. Galván-Tejada
Full Text Available This work presents a human activity recognition (HAR model based on audio features. The use of sound as an information source for HAR models represents a challenge because sound wave analyses generate very large amounts of data. However, feature selection techniques may reduce the amount of data required to represent an audio signal sample. Some of the audio features that were analyzed include Mel-frequency cepstral coefficients (MFCC. Although MFCC are commonly used in voice and instrument recognition, their utility within HAR models is yet to be confirmed, and this work validates their usefulness. Additionally, statistical features were extracted from the audio samples to generate the proposed HAR model. The size of the information is necessary to conform a HAR model impact directly on the accuracy of the model. This problem also was tackled in the present work; our results indicate that we are capable of recognizing a human activity with an accuracy of 85% using the HAR model proposed. This means that minimum computational costs are needed, thus allowing portable devices to identify human activities using audio as an information source.
Full Text Available The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI. In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII. The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech.
Iisaka, Joji; Sakurai-Amano, Takako
This paper describes an integrated approach to terrain feature detection and several methods to estimate spatial information from SAR (synthetic aperture radar) imagery. Spatial information of image features as well as spatial association are key elements in terrain feature detection. After applying a small feature preserving despeckling operation, spatial information such as edginess, texture (smoothness), region-likeliness and line-likeness of objects, target sizes, and target shapes were estimated. Then a trapezoid shape fuzzy membership function was assigned to each spatial feature attribute. Fuzzy classification logic was employed to detect terrain features. Terrain features such as urban areas, mountain ridges, lakes and other water bodies as well as vegetated areas were successfully identified from a sub-image of a JERS-1 SAR image. In the course of shape analysis, a quantitative method was developed to classify spatial patterns by expanding a spatial pattern through the use of a series of pattern primitives.
Yoshimi, Takehiko; Kotani, Katsunori; Isahara, Hitoshi
The present paper proposes and evaluates a readability assessment method designed for Japanese learners of EFL (English as a foreign language). The proposed readability assessment method is constructed by a regression algorithm using a new set of linguistic features that were employed separately in previous studies. The results showed that the…
Wind, Anke; Lobo, Mariana Fernandes; van Dijk, Joris; Lepage-Nefkens, Isabelle; Laranja-Pontes, Jose; da Conceicao Goncalves, Vitor; van Harten, Willem H.; Rocha-Goncalves, Francisco Nuno
The specific aim of this study is to identify the performance features of cancer centers in the European Union by using a fuzzy-set qualitative comparative analysis (fsQCA). The fsQCA method represents cases (cancer centers) as a combination of explanatory and outcome conditions. This study uses
Zawistowski, Jacek; Kurzejamski, Grzegorz; Garbat, Piotr; Naruniec, Jacek
This paper presents a system designed for the multi-object detection purposes and adjusted for the application of product search on the market shelves. System uses well known binary keypoint detection algorithms for finding characteristic points in the image. One of the main idea is object recognition based on Implicit Shape Model method. Authors of the article proposed many improvements of the algorithm. Originally fiducial points are matched with a very simple function. This leads to the limitations in the number of objects parts being success- fully separated, while various methods of classification may be validated in order to achieve higher performance. Such an extension implies research on training procedure able to deal with many objects categories. Proposed solution opens a new possibilities for many algorithms demanding fast and robust multi-object recognition.
He, Fei; Liu, Yuanning; Zhu, Xiaodong; Huang, Chun; Han, Ye; Dong, Hongxing
Gabor descriptors have been widely used in iris texture representations. However, fixed basic Gabor functions cannot match the changing nature of diverse iris datasets. Furthermore, a single form of iris feature cannot overcome difficulties in iris recognition, such as illumination variations, environmental conditions, and device variations. This paper provides multiple local feature representations and their fusion scheme based on a support vector regression (SVR) model for iris recognition using optimized Gabor filters. In our iris system, a particle swarm optimization (PSO)- and a Boolean particle swarm optimization (BPSO)-based algorithm is proposed to provide suitable Gabor filters for each involved test dataset without predefinition or manual modulation. Several comparative experiments on JLUBR-IRIS, CASIA-I, and CASIA-V4-Interval iris datasets are conducted, and the results show that our work can generate improved local Gabor features by using optimized Gabor filters for each dataset. In addition, our SVR fusion strategy may make full use of their discriminative ability to improve accuracy and reliability. Other comparative experiments show that our approach may outperform other popular iris systems.
Full Text Available Modal frequency is an important indicator for structural health assessment. Previous studies have shown that this indicator is substantially affected by the fluctuation of ambient conditions, such as temperature and humidity. Therefore, recognizing the pattern between modal frequency and ambient conditions is necessary for reliable long-term structural health assessment. In this article, a novel machine-learning algorithm is proposed to automatically select relevance features in modal frequency-ambient condition pattern recognition based on structural dynamic response and ambient condition measurement. In contrast to the traditional feature selection approaches by examining a large number of combinations of extracted features, the proposed algorithm conducts continuous relevance feature selection by introducing a sophisticated hyperparameterization on the weight parameter vector controlling the relevancy of different features in the prediction model. The proposed algorithm is then utilized for structural health assessment for a reinforced concrete building based on 1-year daily measurements. It turns out that the optimal model class including the relevance features for each vibrational mode is capable to capture the pattern between the corresponding modal frequency and the ambient conditions.
Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung
Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images. PMID:28335510
Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung
Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images.
Dat Tien Nguyen
Full Text Available Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT, speed-up robust feature (SURF, local binary patterns (LBP, histogram of oriented gradients (HOG, and weighted HOG. Recently, the convolutional neural network (CNN method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images.
Full Text Available Hand-based biometrics plays a significant role in establishing security for real-time environments involving human interaction and is found to be more successful in terms of high speed and accuracy. This paper investigates on an integrated approach for personal authentication using Finger Back Knuckle Surface (FBKS based on two methodologies viz., Angular Geometric Analysis based Feature Extraction Method (AGFEM and Contourlet Transform based Feature Extraction Method (CTFEM. Based on these methods, this personal authentication system simultaneously extracts shape oriented feature information and textural pattern information of FBKS for authenticating an individual. Furthermore, the proposed geometric and textural analysis methods extract feature information from both proximal phalanx and distal phalanx knuckle regions (FBKS, while the existing works of the literature concentrate only on the features of proximal phalanx knuckle region. The finger joint region found nearer to the tip of the finger is called distal phalanx region of FBKS, which is a unique feature and has greater potentiality toward identification. Extensive experiments conducted using newly created database with 5400 FBKS images and the obtained results infer that the integration of shape oriented features with texture feature information yields excellent accuracy rate of 99.12% with lowest equal error rate of 1.04%.
Zhang, Jingfa; Qin, Qiming
Many types of feature extracting of RS image are analyzed, and the work procedure of pattern recognizing in RS images of seismic disaster is proposed. The aerial RS image of Tangshan Great Earthquake is processed, and the digital features of various typical seismic disaster on the RS image is calculated.
Morishita, Masayo; Di Luccio, Eric
Highlights: → NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1 are histone methyltransferases linked to numerous cancers. → Little is known about the NSD pathways and HMTase inhibitors are sorely needed in the epigenetic therapy of cancers. → We investigate the regulation and the recognition of histone marks by the SET domain of NSD1. → A unique and key mechanism is driven by a loop at the interface of the SET and postSET region. → Implications for developing specific and selective HMTase inhibitors are presented. -- Abstract: The development of epigenetic therapies fuels cancer hope. DNA-methylation inhibitors, histone-deacetylase and histone-methyltransferase (HMTase) inhibitors are being developed as the utilization of epigenetic targets is emerging as an effective and valuable approach to chemotherapy as well as chemoprevention of cancer. The nuclear receptor binding SET domain (NSD) protein is a family of three HMTases, NSD1, NSD2/MMSET/WHSC1, and NSD3/WHSC1L1 that are critical in maintaining the chromatin integrity. A growing number of studies have reported alterations or amplifications of NSD1, NSD2, or NSD3 in numerous carcinogenic events. Reducing NSDs activity through specific lysine-HMTase inhibitors appears promising to help suppressing cancer growth. However, little is known about the NSD pathways and our understanding of the histone lysine-HMTase mechanism is partial. To shed some light on both the recognition and the regulation of epigenetic marks by the SET domain of the NSD family, we investigate the structural mechanisms of the docking of the histone-H4 tail on the SET domain of NSD1. Our finding exposes a key regulatory and recognition mechanism driven by the flexibility of a loop at the interface of the SET and postSET region. Finally, we prospect the special value of this regulatory region for developing specific and selective NSD inhibitors for the epigenetic therapy of cancers.
Liu, Haihong; Liu, Sha; Wang, Suju; Liu, Chang; Kong, Ying; Zhang, Ning; Li, Shujing; Yang, Yilin; Han, Demin; Zhang, Luo
The purpose of this study was to examine the open-set word recognition performance of Mandarin Chinese-speaking children who had received a multichannel cochlear implant (CI) and examine the effects of lexical characteristics and demographic factors (i.e., age at implantation and duration of implant use) on Mandarin Chinese open-set word recognition in these children. Participants were 230 prelingually deafened children with CIs. Age at implantation ranged from 0.9 to 16.0 years, with a mean of 3.9 years. The Standard-Chinese version of the Monosyllabic Lexical Neighborhood test and the Multisyllabic Lexical Neighborhood test were used to evaluate the open-set word identification abilities of the children. A two-way analysis of variance was performed to delineate the lexical effects on the open-set word identification, with word difficulty and syllable length as the two main factors. The effects of age at implantation and duration of implant use on open-set, word-recognition performance were examined using correlational/regressional models. First, the average percent-correct scores for the disyllabic "easy" list, disyllabic "hard" list, monosyllabic "easy" list, and monosyllabic "hard" list were 65.0%, 51.3%, 58.9%, and 46.2%, respectively. For both the easy and hard lists, the percentage of words correctly identified was higher for disyllabic words than for monosyllabic words, Second, the CI group scored 26.3%, 31.3%, and 18.8 % points lower than their hearing-age-matched normal-hearing peers for 4, 5, and 6 years of hearing age, respectively. The corresponding gaps between the CI group and the chronological-age-matched normal-hearing group were 47.6, 49.6, and 42.4, respectively. The individual variations in performance were much greater in the CI group than in the normal-hearing group, Third, the children exhibited steady improvements in performance as the duration of implant use increased, especially 1 to 6 years postimplantation. Last, age at implantation had
Full Text Available be recognized simultaneously, and occlusion and clutter (through distracter objects) is common. We propose a representation for object viewpoints using Hough transform based geometric matching features, which are robust in such circumstances. We show how...
This study examined the use of speech recognition (SR) technology to support a group of elementary school children's learning of English as a foreign language (EFL). SR technology has been used in various language learning contexts. Its application to EFL teaching and learning is still relatively recent, but a solid understanding of its…
The author uses pattern recognition methods for detecting word boundaries, and monitors incoming speech at 12 millisecond intervals. Frequency is divided into eight bands and analysis is achieved in an analogue interface integrated circuit, a pipeline digital processor and a control integrated circuit. Applications are suggested, including speech input to personal computers. 3 references.
Elhaj, Fatin A; Salim, Naomie; Harris, Arief R; Swee, Tan Tian; Ahmed, Taqwa
Arrhythmia is a cardiac condition caused by abnormal electrical activity of the heart, and an electrocardiogram (ECG) is the non-invasive method used to detect arrhythmias or heart abnormalities. Due to the presence of noise, the non-stationary nature of the ECG signal (i.e. the changing morphology of the ECG signal with respect to time) and the irregularity of the heartbeat, physicians face difficulties in the diagnosis of arrhythmias. The computer-aided analysis of ECG results assists physicians to detect cardiovascular diseases. The development of many existing arrhythmia systems has depended on the findings from linear experiments on ECG data which achieve high performance on noise-free data. However, nonlinear experiments characterize the ECG signal more effectively sense, extract hidden information in the ECG signal, and achieve good performance under noisy conditions. This paper investigates the representation ability of linear and nonlinear features and proposes a combination of such features in order to improve the classification of ECG data. In this study, five types of beat classes of arrhythmia as recommended by the Association for Advancement of Medical Instrumentation are analyzed: non-ectopic beats (N), supra-ventricular ectopic beats (S), ventricular ectopic beats (V), fusion beats (F) and unclassifiable and paced beats (U). The characterization ability of nonlinear features such as high order statistics and cumulants and nonlinear feature reduction methods such as independent component analysis are combined with linear features, namely, the principal component analysis of discrete wavelet transform coefficients. The features are tested for their ability to differentiate different classes of data using different classifiers, namely, the support vector machine and neural network methods with tenfold cross-validation. Our proposed method is able to classify the N, S, V, F and U arrhythmia classes with high accuracy (98.91%) using a combined support
.... Each data set is divided into a training set, which is made available to developers, and a carefully matched equal-sized set of closely analogous samples, which is reserved for testing of the developers' products...
Full Text Available Automatic target recognition (ATR in synthetic aperture radar (SAR images plays an important role in both national defense and civil applications. Although many methods have been proposed, SAR ATR is still very challenging due to the complex application environment. Feature extraction and classification are key points in SAR ATR. In this paper, we first design a novel feature, which is a histogram of oriented gradients (HOG-like feature for SAR ATR (called SAR-HOG. Then, we propose a supervised discriminative dictionary learning (SDDL method to learn a discriminative dictionary for SAR ATR and propose a strategy to simplify the optimization problem. Finally, we propose a SAR ATR classifier based on SDDL and sparse representation (called SDDLSR, in which both the reconstruction error and the classification error are considered. Extensive experiments are performed on the MSTAR database under standard operating conditions and extended operating conditions. The experimental results show that SAR-HOG can reliably capture the structures of targets in SAR images, and SDDL can further capture subtle differences among the different classes. By virtue of the SAR-HOG feature and SDDLSR, the proposed method achieves the state-of-the-art performance on MSTAR database. Especially for the extended operating conditions (EOC scenario “Training 17 ∘ —Testing 45 ∘ ”, the proposed method improves remarkably with respect to the previous works.
Dung, Le; Mizukawa, Makoto
Adding the reject output to the pattern recognition neural network is an approach to help the neural network can classify almost all patterns of a training data set by using many sets of weights and biases, even if the neural network is small. With a smaller number of neurons, we can implement the neural network on a hardware-based platform more easily and also reduce the response time of it. With the reject output the neural network can produce not only right or wrong results but also reject...
implementing, and evaluating many feature selection algorithms. Mucciardi and Gose compared seven different techniques for choosing subsets of pattern...122 THIS PAGE INTENTIONALLY LEFT BLANK 123 LIST OF REFERENCES  A. Mucciardi and E. Gose , “A comparison of seven techniques for
Ok, Jiheon; Kim, Soochang; Kim, Young-hoon; Lee, Chulhee
Recently, a new concept of device-to-device (D2D) communication, which is called "point-and-link communication" has attracted great attentions due to its intuitive and simple operation. This approach enables user to communicate with target devices without any pre-identification information such as SSIDs, MAC addresses by selecting the target image displayed on the user's own device. In this paper, we present an efficient object matching algorithm that can be applied to look(point)-and-link communications for mobile services. Due to the limited channel bandwidth and low computational power of mobile terminals, the matching algorithm should satisfy low-complexity, low-memory and realtime requirements. To meet these requirements, we propose fast and robust feature extraction by considering the descriptor size and processing time. The proposed algorithm utilizes a HSV color histogram, SIFT (Scale Invariant Feature Transform) features and object aspect ratios. To reduce the descriptor size under 300 bytes, a limited number of SIFT key points were chosen as feature points and histograms were binarized while maintaining required performance. Experimental results show the robustness and the efficiency of the proposed algorithm.
Azmi, Mohd Sanusi; Omar, Khairuddin; Nasrudin, Mohamad Faidzul; Idrus, Bahari; Wan Mohd Ghazali, Khadijah
A novel method is proposed to recognize the Arab/Jawi and Roman digits. This new method is based on features from the triangle geometry, normalized into nine features. The features are used for zoning which results in five and 25 zones. The algorithm is validated by using three standard datasets which are publicly available and used by researchers in this field. The first dataset is HODA that contains 60,000 images for training and 20,000 images for testing. The second dataset is IFHCDB. This dataset has 52,380 isolated characters and 17,740 digits. Only the 17,740 images of digits are used for this research. For the roman digit, MNIST are chosen. MNIST dataset has 60,000 images for training and 10,000 images for testing. Supervised (SML) and Unsupervised Machine Learning (UML) are used to test the nine features. The SML used are Neural Network (NN) and Support Vector Machine (SVM). Whereas the UML uses Euclidean Distance Method with data mining algorithms; namely Mean Average Precision (eMAP) and Frequency Based (eFB). Results for SML testing for HODA dataset are 98.07% accuracy for SVM, and 96.73% for NN. For IFHCDB and MNIST the accuracy are 91.75% and 93.095% respectively. For the UML tests, HODA dataset is 93.91%, IFHCDB 85.94% and MNIST 86.61%. The train and test images are selected using both random and the original dataset's distribution. The results show that the accuracy of proposed algorithm is over 90% for each SML trained datasets where the highest result is the one that uses 25 zones features.
Full Text Available Abstract Background The signal recognition particle (SRP receptor plays a vital role in co-translational protein targeting, because it connects the soluble SRP-ribosome-nascent chain complex (SRP-RNCs to the membrane bound Sec translocon. The eukaryotic SRP receptor (SR is a heterodimeric protein complex, consisting of two unrelated GTPases. The SRβ subunit is an integral membrane protein, which tethers the SRP-interacting SRα subunit permanently to the endoplasmic reticulum membrane. The prokaryotic SR lacks the SRβ subunit and consists of only the SRα homologue FtsY. Strikingly, although FtsY requires membrane contact for functionality, cell fractionation studies have localized FtsY predominantly to the cytosolic fraction of Escherichia coli. So far, the exact function of the soluble SR in E. coli is unknown, but it has been suggested that, in contrast to eukaryotes, the prokaryotic SR might bind SRP-RNCs already in the cytosol and only then initiates membrane targeting. Results In the current study we have determined the contribution of soluble FtsY to co-translational targeting in vitro and have re-analysed the localization of FtsY in vivo by fluorescence microscopy. Our data show that FtsY can bind to SRP-ribosome nascent chains (RNCs in the absence of membranes. However, these soluble FtsY-SRP-RNC complexes are not efficiently targeted to the membrane. In contrast, we observed effective targeting of SRP-RNCs to membrane-bond FtsY. These data show that soluble FtsY does not contribute significantly to cotranslational targeting in E. coli. In agreement with this observation, our in vivo analyses of FtsY localization in bacterial cells by fluorescence microscopy revealed that the vast majority of FtsY was localized to the inner membrane and that soluble FtsY constituted only a negligible species in vivo. Conclusion The exact function of the SRP receptor (SR in bacteria has so far been enigmatic. Our data show that the bacterial SR is
Ghabri, Sawsen; Ouarda, Wael; Alimi, Adel M.
Security and surveillance are vital issues in today's world. The recent acts of terrorism have highlighted the urgent need for efficient surveillance. There is indeed a need for an automated system for video surveillance which can detect identity and activity of person. In this article, we propose a new paradigm to recognize an aggressive human behavior such as boxing action. Our proposed system for human activity detection includes the use of a fusion between Spatio Temporal Interest Point (STIP) and Histogram of Oriented Gradient (HoG) features. The novel feature called Spatio Temporal Histogram Oriented Gradient (STHOG). To evaluate the robustness of our proposed paradigm with a local application of HoG technique on STIP points, we made experiments on KTH human action dataset based on Multi Class Support Vector Machines classification. The proposed scheme outperforms basic descriptors like HoG and STIP to achieve 82.26% us an accuracy value of classification rate.
Shi Jiazheng; Sahiner, Berkman; Chan Heangping; Ge Jun; Hadjiiski, Lubomir; Helvie, Mark A.; Nees, Alexis; Wu Yita; Wei Jun; Zhou Chuan; Zhang Yiheng; Cui Jing
Computer-aided diagnosis (CAD) for characterization of mammographic masses as malignant or benign has the potential to assist radiologists in reducing the biopsy rate without increasing false negatives. The purpose of this study was to develop an automated method for mammographic mass segmentation and explore new image based features in combination with patient information in order to improve the performance of mass characterization. The authors' previous CAD system, which used the active contour segmentation, and morphological, textural, and spiculation features, has achieved promising results in mass characterization. The new CAD system is based on the level set method and includes two new types of image features related to the presence of microcalcifications with the mass and abruptness of the mass margin, and patient age. A linear discriminant analysis (LDA) classifier with stepwise feature selection was used to merge the extracted features into a classification score. The classification accuracy was evaluated using the area under the receiver operating characteristic curve. The authors' primary data set consisted of 427 biopsy-proven masses (200 malignant and 227 benign) in 909 regions of interest (ROIs) (451 malignant and 458 benign) from multiple mammographic views. Leave-one-case-out resampling was used for training and testing. The new CAD system based on the level set segmentation and the new mammographic feature space achieved a view-based A z value of 0.83±0.01. The improvement compared to the previous CAD system was statistically significant (p=0.02). When patient age was included in the new CAD system, view-based and case-based A z values were 0.85±0.01 and 0.87±0.02, respectively. The study also demonstrated the consistency of the newly developed CAD system by evaluating the statistics of the weights of the LDA classifiers in leave-one-case-out classification. Finally, an independent test on the publicly available digital database for screening
Salekin, Randall T; Lester, Whitney S; Sellers, Mary-Kate
The purpose of the current study was to examine the effect of a motivational intervention on conduct problem youth with psychopathic features. Specifically, the current study examined conduct problem youths' mental set (or theory) regarding intelligence (entity vs. incremental) upon task performance. We assessed 36 juvenile offenders with psychopathic features and tested whether providing them with two different messages regarding intelligence would affect their functioning on a task related to academic performance. The study employed a MANOVA design with two motivational conditions and three outcomes including fluency, flexibility, and originality. Results showed that youth with psychopathic features who were given a message that intelligence grows over time, were more fluent and flexible than youth who were informed that intelligence is static. There were no significant differences between the groups in terms of originality. The implications of these findings are discussed including the possible benefits of interventions for adolescent offenders with conduct problems and psychopathic features. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
Zhang, Yuhan; Lu, Dr. Thomas
The objectives of this project were to develop a ROI (Region of Interest) detector using Haar-like feature similar to the face detection in Intel's OpenCV library, implement it in Matlab code, and test the performance of the new ROI detector against the existing ROI detector that uses Optimal Trade-off Maximum Average Correlation Height filter (OTMACH). The ROI detector included 3 parts: 1, Automated Haar-like feature selection in finding a small set of the most relevant Haar-like features for detecting ROIs that contained a target. 2, Having the small set of Haar-like features from the last step, a neural network needed to be trained to recognize ROIs with targets by taking the Haar-like features as inputs. 3, using the trained neural network from the last step, a filtering method needed to be developed to process the neural network responses into a small set of regions of interests. This needed to be coded in Matlab. All the 3 parts needed to be coded in Matlab. The parameters in the detector needed to be trained by machine learning and tested with specific datasets. Since OpenCV library and Haar-like feature were not available in Matlab, the Haar-like feature calculation needed to be implemented in Matlab. The codes for Adaptive Boosting and max/min filters in Matlab could to be found from the Internet but needed to be integrated to serve the purpose of this project. The performance of the new detector was tested by comparing the accuracy and the speed of the new detector against the existing OTMACH detector. The speed was referred as the average speed to find the regions of interests in an image. The accuracy was measured by the number of false positives (false alarms) at the same detection rate between the two detectors.
Howard, John J.; Etter, Delores M.
Iris recognition is increasingly being deployed on population wide scales for important applications such as border security, social service administration, criminal identification and general population management. The error rates for this incredibly accurate form of biometric identification are established using well known, laboratory quality datasets. However, it is has long been acknowledged in biometric theory that not all individuals have the same likelihood of being correctly serviced by a biometric system. Typically, techniques for identifying clients that are likely to experience a false non-match or a false match error are carried out on a per-subject basis. This research makes the novel hypothesis that certain ethnical denominations are more or less likely to experience a biometric error. Through established statistical techniques, we demonstrate this hypothesis to be true and document the notable effect that the ethnicity of the client has on iris similarity scores. Understanding the expected impact of ethnical diversity on iris recognition accuracy is crucial to the future success of this technology as it is deployed in areas where the target population consists of clientele from a range of geographic backgrounds, such as border crossings and immigration check points.
Xiang, Jie; Chen, Jianping; Lai, ZiLi; Yang, Wei
Yuan Yang terraces is one of the most famous terraces in China, and it was successfully listed in the world heritage list at the 37th world heritage convention. On the one hand, Yuan Yang terraces retain more soil and water, to reduce both hydrological connectivity and erosion, and to support irrigation. On the other hand, It has the important tourism value, bring the huge revenue to local residents. In order to protect and make use of Yuan Yang terraces better, This study analyzed the spatial distribution and spectral characteristics of terraces:(1) Through visual interpretation, the study recognized the terraces based on the spatial adjusted remote sensing image (2010 Geoeye-1 with resolution of 1m/pix), and extracted topographic feature (elevation, slope, aspect, etc.) based on the digital elevation model with resolution of 20m/pix. The terraces cover a total area of about 11.58Km2, accounted for 24.4% of the whole study area. The terraces appear at range from 1400m to 1800m in elevation, 10°to 20°in slope, northwest to northeast in aspect; (2) Using the method of weight of evidence, this study assessed the importance of different topographic feature. The results show that the sort of importance: elevation>slope>aspect; (3) The study counted the Normalized Difference Vegetation Index (NDVI) changes of terraces throughout the year, based on the landsat-5 image with resolution of 30m/pix. The results show that the changes of terraces' NDVI are bigger than other stuff (e.g. forest, road, house, etc.). Those work made a good preparations for establishing the dynamic remote sensing monitoring system of Yuan Yang terraces.
Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.
Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147
Abdul Wahab Muzaffar
Full Text Available The information extraction from unstructured text segments is a complex task. Although manual information extraction often produces the best results, it is harder to manage biomedical data extraction manually because of the exponential increase in data size. Thus, there is a need for automatic tools and techniques for information extraction in biomedical text mining. Relation extraction is a significant area under biomedical information extraction that has gained much importance in the last two decades. A lot of work has been done on biomedical relation extraction focusing on rule-based and machine learning techniques. In the last decade, the focus has changed to hybrid approaches showing better results. This research presents a hybrid feature set for classification of relations between biomedical entities. The main contribution of this research is done in the semantic feature set where verb phrases are ranked using Unified Medical Language System (UMLS and a ranking algorithm. Support Vector Machine and Naïve Bayes, the two effective machine learning techniques, are used to classify these relations. Our approach has been validated on the standard biomedical text corpus obtained from MEDLINE 2001. Conclusively, it can be articulated that our framework outperforms all state-of-the-art approaches used for relation extraction on the same corpus.
Full Text Available In this paper, we propose a robust tactile sensing image recognition scheme for automatic robotic assembly. First, an image reprocessing procedure is designed to enhance the contrast of the tactile image. In the second layer, geometric features and Fourier descriptors are extracted from the image. Then, kernel principal component analysis (kernel PCA is applied to transform the features into ones with better discriminating ability, which is the kernel PCA-based feature fusion. The transformed features are fed into the third layer for classification. In this paper, we design a classifier by combining the multiple kernel learning (MKL algorithm and support vector machine (SVM. We also design and implement a tactile sensing array consisting of 10-by-10 sensing elements. Experimental results, carried out on real tactile images acquired by the designed tactile sensing array, show that the kernel PCA-based feature fusion can significantly improve the discriminating performance of the geometric features and Fourier descriptors. Also, the designed MKL-SVM outperforms the regular SVM in terms of recognition accuracy. The proposed recognition scheme is able to achieve a high recognition rate of over 85% for the classification of 12 commonly used metal parts in industrial applications.
Full Text Available Myo Armband became an immersive technology to help deaf people for communication each other. The problem on Myo sensor is unstable clock rate. It causes the different length data for the same period even on the same gesture. This research proposes Moment Invariant Method to extract the feature of sensor data from Myo. This method reduces the amount of data and makes the same length of data. This research is user-dependent, according to the characteristics of Myo Armband. The testing process was performed by using alphabet A to Z on SIBI, Indonesian Sign Language, with static and dynamic finger movements. There are 26 class of alphabets and 10 variants in each class. We use min-max normalization for guarantying the range of data. We use K-Nearest Neighbor method to classify dataset. Performance analysis with leave-one-out-validation method produced an accuracy of 82.31%. It requires a more advanced method of classification to improve the performance on the detection results.
Full Text Available In recent years, several studies have proposed making use of the Twitter micro-blogging service to track various trends in online media and discussion. In this study, we specifically examine the use of Twitter to track discussions of food safety in the Korean language. Given the irregularity of keyword use in most tweets, we focus on optimistic machine-learning and feature set selection to classify collected tweets. We build the classifier model using Naive Bayes & Naive Bayes Multinomial, Support Vector Machine, and Decision Tree Algorithms, all of which show good performance. To select an optimum feature set, we construct a basic feature set as a standard for performance comparison, so that further test feature sets can be evaluated. Experiments show that precision and F-measure performance are best when using a Naive Bayes Multinomial classifier model with a test feature set defined by extracting Substantive, Predicate, Modifier, and Interjection parts of speech.
Full Text Available Understanding the relationship between protein sequence and molecular recognition selectivity remains a major challenge. The antibody fragment scFv1F4 recognizes with sub nM affinity a decapeptide (sequence 6TAMFQDPQER15 derived from the N-terminal end of human papilloma virus E6 oncoprotein. Using this decapeptide as antigen, we had previously shown that only the wild type amino-acid or conservative replacements were allowed at positions 9 to 12 and 15 of the peptide, indicating a strong binding selectivity. Nevertheless phenylalanine (F was equally well tolerated as the wild type glutamine (Q at position 13, while all other amino acids led to weaker scFv binding. The interfaces of complexes involving either Q or F are expected to diverge, due to the different physico-chemistry of these residues. This would imply that high-affinity binding can be achieved through distinct interfacial geometries. In order to investigate this point, we disrupted the scFv-peptide interface by modifying one or several peptide positions. We then analyzed the effect on binding of amino acid changes at the remaining positions, an altered susceptibility being indicative of an altered role in complex formation. The 23 starting variants analyzed contained replacements whose effects on scFv1F4 binding ranged from minor to drastic. A permutation analysis (effect of replacing each peptide position by all other amino acids except cysteine was carried out on the 23 variants using the PEPperCHIP® Platform technology. A comparison of their permutation patterns with that of the wild type peptide indicated that starting replacements at position 11, 12 or 13 modified the tolerance to amino-acid changes at the other two positions. The interdependence between the three positions was confirmed by SPR (Biacore® technology. Our data demonstrate that binding selectivity does not preclude the existence of alternative high-affinity recognition modes.
Full Text Available As a generation of ordinary fuzzy set, the concept of intuitionistic fuzzy set (IFS, characterized both by a membership degree and by a nonmembership degree, is a more flexible way to cope with the uncertainty. Similarity measures of intuitionistic fuzzy sets are used to indicate the similarity degree between intuitionistic fuzzy sets. Although many similarity measures for intuitionistic fuzzy sets have been proposed in previous studies, some of those cannot satisfy the axioms of similarity or provide counterintuitive cases. In this paper, a new similarity measure and weighted similarity measure between IFSs are proposed. It proves that the proposed similarity measures satisfy the properties of the axiomatic definition for similarity measures. Comparison between the previous similarity measures and the proposed similarity measure indicates that the proposed similarity measure does not provide any counterintuitive cases. Moreover, it is demonstrated that the proposed similarity measure is capable of discriminating difference between patterns.
Pattern recognition is a scientific discipline that is becoming increasingly important in the age of automation and information handling and retrieval. Patter Recognition, 2e covers the entire spectrum of pattern recognition applications, from image analysis to speech recognition and communications. This book presents cutting-edge material on neural networks, - a set of linked microprocessors that can form associations and uses pattern recognition to ""learn"" -and enhances student motivation by approaching pattern recognition from the designer's point of view. A direct result of more than 10
Du, Genyuan; Tian, Shengli; Qiu, Yingyu; Xu, Chunyan
This paper presents an effective and efficient kernel approach to recognize image set which is represented as a point on extended Grassmannian manifold. Several recent studies focus on the applicability of discriminant analysis on Grassmannian manifold and suffer from not obtaining the inherent nonlinear structure of the data itself. Therefore, we propose an extension of Grassmannian manifold to address this issue. Instead of using a linear data embedding with PCA, we develop a non-linear data embedding of such manifold using kernel PCA. This paper mainly consider three folds: 1) introduce a non-linear data embedding of extended Grassmannian manifold, 2) derive a distance metric of Grassmannian manifold, 3) develop an effective and efficient Grassmannian kernel for SVM classification. The extended Grassmannian manifold naturally arises in the application to recognition based on image set, such as face and object recognition. Experiments on several standard databases show better classification accuracy. Furthermore, experimental results indicate that our proposed approach significantly reduces time complexity in comparison to graph embedding discriminant analysis.
Dat Tien Nguyen
Full Text Available With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG, which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images.
Nguyen, Dat Tien; Park, Kang Ryoung
With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images.
Popuri, Karteek; Cobzas, Dana; Murtha, Albert; Jägersand, Martin
Brain tumor segmentation is a required step before any radiation treatment or surgery. When performed manually, segmentation is time consuming and prone to human errors. Therefore, there have been significant efforts to automate the process. But, automatic tumor segmentation from MRI data is a particularly challenging task. Tumors have a large diversity in shape and appearance with intensities overlapping the normal brain tissues. In addition, an expanding tumor can also deflect and deform nearby tissue. In our work, we propose an automatic brain tumor segmentation method that addresses these last two difficult problems. We use the available MRI modalities (T1, T1c, T2) and their texture characteristics to construct a multidimensional feature set. Then, we extract clusters which provide a compact representation of the essential information in these features. The main idea in this work is to incorporate these clustered features into the 3D variational segmentation framework. In contrast to previous variational approaches, we propose a segmentation method that evolves the contour in a supervised fashion. The segmentation boundary is driven by the learned region statistics in the cluster space. We incorporate prior knowledge about the normal brain tissue appearance during the estimation of these region statistics. In particular, we use a Dirichlet prior that discourages the clusters from the normal brain region to be in the tumor region. This leads to a better disambiguation of the tumor from brain tissue. We evaluated the performance of our automatic segmentation method on 15 real MRI scans of brain tumor patients, with tumors that are inhomogeneous in appearance, small in size and in proximity to the major structures in the brain. Validation with the expert segmentation labels yielded encouraging results: Jaccard (58%), Precision (81%), Recall (67%), Hausdorff distance (24 mm). Using priors on the brain/tumor appearance, our proposed automatic 3D variational
Yan, Su; Spangler, W Scott; Chen, Ying
The automation of extracting chemical names from text has significant value to biomedical and life science research. A major barrier in this task is the difficulty of getting a sizable and good quality data to train a reliable entity extraction model. Another difficulty is the selection of informative features of chemical names, since comprehensive domain knowledge on chemistry nomenclature is required. Leveraging random text generation techniques, we explore the idea of automatically creating training sets for the task of chemical name extraction. Assuming the availability of an incomplete list of chemical names, called a dictionary, we are able to generate well-controlled, random, yet realistic chemical-like training documents. We statistically analyze the construction of chemical names based on the incomplete dictionary, and propose a series of new features, without relying on any domain knowledge. Compared to state-of-the-art models learned from manually labeled data and domain knowledge, our solution shows better or comparable results in annotating real-world data with less human effort. Moreover, we report an interesting observation about the language for chemical names. That is, both the structural and semantic components of chemical names follow a Zipfian distribution, which resembles many natural languages.
Full Text Available The EMG signal indicates the electrophysiological response to daily living of activities, particularly to lower-limb knee exercises. Literature reports have shown numerous benefits of the Wavelet analysis in EMG feature extraction for pattern recognition. However, its application to typical knee exercises when using only a single EMG channel is limited. In this study, three types of knee exercises, i.e., flexion of the leg up (standing, hip extension from a sitting position (sitting and gait (walking are investigated from 14 healthy untrained subjects, while EMG signals from the muscle group of vastus medialis and the goniometer on the knee joint of the detected leg are synchronously monitored and recorded. Four types of lower-limb motions including standing, sitting, stance phase of walking, and swing phase of walking, are segmented. The Wavelet Transform (WT based Singular Value Decomposition (SVD approach is proposed for the classification of four lower-limb motions using a single-channel EMG signal from the muscle group of vastus medialis. Based on lower-limb motions from all subjects, the combination of five-level wavelet decomposition and SVD is used to comprise the feature vector. The Support Vector Machine (SVM is then configured to build a multiple-subject classifier for which the subject independent accuracy will be given across all subjects for the classification of four types of lower-limb motions. In order to effectively indicate the classification performance, EMG features from time-domain (e.g., Mean Absolute Value (MAV, Root-Mean-Square (RMS, integrated EMG (iEMG, Zero Crossing (ZC and frequency-domain (e.g., Mean Frequency (MNF and Median Frequency (MDF are also used to classify lower-limb motions. The five-fold cross validation is performed and it repeats fifty times in order to acquire the robust subject independent accuracy. Results show that the proposed WT-based SVD approach has the classification accuracy of 91.85%±0
Full Text Available Human motion sensing technology gains tremendous popularity nowadays with practical applications such as video surveillance for security, hand signing, and smart-home and gaming. These applications capture human motions in real-time from video sensors, the data patterns are nonstationary and ever changing. While the hardware technology of such motion sensing devices as well as their data collection process become relatively mature, the computational challenge lies in the real-time analysis of these live feeds. In this paper we argue that traditional data mining methods run short of accurately analyzing the human activity patterns from the sensor data stream. The shortcoming is due to the algorithmic design which is not adaptive to the dynamic changes in the dynamic gesture motions. The successor of these algorithms which is known as data stream mining is evaluated versus traditional data mining, through a case of gesture recognition over motion data by using Microsoft Kinect sensors. Three different subjects were asked to read three comic strips and to tell the stories in front of the sensor. The data stream contains coordinates of articulation points and various positions of the parts of the human body corresponding to the actions that the user performs. In particular, a novel technique of feature selection using swarm search and accelerated PSO is proposed for enabling fast preprocessing for inducing an improved classification model in real-time. Superior result is shown in the experiment that runs on this empirical data stream. The contribution of this paper is on a comparative study between using traditional and data stream mining algorithms and incorporation of the novel improved feature selection technique with a scenario where different gesture patterns are to be recognized from streaming sensor data.
Ma, Wei Ji; Zhou, Xiang; Ross, Lars A; Foxe, John J; Parra, Lucas C
Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.
Wei Ji Ma
Full Text Available Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness, one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.
Full Text Available Clustering is a popular technique for explorative analysis of data, as it can reveal subgroupings and similarities between data in an unsupervised manner. While clustering is routinely applied to gene expression data, there is a lack of appropriate general methodology for clustering of sequence-level genomic and epigenomic data, e.g. ChIP-based data. We here introduce a general methodology for clustering data sets of coordinates relative to a genome assembly, i.e. genomic tracks. By defining appropriate feature extraction approaches and similarity measures, we allow biologically meaningful clustering to be performed for genomic tracks using standard clustering algorithms. An implementation of the methodology is provided through a tool, ClusTrack, which allows fine-tuned clustering analyses to be specified through a web-based interface. We apply our methods to the clustering of occupancy of the H3K4me1 histone modification in samples from a range of different cell types. The majority of samples form meaningful subclusters, confirming that the definitions of features and similarity capture biological, rather than technical, variation between the genomic tracks. Input data and results are available, and can be reproduced, through a Galaxy Pages document at http://hyperbrowser.uio.no/hb/u/hb-superuser/p/clustrack. The clustering functionality is available as a Galaxy tool, under the menu option "Specialized analyzis of tracks", and the submenu option "Cluster tracks based on genome level similarity", at the Genomic HyperBrowser server: http://hyperbrowser.uio.no/hb/.
Full Text Available Increase in number of elderly people who are living independently needs especial care in the form of healthcare monitoring systems. Recent advancements in depth video technologies have made human activity recognition (HAR realizable for elderly healthcare applications. In this paper, a depth video-based novel method for HAR is presented using robust multi-features and embedded Hidden Markov Models (HMMs to recognize daily life activities of elderly people living alone in indoor environment such as smart homes. In the proposed HAR framework, initially, depth maps are analyzed by temporal motion identification method to segment human silhouettes from noisy background and compute depth silhouette area for each activity to track human movements in a scene. Several representative features, including invariant, multi-view differentiation and spatiotemporal body joints features were fused together to explore gradient orientation change, intensity differentiation, temporal variation and local motion of specific body parts. Then, these features are processed by the dynamics of their respective class and learned, modeled, trained and recognized with specific embedded HMM having active feature values. Furthermore, we construct a new online human activity dataset by a depth sensor to evaluate the proposed features. Our experiments on three depth datasets demonstrated that the proposed multi-features are efficient and robust over the state of the art features for human action and activity recognition.
Ryals, Anthony J.; Cleary, Anne M.
Among cues that fail to elicit successful recall, participants can still discriminate between cues that do and do not resemble studied items. This ability is referred to as recognition without cued recall (RWCR). We hypothesized that whereas recognition with cued recall is at least partly based on recalled studied information, RWCR results from a…
Full Text Available In this paper, feature extraction based on data-wave is proposed. The concept of data-wave is introduced to describe the rising and falling trends of the data over the long-term which are detected based on ripple and wave filters. Supported by data-wave, a novel symbol identifier with significant structure features is designed and these features are extracted by constructing pixel chains. On this basis, the corresponding recognition and positioning approach is presented. The effectiveness of the proposed approach is verified by experiments.
Full Text Available T-cell receptors (TCR play an important role in the adaptive immune system as they recognize pathogen- or cancer-based epitopes and thus initiate the cell-mediated immune response. Therefore there exists a growing interest in the optimization of TCRs for medical purposes like adoptive T-cell therapy. However, the molecular mechanisms behind T-cell signaling are still predominantly unknown. For small sets of TCRs it was observed that the angle between their Vα- and Vβ-domains, which bind the epitope, can vary and might be important for epitope recognition. Here we present a comprehensive, quantitative study of the variation in the Vα/Vβ interdomain-angle and its influence on epitope recognition, performing a systematic bioinformatics analysis based on a representative set of experimental TCR structures. For this purpose we developed a new, cuboid-based superpositioning method, which allows a unique, quantitative analysis of the Vα/Vβ-angles. Angle-based clustering led to six significantly different clusters. Analysis of these clusters revealed the unexpected result that the angle is predominantly influenced by the TCR-clonotype, whereas the bound epitope has only a minor influence. Furthermore we could identify a previously unknown center of rotation (CoR, which is shared by all TCRs. All TCR geometries can be obtained by rotation around this center, rendering it a new, common TCR feature with the potential of improving the accuracy of TCR structure prediction considerably. The importance of Vα/Vβ rotation for signaling was confirmed as we observed larger variances in the Vα/Vβ-angles in unbound TCRs compared to epitope-bound TCRs. Our results strongly support a two-step mechanism for TCR-epitope: First, preformation of a flexible TCR geometry in the unbound state and second, locking of the Vα/Vβ-angle in a TCR-type specific geometry upon epitope-MHC association, the latter being driven by rotation around the unique center of rotation.
Kroll, Christine; von der Werth, Monika; Leuck, Holger; Stahl, Christoph; Schertler, Klaus
For Intelligence, Surveillance, Reconnaissance (ISR) missions of manned and unmanned air systems typical electrooptical payloads provide high-definition video data which has to be exploited with respect to relevant ground targets in real-time by automatic/assisted target recognition software. Airbus Defence and Space is developing required technologies for real-time sensor exploitation since years and has combined the latest advances of Deep Convolutional Neural Networks (CNN) with a proprietary high-speed Support Vector Machine (SVM) learning method into a powerful object recognition system with impressive results on relevant high-definition video scenes compared to conventional target recognition approaches. This paper describes the principal requirements for real-time target recognition in high-definition video for ISR missions and the Airbus approach of combining an invariant feature extraction using pre-trained CNNs and the high-speed training and classification ability of a novel frequency-domain SVM training method. The frequency-domain approach allows for a highly optimized implementation for General Purpose Computation on a Graphics Processing Unit (GPGPU) and also an efficient training of large training samples. The selected CNN which is pre-trained only once on domain-extrinsic data reveals a highly invariant feature extraction. This allows for a significantly reduced adaptation and training of the target recognition method for new target classes and mission scenarios. A comprehensive training and test dataset was defined and prepared using relevant high-definition airborne video sequences. The assessment concept is explained and performance results are given using the established precision-recall diagrams, average precision and runtime figures on representative test data. A comparison to legacy target recognition approaches shows the impressive performance increase by the proposed CNN+SVM machine-learning approach and the capability of real-time high
Han, Yi; Wang, Guoyin; Yang, Yong; He, Kun
Human emotions could be expressed by many bio-symbols. Speech and facial expression are two of them. They are both regarded as emotional information which is playing an important role in human-computer interaction. Based on our previous studies on emotion recognition, an audiovisual emotion recognition system is developed and represented in this paper. The system is designed for real-time practice, and is guaranteed by some integrated modules. These modules include speech enhancement for eliminating noises, rapid face detection for locating face from background image, example based shape learning for facial feature alignment, and optical flow based tracking algorithm for facial feature tracking. It is known that irrelevant features and high dimensionality of the data can hurt the performance of classifier. Rough set-based feature selection is a good method for dimension reduction. So 13 speech features out of 37 ones and 10 facial features out of 33 ones are selected to represent emotional information, and 52 audiovisual features are selected due to the synchronization when speech and video fused together. The experiment results have demonstrated that this system performs well in real-time practice and has high recognition rate. Our results also show that the work in multimodules fused recognition will become the trend of emotion recognition in the future.
Li, Shelly-Anne; Jeffs, Lianne; Barwick, Melanie; Stevens, Bonnie
Organizational contextual features have been recognized as important determinants for implementing evidence-based practices across healthcare settings for over a decade. However, implementation scientists have not reached consensus on which features are most important for implementing evidence-based practices. The aims of this review were to identify the most commonly reported organizational contextual features that influence the implementation of evidence-based practices across healthcare settings, and to describe how these features affect implementation. An integrative review was undertaken following literature searches in CINAHL, MEDLINE, PsycINFO, EMBASE, Web of Science, and Cochrane databases from January 2005 to June 2017. English language, peer-reviewed empirical studies exploring organizational context in at least one implementation initiative within a healthcare setting were included. Quality appraisal of the included studies was performed using the Mixed Methods Appraisal Tool. Inductive content analysis informed data extraction and reduction. The search generated 5152 citations. After removing duplicates and applying eligibility criteria, 36 journal articles were included. The majority (n = 20) of the study designs were qualitative, 11 were quantitative, and 5 used a mixed methods approach. Six main organizational contextual features (organizational culture; leadership; networks and communication; resources; evaluation, monitoring and feedback; and champions) were most commonly reported to influence implementation outcomes in the selected studies across a wide range of healthcare settings. We identified six organizational contextual features that appear to be interrelated and work synergistically to influence the implementation of evidence-based practices within an organization. Organizational contextual features did not influence implementation efforts independently from other features. Rather, features were interrelated and often influenced each
Alam, Nadia; Doerga, Kirtiedevi B. N. S; Hussain, Tahira; Hussain, Sadia; Holleman, Frits; Kramer, Mark H. H.; Nanayakkara, Prabath W. B.
General practitioners (GPs) and the emergency medical services (EMS) personnel have a pivotal role as points of entry into the acute care chain. This study was conducted to investigate the recognition of sepsis by GPs and EMS personnel and to evaluate the associations between recognition of sepsis
Akroyd, Mike; Jordan, Gary; Rowlands, Paul
People with serious mental illness have reduced life expectancy compared with a control population, much of which is accounted for by significant physical comorbidity. Frontline clinical staff in mental health often lack confidence in recognition, assessment and management of such 'medical' problems. Simulation provides one way for staff to practise these skills in a safe setting. We produced a multidisciplinary simulation course around recognition and assessment of medical problems in psychiatric settings. We describe an audit of strategic and design aspects of the recognition and assessment of medical problems in psychiatric settings course, using the Department of Health's 'Framework for Technology Enhanced Learning' as our audit standards. At the same time as highlighting areas where recognition and assessment of medical problems in psychiatric settings adheres to these identified principles, such as the strategic underpinning of the approach, and the means by which information is collected, reviewed and shared, it also helps us to identify areas where we can improve. © The Author(s) 2014.
Eakins, John P.; Edwards, Jonathan D.; Riley, K. Jonathan; Rosin, Paul L.
Many different kinds of features have been used as the basis for shape retrieval from image databases. This paper investigates the relative effectiveness of several types of global shape feature, both singly and in combination. The features compared include well-established descriptors such as Fourier coefficients and moment invariants, as well as recently-proposed measures of triangularity and ellipticity. Experiments were conducted within the framework of the ARTISAN shape retrieval system, and retrieval effectiveness assessed on a database of over 10,000 images, using 24 queries and associated ground truth supplied by the UK Patent Office . Our experiments revealed only minor differences in retrieval effectiveness between different measures, suggesting that a wide variety of shape feature combinations can provide adequate discriminating power for effective shape retrieval in multi-component image collections such as trademark registries. Marked differences between measures were observed for some individual queries, suggesting that there could be considerable scope for improving retrieval effectiveness by providing users with an improved framework for searching multi-dimensional feature space.
Deng, Jeremiah D; Simmermacher, Christian; Cranefield, Stephen
In tackling data mining and pattern recognition tasks, finding a compact but effective set of features has often been found to be a crucial step in the overall problem-solving process. In this paper, we present an empirical study on feature analysis for recognition of classical instrument, using machine learning techniques to select and evaluate features extracted from a number of different feature schemes. It is revealed that there is significant redundancy between and within feature schemes commonly used in practice. Our results suggest that further feature analysis research is necessary in order to optimize feature selection and achieve better results for the instrument recognition problem.
van der Burg, Eeke; de Leeuw, Jan; Verdegaal, Renée
Homogeneity analysis, or multiple correspondence analysis, is usually applied tok separate variables. In this paper we apply it to sets of variables by using sums within sets. The resulting technique is called OVERALS. It uses the notion of optimal scaling, with transformations that can be multiple
Eimer, Martin; Kiss, Monika; Press, Clare; Sauter, Disa
We investigated the roles of top-down task set and bottom-up stimulus salience for feature-specific attentional capture. Spatially nonpredictive cues preceded search arrays that included a color-defined target. For target-color singleton cues, behavioral spatial cueing effects were accompanied by cue-induced N2pc components, indicative of…
Baeyens, Frank; Vervliet, Bram; Vansteenwegen, Debora; Beckers, Tom; Hermans, Dirk; Eelen, Paul
Using a conditioned suppression task, we investigated simultaneous (XA-/A+) vs. sequential (X [right arrow] A-/A+) Feature Negative (FN) discrimination learning in humans. We expected the simultaneous discrimination to result in X (or alternatively the XA configuration) becoming an inhibitor acting directly on the US, and the sequential…
83 5.1 Marginal PMFs for the cylinder scene at coarse zoom. . . . . . . . . . . . . . . 85 5.2 SAR image of a Nissan Sentra with canonical...of a Nissan Sentra with canonical features extracted by the SPLIT algorithm. 5.2.4 Experiment Summary. A notional algorithm is presented in Figure 5.3
Kalogeropoulou, Zampeta; Jagadeesh, Akshay V; Ohl, Sven; Rolfs, Martin
Many everyday tasks require prioritizing some visual features over competing ones, both during the selection from the rich sensory input and while maintaining information in visual short-term memory (VSTM). Here, we show that observers can change priorities in VSTM when, initially, they attended to a different feature. Observers reported from memory the orientation of one of two spatially interspersed groups of black and white gratings. Using colored pre-cues (presented before stimulus onset) and retro-cues (presented after stimulus offset) predicting the to-be-reported group, we manipulated observers' feature priorities independently during stimulus encoding and maintenance, respectively. Valid pre-cues reliably increased observers' performance (reduced guessing, increased report precision) as compared to neutral ones; invalid pre-cues had the opposite effect. Valid retro-cues also consistently improved performance (by reducing random guesses), even if the unexpected group suddenly became relevant (invalid-valid condition). Thus, feature-based attention can reshape priorities in VSTM protecting information that would otherwise be forgotten.
Khan, Zafar Ali; Sohn, Won
The growing population of elderly people living alone increases the need for automatic healthcare monitoring systems for elderly care. Automatic vision sensor-based systems are increasingly used for human activity recognition (HAR) in recent years. This study presents an improved model, tested using actors, of a sensor-based HAR system to recognize daily life activities of elderly people at home and generate an alert in case of abnormal HAR. Datasets consisting of six abnormal activities (falling backward, falling forward, falling rightward, falling leftward, chest pain, and fainting) and four normal activities (walking, rushing, sitting down, and standing up) are generated from different view angles (90°, -90°, 45°, -45°). Feature extraction and dimensions reduction are performed by R-transform followed by generalized discriminant analysis (GDA) methods. R-transform extracts symmetric, scale, and translation-invariant features from the sequences of activities. GDA increases the discrimination between different classes of highly similar activities. Silhouette sequences are quantified by the Linde-Buzo-Gray algorithm and recognized by hidden conditional random fields. Experimental results provide an average recognition rate of 94.2% for abnormal activities and 92.7% for normal activities. The recognition rate for the highly similar activities from different view angles shows the flexibility and efficacy of the proposed abnormal HAR and alert generation system for elderly care.
Marhöfer, David Maximilian; Tosello, Guido; Islam, Aminul
. In the reported work, process simulations using Autodesk Moldflow Insight 2015® are applied to a micro mechanical part to be fabricated by micro injection molding and with over-all dimensions of 12.0 × 3.0 × 0.8 mm³ and micro features (micro hole, diameter of 580 μm, and sharp radii down to 100 μm). Three...
Full Text Available Because a number of image feature data to store, complex calculation to execute during the face recognition, therefore the face recognition process was realized only by PCs with high performance. In this paper, the OpenCV facial Haar-like features were used to identify face region; the Principal Component Analysis (PCA was employed in quick extraction of face features and the Euclidean Distance was also adopted in face recognition; as thus, data amount and computational complexity would be reduced effectively in face recognition, and the face recognition could be carried out on embedded platform. Finally, based on Tiny6410 embedded platform, a set of embedded face recognition systems was constructed. The test results showed that the system has stable operation and high recognition rate can be used in portable and mobile identification and authentication.
Bijhold, J.; Herk, M. van; Vijlbrief, R.; Lebesque, J.V.
A new fast method is presented for the quantification of patient set-up errors during radiotherapy with external photon beams. The set-up errors are described as deviations in relative position and orientation of specified anatomical structures relative to specified field shaping devices. These deviations are determined from parameters of the image transformations that make their features in a portal image align with the corresponding features in a simulator image. Knowledge of some set-up parameters during treatment simulation is required. The method does not require accurate knowledge about the position of the portal imaging device as long as the positions of some of the field shaping devices are verified independently during treatment. By applying this method, deviations in a pelvic phantom set-up can be measured with a precision of 2 mm within 1 minute. Theoretical considerations and experiments have shown that the method is not applicable when there are out-of-plane rotations larger than 2 degrees or translations larger than 1 cm. Inter-observer variability proved to be a source of large systematic errors, which could be reduced by offering a precise protocol for the feature alignment. (author)
Ahmed, Waseem; Sufyan Beg, M. M.; Ahmad, Tanvir
In Set-valued Information Systems (SIS), several objects contain more than one value for some attributes. Tolerance relation used for handling SIS sometimes leads to loss of certain information. To surmount this problem, fuzzy rough model was introduced. However, in some cases, SIS may contain some real or continuous set-values. Therefore, the existing fuzzy rough model for handling Information system with fuzzy set-values needs some changes. In this paper, Fuzzy Set-valued Information System (FSIS) is proposed and fuzzy similarity relation for FSIS is defined. Yager's relative conditional entropy was studied to find the significance measure of a candidate attribute of FSIS. Later, using these significance values, three greedy forward algorithms are discussed for finding the reduct and relative reduct for the proposed FSIS. An experiment was conducted on a sample population of the real dataset and a comparison of classification accuracies of the proposed FSIS with the existing SIS and single-valued Fuzzy Information Systems was made, which demonstrated the effectiveness of proposed FSIS.
Full Text Available Smartphone-based activity recognition (SP-AR recognizes users’ activities using the embedded accelerometer sensor. Only a small number of previous works can be classified as online systems, i.e., the whole process (pre-processing, feature extraction, and classification is performed on the device. Most of these online systems use either a high sampling rate (SR or long data-window (DW to achieve high accuracy, resulting in short battery life or delayed system response, respectively. This paper introduces a real-time/online SP-AR system that solves this problem. Exploratory data analysis was performed on acceleration signals of 6 activities, collected from 30 subjects, to show that these signals are generated by an autoregressive (AR process, and an accurate AR-model in this case can be built using a low SR (20 Hz and a small DW (3 s. The high within class variance resulting from placing the phone at different positions was reduced using kernel discriminant analysis to achieve position-independent recognition. Neural networks were used as classifiers. Unlike previous works, true subject-independent evaluation was performed, where 10 new subjects evaluated the system at their homes for 1 week. The results show that our features outperformed three commonly used features by 40% in terms of accuracy for the given SR and DW.
Khan, Adil Mehmood; Siddiqi, Muhammad Hameed; Lee, Seok-Won
Smartphone-based activity recognition (SP-AR) recognizes users' activities using the embedded accelerometer sensor. Only a small number of previous works can be classified as online systems, i.e., the whole process (pre-processing, feature extraction, and classification) is performed on the device. Most of these online systems use either a high sampling rate (SR) or long data-window (DW) to achieve high accuracy, resulting in short battery life or delayed system response, respectively. This paper introduces a real-time/online SP-AR system that solves this problem. Exploratory data analysis was performed on acceleration signals of 6 activities, collected from 30 subjects, to show that these signals are generated by an autoregressive (AR) process, and an accurate AR-model in this case can be built using a low SR (20 Hz) and a small DW (3 s). The high within class variance resulting from placing the phone at different positions was reduced using kernel discriminant analysis to achieve position-independent recognition. Neural networks were used as classifiers. Unlike previous works, true subject-independent evaluation was performed, where 10 new subjects evaluated the system at their homes for 1 week. The results show that our features outperformed three commonly used features by 40% in terms of accuracy for the given SR and DW.
Georgiadis, Pantelis; Cavouras, Dionisis; Kalatzis, Ioannis; Glotsos, Dimitris; Athanasiadis, Emmanouil; Kostopoulos, Spiros; Sifaki, Koralia; Malamas, Menelaos; Nikiforidis, George; Solomou, Ekaterini
Three-dimensional (3D) texture analysis of volumetric brain magnetic resonance (MR) images has been identified as an important indicator for discriminating among different brain pathologies. The purpose of this study was to evaluate the efficiency of 3D textural features using a pattern recognition system in the task of discriminating benign, malignant and metastatic brain tissues on T1 postcontrast MR imaging (MRI) series. The dataset consisted of 67 brain MRI series obtained from patients with verified and untreated intracranial tumors. The pattern recognition system was designed as an ensemble classification scheme employing a support vector machine classifier, specially modified in order to integrate the least squares features transformation logic in its kernel function. The latter, in conjunction with using 3D textural features, enabled boosting up the performance of the system in discriminating metastatic, malignant and benign brain tumors with 77.14%, 89.19% and 93.33% accuracy, respectively. The method was evaluated using an external cross-validation process; thus, results might be considered indicative of the generalization performance of the system to "unseen" cases. The proposed system might be used as an assisting tool for brain tumor characterization on volumetric MRI series.
Full Text Available Credit scoring methods are widely used for evaluating loan applications in financial and banking institutions. Credit score identifies if applicant customers belong to good risk applicant group or a bad risk applicant group. These decisions are based on the demographic data of the customers, overall business by the customer with bank, and loan payment history of the loan applicants. The advantages of using credit scoring models include reducing the cost of credit analysis, enabling faster credit decisions and diminishing possible risk. Many statistical and machine learning techniques such as Logistic Regression, Support Vector Machines, Neural Networks and Decision tree algorithms have been used independently and as hybrid credit scoring models. This paper proposes an ensemble based technique combining seven individual models to increase the classification accuracy. Feature selection has also been used for selecting important attributes for classification. Cross classification was conducted using three data partitions. German credit dataset having 1000 instances and 21 attributes is used in the present study. The results of the experiments revealed that the ensemble model yielded a very good accuracy when compared to individual models. In all three different partitions, the ensemble model was able to classify more than 80% of the loan customers as good creditors correctly. Also, for 70:30 partition there was a good impact of feature selection on the accuracy of classifiers. The results were improved for almost all individual models including the ensemble model.
Full Text Available Technology with its speedy great leaps forward has undeniable impact on every aspect of our life in the new millennium. It has supplied us with different affordances almost daily or more precisely in a matter of hours. Technology and Computer seems to be a break through as for their roles in the Twenty-First century educational system. Examples are numerous, among which CALL, CMC, and Virtual learning spaces come to mind instantly. Amongst the newly developed gadgets of today are the sophisticated smart Hand phones which are far more ahead of a communication tool once designed for. Development of Hand phone as a wide-spread multi-tasking gadget has urged researchers to investigate its effect on every aspect of learning process including language learning. This study attempts to explore the effects of using cell phone audio recording feature, by Iranian EFL learners, on the development of their speaking skills. Thirty-five sophomore students were enrolled in a pre-posttest designed study. Data on their English speaking experience using audio–recording features of their Hand phones were collected. At the end of the semester, the performance of both groups, treatment and control, were observed, evaluated, and analyzed; thereafter procured qualitatively at the next phase. The quantitative outcome lent support to integrating Hand phones as part of the language learning curriculum. Keywords:
Full Text Available In this paper, we present a new approach to offline OCR (optical character recognition for printed Persian subwords using wavelet packet transform. The proposed algorithm is used to extract font invariant and size invariant features from 87804 subwords of 4 fonts and 3 sizes. The feature vectors are compressed using PCA. The obtained feature vectors yield a pictorial dictionary for which an entry is the mean of each group that consists of the same subword with 4 fonts in 3 sizes. The sets of these features are congregated by combining them with the dot features for the recognition of printed Persian subwords. To evaluate the feature extraction results, this algorithm was tested on a set of 2000 subwords in printed Persian text documents. An encouraging recognition rate of 97.9% is got at subword level recognition.
Full Text Available Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA is proposed. PCA allowed to identify a smaller number of features (k = 2 features, the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear (VBmax was achieved, with predicted values very close to the measured tool wear values.
Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features ( k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear ( VB max ) was achieved, with predicted values very close to the measured tool wear values.
Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features (k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear (VBmax) was achieved, with predicted values very close to the measured tool wear values. PMID:29522443
Mahieu, Nathaniel G; Patti, Gary J
When using liquid chromatography/mass spectrometry (LC/MS) to perform untargeted metabolomics, it is now routine to detect tens of thousands of features from biological samples. Poor understanding of the data, however, has complicated interpretation and masked the number of unique metabolites actually being measured in an experiment. Here we place an upper bound on the number of unique metabolites detected in Escherichia coli samples analyzed with one untargeted metabolomics method. We first group multiple features arising from the same analyte, which we call "degenerate features", using a context-driven annotation approach. Surprisingly, this analysis revealed thousands of previously unreported degeneracies that reduced the number of unique analytes to ∼2961. We then applied an orthogonal approach to remove nonbiological features from the data using the 13 C-based credentialing technology. This further reduced the number of unique analytes to less than 1000. Our 90% reduction in data is 5-fold greater than previously published studies. On the basis of the results, we propose an alternative approach to untargeted metabolomics that relies on thoroughly annotated reference data sets. To this end, we introduce the creDBle database ( http://creDBle.wustl.edu ), which contains accurate mass, retention time, and MS/MS fragmentation data as well as annotations of all credentialed features.
Pardos, Maria; Korostenskaja, Milena; Xiang, Jing; Fujiwara, Hisako; Lee, Ki H.; Horn, Paul S.; Byars, Anna; Vannest, Jennifer; Wang, Yingying; Hemasilpin, Nat; Rose, Douglas F.
Objective evaluation of language function is critical for children with intractable epilepsy under consideration for epilepsy surgery. The purpose of this preliminary study was to evaluate word recognition in children with intractable epilepsy by using magnetoencephalography (MEG). Ten children with intractable epilepsy (M/F 6/4, mean ± SD 13.4 ± 2.2 years) were matched on age and sex to healthy controls. Common nouns were presented simultaneously from visual and auditory sensory inputs in “match” and “mismatch” conditions. Neuromagnetic responses M1, M2, M3, M4, and M5 with latencies of ~100 ms, ~150 ms, ~250 ms, ~350 ms, and ~450 ms, respectively, elicited during the “match” condition were identified. Compared to healthy children, epilepsy patients had both significantly delayed latency of the M1 and reduced amplitudes of M3 and M5 responses. These results provide neurophysiologic evidence of altered word recognition in children with intractable epilepsy. PMID:26146459
Popp, Margot; Trumpp, Natalie M; Kiefer, Markus
Grounded cognition theories suggest that conceptual representations essentially depend on modality-specific sensory and motor systems. Feature-specific brain activation across different feature types such as action or audition has been intensively investigated in nouns, while feature-specific conceptual category differences in verbs mainly focused on body part specific effects. The present work aimed at assessing whether feature-specific event-related potential (ERP) differences between action and sound concepts, as previously observed in nouns, can also be found within the word class of verbs. In Experiment 1, participants were visually presented with carefully matched sound and action verbs within a lexical decision task, which provides implicit access to word meaning and minimizes strategic access to semantic word features. Experiment 2 tested whether pre-activating the verb concept in a context phase, in which the verb is presented with a related context noun, modulates subsequent feature-specific action vs. sound verb processing within the lexical decision task. In Experiment 1, ERP analyses revealed a differential ERP polarity pattern for action and sound verbs at parietal and central electrodes similar to previous results in nouns. Pre-activation of the meaning of verbs in the preceding context phase in Experiment 2 resulted in a polarity-reversal of feature-specific ERP effects in the lexical decision task compared with Experiment 1. This parallels analogous earlier findings for primed action and sound related nouns. In line with grounded cognitions theories, our ERP study provides evidence for a differential processing of action and sound verbs similar to earlier observation for concrete nouns. Although the localizational value of ERPs must be viewed with caution, our results indicate that the meaning of verbs is linked to different neural circuits depending on conceptual feature relevance.
Hargreaves, Jo; Blomberg, Davinia
The nature of apprenticeships is changing. Increasing proportions of adult apprentices are prompting demand for various alternative pathways to completion. One option for an alternative pathway to accelerate completion is the use of recognition of prior learning (RPL) to identify existing skills and knowledge in combination with gap training. This…
Full Text Available Despite not knowing the exact age of individuals, humans can estimate their rough age using age-related physical features. Nonhuman primates show some age-related physical features; however, the cognitive traits underlying their recognition of age class have not been revealed. Here, we tested the ability of two species of Old World monkey, Japanese macaques (JM and Campbell's monkeys (CM, to spontaneously discriminate age classes using visual paired comparison (VPC tasks based on the two distinct categories of infant and adult images. First, VPCs were conducted in JM subjects using conspecific JM stimuli. When analyzing the side of the first look, JM subjects significantly looked more often at novel images. Based on analyses of total looking durations, JM subjects looked at a novel infant image longer than they looked at a familiar adult image, suggesting the ability to spontaneously discriminate between the two age classes and a preference for infant over adult images. Next, VPCs were tested in CM subjects using heterospecific JM stimuli. CM subjects showed no difference in the side of their first look, but looked at infant JM images longer than they looked at adult images; the fact that CMs were totally naïve to JMs suggested that the attractiveness of infant images transcends species differences. This is the first report of visual age class recognition and a preference for infant over adult images in nonhuman primates. Our results suggest not only species-specific processing for age class recognition but also the evolutionary origins of the instinctive human perception of baby cuteness schema, proposed by the ethologist Konrad Lorenz.
Ziegler, Philipp; Wartzack, Sandro
Usually, the geometry of the manufactured product inherently varies from the nominal geometry. This may negatively affect the product functions and properties (such as quality and reliability), as well as the assemblability of the single components. In order to avoid this, the geometric variation of these component surfaces and associated geometry elements (like hole axes) are restricted by tolerances. Since tighter tolerances lead to significant higher manufacturing costs, tolerances should be specified carefully. Therefore, the impact of deviating component surfaces on functions, properties and assemblability of the product has to be analyzed. As physical experiments are expensive, methods of statistical tolerance analysis tools are widely used in engineering design. Current tolerance simulation tools lack of an appropriate indicator for the impact of deviating component surfaces. In the adoption of Sensitivity Analysis methods, there are several challenges, which arise from the specific framework in tolerancing. This paper presents an approach to adopt Sensitivity Analysis methods on current tolerance simulations with an interface module, which bases on level sets of constraint functions for parameters of the simulation model. The paper is an extension and generalization of Ziegler and Wartzack . Mathematical properties of the constraint functions (convexity, homogeneity), which are important for the computational costs of the Sensitivity Analysis, are shown. The practical use of the method is illustrated in a case study of a plain bearing. - Highlights: • Alternative definition of Deviation Domains. • Proof of mathematical properties of the Deviation Domains. • Definition of the interface between Deviation Domains and Sensitivity Analysis. • Sensitivity analysis of a gearbox to show the methods practical use
Madsen, Rasmus Elsborg; Larsen, Jan; Hansen, Lars Kai
Language independent `bag-of-words' representations are surprisingly efective for text classi¯cation. In this communi- cation our aim is to elucidate the synergy between language inde- pendent features and simple language model features. We consider term tag features estimated by a so-called part...... and a probabilistic neural network classi- fier. Three medium size data-sets are analyzed and we find consis- tent synergy between the term and natural language features in all three sets for a range of training set sizes. The most significant en- hancement is found for small text databases where high recognition...
Biondich, Paul G; Overhage, J Marc; Dexter, Paul R; Downs, Stephen M; Lemmon, Larry; McDonald, Clement J
Advances in optical character recognition (OCR) software and computer hardware have stimulated a reevaluation of the technology and its ability to capture structured clinical data from preexisting paper forms. In our pilot evaluation, we measured the accuracy and feasibility of capturing vitals data from a pediatric encounter form that has been in use for over twenty years. We found that the software had a digit recognition rate of 92.4% (95% confidence interval: 91.6 to 93.2) overall. More importantly, this system was approximately three times as fast as our existing method of data entry. These preliminary results suggest that with further refinements in the approach and additional development, we may be able to incorporate OCR as another method for capturing structured clinical data.
A. V. Kaminsky
Full Text Available Infertility refers to those states that significantly affect the psycho-emotional status of a person, causing the state of chronic stress. In turn, chronic stress can lead to the development of stress-induced infertility. The aim of the study was to identify features of the reproductive setting of men and women who are patients of assisted reproductive technology (ART programs in connection with reproductive behavior. Material and methods. Under supervision, there were 233 women and men who needed infertility treatment using ART methods, and 142 fertile women and men who had already had births, and applied for pre-gestational preparation before planning another pregnancy. Methods of psychological testing are used. Results. It has been established that the reproductive setting of infertile men and women is uncertain (contradictory; in it there is a discrepancy and ambivalence in the content of affective, cognitive and conative components. Reproductive testing of individuals having children is definite (harmonious; there is consistency in the content of affective, cognitive and conative components. There are gender differences in the components of the reproductive setting, both infertile and those with children. There is a connection between the type of reproductive setting and the personality characteristics, the relation to the spouse, the motives for the birth of the child. Conclusions. The reproductive settings of infertile men and women who are patients of the ART are different from those of mothers and fathers with newborn babies and require psychological correction.
Mosquera Lopez, Clara; Agaian, Sos
Prostate cancer detection and staging is an important step towards patient treatment selection. Advancements in digital pathology allow the application of new quantitative image analysis algorithms for computer-assisted diagnosis (CAD) on digitized histopathology images. In this paper, we introduce a new set of features to automatically grade pathological images using the well-known Gleason grading system. The goal of this study is to classify biopsy images belonging to Gleason patterns 3, 4, and 5 by using a combination of wavelet and fractal features. For image classification we use pairwise coupling Support Vector Machine (SVM) classifiers. The accuracy of the system, which is close to 97%, is estimated through three different cross-validation schemes. The proposed system offers the potential for automating classification of histological images and supporting prostate cancer diagnosis.
Full Text Available Polycomb repressive complex 2 (PRC2, a histone H3 lysine 27 methyltransferase, plays a key role in gene regulation and is a known epigenetics drug target for cancer therapy. The WD40 domain-containing protein EED is the regulatory subunit of PRC2. It binds to the tri-methylated lysine 27 of the histone H3 (H3K27me3, and through which stimulates the activity of PRC2 allosterically. Recently, we disclosed a novel PRC2 inhibitor EED226 which binds to the K27me3-pocket on EED and showed strong antitumor activity in xenograft mice model. Here, we further report the identification and validation of four other EED binders along with EED162, the parental compound of EED226. The crystal structures for all these five compounds in complex with EED revealed a common deep pocket induced by the binding of this diverse set of compounds. This pocket was created after significant conformational rearrangement of the aromatic cage residues (Y365, Y148 and F97 in the H3K27me3 binding pocket of EED, the width of which was delineated by the side chains of these rearranged residues. In addition, all five compounds interact with the Arg367 at the bottom of the pocket. Each compound also displays unique features in its interaction with EED, suggesting the dynamics of the H3K27me3 pocket in accommodating the binding of different compounds. Our results provide structural insights for rational design of novel EED binder for the inhibition of PRC2 complex activity.
Mølgaard, Lasse Lohilahti; Jørgensen, Kasper Winther
Speaker recognition is basically divided into speaker identification and speaker verification. Verification is the task of automatically determining if a person really is the person he or she claims to be. This technology can be used as a biometric feature for verifying the identity of a person...
Dowding, Dawn; Lichtner, Valentina; Allcock, Nick; Briggs, Michelle; James, Kirstin; Keady, John; Lasrado, Reena; Sampson, Elizabeth L; Swarbrick, Caroline; José Closs, S
The recognition, assessment and management of pain in hospital settings is suboptimal, and is a particular challenge in patients with dementia. The existing process guiding pain assessment and management in clinical settings is based on the assumption that nurses follow a sequential linear approach to decision making. In this paper we re-evaluate this theoretical assumption drawing on findings from a study of pain recognition, assessment and management in patients with dementia. To provide a revised conceptual model of pain recognition, assessment and management based on sense-making theories of decision making. The research we refer to is an exploratory ethnographic study using nested case sites. Patients with dementia (n=31) were the unit of data collection, nested in 11 wards (vascular, continuing care, stroke rehabilitation, orthopaedic, acute medicine, care of the elderly, elective and emergency surgery), located in four NHS hospital organizations in the UK. Data consisted of observations of patients at bedside (170h in total); observations of the context of care; audits of patient hospital records; documentary analysis of artefacts; semi-structured interviews (n=56) and informal open conversations with staff and carers (family members). Existing conceptualizations of pain recognition, assessment and management do not fully explain how the decision process occurs in clinical practice. Our research indicates that pain recognition, assessment and management is not an individual cognitive activity; rather it is carried out by groups of individuals over time and within a specific organizational culture or climate, which influences both health care professional and patient behaviour. We propose a revised theoretical model of decision making related to pain assessment and management for patients with dementia based on theories of sense-making, which is reflective of the reality of clinical decision making in acute hospital wards. The revised model recognizes the
Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean
This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Lowman, Douglas W; Greene, Rachel R; Bearden, Daniel W; Kruppa, Michael D; Pottier, Max; Monteiro, Mario A; Soldatov, Dmitriy V; Ensley, Harry E; Cheng, Shih-Chin; Netea, Mihai G; Williams, David L
The innate immune system differentially recognizes Candida albicans yeast and hyphae. It is not clear how the innate immune system effectively discriminates between yeast and hyphal forms of C. albicans. Glucans are major components of the fungal cell wall and key fungal pathogen-associated molecular patterns. C. albicans yeast glucan has been characterized; however, little is known about glucan structure in C. albicans hyphae. Using an extraction procedure that minimizes degradation of the native structure, we extracted glucans from C. albicans hyphal cell walls. (1)H NMR data analysis revealed that, when compared with reference (1→3,1→6) β-linked glucans and C. albicans yeast glucan, hyphal glucan has a unique cyclical or "closed chain" structure that is not found in yeast glucan. GC/MS analyses showed a high abundance of 3- and 6-linked glucose units when compared with yeast β-glucan. In addition to the expected (1→3), (1→6), and 3,6 linkages, we also identified a 2,3 linkage that has not been reported previously in C. albicans. Hyphal glucan induced robust immune responses in human peripheral blood mononuclear cells and macrophages via a Dectin-1-dependent mechanism. In contrast, C. albicans yeast glucan was a much less potent stimulus. We also demonstrated the capacity of C. albicans hyphal glucan, but not yeast glucan, to induce IL-1β processing and secretion. This finding provides important evidence for understanding the immune discrimination between colonization and invasion at the mucosal level. When taken together, these data provide a structural basis for differential innate immune recognition of C. albicans yeast versus hyphae.
Bascil, M Serdar; Tesneli, Ahmet Y; Temurtas, Feyzullah
Brain computer interface (BCI) is a new communication way between man and machine. It identifies mental task patterns stored in electroencephalogram (EEG). So, it extracts brain electrical activities recorded by EEG and transforms them machine control commands. The main goal of BCI is to make available assistive environmental devices for paralyzed people such as computers and makes their life easier. This study deals with feature extraction and mental task pattern recognition on 2-D cursor control from EEG as offline analysis approach. The hemispherical power density changes are computed and compared on alpha-beta frequency bands with only mental imagination of cursor movements. First of all, power spectral density (PSD) features of EEG signals are extracted and high dimensional data reduced by principle component analysis (PCA) and independent component analysis (ICA) which are statistical algorithms. In the last stage, all features are classified with two types of support vector machine (SVM) which are linear and least squares (LS-SVM) and three different artificial neural network (ANN) structures which are learning vector quantization (LVQ), multilayer neural network (MLNN) and probabilistic neural network (PNN) and mental task patterns are successfully identified via k-fold cross validation technique.
Sun, Xin; Yang, Jianping; Wang, Changgang; Dong, Junyu; Wang, Xinhua
Quantitative and statistical analysis of ocean creatures is critical to ecological and environmental studies. And living fish recognition is one of the most essential requirements for fishery industry. However, light attenuation and scattering phenomenon are present in the underwater environment, which makes underwater images low-contrast and blurry. This paper tries to design a robust framework for accurate fish recognition. The framework introduces a two stage PCA Network to extract abstract features from fish images. On a real-world fish recognition dataset, we use a linear SVM classifier and set penalty coefficients to conquer data unbalanced issue. Feature visualization results show that our method can avoid the feature distortion in boundary regions of underwater image. Experiments results show that the PCA Network can extract discriminate features and achieve promising recognition accuracy. The framework improves the recognition accuracy of underwater living fishes and can be easily applied to marine fishery industry.
Buccheri, R.; Coffaro, P.; Di Gesu, V.; Salemi, S.; Colomba, G.
Preliminary results are given of the application of a direct non parametric pattern recognition method to the classification of the pictures of a multiwire spark chamber. The method, developed in an earlier work for an optical spark chamber, looks promising. The picture sample used has with respect to the previous one, the following characteristis: a) the event pictures have a more complicated structure; b) the amount of background sparks in an event is greater; c) there exists a kind of noise which is almost always present in some structured way (double sparkling, bursts...). New features have been used to characterize the event pictures; the results show that the method could be also used as a super filter to reduce the cost of further analysis. (Auth.)
Basharirad, Babak; Moradhaseli, Mohammadreza
Recently, attention of the emotional speech signals research has been boosted in human machine interfaces due to availability of high computation capability. There are many systems proposed in the literature to identify the emotional state through speech. Selection of suitable feature sets, design of a proper classifications methods and prepare an appropriate dataset are the main key issues of speech emotion recognition systems. This paper critically analyzed the current available approaches of speech emotion recognition methods based on the three evaluating parameters (feature set, classification of features, accurately usage). In addition, this paper also evaluates the performance and limitations of available methods. Furthermore, it highlights the current promising direction for improvement of speech emotion recognition systems.
Full Text Available Background: The objective of this study was to describe the audiologic and related characteristics of a group patient with speech perception affected out of proportion to pure tone hearing loss. A case series of patient were referred for evaluation and management to the Hearing Research Center.To describe the clinical picture of the patients with the key clinical feature of hearing loss for pure tones and reduction in speech discrimination out of proportion to the pure tone loss, having some of the criteria of auditory neuropathy (i.e. normal otoacoustic emissions, OAE, and abnormal auditory brainstem evoked potentials, ABR and lacking others (e.g. present auditory reflexes. Methods: Hearing abilities were measured by Pure Tone Audiometry (PTA and Speech Discrimination Scores (SDS, measured in all patients using a standardized list of 25 monosyllabic Farsi words at MCL in quiet. Auditory pathway integrity was measured by using Auditory Brainstem Response (ABR and Otoacoustic Emission (OAE and anatomical lesions Computed Tomography Scan (CT and Magnetic Resonance Image (MRI of brain and retrocochlea. Patient included in the series were 35 patients who have SDS disproportionably low with regard to PTA, absent ABR waves and normal OAE. Results: All patients reported the beginning of their problem around adolescence. Neither of them had anatomical lesion in imaging studies and neither of them had any finding suggestive of conductive hearing lesion. Although in most of the cases the hearing loss had been more apparent in the lower frequencies (i.e. 1000 Hz and less, a stronger correlation was found between SDS and hearing threshold at higher frequencies. These patients may not benefit from hearing aids, as the outer hair cells are functional and amplification doesn’t seem to help; though, it was tried for all. Conclusion: These patients share a pattern of sensory –neural loss with no detectable lesion. The age of onset and the gradual
NSGIC Local Govt | GIS Inventory — Election Districts and Precincts dataset current as of 1991. PrecinctPoly-The data set is a polygon feature consisting of 220 segments representing voter precinct...
Shiri, Isaac; Abdollahi, Hamid; Rahmim, Arman; Ghaffarian, Pardis; Geramifar, Parham; Bitarafan-Rajabi, Ahmad
The purpose of this study was to investigate the robustness of different PET/CT image radiomic features over a wide range of different reconstruction settings. Phantom and patient studies were conducted, including two PET/CT scanners. Different reconstruction algorithms and parameters including number of sub-iterations, number of subsets, full width at half maximum (FWHM) of Gaussian filter, scan time per bed position and matrix size were studied. Lesions were delineated and one hundred radiomic features were extracted. All radiomics features were categorized based on coefficient of variation (COV). Forty seven percent features showed COV ≤ 5% and 10% of which showed COV > 20%. All geometry based, 44% and 41% of intensity based and texture based features were found as robust respectively. In regard to matrix size, 56% and 6% of all features were found non-robust (COV > 20%) and robust (COV ≤ 5%) respectively. Variability and robustness of PET/CT image radiomics in advanced reconstruction settings is feature-dependent, and different settings have different effects on different features. Radiomic features with low COV can be considered as good candidates for reproducible tumour quantification in multi-center studies. (orig.)
Shiri, Isaac; Abdollahi, Hamid [Iran University of Medical Sciences, Department of Medical Physics, School of Medicine, Tehran (Iran, Islamic Republic of); Rahmim, Arman [Johns Hopkins University, Department of Radiology, Baltimore, MD (United States); Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD (United States); Ghaffarian, Pardis [Shahid Beheshti University of Medical Sciences, Chronic Respiratory Diseases Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Tehran (Iran, Islamic Republic of); Shahid Beheshti University of Medical Sciences, PET/CT and Cyclotron Center, Masih Daneshvari Hospital, Tehran (Iran, Islamic Republic of); Geramifar, Parham [Tehran University of Medical Sciences, Research Center for Nuclear Medicine, Shariati Hospital, Tehran (Iran, Islamic Republic of); Bitarafan-Rajabi, Ahmad [Iran University of Medical Sciences, Department of Medical Physics, School of Medicine, Tehran (Iran, Islamic Republic of); Iran University of Medical Sciences, Department of Nuclear Medicine, Rajaei Cardiovascular, Medical and Research Center, Tehran (Iran, Islamic Republic of)
The purpose of this study was to investigate the robustness of different PET/CT image radiomic features over a wide range of different reconstruction settings. Phantom and patient studies were conducted, including two PET/CT scanners. Different reconstruction algorithms and parameters including number of sub-iterations, number of subsets, full width at half maximum (FWHM) of Gaussian filter, scan time per bed position and matrix size were studied. Lesions were delineated and one hundred radiomic features were extracted. All radiomics features were categorized based on coefficient of variation (COV). Forty seven percent features showed COV ≤ 5% and 10% of which showed COV > 20%. All geometry based, 44% and 41% of intensity based and texture based features were found as robust respectively. In regard to matrix size, 56% and 6% of all features were found non-robust (COV > 20%) and robust (COV ≤ 5%) respectively. Variability and robustness of PET/CT image radiomics in advanced reconstruction settings is feature-dependent, and different settings have different effects on different features. Radiomic features with low COV can be considered as good candidates for reproducible tumour quantification in multi-center studies. (orig.)
.... (4) Invariants: both geometric and other types. (5) Human faces: Analysis of images of human faces, including feature extraction, face recognition, compression, and recognition of facial expressions...
.... (4) Invariants -- both geometric and other types. (5) Human faces: Analysis of images of human faces, including feature extraction, face recognition, compression, and recognition of facial expressions...
... digital filtering for noise cancellation which interfaces to speech recognition software. It uses auditory features in speech recognition training, and provides applications to multilingual spoken language translation...
Full Text Available Abstract Emotion recognition has become a fundamental task in human-computer interaction systems. In this article, we propose an emotion recognition approach based on biologically inspired methods. Specifically, emotion classification is performed using a long short-term memory (LSTM recurrent neural network which is able to recognize long-range dependencies between successive temporal patterns. We propose to represent data using features derived from two different models: mel-frequency cepstral coefficients (MFCC and the Lyon cochlear model. In the experimental phase, results obtained from the LSTM network and the two different feature sets are compared, showing that features derived from the Lyon cochlear model give better recognition results in comparison with those obtained with the traditional MFCC representation.
Full Text Available A large number of parameters are acquired during practical water quality monitoring. If all the parameters are used in water quality assessment, the computational complexity will definitely increase. In order to reduce the input space dimensions, a fuzzy rough set was introduced to perform attribute reduction. Then, an attribute recognition theoretical model and entropy method were combined to assess water quality in the Harbin reach of the Songhuajiang River in China. A dataset consisting of ten parameters was collected from January to October in 2012. Fuzzy rough set was applied to reduce the ten parameters to four parameters: BOD5, NH3-N, TP, and F. coli (Reduct A. Considering that DO is a usual parameter in water quality assessment, another reduct, including DO, BOD5, NH3-N, TP, TN, F, and F. coli (Reduct B, was obtained. The assessment results of Reduct B show a good consistency with those of Reduct A, and this means that DO is not always necessary to assess water quality. The results with attribute reduction are not exactly the same as those without attribute reduction, which can be attributed to the α value decided by subjective experience. The assessment results gained by the fuzzy rough set obviously reduce computational complexity, and are acceptable and reliable. The model proposed in this paper enhances the water quality assessment system.
Boyle, Vicki L; Roychoudhury, Canopy; Beniak, Renee; Cohn, Lisa; Bayer, Albert; Katz, Ira
Depression is a common disorder associated with suffering, morbidity, and mortality in nursing home residents. It is treatable, and improving the quality of treatment can have a major impact. MPRO, Michigan's Quality Improvement Organization, initiated a quality-improvement project in 14 nursing facilities to improve the accuracy of assessments, targeting, and monitoring of care. Electronic Minimum Data Set (MDS) data and medical-record abstraction results were combined to form the analytic dataset. Findings from the baseline phase demonstrated that, according to medical and administrative records, 26% of newly admitted nursing home residents had symptoms of depression that were apparent at admission, and an additional 12% were recognized early in their stay. Eighty-one percent of residents with depression were receiving treatment on admission to the facility, and 79% of those with depression recognized by Day 14 were treated by then. These data demonstrate progress toward improving the initiation of treatment for depression in nursing homes; however, there are still opportunities for improving the quality of care and, especially, the quality of assessments. The authors recommend the addition of the Geriatric Depression Scale to the federally mandated MDS for cognitively intact patients. There could also be mechanisms to ensure that providers and facilities follow recommended practice guidelines. Initiating treatment with antidepressant medications should be followed with monitoring of residents to identify those who still have depressive symptoms and to modify or intensify their treatment.
Vecchiarelli, Kelly; Amar, Arun Paul; Emanuele, Donna
Pulsatile tinnitus is a whooshing sound heard synchronous with the heartbeat. It is an uncommon symptom affecting fewer than 10% of patients with tinnitus. It often goes unrecognized in the primary care setting. Failure to recognize this symptom can result in a missed or delayed diagnosis of a potentially life-threatening condition known as a dural arteriovenous fistula. The purpose of this case study is to provide a structured approach to the identification of pulsatile tinnitus and provide management recommendations. A case study and review of pertinent literature. Pulsatile tinnitus usually has a vascular treatable cause. A comprehensive history and physical examination will alert the nurse practitioner (NP) when pulsatile tinnitus is present. Auscultation in specific areas of the head can detect audible or objective pulsatile tinnitus. Pulsatile tinnitus that is audible to the examiner is an urgent medical condition requiring immediate consultation and referral. Knowledge of pulsatile tinnitus and awareness of this often treatable condition directs the NP to perform a detailed assessment when patients present with tinnitus, directs appropriate referral for care and treatment, and can reduce the risk of delayed or missed diagnosis. ©2017 American Association of Nurse Practitioners.
Xing, Yin; Chen, Chuang; Liu, Li-Long
In order to solve the problem that there is a large amount of redundant data in high-dimensional speech emotion features, we analyze deeply the extracted speech emotion features and select better features. Firstly, a given emotion is classified by each feature. Secondly, the recognition rate is ranked in descending order. Then, the optimal threshold of features is determined by rate criterion. Finally, the better features are obtained. When applied in Berlin and Chinese emotional data set, the experimental results show that the feature selection method outperforms the other traditional methods.
Tauscher, Keith; Rapetti, David; Burns, Jack O.; Switzer, Eric
The sky-averaged (global) highly redshifted 21 cm spectrum from neutral hydrogen is expected to appear in the VHF range of ∼20–200 MHz and its spectral shape and strength are determined by the heating properties of the first stars and black holes, by the nature and duration of reionization, and by the presence or absence of exotic physics. Measurements of the global signal would therefore provide us with a wealth of astrophysical and cosmological knowledge. However, the signal has not yet been detected because it must be seen through strong foregrounds weighted by a large beam, instrumental calibration errors, and ionospheric, ground, and radio-frequency-interference effects, which we collectively refer to as “systematics.” Here, we present a signal extraction method for global signal experiments which uses Singular Value Decomposition of “training sets” to produce systematics basis functions specifically suited to each observation. Instead of requiring precise absolute knowledge of the systematics, our method effectively requires precise knowledge of how the systematics can vary. After calculating eigenmodes for the signal and systematics, we perform a weighted least square fit of the corresponding coefficients and select the number of modes to include by minimizing an information criterion. We compare the performance of the signal extraction when minimizing various information criteria and find that minimizing the Deviance Information Criterion most consistently yields unbiased fits. The methods used here are built into our widely applicable, publicly available Python package, pylinex, which analytically calculates constraints on signals and systematics from given data, errors, and training sets.
Sun, Zhenan; Tan, Tieniu
Images of a human iris contain rich texture information useful for identity authentication. A key and still open issue in iris recognition is how best to represent such textural information using a compact set of features (iris features). In this paper, we propose using ordinal measures for iris feature representation with the objective of characterizing qualitative relationships between iris regions rather than precise measurements of iris image structures. Such a representation may lose some image-specific information, but it achieves a good trade-off between distinctiveness and robustness. We show that ordinal measures are intrinsic features of iris patterns and largely invariant to illumination changes. Moreover, compactness and low computational complexity of ordinal measures enable highly efficient iris recognition. Ordinal measures are a general concept useful for image analysis and many variants can be derived for ordinal feature extraction. In this paper, we develop multilobe differential filters to compute ordinal measures with flexible intralobe and interlobe parameters such as location, scale, orientation, and distance. Experimental results on three public iris image databases demonstrate the effectiveness of the proposed ordinal feature models.
This is the first text to provide a unified and self-contained introduction to visual pattern recognition and machine learning. It is useful as a general introduction to artifical intelligence and knowledge engineering, and no previous knowledge of pattern recognition or machine learning is necessary. Basic for various pattern recognition and machine learning methods. Translated from Japanese, the book also features chapter exercises, keywords, and summaries.
Grabar, Natalia; Krivine, Sonia; Jaulent, Marie-Christine
Making the distinction between expert and non expert health documents can help users to select the information which is more suitable for them, according to whether they are familiar or not with medical terminology. This issue is particularly important for the information retrieval area. In our work we address this purpose through stylistic corpus analysis and the application of machine learning algorithms. Our hypothesis is that this distinction can be performed on the basis of a small number of features and that such features can be language and domain independent. The used features were acquired in source corpus (Russian language, diabetes topic) and then tested on target (French language, pneumology topic) and source corpora. These cross-language features show 90% precision and 93% recall with non expert documents in source language; and 85% precision and 74% recall with expert documents in target language.
Wan, Lulu; Crookes, Kate; Reynolds, Katherine J; Irons, Jessica L; McKone, Elinor
Competing approaches to the other-race effect (ORE) see its primary cause as either a lack of motivation to individuate social outgroup members, or a lack of perceptual experience with other-race faces. Here, we argue that the evidence supporting the social-motivational approach derives from a particular cultural setting: a high socio-economic status group (typically US Whites) looking at the faces of a lower status group (US Blacks) with whom observers typically have at least moderate perceptual experience. In contrast, we test motivation-to-individuate instructions across five studies covering an extremely wide range of perceptual experience, in a cultural setting of more equal socio-economic status, namely Asian and Caucasian participants (N = 480) tested on Asian and Caucasian faces. We find no social-motivational component at all to the ORE, specifically: no reduction in the ORE with motivation instructions, including for novel images of the faces, and at all experience levels; no increase in correlation between own- and other-race face recognition, implying no increase in shared processes; and greater (not the predicted less) effort applied to distinguishing other-race faces than own-race faces under normal ("no instructions") conditions. Instead, the ORE was predicted by level of contact with the other-race. Our results reject both pure social-motivational theories and also the recent Categorization-Individuation model of Hugenberg, Young, Bernstein, and Sacco (2010). We propose a new dual-route approach to the ORE, in which there are two causes of the ORE-lack of motivation, and lack of experience--that contribute differently across varying world locations and cultural settings. Copyright © 2015 Elsevier B.V. All rights reserved.
Full Text Available Feature extraction is a key step in radar target recognition. The quality of the extracted features determines the performance of target recognition. However, obtaining the deep nature of the data is difficult using the traditional method. The autoencoder can learn features by making use of data and can obtain feature expressions at different levels of data. To eliminate the influence of noise, the method of radar target recognition based on stacked denoising sparse autoencoder is proposed in this paper. This method can extract features directly and efficiently by setting different hidden layers and numbers of iterations. Experimental results show that the proposed method is superior to the K-nearest neighbor method and the traditional stacked autoencoder.
Full Text Available Facial expressions communicate non-verbal cues which play an important role in interpersonal relations. Automatic recognition of facial expressions can be an important element of normal human-machine interfaces it might likewise be utilized as a part of behavioral science and in clinical practice. In spite of the fact that people perceive facial expressions for all intents and purposes immediately solid expression recognition by machine is still a challenge. From the point of view of automatic recognition a facial expression can be considered to comprise of disfigurements of the facial parts and their spatial relations or changes in the faces pigmentation. Research into automatic recognition of the facial expressions addresses the issues encompassing the representation and arrangement of static or dynamic qualities of these distortions or face pigmentation. We get results by utilizing the CVIPtools. We have taken train data set of six facial expressions of three persons and for train data set purpose we have total border mask sample 90 and 30 border mask sample for test data set purpose and we use RST- Invariant features and texture features for feature analysis and then classified them by using k- Nearest Neighbor classification algorithm. The maximum accuracy is 90.
Muhammad Hameed Siddiqi
Full Text Available Over the last decade, human facial expressions recognition (FER has emerged as an important research area. Several factors make FER a challenging research problem. These include varying light conditions in training and test images; need for automatic and accurate face detection before feature extraction; and high similarity among different expressions that makes it difﬁcult to distinguish these expressions with a high accuracy. This work implements a hierarchical linear discriminant analysis-based facial expressions recognition (HL-FER system to tackle these problems. Unlike the previous systems, the HL-FER uses a pre-processing step to eliminate light effects, incorporates a new automatic face detection scheme, employs methods to extract both global and local features, and utilizes a HL-FER to overcome the problem of high similarity among different expressions. Unlike most of the previous works that were evaluated using a single dataset, the performance of the HL-FER is assessed using three publicly available datasets under three different experimental settings: n-fold cross validation based on subjects for each dataset separately; n-fold cross validation rule based on datasets; and, ﬁnally, a last set of experiments to assess the effectiveness of each module of the HL-FER separately. Weighted average recognition accuracy of 98.7% across three different datasets, using three classifiers, indicates the success of employing the HL-FER for human FER.
Siddiqi, Muhammad Hameed; Lee, Sungyoung; Lee, Young-Koo; Khan, Adil Mehmood; Truc, Phan Tran Ho
Over the last decade, human facial expressions recognition (FER) has emerged as an important research area. Several factors make FER a challenging research problem. These include varying light conditions in training and test images; need for automatic and accurate face detection before feature extraction; and high similarity among different expressions that makes it difficult to distinguish these expressions with a high accuracy. This work implements a hierarchical linear discriminant analysis-based facial expressions recognition (HL-FER) system to tackle these problems. Unlike the previous systems, the HL-FER uses a pre-processing step to eliminate light effects, incorporates a new automatic face detection scheme, employs methods to extract both global and local features, and utilizes a HL-FER to overcome the problem of high similarity among different expressions. Unlike most of the previous works that were evaluated using a single dataset, the performance of the HL-FER is assessed using three publicly available datasets under three different experimental settings: n-fold cross validation based on subjects for each dataset separately; n-fold cross validation rule based on datasets; and, finally, a last set of experiments to assess the effectiveness of each module of the HL-FER separately. Weighted average recognition accuracy of 98.7% across three different datasets, using three classifiers, indicates the success of employing the HL-FER for human FER. PMID:24316568
Nunes, N; Ambler, G; Foo, X; Widschwendter, M; Jurkovic, D
.8-97.4%), specificity 96.3% (95% CI, 93.8-98.0%)), the specificities of the IOTA models were significantly lower (P IOTA models instead of pattern recognition (213/489 (43.6%) vs 142/489 (29.0%); P IOTA models maintained their high sensitivity when used in an outpatient setting. Specificity was relatively low, which indicates that a significant proportion of the women would have been offered unnecessary surgery for suspected ovarian cancer. These findings show that the IOTA models could be used as a first-stage test to diagnose ovarian cancer in an outpatient setting, but a different second-stage test is required to minimize the number of false-positive findings. Copyright © 2017 ISUOG. Published by John Wiley & Sons Ltd. Copyright © 2017 ISUOG. Published by John Wiley & Sons Ltd.
Bauer, Michael; Haesler, Emily; Fetherstonhaugh, Deirdre
To report on the findings of a systematic review which examined the experiences and views of older people aged 65 years and over on health professionals' recognition of sexuality and sexual health and whether these aspects of the person are incorporated into care. The review followed the methods laid out by the Joanna Briggs Institute. Eleven electronic databases were searched using the terms sexual*, aged, ageing/aging, attitudes and care in any health-care setting. Only quantitative and qualitative research and opinion papers written in English and offering unique commentary published between January 2004 and January 2015 were eligible. A total of 999 papers were initially identified and of these, 148 were assessed by two reviewers. Eighteen studies - seven quantitative, eight qualitative and three opinion papers - met the inclusion criteria and were appraised. The importance of sexuality to well-being, language used, expressing sexuality, discomfort discussing sexuality, inadequate sexuality health education and treatment and deficient communication with health-care professionals were all identified as significant issues in a range of settings. Fourteen categories and five syntheses summarize the 43 findings. Sexuality remains important for many older people; however, embarrassment, dissatisfaction with treatment, negative attitudes and seeming disinterest by health professionals can all inhibit discussions. Professionals and health-care services need to adopt strategies and demonstrate characteristics which create environments that are more supportive of sexuality. Issues related to sexuality and sexual health should be able to be discussed without anxiety or discomfort so that older people receive optimal care and treatment. © 2015 The Authors. Health Expectations Published by John Wiley & Sons Ltd.
Khaligh-Razavi, Seyed-Mahdi; Henriksson, Linda; Kay, Kendrick; Kriegeskorte, Nikolaus
Studies of the primate visual system have begun to test a wide range of complex computational object-vision models. Realistic models have many parameters, which in practice cannot be fitted using the limited amounts of brain-activity data typically available. Task performance optimization (e.g. using backpropagation to train neural networks) provides major constraints for fitting parameters and discovering nonlinear representational features appropriate for the task (e.g. object classification). Model representations can be compared to brain representations in terms of the representational dissimilarities they predict for an image set. This method, called representational similarity analysis (RSA), enables us to test the representational feature space as is (fixed RSA) or to fit a linear transformation that mixes the nonlinear model features so as to best explain a cortical area's representational space (mixed RSA). Like voxel/population-receptive-field modelling, mixed RSA uses a training set (different stimuli) to fit one weight per model feature and response channel (voxels here), so as to best predict the response profile across images for each response channel. We analysed response patterns elicited by natural images, which were measured with functional magnetic resonance imaging (fMRI). We found that early visual areas were best accounted for by shallow models, such as a Gabor wavelet pyramid (GWP). The GWP model performed similarly with and without mixing, suggesting that the original features already approximated the representational space, obviating the need for mixing. However, a higher ventral-stream visual representation (lateral occipital region) was best explained by the higher layers of a deep convolutional network and mixing of its feature set was essential for this model to explain the representation. We suspect that mixing was essential because the convolutional network had been trained to discriminate a set of 1000 categories, whose frequencies
Wan, Qian; Li, Yiran; Li, Changzhi; Pal, Ranadip
In this article, we consider the design of a human gesture recognition system based on pattern recognition of signatures from a portable smart radar sensor. Powered by AAA batteries, the smart radar sensor operates in the 2.4 GHz industrial, scientific and medical (ISM) band. We analyzed the feature space using principle components and application-specific time and frequency domain features extracted from radar signals for two different sets of gestures. We illustrate that a nearest neighbor based classifier can achieve greater than 95% accuracy for multi class classification using 10 fold cross validation when features are extracted based on magnitude differences and Doppler shifts as compared to features extracted through orthogonal transformations. The reported results illustrate the potential of intelligent radars integrated with a pattern recognition system for high accuracy smart home and health monitoring purposes.
Full Text Available An important task in machine learning is to reduce data set dimensionality, which in turn contributes to reducing computational load and data collection costs, while improving human understanding and interpretation of models. We introduce an operational guideline for determining the minimum number of instances sufficient to identify correct ranks of features with the highest impact. We conduct tests based on qualitative B2B sales forecasting data. The results show that a relatively small instance subset is sufficient for identifying the most important features when rank is not important.
Luts, J.; Heerschap, A.; Suykens, J.A.; Huffel, S. van
OBJECTIVE: This study investigates the use of automated pattern recognition methods on magnetic resonance data with the ultimate goal to assist clinicians in the diagnosis of brain tumours. Recently, the combined use of magnetic resonance imaging (MRI) and magnetic resonance spectroscopic imaging
Yao, Shoukui; Qin, Xiaojuan
Since the resolution of remote sensing infrared images is low, the features of ship targets become unstable. The issue of how to recognize ships with fuzzy features is an open problem. In this paper, we propose a novel ship target recognition algorithm based on Gaussian mixture models (GMMs). In the proposed algorithm, there are mainly two steps. At the first step, the Hu moments of these ship target images are calculated, and the GMMs are trained on the moment features of ships. At the second step, the moment feature of each ship image is assigned to the trained GMMs for recognition. Because of the scale, rotation, translation invariance property of Hu moments and the power feature-space description ability of GMMs, the GMMs-based ship target recognition algorithm can recognize ship reliably. Experimental results of a large simulating image set show that our approach is effective in distinguishing different ship types, and obtains a satisfactory ship recognition performance.
Marko Bohanec; Mirjana Kljajić Borštnar; Marko Robnik-Šikonja
An important task in machine learning is to reduce data set dimensionality, which in turn contributes to reducing computational load and data collection costs, while improving human understanding and interpretation of models. We introduce an operational guideline for determining the minimum number of instances sufficient to identify correct ranks of features with the highest impact. We conduct tests based on qualitative B2B sales forecasting data. The results show that a relatively small inst...
Ripollés, Tomás; Martínez-Pérez, María Jesús; Gómez Valencia, Diana Patricia; Vizuete, José; Martín, Gregorio
To retrospectively evaluate the accuracy of ultrasound as a diagnostic method for differentiating acute diverticulitis from colon cancer in patients with sigmoid colon stenosis. Ultrasound examinations of 91 consecutive patients with sigmoid stenosis (50 diverticulitis and 41 colon cancers) were reviewed by two trained radiologists. Sixty-five (71%) patients presented with acute abdominal symptoms. Thirteen sonographic criteria retrieved from the literature were evaluated to differentiate benign from malignant strictures. A score including all parameters which showed significant differences between benign vs. malignant was built. Sensitivity, specificity, accuracy, and positive or negative predictive values of each sonographic sign, the overall diagnosis, and sonographic score were calculated. Loss of the bowel wall stratification was the most reliable criteria for the diagnosis of malignancy (92% and 94% of sensitivity and specificity, respectively), and the best inter-radiologist agreement (κ = 0.848). Adjacent lymph nodes were the most specific feature (98%) for colon cancer, but its sensitivity was low. Global assessment could differentiate both diseases with high sensitivity (92-94.9%) and specificity (98-100%). Sonographic score >3 enabled differentiation of carcinoma from diverticulitis with 95% sensitivity and 92-94% specificity, with an area under the ROC curve of 0.98-0.987. There were no significant differences in the results between patients with acute and nonacute abdominal symptoms. The combination of several morphological sonographic findings using a score can differentiate most cases of diverticulitis from colon carcinoma in sigmoid strictures.
V. V. Streltsov
Full Text Available The psychological trauma of pulmonary tuberculosis and long-term treatment may cause the development and progression of different borderline neuropsychic disorders in patients, lower therapeutic effectiveness, and prematurely discontinue therapy. The main practical tasks of psychological rehabilitation during intensive treatment are to render care for a patient during his adaptation to the hospital setting, to correct inadequate attitude towards disease, and to motivate active cooperation with specialists. Competent psychological support of drug therapy promotes a reduction in the intensity of psychic and somatic experiences in the patient and an increase in his psychological resources. A respective microclimate in the tuberculosis control facility and a patient-centered doctorpatient model should be considered as the most important rehabilitation factors.
A pattern recognition system is described for the surveillance of a PWR reactor. This report contains four chapters. The first one succinctly deals with statistical pattern recognition principles. In the second chapter we show how a surveillance problem may be treated by pattern recognition and we present methods for surveillances (detection of abnormalities), controls (kind of running recognition) and diagnotics (kind of abnormality recognition). The third chapter shows a surveillance method of a nuclear plant. The signals used are the neutron noise observations made by the ionization chambers inserted in the reactor. Abnormality is defined in opposition with the training set witch is supposed to be an exhaustive summary of normality. In the fourth chapter we propose a scheme for an adaptative recognition and a method based on classes modelisations by hyper-spheres. This method has been tested on simulated training sets in two-dimensional feature spaces. It gives solutions to problems of non-linear separability [fr
Wang, Bo; Li, Liwei; Hurley, Thomas D; Meroueh, Samy O
End-point free energy calculations using MM-GBSA and MM-PBSA provide a detailed understanding of molecular recognition in protein-ligand interactions. The binding free energy can be used to rank-order protein-ligand structures in virtual screening for compound or target identification. Here, we carry out free energy calculations for a diverse set of 11 proteins bound to 14 small molecules using extensive explicit-solvent MD simulations. The structure of these complexes was previously solved by crystallography and their binding studied with isothermal titration calorimetry (ITC) data enabling direct comparison to the MM-GBSA and MM-PBSA calculations. Four MM-GBSA and three MM-PBSA calculations reproduced the ITC free energy within 1 kcal·mol(-1) highlighting the challenges in reproducing the absolute free energy from end-point free energy calculations. MM-GBSA exhibited better rank-ordering with a Spearman ρ of 0.68 compared to 0.40 for MM-PBSA with dielectric constant (ε = 1). An increase in ε resulted in significantly better rank-ordering for MM-PBSA (ρ = 0.91 for ε = 10), but larger ε significantly reduced the contributions of electrostatics, suggesting that the improvement is due to the nonpolar and entropy components, rather than a better representation of the electrostatics. The SVRKB scoring function applied to MD snapshots resulted in excellent rank-ordering (ρ = 0.81). Calculations of the configurational entropy using normal-mode analysis led to free energies that correlated significantly better to the ITC free energy than the MD-based quasi-harmonic approach, but the computed entropies showed no correlation with the ITC entropy. When the adaptation energy is taken into consideration by running separate simulations for complex, apo, and ligand (MM-PBSAADAPT), there is less agreement with the ITC data for the individual free energies, but remarkably good rank-ordering is observed (ρ = 0.89). Interestingly, filtering MD snapshots by prescoring
recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease.
Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang
recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease. PMID:27977767
Pauca, V. Paúl; Forkin, Michael; Xu, Xiao; Plemmons, Robert; Ross, Arun A.
Ocular recognition is a new area of biometric investigation targeted at overcoming the limitations of iris recognition performance in the presence of non-ideal data. There are several advantages for increasing the area beyond the iris, yet there are also key issues that must be addressed such as size of the ocular region, factors affecting performance, and appropriate corpora to study these factors in isolation. In this paper, we explore and identify some of these issues with the goal of better defining parameters for ocular recognition. An empirical study is performed where iris recognition methods are contrasted with texture and point operators on existing iris and face datasets. The experimental results show a dramatic recognition performance gain when additional features are considered in the presence of poor quality iris data, offering strong evidence for extending interest beyond the iris. The experiments also highlight the need for the direct collection of additional ocular imagery.
Full Text Available Multimodal signal analysis based on sophisticated sensors, efficient communicationsystems and fast parallel processing methods has a rapidly increasing range of multidisciplinaryapplications. The present paper is devoted to pattern recognition, machine learning, and the analysisof sleep stages in the detection of sleep disorders using polysomnography (PSG data, includingelectroencephalography (EEG, breathing (Flow, and electro-oculogram (EOG signals. The proposedmethod is based on the classification of selected features by a neural network system with sigmoidaland softmax transfer functions using Bayesian methods for the evaluation of the probabilities of theseparate classes. The application is devoted to the analysis of the sleep stages of 184 individualswith different diagnoses, using EEG and further PSG signals. Data analysis points to an averageincrease of the length of the Wake stage by 2.7% per 10 years and a decrease of the length of theRapid Eye Movement (REM stages by 0.8% per 10 years. The mean classification accuracy for givensets of records and single EEG and multimodal features is 88.7% ( standard deviation, STD: 2.1 and89.6% (STD:1.9, respectively. The proposed methods enable the use of adaptive learning processesfor the detection and classification of health disorders based on prior specialist experience andman–machine interaction.
Easley, Glenn R.; Colonna, Flavia
We introduce a method for classifying objects based on special cases of the generalized discrete Radon transform. We adjust the transform and the corresponding ridgelet transform by means of circular shifting and a singular value decomposition (SVD) to obtain a translation, rotation and scaling invariant set of feature vectors. We then use a back-propagation neural network to classify the input feature vectors. We conclude with experimental results and compare these with other invariant recognition methods.
Siuly; Li, Yan; Paul Wen, Peng
Motor imagery (MI) tasks classification provides an important basis for designing brain-computer interface (BCI) systems. If the MI tasks are reliably distinguished through identifying typical patterns in electroencephalography (EEG) data, a motor disabled people could communicate with a device by composing sequences of these mental states. In our earlier study, we developed a cross-correlation based logistic regression (CC-LR) algorithm for the classification of MI tasks for BCI applications, but its performance was not satisfactory. This study develops a modified version of the CC-LR algorithm exploring a suitable feature set that can improve the performance. The modified CC-LR algorithm uses the C3 electrode channel (in the international 10-20 system) as a reference channel for the cross-correlation (CC) technique and applies three diverse feature sets separately, as the input to the logistic regression (LR) classifier. The present algorithm investigates which feature set is the best to characterize the distribution of MI tasks based EEG data. This study also provides an insight into how to select a reference channel for the CC technique with EEG signals considering the anatomical structure of the human brain. The proposed algorithm is compared with eight of the most recently reported well-known methods including the BCI III Winner algorithm. The findings of this study indicate that the modified CC-LR algorithm has potential to improve the identification performance of MI tasks in BCI systems. The results demonstrate that the proposed technique provides a classification improvement over the existing methods tested. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Full Text Available In this paper, I present a novel hybrid face recognition approach based on a convolutional neural architecture, designed to robustly detect highly variable face patterns. The convolutional network extracts successively larger features in a hierarchical set of layers. With the weights of the trained neural networks there are created kernel windows used for feature extraction in a 3-stage algorithm. I present experimental results illustrating the efficiency of the proposed approach. I use a database of 796 images of 159 individuals from Reims University which contains quite a high degree of variability in expression, pose, and facial details.
Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.
In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and
Doctor, P.G.; Harrington, T.P.; Hutton, P.H.
Models have been developed that relate the rate of acoustic emissions to structural integrity. The implementation of these techniques in the field has been hindered by the noisy environment in which the data must be taken. Acoustic emissions from noncritical sources are recorded in addition to those produced by critical sources, such as flaws. A technique is discussed for prescreening acoustic events and filtering out those that are produced by noncritical sources. The methodology that was investigated is pattern recognition. Three different pattern recognition techniques were applied to a data set that consisted of acoustic emissions caused by crack growth and acoustic signals caused by extraneous noise sources. Examination of the acoustic emission data presented has uncovered several features of the data that can provide a reasonable filter. Two of the most valuable features are the frequency of maximum response and the autocorrelation coefficient at Lag 13. When these two features and several others were combined with a least squares decision algorithm, 90% of the acoustic emissions in the data set were correctly classified. It appears possible to design filters that eliminate extraneous noise sources from flaw-growth acoustic emissions using pattern recognition techniques
Acevedo, Elena; Acevedo, Antonio; Felipe, Federico; Avilés, Pedro
In this work, we present a system for pattern recognition that combines the power of genetic algorithms for solving problems and the efficiency of the morphological associative memories. We use a set of 48 tire prints divided into 8 brands of tires. The images have dimensions of 200 x 200 pixels. We applied Hough transform to obtain lines as main features. The number of lines obtained is 449. The genetic algorithm reduces the number of features to ten suitable lines that give thus the 100% of recognition. Morphological associative memories were used as evaluation function. The selection algorithms were Tournament and Roulette wheel. For reproduction, we applied one-point, two-point and uniform crossover.
Wang, Yue; Lu, Jingjing; Jiang, Bo; Feng, Feng; Jin, Zhengyu [Peking Union Medical College, Chinese Academy of Medical Sciences, Department of Radiology, Peking Union Medical College Hospital, Beijing (China); Zhu, Lan; Sun, Zhijing [Chinese Academy of Medical Sciences, Department of Obstetrics and Gynaecology, Peking Union Medical College Hospital, Peking Union Medical College, Bejing (China)
To characterize the anatomical features and clinical settings of Mayer-Rokitansky-Kuester-Hauser (MRKH) syndrome and correlate them with patterns of uterine involvement. Pelvic magnetic resonance images and medical records of 92 MRKH patients were retrospectively reviewed. Patients were subgrouped by uterine morphology: uterine agenesis, unilateral rudimentary uterus and bilateral rudimentary uteri. Uterine volume, presence of endometrium, location of ovary, endometriosis and pelvic pain were compared among groups. The mean uterine volume was 33.5 ml (17.5-90.0 ml) for unilateral uterine remnants, and 16.1 ml (3.5-21.5 ml) for bilateral uterine rudiments (p<0.01). The incidence of presence of endometrium (100% vs. 22%, p<0.001), haematometra (56% vs. 3%, p<0.001) and ovarian endometriosis (22% vs. 3%, p<0.01) was significantly increased in the group of unilateral rudimentary uteri as compared with the group of bilateral uterine remnants. Thirty-one patients (38%) showed ectopic ovaries. Pelvic pain was more common in individuals with unilateral rudimentary uterus than those who had no (56% vs. 5%, p<0.01) or bilateral uterine remnants (56% vs. 14%, p<0.05). MRKH patients with different patterns of uterine involvement may have differentiated anatomical features and clinical settings. (orig.)
Prabusankarlal, Kadayanallur Mahadevan; Thirumoorthy, Palanisamy; Manavalan, Radhakrishnan
A method using rough set feature selection and extreme learning machine (ELM) whose learning strategy and hidden node parameters are optimized by self-adaptive differential evolution (SaDE) algorithm for classification of breast masses is investigated. A pathologically proven database of 140 breast ultrasound images, including 80 benign and 60 malignant, is used for this study. A fast nonlocal means algorithm is applied for speckle noise removal, and multiresolution analysis of undecimated discrete wavelet transform is used for accurate segmentation of breast lesions. A total of 34 features, including 29 textural and five morphological, are applied to a [Formula: see text]-fold cross-validation scheme, in which more relevant features are selected by quick-reduct algorithm, and the breast masses are discriminated into benign or malignant using SaDE-ELM classifier. The diagnosis accuracy of the system is assessed using parameters, such as accuracy (Ac), sensitivity (Se), specificity (Sp), positive predictive value (PPV), negative predictive value (NPV), Matthew's correlation coefficient (MCC), and area ([Formula: see text]) under receiver operating characteristics curve. The performance of the proposed system is also compared with other classifiers, such as support vector machine and ELM. The results indicated that the proposed SaDE algorithm has superior performance with [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text] compared to other classifiers.
containing screenshots of computer programs as part of their instructions. Additionally, web browsers such as Firefox capture and cache screenshots of user...provide indications as to the class of image. At an even higher level, there is often distinguishing metadata associated with digital images. Mac’s OS X...mac os x malware discovered, takes screenshots and uploads them to unknown servers without user’s consent. [Online]. Available: http
Virginia B. Bustos
Full Text Available With the emergence of research on real-time visual feedback to supplement vocal pedagogy, the utilization of technology in the world of music is now seen to accelerate skills learning and enhance cognitive development. The researchers of this project aim to further analyze vowel intelligibility and develop software applications intended to be used not only by professional singers but also by individuals who wish to improve their singing capability. Data in the form of sung vowels and song pieces were obtained from 46 singers. A Listening Test was then conducted on these samples to obtain the ground truth for vowel classification based on human perception. Simulation of the human auditory perception of sung Filipino vowels was performed using formant frequencies and Mel-frequency cepstral coefficients as feature vector inputs to a two-stage Discriminant Analysis classifier. The setup resulted in an over-all Training Set accuracy of 89.4% and an over-all Test Set accuracy of 90.9%. The accuracy of the classifier, measured in terms of the correspondence of vowel classifications obtained from the classifier with the results of the Listening Test, reached 92.3%. Using information obtained from the classifier, offline and online/real-time software applications were developed. The main application features include the display of the spectral envelope and spectrogram, pitch and vibrato analysis and direct feedback on the classification of the sung vowel. These features were recommended by singers who were surveyed and were incorporated in the applications to aid singers to adjust formant locations, directly determine listener’s perception of sung vowels, perform modeling effectively and carry out vowel migration.
Jørgensen, Jan Guldager; Schröder, Philipp
The present paper examines trade liberalization driven by the coordination of product standards. For oligopolistic firms situated in separate markets that are initially sheltered by national standards, mutual recognition of standards implies entry and reduced profits at home paired with the oppor......The present paper examines trade liberalization driven by the coordination of product standards. For oligopolistic firms situated in separate markets that are initially sheltered by national standards, mutual recognition of standards implies entry and reduced profits at home paired...... countries and three firms, where firms first lobby for the policy coordination regime (harmonization versus mutual recognition), and subsequently, in case of harmonization, the global standard is auctioned among the firms. We discuss welfare effects and conclude with policy implications. In particular......, harmonized standards may fail to harvest the full pro-competitive effects from trade liberalization compared to mutual recognition; moreover, the issue is most pronounced in markets featuring price competition....
Full Text Available This paper presents a method of speech recognition by pattern recognition techniques. Learning consists in determining the unique characteristics of a word (cepstral coefficients by eliminating those characteristics that are different from one word to another. For learning and recognition, the system will build a dictionary of words by determining the characteristics of each word to be used in the recognition. Determining the characteristics of an audio signal consists in the following steps: noise removal, sampling it, applying Hamming window, switching to frequency domain through Fourier transform, calculating the magnitude spectrum, filtering data, determining cepstral coefficients.
Converso, L.; Hocek, S.
This paper describes computer-based optical character recognition (OCR) systems, focusing on their components (the computer, the scanner, the OCR, and the output device); how the systems work; and features to consider in selecting a system. A list of 26 questions to ask to evaluate systems for potential purchase is included. (JDD)
Pantic, Maja; Li, S.; Jain, A.
Facial expression recognition is a process performed by humans or computers, which consists of: 1. Locating faces in the scene (e.g., in an image; this step is also referred to as face detection), 2. Extracting facial features from the detected face region (e.g., detecting the shape of facial
Al-Talabani, Abdulbasit; Sellahewa, Harin; Jassim, Sabah A.
Human emotion recognition from speech is studied frequently for its importance in many applications, e.g. human-computer interaction. There is a wide diversity and non-agreement about the basic emotion or emotion-related states on one hand and about where the emotion related information lies in the speech signal on the other side. These diversities motivate our investigations into extracting Meta-features using the PCA approach, or using a non-adaptive random projection RP, which significantly reduce the large dimensional speech feature vectors that may contain a wide range of emotion related information. Subsets of Meta-features are fused to increase the performance of the recognition model that adopts the score-based LDC classifier. We shall demonstrate that our scheme outperform the state of the art results when tested on non-prompted databases or acted databases (i.e. when subjects act specific emotions while uttering a sentence). However, the huge gap between accuracy rates achieved on the different types of datasets of speech raises questions about the way emotions modulate the speech. In particular we shall argue that emotion recognition from speech should not be dealt with as a classification problem. We shall demonstrate the presence of a spectrum of different emotions in the same speech portion especially in the non-prompted data sets, which tends to be more "natural" than the acted datasets where the subjects attempt to suppress all but one emotion.
Abrams, Robert C; Nathanson, Mark; Silver, Stephanie; Ramirez, Mildred; Toner, John A; Teresi, Jeanne A
Low levels of symptom recognition by staff have been "gateway" barriers to the management of depression in long-term care. The study aims were to refine a depression training program for front-line staff in long-term care and provide evaluative knowledge outcome data. Three primary training modules provide an overview of depression symptoms; a review of causes and situational and environmental contributing factors; and communication strategies, medications, and clinical treatment strategies. McNemar's chi-square tests and paired t-tests were used to examine change in knowledge. Data were analyzed for up to 143 staff members, the majority from nursing. Significant changes (p depressive disorder.
Yamashita, Rikiya; Yamaoka, Toshihide; Nishitai, Ryuta; Isoda, Hiroyoshi; Taura, Kojiro; Arizono, Shigeki; Furuta, Akihiro; Ohno, Tsuyoshi; Ono, Ayako; Togashi, Kaori
This study aimed to evaluate the common features and variations of portal vein anatomy in right-sided round ligament (RSRL), which can help propose a method to detect and diagnose this anomaly. In this retrospective study of 14 patients with RSRL, the branching order of the portal tree was analyzed, with special focus on the relationship between the dorsal branch of the right anterior segmental portal vein (P A-D ) and the lateral segmental portal vein (P LL ), to determine the common features. The configuration of the portal vein from the main portal trunk to the right umbilical portion (RUP), the inclination of the RUP, and the number and thickness of the ramifications branching from the right anterior segmental portal vein (P A ) were evaluated for variations. In all subjects, the diverging point of the P A-D was constantly distal to that of the P LL . The portal vein configuration was I- and Z-shaped in nine and five subjects, respectively. The RUP was tilted to the right in all subjects. In Z-shaped subjects, the portal trunk between the branching point of the right posterior segmental portal vein and that of the P LL was tilted to the left in one subject and was almost parallel to the vertical plane in four subjects. Multiple ramifications were radially distributed from the P A in eight subjects, whereas one predominant P A-D branched from the P A in six subjects. Based on the diverging points of the P A-D and P LL , we proposed a three-step method for the detection and diagnosis of RSRL.
Reiss, David; Walter, Ondine; Bourgoin, Lucie; Kieffer, Brigitte L; Ouagazzal, Abdel-Mouttalib
Recognition memory is an important aspect of human declarative memory and is one of the routine memory abilities altered in patients with amnestic syndrome and Alzheimer's disease. In rodents, recognition memory has been most widely assessed using the novel object preference paradigm, which exploits the spontaneous preference that animals display for novel objects. Here, we used nose-poke units instead of objects to design a simple automated method for assessing context recognition memory in mice. In the acquisition trial, mice are exposed for the first time to an operant chamber with one blinking nose-poke unit. In the choice session, a novel nonblinking nose-poke unit is inserted into an empty spatial location and the number of nose poking dedicated to each set of nose-poke unit is used as an index of recognition memory. We report that recognition performance varies as a function of the length of the acquisition period and the retention delay and is sensitive to conventional amnestic treatments. By manipulating the features of the operant chamber during a brief retrieval episode (3-min long), we further demonstrate that reconsolidation of the original contextual memory depends on the magnitude and the type of environmental changes introduced into the familiar spatial environment. These results show that the nose-poke recognition task provides a rapid and reliable way for assessing context recognition memory in mice and offers new possibilities for the deciphering of the brain mechanisms governing the reconsolidation process.
Full Text Available Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM. Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN, then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
Zhang, Yajun; Liu, Zongtian; Zhou, Wen
Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
He, Y.; He, Y.
Urban shanty towns are communities that has contiguous old and dilapidated houses with more than 2000 square meters built-up area or more than 50 households. This study makes attempts to extract shanty towns in Nanning City using the product of Census and TripleSat satellite images. With 0.8-meter high-resolution remote sensing images, five texture characteristics (energy, contrast, maximum probability, and inverse difference moment) of shanty towns are trained and analyzed through GLCM. In this study, samples of shanty town are well classified with 98.2 % producer accuracy of unsupervised classification and 73.2 % supervised classification correctness. Low-rise and mid-rise residential blocks in Nanning City are classified into 4 different types by using k-means clustering and nearest neighbour classification respectively. This study initially establish texture feature descriptions of different types of residential areas, especially low-rise and mid-rise buildings, which would help city administrator evaluate residential blocks and reconstruction shanty towns.
Road and Street Centerlines, Street-The data set is a line feature consisting of 13948 line segments representing streets. It was created to maintain the location of city and county based streets., Published in 1989, Davis County Government.
NSGIC Local Govt | GIS Inventory — Road and Street Centerlines dataset current as of 1989. Street-The data set is a line feature consisting of 13948 line segments representing streets. It was created...
The task of a keyword recognition system is to detect the presence of certain words in a conversation based on the linguistic information present in human speech. Such keyword spotting systems have applications in homeland security, telephone surveillance and human-computer interfacing. General procedure of a keyword spotting system involves feature generation and matching. In this work, new set of features that are based on the psycho-acoustic masking nature of human speech are proposed. After developing these features a time aligned pattern matching process was implemented to locate the words in a set of unknown words. A word boundary detection technique based on frame classification using the nonlinear characteristics of speech is also addressed in this work. Validation of this keyword spotting model was done using widely acclaimed Cepstral features. The experimental results indicate the viability of using these perceptually significant features as an augmented feature set in keyword spotting.
Gfeller, Kate; Olszewski, Carol; Rychener, Marly; Sena, Kimberly; Knutson, John F; Witt, Shelley; Macpherson, Beth
The purposes of this study were (a) to compare recognition of "real-world" music excerpts by postlingually deafened adults using cochlear implants and normal-hearing adults; (b) to compare the performance of cochlear implant recipients using different devices and processing strategies; and (c) to examine the variability among implant recipients in recognition of musical selections in relation to performance on speech perception tests, performance on cognitive tests, and demographic variables. Seventy-nine cochlear implant users and 30 normal-hearing adults were tested on open-set recognition of systematically selected excerpts from musical recordings heard in real life. The recognition accuracy of the two groups was compared for three musical genre: classical, country, and pop. Recognition accuracy was correlated with speech recognition scores, cognitive measures, and demographic measures, including musical background. Cochlear implant recipients were significantly less accurate in recognition of previously familiar (known before hearing loss) musical excerpts than normal-hearing adults (p genre. Implant recipients were most accurate in the recognition of country items and least accurate in the recognition of classical items. There were no significant differences among implant recipients due to implant type (Nucleus, Clarion, or Ineraid), or programming strategy (SPEAK, CIS, or ACE). For cochlear implant recipients, correlations between melody recognition and other measures were moderate to weak in strength; those with statistically significant correlations included age at time of testing (negatively correlated), performance on selected speech perception tests, and the amount of focused music listening following implantation. Current-day cochlear implants are not effective in transmitting several key structural features (i.e., pitch, harmony, timbral blends) of music essential to open-set recognition of well-known musical selections. Consequently, implant
Full Text Available Gait is one of the few biometrics that can be measured at a distance, and is hence useful for passive surveillance as well as biometric applications. Gait recognition research is still at its infancy, however, and we have yet to solve the fundamental issue of finding gait features which at once have sufficient discrimination power and can be extracted robustly and accurately from low-resolution video. This paper describes a novel gait recognition technique based on the image self-similarity of a walking person. We contend that the similarity plot encodes a projection of gait dynamics. It is also correspondence-free, robust to segmentation noise, and works well with low-resolution video. The method is tested on multiple data sets of varying sizes and degrees of difficulty. Performance is best for fronto-parallel viewpoints, whereby a recognition rate of 98% is achieved for a data set of 6 people, and 70% for a data set of 54 people.
Pal, Sankar K; Ganivada, Avatharam
This book provides a uniform framework describing how fuzzy rough granular neural network technologies can be formulated and used in building efficient pattern recognition and mining models. It also discusses the formation of granules in the notion of both fuzzy and rough sets. Judicious integration in forming fuzzy-rough information granules based on lower approximate regions enables the network to determine the exactness in class shape as well as to handle the uncertainties arising from overlapping regions, resulting in efficient and speedy learning with enhanced performance. Layered network and self-organizing analysis maps, which have a strong potential in big data, are considered as basic modules,. The book is structured according to the major phases of a pattern recognition system (e.g., classification, clustering, and feature selection) with a balanced mixture of theory, algorithm, and application. It covers the latest findings as well as directions for future research, particularly highlighting bioinf...
In this paper, we develop a novel framework for action recognition in videos. The framework is based on automatically learning the discriminative trajectory groups that are relevant to an action. Different from previous approaches, our method does not require complex computation for graph matching or complex latent models to localize the parts. We model a video as a structured bag of trajectory groups with latent class variables. We model action recognition problem in a weakly supervised setting and learn discriminative trajectory groups by employing multiple instance learning (MIL) based Support Vector Machine (SVM) using pre-computed kernels. The kernels depend on the spatio-temporal relationship between the extracted trajectory groups and their associated features. We demonstrate both quantitatively and qualitatively that the classification performance of our proposed method is superior to baselines and several state-of-the-art approaches on three challenging standard benchmark datasets.
Solís-V., J.-Francisco; Toxqui-Quitl, Carina; Martínez-Martínez, David; H.-G., Margarita
This work presents a framework designed for the Mexican Sign Language (MSL) recognition. A data set was recorded with 24 static signs from the MSL using 5 different versions, this MSL dataset was captured using a digital camera in incoherent light conditions. Digital Image Processing was used to segment hand gestures, a uniform background was selected to avoid using gloved hands or some special markers. Feature extraction was performed by calculating normalized geometric moments of gray scaled signs, then an Artificial Neural Network performs the recognition using a 10-fold cross validation tested in weka, the best result achieved 95.83% of recognition rate.
Full Text Available We proposed a face recognition algorithm based on both the multilinear principal component analysis (MPCA and linear discriminant analysis (LDA. Compared with current traditional existing face recognition methods, our approach treats face images as multidimensional tensor in order to find the optimal tensor subspace for accomplishing dimension reduction. The LDA is used to project samples to a new discriminant feature space, while the K nearest neighbor (KNN is adopted for sample set classification. The results of our study and the developed algorithm are validated with face databases ORL, FERET, and YALE and compared with PCA, MPCA, and PCA + LDA methods, which demonstrates an improvement in face recognition accuracy.
Brouwer, A.; Hoogendoorn, M.; Naarding, E.
In this paper we evaluate the International Accounting Standards Board’s (IASB) efforts, in a discussion paper (DP) of 2013, to develop a new conceptual framework (CF) in the light of its stated ambition to establish a robust and consistent basis for future standard setting, thereby guiding standard
Full Text Available Abstract Background The waveform morphology of intracranial pressure pulses (ICP is an essential indicator for monitoring, and forecasting critical intracranial and cerebrovascular pathophysiological variations. While current ICP pulse analysis frameworks offer satisfying results on most of the pulses, we observed that the performance of several of them deteriorates significantly on abnormal, or simply more challenging pulses. Methods This paper provides two contributions to this problem. First, it introduces MOCAIP++, a generic ICP pulse processing framework that generalizes MOCAIP (Morphological Clustering and Analysis of ICP Pulse. Its strength is to integrate several peak recognition methods to describe ICP morphology, and to exploit different ICP features to improve peak recognition. Second, it investigates the effect of incorporating, automatically identified, challenging pulses into the training set of peak recognition models. Results Experiments on a large dataset of ICP signals, as well as on a representative collection of sampled challenging ICP pulses, demonstrate that both contributions are complementary and significantly improve peak recognition performance in clinical conditions. Conclusion The proposed framework allows to extract more reliable statistics about the ICP waveform morphology on challenging pulses to investigate the predictive power of these pulses on the condition of the patient.
Full Text Available Abstract Mental health problems in women during pregnancy and after childbirth and their adverse consequences for child health and development have received sustained detailed attention in high-income countries. In contrast, evidence has only been generated more recently in resource-constrained settings. In June 2007 the United Nations Population Fund, the World Health Organization, the Key Centre for Women's Health in Society, a WHO Collaborating Centre for Women's Health and the Research and Training Centre for Community Development in Vietnam convened the first international expert meeting on maternal mental health and child health and development in resource-constrained settings. It aimed to appraise the evidence about the nature, prevalence and risks for common perinatal mental disorders in women; the consequences of these for child health and development and ameliorative strategies in these contexts. The substantial disparity in rates of perinatal mental disorders between women living in high- and low-income settings, suggests social rather than biological determinants. Risks in resource-constrained contexts include: poverty; crowded living situations; limited reproductive autonomy; unintended pregnancy; lack of empathy from the intimate partner; rigid gender stereotypes about responsibility for household work and infant care; family violence; poor physical health and discrimination. Development is adversely affected if infants lack day-to-day interactions with a caregiver who can interpret their cues, and respond effectively. Women with compromised mental health are less able to provide sensitive, responsive infant care. In resource-constrained settings infants whose mothers are depressed are less likely to thrive and to receive optimal care than those whose mothers are well. The meeting outcome is the Hanoi Expert Statement (Additional file 1. It argues that the Millennium Development Goals to improve maternal health, reduce child
Monro, Donald M; Rakshit, Soumyadip; Zhang, Dexin
This paper presents a novel iris coding method based on differences of discrete cosine transform (DCT) coefficients of overlapped angular patches from normalized iris images. The feature extraction capabilities of the DCT are optimized on the two largest publicly available iris image data sets, 2,156 images of 308 eyes from the CASIA database and 2,955 images of 150 eyes from the Bath database. On this data, we achieve 100 percent Correct Recognition Rate (CRR) and perfect Receiver-Operating Characteristic (ROC) Curves with no registered false accepts or rejects. Individual feature bit and patch position parameters are optimized for matching through a product-of-sum approach to Hamming distance calculation. For verification, a variable threshold is applied to the distance metric and the False Acceptance Rate (FAR) and False Rejection Rate (FRR) are recorded. A new worst-case metric is proposed for predicting practical system performance in the absence of matching failures, and the worst case theoretical Equal Error Rate (EER) is predicted to be as low as 2.59 x 10(-4) on the available data sets.
Ramirez, J.; Gorriz, J. M.; Segura, J. C.
This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...
speaker diarization code was optimized to execute faster and yield a lower Diarization Error Rate (DER). Minimizing the file read and write operations...PLP features were calculated using the same procedure described in Section 2.1.1 A second set of models was estimated that include Speaker Adaptive...non-SAT HMMs. Constrained Maximum Likelihood Linear Regression (CMLLR) transforms were estimated for each speaker , and recognition lattices were
Dielmann, Alfred; Renals, Steve
Joint Dialogue Act segmentation and classification of the new AMI meeting corpus has been performed through an integrated framework based on a switching dynamic Bayesian network and a set of continuous features and language models. The recognition process is based on a dictionary of 15 DA classes tailored for group decision-making. Experimental results show that a novel interpolated Factored Language Model results in a low error rate on the automatic segmentation task, an...
A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.
Vivek Shrivastava; Navdeep Sharma
Optical Character Recognition deals in recognition and classification of characters from an image. For the recognition to be accurate, certain topological and geometrical properties are calculated, based on which a character is classified and recognized. Also, the Human psychology perceives characters by its overall shape and features such as strokes, curves, protrusions, enclosures etc. These properties, also called Features are extracted from the image by means of spatial pixel-...
Choe, Howard C.; Karlsen, Robert E.; Gerhart, Grant R.; Meitzler, Thomas J.
We present, in this paper, a wavelet-based acoustic signal analysis to remotely recognize military vehicles using their sound intercepted by acoustic sensors. Since expedited signal recognition is imperative in many military and industrial situations, we developed an algorithm that provides an automated, fast signal recognition once implemented in a real-time hardware system. This algorithm consists of wavelet preprocessing, feature extraction and compact signal representation, and a simple but effective statistical pattern matching. The current status of the algorithm does not require any training. The training is replaced by human selection of reference signals (e.g., squeak or engine exhaust sound) distinctive to each individual vehicle based on human perception. This allows a fast archiving of any new vehicle type in the database once the signal is collected. The wavelet preprocessing provides time-frequency multiresolution analysis using discrete wavelet transform (DWT). Within each resolution level, feature vectors are generated from statistical parameters and energy content of the wavelet coefficients. After applying our algorithm on the intercepted acoustic signals, the resultant feature vectors are compared with the reference vehicle feature vectors in the database using statistical pattern matching to determine the type of vehicle from where the signal originated. Certainly, statistical pattern matching can be replaced by an artificial neural network (ANN); however, the ANN would require training data sets and time to train the net. Unfortunately, this is not always possible for many real world situations, especially collecting data sets from unfriendly ground vehicles to train the ANN. Our methodology using wavelet preprocessing and statistical pattern matching provides robust acoustic signal recognition. We also present an example of vehicle recognition using acoustic signals collected from two different military ground vehicles. In this paper, we will
Fisher, Jane Rw; de Mello, Meena Cabral; Izutsu, Takashi; Tran, Tuan
Mental health problems in women during pregnancy and after childbirth and their adverse consequences for child health and development have received sustained detailed attention in high-income countries. In contrast, evidence has only been generated more recently in resource-constrained settings.In June 2007 the United Nations Population Fund, the World Health Organization, the Key Centre for Women's Health in Society, a WHO Collaborating Centre for Women's Health and the Research and Training Centre for Community Development in Vietnam convened the first international expert meeting on maternal mental health and child health and development in resource-constrained settings. It aimed to appraise the evidence about the nature, prevalence and risks for common perinatal mental disorders in women; the consequences of these for child health and development and ameliorative strategies in these contexts.The substantial disparity in rates of perinatal mental disorders between women living in high- and low-income settings, suggests social rather than biological determinants. Risks in resource-constrained contexts include: poverty; crowded living situations; limited reproductive autonomy; unintended pregnancy; lack of empathy from the intimate partner; rigid gender stereotypes about responsibility for household work and infant care; family violence; poor physical health and discrimination. Development is adversely affected if infants lack day-to-day interactions with a caregiver who can interpret their cues, and respond effectively. Women with compromised mental health are less able to provide sensitive, responsive infant care. In resource-constrained settings infants whose mothers are depressed are less likely to thrive and to receive optimal care than those whose mothers are well.The meeting outcome is the Hanoi Expert Statement (Additional file 1). It argues that the Millennium Development Goals to improve maternal health, reduce child mortality, promote gender equality
Anne, Koteswara Rao; Vankayalapati, Hima Deepthi
This book presents state of art research in speech emotion recognition. Readers are first presented with basic research and applications – gradually more advance information is provided, giving readers comprehensive guidance for classify emotions through speech. Simulated databases are used and results extensively compared, with the features and the algorithms implemented using MATLAB. Various emotion recognition models like Linear Discriminant Analysis (LDA), Regularized Discriminant Analysis (RDA), Support Vector Machines (SVM) and K-Nearest neighbor (KNN) and are explored in detail using prosody and spectral features, and feature fusion techniques.
Jones, Todd C; Bartlett, James C
In three experiments, a dual-process approach to face recognition memory is examined, with a specific focus on the idea that a recollection process can be used to retrieve configural information of a studied face. Subjects could avoid, with confidence, a recognition error to conjunction lure faces (each a reconfiguration of features from separate studied faces) or feature lure faces (each based on a set of old features and a set of new features) by recalling a studied configuration. In Experiment 1, study repetition (one vs. eight presentations) was manipulated, and in Experiments 2 and 3, retention interval over a short number of trials (0-20) was manipulated. Different measures converged on the conclusion that subjects were unable to use a recollection process to retrieve configural information in an effort to temper recognition errors for conjunction or feature lure faces. A single process, familiarity, appears to be the sole process underlying recognition of conjunction and feature faces, and familiarity contributes, perhaps in whole, to discrimination of old from conjunction faces.
Nor Aziyatul Izni Mohd Rosli
Full Text Available Gender recognition is trivial for a physiotherapist, but it is considered a challenge for computers. The electromyography (EMG and heart rate variability (HRV were utilized in this work for gender recognition during exercise using a stepper. The relevant features were extracted and selected. The selected features were then fused to automatically predict gender recognition. However, the feature selection for gender classification became a challenge to ensure better accuracy. Thus, in this paper, a feature selection approach based on both the performance and the diversity between the two features from the rank-score characteristic (RSC function in a combinatorial fusion approach (CFA (Hsu et al. was employed. Then, the features from the selected feature sets were fused using a CFA. The results were then compared with other fusion techniques such as naive bayes (NB, decision tree (J48, k-nearest neighbor (KNN and support vector machine (SVM. Besides, the results were also compared with previous researches in gender recognition. The experimental results showed that the CFA was efficient and effective for feature selection. The fusion method was also able to improve the accuracy of the gender recognition rate. The CFA provides much better gender classification results which is 94.51% compared to Barani’s work (90.34%, Nazarloo’s work (92.50%, and other classifiers.
Wang, Liqiang; Wang, Xin; Xi, Fubiao; Dong, Jian
One of the important part of object target recognition is the feature extraction, which can be classified into feature extraction and automatic feature extraction. The traditional neural network is one of the automatic feature extraction methods, while it causes high possibility of over-fitting due to the global connection. The deep learning algorithm used in this paper is a hierarchical automatic feature extraction method, trained with the layer-by-layer convolutional neural network (CNN), which can extract the features from lower layers to higher layers. The features are more discriminative and it is beneficial to the object target recognition.
1 Department of Electronics and Communication, G.L.A. University, 17-km stone, NH#2, Delhi-Mathura Road, .... Based upon these range of values, a decision is taken about the ...... triplet half-band filter bank and flexible k-out-of-n: A post.
Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan
Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition. PMID:28937987
Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan
Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.
Full Text Available Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB algorithm plus Support vector machine (SVM is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.
Choi, Io Teng; Leong, Chi Chong; Hong, Ka Wo; Pun, Chi-Man
With the progress of smartphones hardware, it is simple on smartphone using image recognition technique such as face detection. In addition, indoor navigation system development is much slower than outdoor navigation system. Hence, this research proves a usage of image recognition technique for navigation in indoor environment. In this paper, we introduced an indoor navigation application that uses the indoor environment features to locate user's location and a route calculating algorithm to generate an appropriate path for user. The application is implemented on Android smartphone rather than iPhone. Yet, the application design can also be applied on iOS because the design is implemented without using special features only for Android. We found that digital navigation system provides better and clearer location information than paper map. Also, the indoor environment is ideal for Image recognition processing. Hence, the results motivate us to design an indoor navigation system using image recognition.
This paper describes a research work on computer aided vision relating to the design of a vision system which can recognize isolated handwritten characters written on a mobile support. We use a technique which consists in analyzing information contained in the contours of the polygon circumscribed to the character's shape. These contours are segmented and labelled to give a new set of features constituted by: - right and left 'profiles', - topological and algebraic unvarying properties. A new method of character's recognition induced from this representation based on a multilevel hierarchical technique is then described. In the primary level, we use a fuzzy classification with dynamic programming technique using 'profiles'. The other levels adjust the recognition by using topological and algebraic unvarying properties. Several results are presented and an accuracy of 99 pc was reached for handwritten numeral characters, thereby attesting the robustness of our algorithm. (author) [fr
Gambone, Elisabeth A.
Spacecraft control algorithms must know the expected vehicle response to any command to the available control effectors, such as reaction thrusters or torque devices. Spacecraft control system design approaches have traditionally relied on the estimated vehicle mass properties to determine the desired force and moment, as well as knowledge of the effector performance to efficiently control the spacecraft. A pattern recognition approach was used to investigate the relationship between the control effector commands and spacecraft responses. Instead of supplying the approximated vehicle properties and the thruster performance characteristics, a database of information relating the thruster ring commands and the desired vehicle response was used for closed-loop control. A Monte Carlo simulation data set of the spacecraft dynamic response to effector commands was analyzed to establish the influence a command has on the behavior of the spacecraft. A tool developed at NASA Johnson Space Center to analyze flight dynamics Monte Carlo data sets through pattern recognition methods was used to perform this analysis. Once a comprehensive data set relating spacecraft responses with commands was established, it was used in place of traditional control methods and gains set. This pattern recognition approach was compared with traditional control algorithms to determine the potential benefits and uses.
Swapnil Vitthal Tathe
Full Text Available Advancement in computer vision technology and availability of video capturing devices such as surveillance cameras has evoked new video processing applications. The research in video face recognition is mostly biased towards law enforcement applications. Applications involves human recognition based on face and iris, human computer interaction, behavior analysis, video surveillance etc. This paper presents face tracking framework that is capable of face detection using Haar features, recognition using Gabor feature extraction, matching using correlation score and tracking using Kalman filter. The method has good recognition rate for real-life videos and robust performance to changes due to illumination, environmental factors, scale, pose and orientations.
Yuan, Chunfeng; Li, Xi; Hu, Weiming; Ling, Haibin; Maybank, Stephen J
In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved.
In dialect speech recognition, the feature of tone in one dialect is subject to changes in pitch frequency as well as the length of tone. It is beneficial for the recognition if a representation can be derived to account for the frequency and length changes of tone in an effective and meaningful way. In this paper, we propose a method for learning such a representation from a set of unlabeled speech sentences containing the features of the dialect changed from various pitch frequencies and time length. Topographic independent component analysis (TICA) is applied for the unsupervised learning to produce an emergent result that is a topographic matrix made up of basis components. The dialect speech is topographic in the following sense: the basis components as the units of the speech are ordered in the feature matrix such that components of one dialect are grouped in one axis and changes in time windows are accounted for in the other axis. This provides a meaningful set of basis vectors that may be used to construct dialect subspaces for dialect speech recognition.
Lee, Jen-Chun; Chang, Chien-Ping; Chen, Wei-Kuei
Directional empirical mode decomposition (DEMD) has recently been proposed to make empirical mode decomposition suitable for the processing of texture analysis. Using DEMD, samples are decomposed into a series of images, referred to as two-dimensional intrinsic mode functions (2-D IMFs), from finer to large scale. A DEMD-based 2 linear discriminant analysis (LDA) for palm vein recognition is proposed. The proposed method progresses through three steps: (i) a set of 2-D IMF features of various scale and orientation are extracted using DEMD, (ii) the 2LDA method is then applied to reduce the dimensionality of the feature space in both the row and column directions, and (iii) the nearest neighbor classifier is used for classification. We also propose two strategies for using the set of 2-D IMF features: ensemble DEMD vein representation (EDVR) and multichannel DEMD vein representation (MDVR). In experiments using palm vein databases, the proposed MDVR-based 2LDA method achieved recognition accuracy of 99.73%, thereby demonstrating its feasibility for palm vein recognition.
van de Sande, Koen E A; Gevers, Theo; Snoek, Cees G M
Image category recognition is important to access visual information on the level of objects and scene types. So far, intensity-based descriptors have been widely used for feature extraction at salient points. To increase illumination invariance and discriminative power, color descriptors have been proposed. Because many different descriptors exist, a structured overview is required of color invariant descriptors in the context of image category recognition. Therefore, this paper studies the invariance properties and the distinctiveness of color descriptors (software to compute the color descriptors from this paper is available from http://www.colordescriptors.com) in a structured way. The analytical invariance properties of color descriptors are explored, using a taxonomy based on invariance properties with respect to photometric transformations, and tested experimentally using a data set with known illumination conditions. In addition, the distinctiveness of color descriptors is assessed experimentally using two benchmarks, one from the image domain and one from the video domain. From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition. The results further reveal that, for light intensity shifts, the usefulness of invariance is category-specific. Overall, when choosing a single descriptor and no prior knowledge about the data set and object and scene categories is available, the OpponentSIFT is recommended. Furthermore, a combined set of color descriptors outperforms intensity-based SIFT and improves category recognition by 8 percent on the PASCAL VOC 2007 and by 7 percent on the Mediamill Challenge.
Full Text Available Abstract Action recognition from video is a problem that has many important applications to human motion analysis. In real-world settings, the viewpoint of the camera cannot always be fixed relative to the subject, so view-invariant action recognition methods are needed. Previous view-invariant methods use multiple cameras in both the training and testing phases of action recognition or require storing many examples of a single action from multiple viewpoints. In this paper, we present a framework for learning a compact representation of primitive actions (e.g., walk, punch, kick, sit that can be used for video obtained from a single camera for simultaneous action recognition and viewpoint estimation. Using our method, which models the low-dimensional structure of these actions relative to viewpoint, we show recognition rates on a publicly available dataset previously only achieved using multiple simultaneous views.
Alvi, Fahad Bashir; Pears, Russel
This Research study proposes a novel method for face recognition based on Anthropometric features that make use of an integrated approach comprising of a global and personalized models. The system is aimed to at situations where lighting, illumination, and pose variations cause problems in face recognition. A Personalized model covers the individual aging patterns while a Global model captures general aging patterns in the database. We introduced a de-aging factor that de-ages each individual in the database test and training sets. We used the k nearest neighbor approach for building a personalized model and global model. Regression analysis was applied to build the models. During the test phase, we resort to voting on different features. We used FG-Net database for checking the results of our technique and achieved 65 percent Rank 1 identification rate.
Petridis, Stavros; Li, Zuwei; Pantic, Maja
Traditional visual speech recognition systems consist of two stages, feature extraction and classification. Recently, several deep learning approaches have been presented which automatically extract features from the mouth images and aim to replace the feature extraction stage. However, research on
Kim, Woong-Ki; Park, Soon-Yong; Lee, Yong-Bum; Kim, Seung-Ho; Lee, Jong-Min; Chien, Sung-Il.
The nuclear fuel rods should be discriminated and managed systematically by numeric characters which are printed at the end part of each rod in the process of producing fuel assembly. The characters are used to examine manufacturing process of the fuel rods in the inspection process of irradiated fuel rod. Therefore automatic character recognition is one of the most important technologies to establish automatic manufacturing process of fuel assembly. In the developed character recognition system, mesh feature set extracted from each character written in the fuel rod is employed to train a neural network based on back-propagation algorithm as a classifier for character recognition system. Performance evaluation has been achieved on a test set which is not included in a training character set. (author)
Voss, Joel L.; Baym, Carol L.; Paller, Ken A.
Recognition confidence and the explicit awareness of memory retrieval commonly accompany accurate responding in recognition tests. Memory performance in recognition tests is widely assumed to measure explicit memory, but the generality of this assumption is questionable. Indeed, whether recognition in nonhumans is always supported by explicit memory is highly controversial. Here we identified circumstances wherein highly accurate recognition was unaccompanied by hallmark features of explicit ...
van Kasteren, T.; Noulas, A.; Englebienne, G.; Kröse, B.
A sensor system capable of automatically recognizing activities would allow many potential ubiquitous applications. In this paper, we present an easy to install sensor network and an accurate but inexpensive annotation method. A recorded dataset consisting of 28 days of sensor data and its
Full Text Available The extraction of a valuable set of features and the design of a discriminative classifier are crucial for target recognition in SAR image. Although various features and classifiers have been proposed over the years, target recognition under extended operating conditions (EOCs is still a challenging problem, e.g., target with configuration variation, different capture orientations, and articulation. To address these problems, this paper presents a new strategy for target recognition. We first propose a low-dimensional representation model via incorporating multi-manifold regularization term into the low-rank matrix factorization framework. Two rules, pairwise similarity and local linearity, are employed for constructing multiple manifold regularization. By alternately optimizing the matrix factorization and manifold selection, the feature representation model can not only acquire the optimal low-rank approximation of original samples, but also capture the intrinsic manifold structure information. Then, to take full advantage of the local structure property of features and further improve the discriminative ability, local sparse representation is proposed for classification. Finally, extensive experiments on moving and stationary target acquisition and recognition (MSTAR database demonstrate the effectiveness of the proposed strategy, including target recognition under EOCs, as well as the capability of small training size.
Peng, Liangrui; Liu, Changsong; Ding, Xiaoqing; Wang, Hua; Jin, Jianming
Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some characteristics, for example, one character may be part of another character, we define the character set for recognition according to the segmented components, and the components are combined into characters by rule-based post-processing module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented. For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and connected components. As Mongolian has different font-types which are categorized into two major groups, the parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant character recognition kernels are integrated. Experiments show that the presented methods are effective. The text recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.
Hefnawy, Alaa; Mashali, Samia A.; Rashwan, Mohsen; Fikri, Magdi
This paper introduces a fast and efficient indexing approach for both 2D and 3D model-based object recognition in the presence of rotation, translation, and scale variations of objects. The indexing entries are computed after preprocessing the data by Haar wavelet decomposition. The scheme is based on a unified image feature detection approach based on Zernike moments. A set of low level features, e.g. high precision edges, gray level corners, are estimated by a set of orthogonal Zernike moments, calculated locally around every image point. A high dimensional, highly descriptive indexing entries are then calculated based on the correlation of these local features and employed for fast access to the model database to generate hypotheses. A list of the most candidate models is then presented by evaluating the hypotheses. Experimental results are included to demonstrate the effectiveness of the proposed indexing approach.
Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying
In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.
The research and development of pattern recognition have proven to be of importance in science, technology, and human activity. Many useful concepts and tools from different disciplines have been employed in pattern recognition. Among them is string matching, which receives much theoretical and practical attention. String matching is also an important topic in combinatorial optimization. This book is devoted to recent advances in pattern recognition and string matching. It consists of twenty eight chapters written by different authors, addressing a broad range of topics such as those from classifica tion, matching, mining, feature selection, and applications. Each chapter is self-contained, and presents either novel methodological approaches or applications of existing theories and techniques. The aim, intent, and motivation for publishing this book is to pro vide a reference tool for the increasing number of readers who depend upon pattern recognition or string matching in some way. This includes student...
Scriba, Gerhard K E
Chiral recognition phenomena play an important role in nature as well as analytical separation sciences. In separation sciences such as chromatography and capillary electrophoresis, enantiospecific interactions between the enantiomers of an analyte and the chiral selector are required in order to observe enantioseparations. Due to the large structural variety of chiral selectors applied, different mechanisms and structural features contribute to the chiral recognition process. This chapter briefly illustrates the current models of the enantiospecific recognition on the structural basics of various chiral selectors.