WorldWideScience

Sample records for supervised learning algorithm

  1. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification

    Directory of Open Access Journals (Sweden)

    R. Sathya

    2013-02-01

    Full Text Available This paper presents a comparative account of unsupervised and supervised learning models and their pattern classification evaluations as applied to the higher education scenario. Classification plays a vital role in machine based learning algorithms and in the present study, we found that, though the error back-propagation learning algorithm as provided by supervised learning model is very efficient for a number of non-linear real-time problems, KSOM of unsupervised learning model, offers efficient solution and classification in the present study.

  2. QUEST: Eliminating Online Supervised Learning for Efficient Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Ardjan Zwartjes

    2016-10-01

    Full Text Available In this work, we introduce QUEST (QUantile Estimation after Supervised Training, an adaptive classification algorithm for Wireless Sensor Networks (WSNs that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting raw sensor data puts high demands on the battery, reducing network life time. By merely transmitting partial results or classifications based on the sampled data, the amount of traffic on the network can be significantly reduced. Such classifications can be made by learning based algorithms using sampled data. An important issue, however, is the training phase of these learning based algorithms. Training a deployed sensor network requires a lot of communication and an impractical amount of human involvement. QUEST is a hybrid algorithm that combines supervised learning in a controlled environment with unsupervised learning on the location of deployment. Using the SITEX02 dataset, we demonstrate that the presented solution works with a performance penalty of less than 10% in 90% of the tests. Under some circumstances, it even outperforms a network of classifiers completely trained with supervised learning. As a result, the need for on-site supervised learning and communication for training is completely eliminated by our solution.

  3. Supervised learning algorithms for visual object categorization

    NARCIS (Netherlands)

    bin Abdullah, A.

    2010-01-01

    This thesis presents novel techniques for image recognition systems for better understanding image content. More specifically, it looks at the algorithmic aspects and experimental verification to demonstrate the capability of the proposed algorithms. These techniques aim to improve the three major

  4. A new supervised learning algorithm for spiking neurons.

    Science.gov (United States)

    Xu, Yan; Zeng, Xiaoqin; Zhong, Shuiming

    2013-06-01

    The purpose of supervised learning with temporal encoding for spiking neurons is to make the neurons emit a specific spike train encoded by the precise firing times of spikes. If only running time is considered, the supervised learning for a spiking neuron is equivalent to distinguishing the times of desired output spikes and the other time during the running process of the neuron through adjusting synaptic weights, which can be regarded as a classification problem. Based on this idea, this letter proposes a new supervised learning method for spiking neurons with temporal encoding; it first transforms the supervised learning into a classification problem and then solves the problem by using the perceptron learning rule. The experiment results show that the proposed method has higher learning accuracy and efficiency over the existing learning methods, so it is more powerful for solving complex and real-time problems.

  5. An AdaBoost algorithm for multiclass semi-supervised learning

    NARCIS (Netherlands)

    Tanha, J.; van Someren, M.; Afsarmanesh, H.; Zaki, M.J.; Siebes, A.; Yu, J.X.; Goethals, B.; Webb, G.; Wu, X.

    2012-01-01

    We present an algorithm for multiclass Semi-Supervised learning which is learning from a limited amount of labeled data and plenty of unlabeled data. Existing semi-supervised algorithms use approaches such as one-versus-all to convert the multiclass problem to several binary classification problems

  6. Online semi-supervised learning: algorithm and application in metagenomics

    NARCIS (Netherlands)

    S. Imangaliyev; B. Keijser; W. Crielaard; E. Tsivtsivadze

    2013-01-01

    As the amount of metagenomic data grows rapidly, online statistical learning algorithms are poised to play key role in metagenome analysis tasks. Frequently, data are only partially labeled, namely dataset contains partial information about the problem of interest. This work presents an algorithm an

  7. Online Semi-Supervised Learning: Algorithm and Application in Metagenomics

    NARCIS (Netherlands)

    Imangaliyev, S.; Keijser, B.J.F.; Crielaard, W.; Tsivtsivadze, E.

    2013-01-01

    As the amount of metagenomic data grows rapidly, online statistical learning algorithms are poised to play key rolein metagenome analysis tasks. Frequently, data are only partially labeled, namely dataset contains partial information about the problem of interest. This work presents an algorithm and

  8. Predicting incomplete gene microarray data with the use of supervised learning algorithms

    CSIR Research Space (South Africa)

    Twala, B

    2010-10-01

    Full Text Available of many well-established supervised learning (SL) algorithms in an attempt to provide more accurate and automatic diagnosis class (cancer/non cancer) prediction. Virtually all research on SL addresses the task of learning to classify complete domain...

  9. Fall detection using supervised machine learning algorithms: A comparative study

    KAUST Repository

    Zerrouki, Nabil

    2017-01-05

    Fall incidents are considered as the leading cause of disability and even mortality among older adults. To address this problem, fall detection and prevention fields receive a lot of intention over the past years and attracted many researcher efforts. We present in the current study an overall performance comparison between fall detection systems using the most popular machine learning approaches which are: Naïve Bayes, K nearest neighbor, neural network, and support vector machine. The analysis of the classification power associated to these most widely utilized algorithms is conducted on two fall detection databases namely FDD and URFD. Since the performance of the classification algorithm is inherently dependent on the features, we extracted and used the same features for all classifiers. The classification evaluation is conducted using different state of the art statistical measures such as the overall accuracy, the F-measure coefficient, and the area under ROC curve (AUC) value.

  10. Cost-conscious comparison of supervised learning algorithms over multiple data sets

    OpenAIRE

    Ulaş, Aydın; Yıldız, Olcay Taner; Alpaydın, Ahmet İbrahim Ethem

    2012-01-01

    In the literature, there exist statistical tests to compare supervised learning algorithms on multiple data sets in terms of accuracy but they do not always generate an ordering. We propose Multi(2)Test, a generalization of our previous work, for ordering multiple learning algorithms on multiple data sets from "best" to "worst" where our goodness measure is composed of a prior cost term additional to generalization error. Our simulations show that Multi2Test generates orderings using pairwise...

  11. Semi-supervised prediction of gene regulatory networks using machine learning algorithms

    Indian Academy of Sciences (India)

    Nihir Patel; T L Wang

    2015-10-01

    Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.

  12. Inductive Supervised Quantum Learning

    Science.gov (United States)

    Monràs, Alex; Sentís, Gael; Wittek, Peter

    2017-05-01

    In supervised learning, an inductive learning algorithm extracts general rules from observed training instances, then the rules are applied to test instances. We show that this splitting of training and application arises naturally, in the classical setting, from a simple independence requirement with a physical interpretation of being nonsignaling. Thus, two seemingly different definitions of inductive learning happen to coincide. This follows from the properties of classical information that break down in the quantum setup. We prove a quantum de Finetti theorem for quantum channels, which shows that in the quantum case, the equivalence holds in the asymptotic setting, that is, for large numbers of test instances. This reveals a natural analogy between classical learning protocols and their quantum counterparts, justifying a similar treatment, and allowing us to inquire about standard elements in computational learning theory, such as structural risk minimization and sample complexity.

  13. Automated Quality Assessment of Structural Magnetic Resonance Brain Images Based on a Supervised Machine Learning Algorithm

    Directory of Open Access Journals (Sweden)

    Ricardo Andres Pizarro

    2016-12-01

    Full Text Available High-resolution three-dimensional magnetic resonance imaging (3D-MRI is being increasingly used to delineate morphological changes underlying neuropsychiatric disorders. Unfortunately, artifacts frequently compromise the utility of 3D-MRI yielding irreproducible results, from both type I and type II errors. It is therefore critical to screen 3D-MRIs for artifacts before use. Currently, quality assessment involves slice-wise visual inspection of 3D-MRI volumes, a procedure that is both subjective and time consuming. Automating the quality rating of 3D-MRI could improve the efficiency and reproducibility of the procedure. The present study is one of the first efforts to apply a support vector machine (SVM algorithm in the quality assessment of structural brain images, using global and region of interest (ROI automated image quality features developed in-house. SVM is a supervised machine-learning algorithm that can predict the category of test datasets based on the knowledge acquired from a learning dataset. The performance (accuracy of the automated SVM approach was assessed, by comparing the SVM-predicted quality labels to investigator-determined quality labels. The accuracy for classifying 1457 3D-MRI volumes from our database using the SVM approach is around 80%. These results are promising and illustrate the possibility of using SVM as an automated quality assessment tool for 3D-MRI.

  14. How to measure metallicity from five-band photometry with supervised machine learning algorithms

    CERN Document Server

    Acquaviva, Viviana

    2015-01-01

    We demonstrate that it is possible to measure metallicity from the SDSS five-band photometry to better than 0.1 dex using supervised machine learning algorithms. Using spectroscopic estimates of metallicity as ground truth, we build, optimize and train several estimators to predict metallicity. We use the observed photometry, as well as derived quantities such as stellar mass and photometric redshift, as features, and we build two sample data sets at median redshifts of 0.103 and 0.218 and median r-band magnitude of 17.5 and 18.3 respectively. We find that ensemble methods, such as Random Forests of Trees and Extremely Randomized Trees, and Support Vector Machines all perform comparably well and can measure metallicity with a Root Mean Square Error (RMSE) of 0.081 and 0.090 for the two data sets when all objects are included. The fraction of outliers (objects for which the difference between true and predicted metallicity is larger than 0.2 dex) is only 2.2 and 3.9% respectively, and the RMSE decreases to 0.0...

  15. A Comparison of Supervised Machine Learning Algorithms and Feature Vectors for MS Lesion Segmentation Using Multimodal Structural MRI

    Science.gov (United States)

    Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.

    2014-01-01

    Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953

  16. Comparison of supervised machine learning algorithms for waterborne pathogen detection using mobile phone fluorescence microscopy

    Science.gov (United States)

    Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Benien, Parul; Ozcan, Aydogan

    2017-06-01

    Giardia lamblia is a waterborne parasite that affects millions of people every year worldwide, causing a diarrheal illness known as giardiasis. Timely detection of the presence of the cysts of this parasite in drinking water is important to prevent the spread of the disease, especially in resource-limited settings. Here we provide extended experimental testing and evaluation of the performance and repeatability of a field-portable and cost-effective microscopy platform for automated detection and counting of Giardia cysts in water samples, including tap water, non-potable water, and pond water. This compact platform is based on our previous work, and is composed of a smartphone-based fluorescence microscope, a disposable sample processing cassette, and a custom-developed smartphone application. Our mobile phone microscope has a large field of view of 0.8 cm2 and weighs only 180 g, excluding the phone. A custom-developed smartphone application provides a user-friendly graphical interface, guiding the users to capture a fluorescence image of the sample filter membrane and analyze it automatically at our servers using an image processing algorithm and training data, consisting of >30,000 images of cysts and >100,000 images of other fluorescent particles that are captured, including, e.g. dust. The total time that it takes from sample preparation to automated cyst counting is less than an hour for each 10 ml of water sample that is tested. We compared the sensitivity and the specificity of our platform using multiple supervised classification models, including support vector machines and nearest neighbors, and demonstrated that a bootstrap aggregating (i.e. bagging) approach using raw image file format provides the best performance for automated detection of Giardia cysts. We evaluated the performance of this machine learning enabled pathogen detection device with water samples taken from different sources (e.g. tap water, non-potable water, pond water) and achieved a

  17. Comparison of supervised machine learning algorithms for waterborne pathogen detection using mobile phone fluorescence microscopy

    Directory of Open Access Journals (Sweden)

    Ceylan Koydemir Hatice

    2017-06-01

    Full Text Available Giardia lamblia is a waterborne parasite that affects millions of people every year worldwide, causing a diarrheal illness known as giardiasis. Timely detection of the presence of the cysts of this parasite in drinking water is important to prevent the spread of the disease, especially in resource-limited settings. Here we provide extended experimental testing and evaluation of the performance and repeatability of a field-portable and cost-effective microscopy platform for automated detection and counting of Giardia cysts in water samples, including tap water, non-potable water, and pond water. This compact platform is based on our previous work, and is composed of a smartphone-based fluorescence microscope, a disposable sample processing cassette, and a custom-developed smartphone application. Our mobile phone microscope has a large field of view of ~0.8 cm2 and weighs only ~180 g, excluding the phone. A custom-developed smartphone application provides a user-friendly graphical interface, guiding the users to capture a fluorescence image of the sample filter membrane and analyze it automatically at our servers using an image processing algorithm and training data, consisting of >30,000 images of cysts and >100,000 images of other fluorescent particles that are captured, including, e.g. dust. The total time that it takes from sample preparation to automated cyst counting is less than an hour for each 10 ml of water sample that is tested. We compared the sensitivity and the specificity of our platform using multiple supervised classification models, including support vector machines and nearest neighbors, and demonstrated that a bootstrap aggregating (i.e. bagging approach using raw image file format provides the best performance for automated detection of Giardia cysts. We evaluated the performance of this machine learning enabled pathogen detection device with water samples taken from different sources (e.g. tap water, non-potable water, pond

  18. Comparison of supervised machine learning algorithms for waterborne pathogen detection using mobile phone fluorescence microscopy

    KAUST Repository

    Ceylan Koydemir, Hatice

    2017-06-14

    Giardia lamblia is a waterborne parasite that affects millions of people every year worldwide, causing a diarrheal illness known as giardiasis. Timely detection of the presence of the cysts of this parasite in drinking water is important to prevent the spread of the disease, especially in resource-limited settings. Here we provide extended experimental testing and evaluation of the performance and repeatability of a field-portable and cost-effective microscopy platform for automated detection and counting of Giardia cysts in water samples, including tap water, non-potable water, and pond water. This compact platform is based on our previous work, and is composed of a smartphone-based fluorescence microscope, a disposable sample processing cassette, and a custom-developed smartphone application. Our mobile phone microscope has a large field of view of ~0.8 cm2 and weighs only ~180 g, excluding the phone. A custom-developed smartphone application provides a user-friendly graphical interface, guiding the users to capture a fluorescence image of the sample filter membrane and analyze it automatically at our servers using an image processing algorithm and training data, consisting of >30,000 images of cysts and >100,000 images of other fluorescent particles that are captured, including, e.g. dust. The total time that it takes from sample preparation to automated cyst counting is less than an hour for each 10 ml of water sample that is tested. We compared the sensitivity and the specificity of our platform using multiple supervised classification models, including support vector machines and nearest neighbors, and demonstrated that a bootstrap aggregating (i.e. bagging) approach using raw image file format provides the best performance for automated detection of Giardia cysts. We evaluated the performance of this machine learning enabled pathogen detection device with water samples taken from different sources (e.g. tap water, non-potable water, pond water) and achieved

  19. Supervised Learning in Multilayer Spiking Neural Networks

    CERN Document Server

    Sporea, Ioana

    2012-01-01

    The current article introduces a supervised learning algorithm for multilayer spiking neural networks. The algorithm presented here overcomes some limitations of existing learning algorithms as it can be applied to neurons firing multiple spikes and it can in principle be applied to any linearisable neuron model. The algorithm is applied successfully to various benchmarks, such as the XOR problem and the Iris data set, as well as complex classifications problems. The simulations also show the flexibility of this supervised learning algorithm which permits different encodings of the spike timing patterns, including precise spike trains encoding.

  20. Algorithm of Supervised Learning on Outlier Manifold%有监督的噪音流形学习算法

    Institute of Scientific and Technical Information of China (English)

    黄添强; 李凯; 郑之

    2011-01-01

    流形学习算法是维度约简与数据可视化领域的重要工具,提高算法的效率与健壮性对其实际应用有积极意义.经典的流形学习算法普遍的对噪音点较为敏感,现有的改进算法尚存在不足.本文提出一种基于监督学习与核函数的健壮流形学习算法,把核方法与监督学习引入降维过程,利用已知标签数据信息与核函数特性,使得同类样本变得紧密,不同类样本变成分散,提高后续分类任务的效果,降低算法对流形上噪音的敏感性.在UCI数据与白血病拉曼光谱数据上的实验表明本文改进的算法具有更高的抗噪性.%Manifold learning algorithm is an important tool in the field of dimension reduction and data visualization. Improving the algorithm's efficiency and robustness is of positive significance to its practical application. Classical manifold learning algorithm is sensitive to noise points,and its improved algorithms have been imperfect. This paper presents a robust manifold learning algorithm based on supervised learning and kernel function. It introduces nuclear methods and supervised learning into the dimensionality reduction ,and takes full advantage of the label of some data and the property of kernel function. The proposed algorithm can make close and same types of samples and distribute different types of samples,thus to improves the effect of the classification task and reduce the noise sensitivity of outliers on manifold. The experiments on the UCI data and Raman data of leukemia reveal that the algorithm has better noise immunity.

  1. Novel Approaches for Diagnosing Melanoma Skin Lesions Through Supervised and Deep Learning Algorithms.

    Science.gov (United States)

    Premaladha, J; Ravichandran, K S

    2016-04-01

    Dermoscopy is a technique used to capture the images of skin, and these images are useful to analyze the different types of skin diseases. Malignant melanoma is a kind of skin cancer whose severity even leads to death. Earlier detection of melanoma prevents death and the clinicians can treat the patients to increase the chances of survival. Only few machine learning algorithms are developed to detect the melanoma using its features. This paper proposes a Computer Aided Diagnosis (CAD) system which equips efficient algorithms to classify and predict the melanoma. Enhancement of the images are done using Contrast Limited Adaptive Histogram Equalization technique (CLAHE) and median filter. A new segmentation algorithm called Normalized Otsu's Segmentation (NOS) is implemented to segment the affected skin lesion from the normal skin, which overcomes the problem of variable illumination. Fifteen features are derived and extracted from the segmented images are fed into the proposed classification techniques like Deep Learning based Neural Networks and Hybrid Adaboost-Support Vector Machine (SVM) algorithms. The proposed system is tested and validated with nearly 992 images (malignant & benign lesions) and it provides a high classification accuracy of 93 %. The proposed CAD system can assist the dermatologists to confirm the decision of the diagnosis and to avoid excisional biopsies.

  2. Evaluation of supervised machine-learning algorithms to distinguish between inflammatory bowel disease and alimentary lymphoma in cats.

    Science.gov (United States)

    Awaysheh, Abdullah; Wilcke, Jeffrey; Elvinger, François; Rees, Loren; Fan, Weiguo; Zimmerman, Kurt L

    2016-11-01

    Inflammatory bowel disease (IBD) and alimentary lymphoma (ALA) are common gastrointestinal diseases in cats. The very similar clinical signs and histopathologic features of these diseases make the distinction between them diagnostically challenging. We tested the use of supervised machine-learning algorithms to differentiate between the 2 diseases using data generated from noninvasive diagnostic tests. Three prediction models were developed using 3 machine-learning algorithms: naive Bayes, decision trees, and artificial neural networks. The models were trained and tested on data from complete blood count (CBC) and serum chemistry (SC) results for the following 3 groups of client-owned cats: normal, inflammatory bowel disease (IBD), or alimentary lymphoma (ALA). Naive Bayes and artificial neural networks achieved higher classification accuracy (sensitivities of 70.8% and 69.2%, respectively) than the decision tree algorithm (63%, p machine learning provided a method for distinguishing between ALA-IBD, ALA-normal, and IBD-normal. The naive Bayes and artificial neural networks classifiers used 10 and 4 of the CBC and SC variables, respectively, to outperform the C4.5 decision tree, which used 5 CBC and SC variables in classifying cats into the 3 classes. These models can provide another noninvasive diagnostic tool to assist clinicians with differentiating between IBD and ALA, and between diseased and nondiseased cats. © 2016 The Author(s).

  3. Incremental Supervised Subspace Learning for Face Recognition

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Subspace learning algorithms have been well studied in face recognition. Among them, linear discriminant analysis (LDA) is one of the most widely used supervised subspace learning method. Due to the difficulty of designing an incremental solution of the eigen decomposition on the product of matrices, there is little work for computing LDA incrementally. To avoid this limitation, an incremental supervised subspace learning (ISSL) algorithm was proposed, which incrementally learns an adaptive subspace by optimizing the maximum margin criterion (MMC). With the dynamically added face images, ISSL can effectively constrain the computational cost. Feasibility of the new algorithm has been successfully tested on different face data sets.

  4. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    In this paper, a new supervised clustering and classification method is proposed. First, the application of discriminant partial least squares (DPLS) for the selection of a minimum number of key genes is applied on a gene expression microarray data set. Second, supervised hierarchical clustering ...

  5. Application of supervised machine learning algorithms for the classification of regulatory RNA riboswitches.

    Science.gov (United States)

    Singh, Swadha; Singh, Raghvendra

    2016-04-03

    Riboswitches, the small structured RNA elements, were discovered about a decade ago. It has been the subject of intense interest to identify riboswitches, understand their mechanisms of action and use them in genetic engineering. The accumulation of genome and transcriptome sequence data and comparative genomics provide unprecedented opportunities to identify riboswitches in the genome. In the present study, we have evaluated the following six machine learning algorithms for their efficiency to classify riboswitches: J48, BayesNet, Naïve Bayes, Multilayer Perceptron, sequential minimal optimization, hidden Markov model (HMM). For determining effective classifier, the algorithms were compared on the statistical measures of specificity, sensitivity, accuracy, F-measure and receiver operating characteristic (ROC) plot analysis. The classifier Multilayer Perceptron achieved the best performance, with the highest specificity, sensitivity, F-score and accuracy, and with the largest area under the ROC curve, whereas HMM was the poorest performer. At present, the available tools for the prediction and classification of riboswitches are based on covariance model, support vector machine and HMM. The present study determines Multilayer Perceptron as a better classifier for the genome-wide riboswitch searches.

  6. Classification models for clear cell renal carcinoma stage progression, based on tumor RNAseq expression trained supervised machine learning algorithms.

    Science.gov (United States)

    Jagga, Zeenia; Gupta, Dinesh

    2014-01-01

    Clear-cell Renal Cell Carcinoma (ccRCC) is the most- prevalent, chemotherapy resistant and lethal adult kidney cancer. There is a need for novel diagnostic and prognostic biomarkers for ccRCC, due to its heterogeneous molecular profiles and asymptomatic early stage. This study aims to develop classification models to distinguish early stage and late stage of ccRCC based on gene expression profiles. We employed supervised learning algorithms- J48, Random Forest, SMO and Naïve Bayes; with enriched model learning by fast correlation based feature selection to develop classification models trained on sequencing based gene expression data of RNAseq experiments, obtained from The Cancer Genome Atlas. Different models developed in the study were evaluated on the basis of 10 fold cross validations and independent dataset testing. Random Forest based prediction model performed best amongst the models developed in the study, with a sensitivity of 89%, accuracy of 77% and area under Receivers Operating Curve of 0.8. We anticipate that the prioritized subset of 62 genes and prediction models developed in this study will aid experimental oncologists to expedite understanding of the molecular mechanisms of stage progression and discovery of prognostic factors for ccRCC tumors.

  7. Results of Evolution Supervised by Genetic Algorithms

    CERN Document Server

    Jäntschi, Lorentz; Bălan, Mugur C; Sestraş, Radu E

    2010-01-01

    A series of results of evolution supervised by genetic algorithms with interest to agricultural and horticultural fields are reviewed. New obtained original results from the use of genetic algorithms on structure-activity relationships are reported.

  8. Multi-Instance Learning from Supervised View

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Zhou

    2006-01-01

    In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances,and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the view of supervised learning. First, by analyzing some representative learning algorithms, this paper shows that multi-instance learners can be derived from supervised learners by shifting their focuses from the discrimination on the instances to the discrimination on the bags. Second, considering that ensemble learning paradigms can effectively enhance supervised learners, this paper proposes to build multi-instance ensembles to solve multi-instance problems. Experiments on a real-world benchmark test show that ensemble learning paradigms can significantly enhance multi-instance learners.

  9. Supervised Dictionary Learning

    Science.gov (United States)

    2008-11-01

    recently led to state-of-the-art results for numerous low-level image processing tasks such as denoising [2], show- ing that sparse models are well... denoising via sparse and redundant representations over learned dictio- naries. IEEE Trans. IP, 54(12), 2006. [3] K. Huang and S. Aviyente. Sparse...2006. [19] M. Aharon, M. Elad, and A. M. Bruckstein. The K- SVD : An algorithm for designing of overcomplete dictionaries for sparse representations

  10. Learning Dynamics in Doctoral Supervision

    DEFF Research Database (Denmark)

    Kobayashi, Sofie

    This doctoral research explores doctoral supervision within life science research in a Danish university. From one angle it investigates doctoral students’ experiences with strengthening the relationship with their supervisors through a structured meeting with the supervisor, prepared as part...... investigates learning opportunities in supervision with multiple supervisors. This was investigated through observations and recording of supervision, and subsequent analysis of transcripts. The analyses used different perspectives on learning; learning as participation, positioning theory and variation theory....... The research illuminates how learning opportunities are created in the interaction through the scientific discussions. It also shows how multiple supervisors can contribute to supervision by providing new perspectives and opinions that have a potential for creating new understandings. The combination...

  11. Algorithms for Reinforcement Learning

    CERN Document Server

    Szepesvari, Csaba

    2010-01-01

    Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms'

  12. Graph-based semi-supervised learning

    CERN Document Server

    Subramanya, Amarnag

    2014-01-01

    While labeled data is expensive to prepare, ever increasing amounts of unlabeled data is becoming widely available. In order to adapt to this phenomenon, several semi-supervised learning (SSL) algorithms, which learn from labeled as well as unlabeled data, have been developed. In a separate line of work, researchers have started to realize that graphs provide a natural way to represent data in a variety of domains. Graph-based SSL algorithms, which bring together these two lines of work, have been shown to outperform the state-of-the-art in many applications in speech processing, computer visi

  13. 监督学习的发展动态%Current Directions in Supervised Learning Research

    Institute of Scientific and Technical Information of China (English)

    蒋艳凰; 周海芳; 杨学军

    2003-01-01

    Supervised learning is very important in machine learning area. It has been making great progress in manydirections. This article summarizes three of these directions ,which are the hot problems in supervised learning field.These three directions are (a) improving classification accuracy by learning ensembles of classifiers, (b) methods forscaling up supervised learning algorithm, (c) extracting understandable rules from classifiers.

  14. Supervision Learning as Conceptual Threshold Crossing: When Supervision Gets "Medieval"

    Science.gov (United States)

    Carter, Susan

    2016-01-01

    This article presumes that supervision is a category of teaching, and that we all "learn" how to teach better. So it enquires into what novice supervisors need to learn. An anonymised digital questionnaire sought data from supervisors [n226] on their experiences of supervision to find out what was difficult, and supervisor interviews…

  15. Supervision Learning as Conceptual Threshold Crossing: When Supervision Gets "Medieval"

    Science.gov (United States)

    Carter, Susan

    2016-01-01

    This article presumes that supervision is a category of teaching, and that we all "learn" how to teach better. So it enquires into what novice supervisors need to learn. An anonymised digital questionnaire sought data from supervisors [n226] on their experiences of supervision to find out what was difficult, and supervisor interviews…

  16. Supervised Dictionary Learning

    CERN Document Server

    Mairal, Julien; Ponce, Jean; Sapiro, Guillermo; Zisserman, Andrew

    2008-01-01

    It is now well established that sparse signal models are well suited to restoration tasks and can effectively be learned from audio, image, and video data. Recent research has been aimed at learning discriminative sparse models instead of purely reconstructive ones. This paper proposes a new step in that direction, with a novel sparse representation for signals belonging to different classes in terms of a shared dictionary and multiple class-decision functions. The linear variant of the proposed model admits a simple probabilistic interpretation, while its most general variant admits an interpretation in terms of kernels. An optimization framework for learning all the components of the proposed model is presented, along with experimental results on standard handwritten digit and texture classification tasks.

  17. Supervised Speech Separation Based on Deep Learning: An Overview

    OpenAIRE

    Wang, DeLiang; Chen, Jitong

    2017-01-01

    Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem. A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data. Over the past decade, many supervised separation algorithms have been put forward. In particular, the recent introduction of deep learning ...

  18. A Supervised Classification Algorithm for Note Onset Detection

    Directory of Open Access Journals (Sweden)

    Douglas Eck

    2007-01-01

    Full Text Available This paper presents a novel approach to detecting onsets in music audio files. We use a supervised learning algorithm to classify spectrogram frames extracted from digital audio as being onsets or nononsets. Frames classified as onsets are then treated with a simple peak-picking algorithm based on a moving average. We present two versions of this approach. The first version uses a single neural network classifier. The second version combines the predictions of several networks trained using different hyperparameters. We describe the details of the algorithm and summarize the performance of both variants on several datasets. We also examine our choice of hyperparameters by describing results of cross-validation experiments done on a custom dataset. We conclude that a supervised learning approach to note onset detection performs well and warrants further investigation.

  19. 基于半监督的SVM迁移学习文本分类算法%Semi-Supervised Transfer Learning Text Classiifcation Algorithms Based on SVM

    Institute of Scientific and Technical Information of China (English)

    谭建平; 刘波; 肖燕珊

    2016-01-01

    随着互联网的快速发展,文本信息量巨大,大规模的文本处理已经成为一个挑战。文本处理的一个重要技术便是分类,基于SVM的传统文本分类算法已经无法满足快速的文本增长分类。于是如何利用过时的历史文本数据(源任务数据)进行迁移来帮助新产生文本数据进行分类显得异常重要。文章提出了基于半监督的SVM迁移学习算法(Semi-supervised TL_SVM)来对文本进行分类。首先,在半监督SVM的模型中引入迁移学习,构建分类模型。其次,采用交互迭代的方法对目标方程求解,最终得到面向目标领域的分类器。实验验证了基于半监督的SVM迁移学习分类器具有比传统分类器更高的精确度。%With the rapid development of the Internet, texts contain a huge amount of information and the large-scale text processing has become a challenge. An important technical of the text processing is classiifcation, the traditional text categorization algorithm based on SVM has been unable to meet the rapid growth of text classiifcation. So how to utilize the source tasks data to help build a transfer learning classiifer for the target task is especially important. Semi-supervised TL_SVM algorithms is proposed to text classiifcation. First, semi-supervised SVM model combines transfer learning to build the model of classiifcation. Second, we utilize the iterative algorithm to solve the optimization function and obtain the transfer classiifer for the target task. Experiments have shown that our Semi-supervised-based transfer SVM can obtain higher accuracy compared with the traditional method.

  20. Supervised Machine Learning Algorithms Can Classify Open-Text Feedback of Doctor Performance With Human-Level Accuracy.

    Science.gov (United States)

    Gibbons, Chris; Richards, Suzanne; Valderas, Jose Maria; Campbell, John

    2017-03-15

    Machine learning techniques may be an effective and efficient way to classify open-text reports on doctor's activity for the purposes of quality assurance, safety, and continuing professional development. The objective of the study was to evaluate the accuracy of machine learning algorithms trained to classify open-text reports of doctor performance and to assess the potential for classifications to identify significant differences in doctors' professional performance in the United Kingdom. We used 1636 open-text comments (34,283 words) relating to the performance of 548 doctors collected from a survey of clinicians' colleagues using the General Medical Council Colleague Questionnaire (GMC-CQ). We coded 77.75% (1272/1636) of the comments into 5 global themes (innovation, interpersonal skills, popularity, professionalism, and respect) using a qualitative framework. We trained 8 machine learning algorithms to classify comments and assessed their performance using several training samples. We evaluated doctor performance using the GMC-CQ and compared scores between doctors with different classifications using t tests. Individual algorithm performance was high (range F score=.68 to .83). Interrater agreement between the algorithms and the human coder was highest for codes relating to "popular" (recall=.97), "innovator" (recall=.98), and "respected" (recall=.87) codes and was lower for the "interpersonal" (recall=.80) and "professional" (recall=.82) codes. A 10-fold cross-validation demonstrated similar performance in each analysis. When combined together into an ensemble of multiple algorithms, mean human-computer interrater agreement was .88. Comments that were classified as "respected," "professional," and "interpersonal" related to higher doctor scores on the GMC-CQ compared with comments that were not classified (Pdoctors who were rated as popular or innovative and those who were not rated at all (P>.05). Machine learning algorithms can classify open-text feedback

  1. Supervised Multi-Manifold Learning Algorithm Based on ISOMAP%基于等距映射的监督多流形学习算法

    Institute of Scientific and Technical Information of China (English)

    邵超; 万春红

    2014-01-01

    The existing supervised multi-manifold learning algorithms adjust the distances between data points according to their class labels, and hence the multiple manifolds can be classified successfully. However, the poor generalization ability of these algorithms results in unfaithful display of the intrinsic geometric structure of some manifolds. A supervised multi-manifold learning algorithm based on Isometric mapping ( ISOMAP) is proposed. The shortest path algorithm suitable for the multi-manifold structure is used to compute the shortest path distances which can effectively approximate the corresponding geodesic distances even in the multi-manifold structure. Then, Sammon mapping is used to further preserve shorter distances in the low-dimensional embedding space. Consequently, the intrinsic geometric structure of each manifold can be faithfully displayed. Moreover, the manifolds of new data points can be precisely judged based on the similarities between neighboring local tangent spaces according to the local Euclidean nature of the manifold, and thus the proposed algorithm obtains a good generalization ability. The effectiveness of the proposed algorithm is verified by experimental results.%目前的监督多流形学习算法大多数都根据数据的类别标记对彼此间的距离进行调整,能较好实现多流形的分类,但难以成功展现各流形的内在几何结构,泛化能力也较差,因此文中提出一种基于等距映射的监督多流形学习算法。该算法采用适合于多流形的最短路径算法,得到在多流形下依然能正确逼近相应测地距离的最短路径距离,并采用Sammon映射以更好地保持短距离,最终可成功展现各流形的内在几何结构。此外,该算法根据邻近局部切空间的相似性可准确判定新数据点所在的流形,从而具有较强的泛化能力。该算法的有效性可通过实验结果得以证实。

  2. SLEAS: Supervised Learning using Entropy as Attribute Selection Measure

    Directory of Open Access Journals (Sweden)

    Kishor Kumar Reddy C

    2014-10-01

    Full Text Available There is embryonic importance in scaling up the broadly used decision tree learning algorithms to huge datasets. Even though abundant diverse methodologies have been proposed, a fast tree growing algorithm without substantial decrease in accuracy and substantial increase in space complexity is essential to a greater extent. This paper aims at improving the performance of the SLIQ (Supervised Learning in Quest decision tree algorithm for classification in data mining. In the present research, we adopted entropy as attribute selection measure, which overcomes the problems facing with Gini Index. Classification accuracy of the proposed supervised learning using entropy as attribute selection measure (SLEAS algorithm is compared with the existing SLIQ algorithm using twelve datasets taken from UCI Machine Learning Repository, and the results yields that the SLEAS outperforms when compared with SLIQ decision tree. Further, error rate is also computed and the results clearly show that the SLEAS algorithm is giving less error rate when compared with SLIQ decision tree.

  3. Benchmarking protein classification algorithms via supervised cross-validation.

    Science.gov (United States)

    Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor

    2008-04-24

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and

  4. Document Classification Using Expectation Maximization with Semi Supervised Learning

    CERN Document Server

    Nigam, Bhawna; Salve, Sonal; Vamney, Swati

    2011-01-01

    As the amount of online document increases, the demand for document classification to aid the analysis and management of document is increasing. Text is cheap, but information, in the form of knowing what classes a document belongs to, is expensive. The main purpose of this paper is to explain the expectation maximization technique of data mining to classify the document and to learn how to improve the accuracy while using semi-supervised approach. Expectation maximization algorithm is applied with both supervised and semi-supervised approach. It is found that semi-supervised approach is more accurate and effective. The main advantage of semi supervised approach is "Dynamically Generation of New Class". The algorithm first trains a classifier using the labeled document and probabilistically classifies the unlabeled documents. The car dataset for the evaluation purpose is collected from UCI repository dataset in which some changes have been done from our side.

  5. Integrating the Supervised Information into Unsupervised Learning

    Directory of Open Access Journals (Sweden)

    Ping Ling

    2013-01-01

    Full Text Available This paper presents an assembling unsupervised learning framework that adopts the information coming from the supervised learning process and gives the corresponding implementation algorithm. The algorithm consists of two phases: extracting and clustering data representatives (DRs firstly to obtain labeled training data and then classifying non-DRs based on labeled DRs. The implementation algorithm is called SDSN since it employs the tuning-scaled Support vector domain description to collect DRs, uses spectrum-based method to cluster DRs, and adopts the nearest neighbor classifier to label non-DRs. The validation of the clustering procedure of the first-phase is analyzed theoretically. A new metric is defined data dependently in the second phase to allow the nearest neighbor classifier to work with the informed information. A fast training approach for DRs’ extraction is provided to bring more efficiency. Experimental results on synthetic and real datasets verify that the proposed idea is of correctness and performance and SDSN exhibits higher popularity in practice over the traditional pure clustering procedure.

  6. Learning Dynamics in Doctoral Supervision

    DEFF Research Database (Denmark)

    Kobayashi, Sofie

    This doctoral research explores doctoral supervision within life science research in a Danish university. From one angle it investigates doctoral students’ experiences with strengthening the relationship with their supervisors through a structured meeting with the supervisor, prepared as part...... of an introduction course for new doctoral students. This study showed how the course provides an effective way build supervisee agency and strengthening supervisory relationships through clarification and alignment of expectations and sharing goals about doctoral studies. From the other angle the research...

  7. Quality of Service Routing Strategy Using Supervised Genetic Algorithm

    Institute of Scientific and Technical Information of China (English)

    WANG Zhaoxia; SUN Yugeng; WANG Zhiyong; SHEN Huayu

    2007-01-01

    A supervised genetic algorithm (SGA) is proposed to solve the quality of service (QoS)routing problems in computer networks. The supervised rules of intelligent concept are introduced into genetic algorithms (GAs) to solve the constraint optimization problem. One of the main characteristics of SGA is its searching space can be limited in feasible regions rather than infeasible regions. The superiority of SGA to other GAs lies in that some supervised search rules in which the information comes from the problems are incorporated into SGA. The simulation results show that SGA improves the ability of searching an optimum solution and accelerates the convergent process up to 20 times.

  8. Action learning in undergraduate engineering thesis supervision

    Directory of Open Access Journals (Sweden)

    Brad Stappenbelt

    2017-03-01

    Full Text Available In the present action learning implementation, twelve action learning sets were conducted over eight years. The action learning sets consisted of students involved in undergraduate engineering research thesis work. The concurrent study accompanying this initiative, investigated the influence of the action learning environment on student approaches to learning and any accompanying academic, learning and personal benefits realised. The influence of preferred learning styles on set function and student adoption of the action learning process were also examined. The action learning environment implemented had a measurable significant positive effect on student academic performance, their ability to cope with the stresses associated with conducting a research thesis, the depth of learning, the development of autonomous learners and student perception of the research thesis experience. The present study acts as an addendum to a smaller scale implementation of this action learning approach, applied to supervision of third and fourth year research projects and theses, published in 2010.

  9. Supervised Learning with Complex-valued Neural Networks

    CERN Document Server

    Suresh, Sundaram; Savitha, Ramasamy

    2013-01-01

    Recent advancements in the field of telecommunications, medical imaging and signal processing deal with signals that are inherently time varying, nonlinear and complex-valued. The time varying, nonlinear characteristics of these signals can be effectively analyzed using artificial neural networks.  Furthermore, to efficiently preserve the physical characteristics of these complex-valued signals, it is important to develop complex-valued neural networks and derive their learning algorithms to represent these signals at every step of the learning process. This monograph comprises a collection of new supervised learning algorithms along with novel architectures for complex-valued neural networks. The concepts of meta-cognition equipped with a self-regulated learning have been known to be the best human learning strategy. In this monograph, the principles of meta-cognition have been introduced for complex-valued neural networks in both the batch and sequential learning modes. For applications where the computati...

  10. Balancing Design Project Supervision and Learning Facilitation

    DEFF Research Database (Denmark)

    Nielsen, Louise Møller

    2012-01-01

    set of demands to the design lecturer. On one hand she is the facilitator of the learning process, where the students are in charge of their own projects, and where learning happens through the students’ own experiences, successes and mistakes and on the other hand she is a supervisor, who uses her...... experiences and expertise to guide the students’ decisions in relation to the design project. This paper focuses on project supervision in the context of design education – and more specifically on how this supervision is unfolded in a Problem Based Learning culture. The paper explores the supervisor......In design there is a long tradition for apprenticeship, as well as tradition for learning through design projects. Today many design educations are positioned within the University context, and have to be aligned with the learning culture and structure, which they represent. This raises a specific...

  11. A review of supervised machine learning applied to ageing research.

    Science.gov (United States)

    Fabris, Fabio; Magalhães, João Pedro de; Freitas, Alex A

    2017-04-01

    Broadly speaking, supervised machine learning is the computational task of learning correlations between variables in annotated data (the training set), and using this information to create a predictive model capable of inferring annotations for new data, whose annotations are not known. Ageing is a complex process that affects nearly all animal species. This process can be studied at several levels of abstraction, in different organisms and with different objectives in mind. Not surprisingly, the diversity of the supervised machine learning algorithms applied to answer biological questions reflects the complexities of the underlying ageing processes being studied. Many works using supervised machine learning to study the ageing process have been recently published, so it is timely to review these works, to discuss their main findings and weaknesses. In summary, the main findings of the reviewed papers are: the link between specific types of DNA repair and ageing; ageing-related proteins tend to be highly connected and seem to play a central role in molecular pathways; ageing/longevity is linked with autophagy and apoptosis, nutrient receptor genes, and copper and iron ion transport. Additionally, several biomarkers of ageing were found by machine learning. Despite some interesting machine learning results, we also identified a weakness of current works on this topic: only one of the reviewed papers has corroborated the computational results of machine learning algorithms through wet-lab experiments. In conclusion, supervised machine learning has contributed to advance our knowledge and has provided novel insights on ageing, yet future work should have a greater emphasis in validating the predictions.

  12. Subsampled Hessian Newton Methods for Supervised Learning.

    Science.gov (United States)

    Wang, Chien-Chih; Huang, Chun-Heng; Lin, Chih-Jen

    2015-08-01

    Newton methods can be applied in many supervised learning approaches. However, for large-scale data, the use of the whole Hessian matrix can be time-consuming. Recently, subsampled Newton methods have been proposed to reduce the computational time by using only a subset of data for calculating an approximation of the Hessian matrix. Unfortunately, we find that in some situations, the running speed is worse than the standard Newton method because cheaper but less accurate search directions are used. In this work, we propose some novel techniques to improve the existing subsampled Hessian Newton method. The main idea is to solve a two-dimensional subproblem per iteration to adjust the search direction to better minimize the second-order approximation of the function value. We prove the theoretical convergence of the proposed method. Experiments on logistic regression, linear SVM, maximum entropy, and deep networks indicate that our techniques significantly reduce the running time of the subsampled Hessian Newton method. The resulting algorithm becomes a compelling alternative to the standard Newton method for large-scale data classification.

  13. Balancing Design Project Supervision and Learning Facilitation

    DEFF Research Database (Denmark)

    Nielsen, Louise Møller

    2012-01-01

    experiences and expertise to guide the students’ decisions in relation to the design project. This paper focuses on project supervision in the context of design education – and more specifically on how this supervision is unfolded in a Problem Based Learning culture. The paper explores the supervisor......’s balance between the roles: 1) Design Project Supervisor – and 2) Learning Facilitator – with the aim to understand when to apply the different roles, and what to be aware of when doing so. This paper represents the first pilot-study of a larger research effort. It is based on a Lego Serious Play workshop......In design there is a long tradition for apprenticeship, as well as tradition for learning through design projects. Today many design educations are positioned within the University context, and have to be aligned with the learning culture and structure, which they represent. This raises a specific...

  14. Equality of Opportunity in Supervised Learning

    OpenAIRE

    Hardt, Moritz; Price, Eric; Srebro, Nathan

    2016-01-01

    We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to...

  15. Missing Data Imputation for Supervised Learning

    OpenAIRE

    Poulos, Jason; Valle, Rafael

    2016-01-01

    This paper compares methods for imputing missing categorical data for supervised learning tasks. The ability of researchers to accurately fit a model and yield unbiased estimates may be compromised by missing data, which are prevalent in survey-based social science research. We experiment on two machine learning benchmark datasets with missing categorical data, comparing classifiers trained on non-imputed (i.e., one-hot encoded) or imputed data with different degrees of missing-data perturbat...

  16. Developing a supervised training algorithm for limited precision feed-forward spiking neural networks

    CERN Document Server

    Stromatias, Evangelos

    2011-01-01

    Spiking neural networks have been referred to as the third generation of artificial neural networks where the information is coded as time of the spikes. There are a number of different spiking neuron models available and they are categorized based on their level of abstraction. In addition, there are two known learning methods, unsupervised and supervised learning. This thesis focuses on supervised learning where a new algorithm is proposed, based on genetic algorithms. The proposed algorithm is able to train both synaptic weights and delays and also allow each neuron to emit multiple spikes thus taking full advantage of the spatial-temporal coding power of the spiking neurons. In addition, limited synaptic precision is applied; only six bits are used to describe and train a synapse, three bits for the weights and three bits for the delays. Two limited precision schemes are investigated. The proposed algorithm is tested on the XOR classification problem where it produces better results for even smaller netwo...

  17. Semi-supervised least squares support vector machine algorithm: application to offshore oil reservoir

    Science.gov (United States)

    Luo, Wei-Ping; Li, Hong-Qi; Shi, Ning

    2016-06-01

    At the early stages of deep-water oil exploration and development, fewer and further apart wells are drilled than in onshore oilfields. Supervised least squares support vector machine algorithms are used to predict the reservoir parameters but the prediction accuracy is low. We combined the least squares support vector machine (LSSVM) algorithm with semi-supervised learning and established a semi-supervised regression model, which we call the semi-supervised least squares support vector machine (SLSSVM) model. The iterative matrix inversion is also introduced to improve the training ability and training time of the model. We use the UCI data to test the generalization of a semi-supervised and a supervised LSSVM models. The test results suggest that the generalization performance of the LSSVM model greatly improves and with decreasing training samples the generalization performance is better. Moreover, for small-sample models, the SLSSVM method has higher precision than the semi-supervised K-nearest neighbor (SKNN) method. The new semisupervised LSSVM algorithm was used to predict the distribution of porosity and sandstone in the Jingzhou study area.

  18. The Supervised Learning Gaussian Mixture Model

    Institute of Scientific and Technical Information of China (English)

    马继涌; 高文

    1998-01-01

    The traditional Gaussian Mixture Model(GMM)for pattern recognition is an unsupervised learning method.The parameters in the model are derived only by the training samples in one class without taking into account the effect of sample distributions of other classes,hence,its recognition accuracy is not ideal sometimes.This paper introduces an approach for estimating the parameters in GMM in a supervising way.The Supervised Learning Gaussian Mixture Model(SLGMM)improves the recognition accuracy of the GMM.An experimental example has shown its effectiveness.The experimental results have shown that the recognition accuracy derived by the approach is higher than those obtained by the Vector Quantization(VQ)approach,the Radial Basis Function (RBF) network model,the Learning Vector Quantization (LVQ) approach and the GMM.In addition,the training time of the approach is less than that of Multilayer Perceptrom(MLP).

  19. Opportunities to Learn Scientific Thinking in Joint Doctoral Supervision

    Science.gov (United States)

    Kobayashi, Sofie; Grout, Brian W.; Rump, Camilla Østerberg

    2015-01-01

    Research into doctoral supervision has increased rapidly over the last decades, yet our understanding of how doctoral students learn scientific thinking from supervision is limited. Most studies are based on interviews with little work being reported that is based on observation of actual supervision. While joint supervision has become widely…

  20. Using Supervised Deep Learning for Human Age Estimation Problem

    Science.gov (United States)

    Drobnyh, K. A.; Polovinkin, A. N.

    2017-05-01

    Automatic facial age estimation is a challenging task upcoming in recent years. In this paper, we propose using the supervised deep learning features to improve an accuracy of the existing age estimation algorithms. There are many approaches solving the problem, an active appearance model and the bio-inspired features are two of them which showed the best accuracy. For experiments we chose popular publicly available FG-NET database, which contains 1002 images with a broad variety of light, pose, and expression. LOPO (leave-one-person-out) method was used to estimate the accuracy. Experiments demonstrated that adding supervised deep learning features has improved accuracy for some basic models. For example, adding the features to an active appearance model gave the 4% gain (the error decreased from 4.59 to 4.41).

  1. Semi-supervised Learning with Density Based Distances

    CERN Document Server

    Bijral, Avleen S; Srebro, Nathan

    2012-01-01

    We present a simple, yet effective, approach to Semi-Supervised Learning. Our approach is based on estimating density-based distances (DBD) using a shortest path calculation on a graph. These Graph-DBD estimates can then be used in any distance-based supervised learning method, such as Nearest Neighbor methods and SVMs with RBF kernels. In order to apply the method to very large data sets, we also present a novel algorithm which integrates nearest neighbor computations into the shortest path search and can find exact shortest paths even in extremely large dense graphs. Significant runtime improvement over the commonly used Laplacian regularization method is then shown on a large scale dataset.

  2. Active semi-supervised learning method with hybrid deep belief networks.

    Science.gov (United States)

    Zhou, Shusen; Chen, Qingcai; Wang, Xiaolong

    2014-01-01

    In this paper, we develop a novel semi-supervised learning algorithm called active hybrid deep belief networks (AHD), to address the semi-supervised sentiment classification problem with deep learning. First, we construct the previous several hidden layers using restricted Boltzmann machines (RBM), which can reduce the dimension and abstract the information of the reviews quickly. Second, we construct the following hidden layers using convolutional restricted Boltzmann machines (CRBM), which can abstract the information of reviews effectively. Third, the constructed deep architecture is fine-tuned by gradient-descent based supervised learning with an exponential loss function. Finally, active learning method is combined based on the proposed deep architecture. We did several experiments on five sentiment classification datasets, and show that AHD is competitive with previous semi-supervised learning algorithm. Experiments are also conducted to verify the effectiveness of our proposed method with different number of labeled reviews and unlabeled reviews respectively.

  3. Semi-Supervised Clustering Fingerprint Positioning Algorithm Based on Distance Constraints

    Institute of Scientific and Technical Information of China (English)

    Ying Xia; Zhongzhao Zhang; Lin Ma; Yao Wang

    2015-01-01

    With the rapid development of WLAN ( Wireless Local Area Network ) technology, an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation. In this paper, it proposes a novel fingerprint positioning algorithm known as semi⁃supervised affinity propagation clustering based on distance function constraints. We show that by employing affinity propagation techniques, it is able to use a fractional labeled data to adjust similarity matrix of signal space to cluster reference points with high accuracy. The semi⁃supervised APC uses a combination of machine learning, clustering analysis and fingerprinting algorithm. By collecting data and testing our algorithm in a realistic indoor WLAN environment, the experimental results indicate that the proposed algorithm can improve positioning accuracy while reduce the online localization computation, as compared with the widely used K nearest neighbor and maximum likelihood estimation algorithms.

  4. Co-Training Semi-Supervised Active Learning Algorithm with Noise Filter%具有噪声过滤功能的协同训练半监督主动学习算法

    Institute of Scientific and Technical Information of China (English)

    詹永照; 陈亚必

    2009-01-01

    针对基于半监督学习的分类器利用未标记样本训练会引入噪声而使得分类性能下降的情形,文中提出一种具有噪声过滤功能的协同训练半监督主动学习算法.该算法以3个模糊深隐马尔可夫模型进行协同半监督学习,在适当的时候主动引入一些人机交互来补充类别标记,避免判决类别不相同时的拒判和初始时判决一致即认为正确的误判情形.同时加入噪声过滤机制,用以过滤南机器自动标记的可能是噪声的样本.将该算法应用于人脸表情识别.实验结果表明,该算法能有效提高未标记样本的利用率并降低半监督学习而引入的噪声,提高表情识别的准确率.%The classification performance of the classifier based on semi-supervised learning is weakened when the noise samples are introduced. An algorithm called co-training semi-supervised active learning with noise filter is presented to overcome this disadvantage. In this algorithm, three fuzzy buried Markov models are used to perform semi-supervised learning cooperatively. Some human-computer interactions are actively introduced into labelling the unlabeled sample at certain time in order to avoid the rejective judgment when the classifiers do not agree with each other and the inaccurate judgment when the initial weak classifiers all agree. Meanwhile, the noise filter is used to filter the possible noise samples which are labeled automatically by the computer. The proposed algorithm is applied to facial expression recognition. The experimental results show that the algorithm can effectively improve the utilization of unlabeled samples, reduce the introduction of noise samples and raise the accuracy of expression recognition.

  5. Robust head pose estimation via supervised manifold learning.

    Science.gov (United States)

    Wang, Chao; Song, Xubo

    2014-05-01

    Head poses can be automatically estimated using manifold learning algorithms, with the assumption that with the pose being the only variable, the face images should lie in a smooth and low-dimensional manifold. However, this estimation approach is challenging due to other appearance variations related to identity, head location in image, background clutter, facial expression, and illumination. To address the problem, we propose to incorporate supervised information (pose angles of training samples) into the process of manifold learning. The process has three stages: neighborhood construction, graph weight computation and projection learning. For the first two stages, we redefine inter-point distance for neighborhood construction as well as graph weight by constraining them with the pose angle information. For Stage 3, we present a supervised neighborhood-based linear feature transformation algorithm to keep the data points with similar pose angles close together but the data points with dissimilar pose angles far apart. The experimental results show that our method has higher estimation accuracy than the other state-of-art algorithms and is robust to identity and illumination variations.

  6. Research of Plant-Leaves Classification Algorithm Based on Supervised LLE

    Directory of Open Access Journals (Sweden)

    Yan Qing

    2013-06-01

    Full Text Available A new supervised LLE method based on the fisher projection was proposed in this paper, and combined it with a new classification algorithm based on manifold learning to realize the recognition of the plant leaves. Firstly,the method utilizes the Fisher projection distance to replace the sample's geodesic distance, and a new supervised LLE algorithm is obtained .Then, a classification algorithm which uses the manifold reconstruction error to distinguish the sample classification directly is adopted. This algorithm can utilize the category information better,and improve recognition rate effectively. At the same time, it has the advantage of the easily parameter estimation. The experimental results based on the real-world plant leaf databases shows its average accuracy of recognition was up to 95.17%.

  7. Semi-Supervised Learning Based on Manifold in BCI

    Institute of Scientific and Technical Information of China (English)

    Ji-Ying Zhong; Xu Lei; De-Zhong Yao

    2009-01-01

    A Laplacian support vector machine (LapSVM) algorithm,a semi-supervised learning based on manifold,is introduced to brain-computer interface (BCI) to raise the classification precision and reduce the subjects' training complexity.The data are collected from three subjects in a three-task mental imagery experiment.LapSVM and transductive SVM (TSVM) are trained with a few labeled samples and a large number of unlabeled samples.The results confirm that LapSVM has a much better classification than TSVM.

  8. Semi-supervised learning and domain adaptation in natural language processing

    CERN Document Server

    Søgaard, Anders

    2013-01-01

    This book introduces basic supervised learning algorithms applicable to natural language processing (NLP) and shows how the performance of these algorithms can often be improved by exploiting the marginal distribution of large amounts of unlabeled data. One reason for that is data sparsity, i.e., the limited amounts of data we have available in NLP. However, in most real-world NLP applications our labeled data is also heavily biased. This book introduces extensions of supervised learning algorithms to cope with data sparsity and different kinds of sampling bias.This book is intended to be both

  9. Towards designing an email classification system using multi-view based semi-supervised learning

    NARCIS (Netherlands)

    Li, Wenjuan; Meng, Weizhi; Tan, Zhiyuan; Xiang, Yang

    2014-01-01

    The goal of email classification is to classify user emails into spam and legitimate ones. Many supervised learning algorithms have been invented in this domain to accomplish the task, and these algorithms require a large number of labeled training data. However, data labeling is a labor intensive t

  10. Genetic classification of populations using supervised learning.

    LENUS (Irish Health Repository)

    Bridges, Michael

    2011-01-01

    There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case-control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available.In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies.

  11. Semi-supervised Learning with Deep Generative Models

    NARCIS (Netherlands)

    Kingma, D.P.; Rezende, D.J.; Mohamed, S.; Welling, M.

    2014-01-01

    The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and

  12. Online Semi-Supervised Learning on Quantized Graphs

    CERN Document Server

    Valko, Michal; Huang, Ling; Ting, Daniel

    2012-01-01

    In this paper, we tackle the problem of online semi-supervised learning (SSL). When data arrive in a stream, the dual problems of computation and data storage arise for any SSL method. We propose a fast approximate online SSL algorithm that solves for the harmonic solution on an approximate graph. We show, both empirically and theoretically, that good behavior can be achieved by collapsing nearby points into a set of local "representative points" that minimize distortion. Moreover, we regularize the harmonic solution to achieve better stability properties. We apply our algorithm to face recognition and optical character recognition applications to show that we can take advantage of the manifold structure to outperform the previous methods. Unlike previous heuristic approaches, we show that our method yields provable performance bounds.

  13. Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario

    CERN Document Server

    Yan, Yan; Fung, Glenn; Dy, Jennifer

    2012-01-01

    Learning algorithms normally assume that there is at most one annotation or label per data point. However, in some scenarios, such as medical diagnosis and on-line collaboration,multiple annotations may be available. In either case, obtaining labels for data points can be expensive and time-consuming (in some circumstances ground-truth may not exist). Semi-supervised learning approaches have shown that utilizing the unlabeled data is often beneficial in these cases. This paper presents a probabilistic semi-supervised model and algorithm that allows for learning from both unlabeled and labeled data in the presence of multiple annotators. We assume that it is known what annotator labeled which data points. The proposed approach produces annotator models that allow us to provide (1) estimates of the true label and (2) annotator variable expertise for both labeled and unlabeled data. We provide numerical comparisons under various scenarios and with respect to standard semi-supervised learning. Experiments showed ...

  14. Prototype Vector Machine for Large Scale Semi-Supervised Learning

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Kai; Kwok, James T.; Parvin, Bahram

    2009-04-29

    Practicaldataminingrarelyfalls exactlyinto the supervisedlearning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computationalintensivenessofgraph-based SSLarises largely from the manifold or graph regularization, which in turn lead to large models that are dificult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highlyscalable,graph-based algorithm for large-scale SSL. Our key innovation is the use of"prototypes vectors" for effcient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.

  15. The Learning Alliance: Ethics in Doctoral Supervision

    Science.gov (United States)

    Halse, Christine; Bansel, Peter

    2012-01-01

    This paper is concerned with the ethics of relationships in doctoral supervision. We give an overview of four paradigms of doctoral supervision that have endured over the past 25 years and elucidate some of their strengths and limitations, contextualise them historically and consider their implications for doctoral supervision in the contemporary…

  16. Supervised Filter Learning for Representation Based Face Recognition.

    Directory of Open Access Journals (Sweden)

    Chao Bi

    Full Text Available Representation based classification methods, such as Sparse Representation Classification (SRC and Linear Regression Classification (LRC have been developed for face recognition problem successfully. However, most of these methods use the original face images without any preprocessing for recognition. Thus, their performances may be affected by some problematic factors (such as illumination and expression variances in the face images. In order to overcome this limitation, a novel supervised filter learning algorithm is proposed for representation based face recognition in this paper. The underlying idea of our algorithm is to learn a filter so that the within-class representation residuals of the faces' Local Binary Pattern (LBP features are minimized and the between-class representation residuals of the faces' LBP features are maximized. Therefore, the LBP features of filtered face images are more discriminative for representation based classifiers. Furthermore, we also extend our algorithm for heterogeneous face recognition problem. Extensive experiments are carried out on five databases and the experimental results verify the efficacy of the proposed algorithm.

  17. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine

    Science.gov (United States)

    Gao, Fei; Mei, Jingyuan; Sun, Jinping; Wang, Jun; Yang, Erfu; Hussain, Amir

    2015-01-01

    For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM) is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a “soft-start” approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment. PMID:26275294

  18. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine.

    Directory of Open Access Journals (Sweden)

    Fei Gao

    Full Text Available For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a "soft-start" approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment.

  19. Supervised Learning of Logical Operations in Layered Spiking Neural Networks with Spike Train Encoding

    CERN Document Server

    Grüning, André

    2011-01-01

    Few algorithms for supervised training of spiking neural networks exist that can deal with patterns of multiple spikes, and their computational properties are largely unexplored. We demonstrate in a set of simulations that the ReSuMe learning algorithm can be successfully applied to layered neural networks. Input and output patterns are encoded as spike trains of multiple precisely timed spikes, and the network learns to transform the input trains into target output trains. This is done by combining the ReSuMe learning algorithm with multiplicative scaling of the connections of downstream neurons. We show in particular that layered networks with one hidden layer can learn the basic logical operations, including Exclusive-Or, while networks without hidden layer cannot, mirroring an analogous result for layered networks of rate neurons. While supervised learning in spiking neural networks is not yet fit for technical purposes, exploring computational properties of spiking neural networks advances our understand...

  20. Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression.

    Science.gov (United States)

    Zhen, Xiantong; Yu, Mengyang; Islam, Ali; Bhaduri, Mousumi; Chan, Ian; Li, Shuo

    2016-06-08

    Multioutput regression has recently shown great ability to solve challenging problems in both computer vision and medical image analysis. However, due to the huge image variability and ambiguity, it is fundamentally challenging to handle the highly complex input-target relationship of multioutput regression, especially with indiscriminate high-dimensional representations. In this paper, we propose a novel supervised descriptor learning (SDL) algorithm for multioutput regression, which can establish discriminative and compact feature representations to improve the multivariate estimation performance. The SDL is formulated as generalized low-rank approximations of matrices with a supervised manifold regularization. The SDL is able to simultaneously extract discriminative features closely related to multivariate targets and remove irrelevant and redundant information by transforming raw features into a new low-dimensional space aligned to targets. The achieved discriminative while compact descriptor largely reduces the variability and ambiguity for multioutput regression, which enables more accurate and efficient multivariate estimation. We conduct extensive evaluation of the proposed SDL on both synthetic data and real-world multioutput regression tasks for both computer vision and medical image analysis. Experimental results have shown that the proposed SDL can achieve high multivariate estimation accuracy on all tasks and largely outperforms the algorithms in the state of the arts. Our method establishes a novel SDL framework for multioutput regression, which can be widely used to boost the performance in different applications.

  1. Integrative gene network construction to analyze cancer recurrence using semi-supervised learning.

    Directory of Open Access Journals (Sweden)

    Chihyun Park

    Full Text Available BACKGROUND: The prognosis of cancer recurrence is an important research area in bioinformatics and is challenging due to the small sample sizes compared to the vast number of genes. There have been several attempts to predict cancer recurrence. Most studies employed a supervised approach, which uses only a few labeled samples. Semi-supervised learning can be a great alternative to solve this problem. There have been few attempts based on manifold assumptions to reveal the detailed roles of identified cancer genes in recurrence. RESULTS: In order to predict cancer recurrence, we proposed a novel semi-supervised learning algorithm based on a graph regularization approach. We transformed the gene expression data into a graph structure for semi-supervised learning and integrated protein interaction data with the gene expression data to select functionally-related gene pairs. Then, we predicted the recurrence of cancer by applying a regularization approach to the constructed graph containing both labeled and unlabeled nodes. CONCLUSIONS: The average improvement rate of accuracy for three different cancer datasets was 24.9% compared to existing supervised and semi-supervised methods. We performed functional enrichment on the gene networks used for learning. We identified that those gene networks are significantly associated with cancer-recurrence-related biological functions. Our algorithm was developed with standard C++ and is available in Linux and MS Windows formats in the STL library. The executable program is freely available at: http://embio.yonsei.ac.kr/~Park/ssl.php.

  2. Collaborative Supervised Learning for Sensor Networks

    Science.gov (United States)

    Wagstaff, Kiri L.; Rebbapragada, Umaa; Lane, Terran

    2011-01-01

    Collaboration methods for distributed machine-learning algorithms involve the specification of communication protocols for the learners, which can query other learners and/or broadcast their findings preemptively. Each learner incorporates information from its neighbors into its own training set, and they are thereby able to bootstrap each other to higher performance. Each learner resides at a different node in the sensor network and makes observations (collects data) independently of the other learners. After being seeded with an initial labeled training set, each learner proceeds to learn in an iterative fashion. New data is collected and classified. The learner can then either broadcast its most confident classifications for use by other learners, or can query neighbors for their classifications of its least confident items. As such, collaborative learning combines elements of both passive (broadcast) and active (query) learning. It also uses ideas from ensemble learning to combine the multiple responses to a given query into a single useful label. This approach has been evaluated against current non-collaborative alternatives, including training a single classifier and deploying it at all nodes with no further learning possible, and permitting learners to learn from their own most confident judgments, absent interaction with their neighbors. On several data sets, it has been consistently found that active collaboration is the best strategy for a distributed learner network. The main advantages include the ability for learning to take place autonomously by collaboration rather than by requiring intervention from an oracle (usually human), and also the ability to learn in a distributed environment, permitting decisions to be made in situ and to yield faster response time.

  3. Weakly supervised visual dictionary learning by harnessing image attributes.

    Science.gov (United States)

    Gao, Yue; Ji, Rongrong; Liu, Wei; Dai, Qionghai; Hua, Gang

    2014-12-01

    Bag-of-features (BoFs) representation has been extensively applied to deal with various computer vision applications. To extract discriminative and descriptive BoF, one important step is to learn a good dictionary to minimize the quantization loss between local features and codewords. While most existing visual dictionary learning approaches are engaged with unsupervised feature quantization, the latest trend has turned to supervised learning by harnessing the semantic labels of images or regions. However, such labels are typically too expensive to acquire, which restricts the scalability of supervised dictionary learning approaches. In this paper, we propose to leverage image attributes to weakly supervise the dictionary learning procedure without requiring any actual labels. As a key contribution, our approach establishes a generative hidden Markov random field (HMRF), which models the quantized codewords as the observed states and the image attributes as the hidden states, respectively. Dictionary learning is then performed by supervised grouping the observed states, where the supervised information is stemmed from the hidden states of the HMRF. In such a way, the proposed dictionary learning approach incorporates the image attributes to learn a semantic-preserving BoF representation without any genuine supervision. Experiments in large-scale image retrieval and classification tasks corroborate that our approach significantly outperforms the state-of-the-art unsupervised dictionary learning approaches.

  4. Contributions to unsupervised and supervised learning with applications in digital image processing

    OpenAIRE

    2012-01-01

    311 p. : il. [EN]This Thesis covers a broad period of research activities with a commonthread: learning processes and its application to image processing. The twomain categories of learning algorithms, supervised and unsupervised, have beentouched across these years. The main body of initial works was devoted tounsupervised learning neural architectures, specially the Self Organizing Map.Our aim was to study its convergence properties from empirical and analyticalviewpoints.From the digita...

  5. Contributions to unsupervised and supervised learning with applications in digital image processing

    OpenAIRE

    González Acuña, Ana Isabel

    2014-01-01

    311 p. : il. [EN]This Thesis covers a broad period of research activities with a commonthread: learning processes and its application to image processing. The twomain categories of learning algorithms, supervised and unsupervised, have beentouched across these years. The main body of initial works was devoted tounsupervised learning neural architectures, specially the Self Organizing Map.Our aim was to study its convergence properties from empirical and analyticalviewpoints.From the digita...

  6. CFSO3: A New Supervised Swarm-Based Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Antonino Laudani

    2013-01-01

    Full Text Available We present CFSO3, an optimization heuristic within the class of the swarm intelligence, based on a synergy among three different features of the Continuous Flock-of-Starlings Optimization. One of the main novelties is that this optimizer is no more a classical numerical algorithm since it now can be seen as a continuous dynamic system, which can be treated by using all the mathematical instruments available for managing state equations. In addition, CFSO3 allows passing from stochastic approaches to supervised deterministic ones since the random updating of parameters, a typical feature for numerical swam-based optimization algorithms, is now fully substituted by a supervised strategy: in CFSO3 the tuning of parameters is a priori designed for obtaining both exploration and exploitation. Indeed the exploration, that is, the escaping from a local minimum, as well as the convergence and the refinement to a solution can be designed simply by managing the eigenvalues of the CFSO state equations. Virtually in CFSO3, just the initial values of positions and velocities of the swarm members have to be randomly assigned. Both standard and parallel versions of CFSO3 together with validations on classical benchmarks are presented.

  7. Opportunities to learn scientific thinking in joint doctoral supervision

    DEFF Research Database (Denmark)

    Kobayashi, Sofie; Grout, Brian William Wilson; Rump, Camilla Østerberg

    2015-01-01

    Research into doctoral supervision has increased rapidly over the last decades, yet our understanding of how doctoral students learn scientific thinking from supervision is limited. Most studies are based on interviews with little work being reported that is based on observation of actual supervi...

  8. Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data

    OpenAIRE

    Kurth, Thorsten; Zhang, Jian; Satish, Nadathur; Mitliagkas, Ioannis; Racah, Evan; Patwary, Mostofa Ali; Malas, Tareq; Sundaram, Narayanan; Bhimji, Wahid; Smorkalov, Mikhail; Deslippe, Jack; Shiryaev, Mikhail; Sridharan, Srinivas; Prabhat; Dubey, Pradeep

    2017-01-01

    This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy physics data as well as semi-supervised architectures for localizing and classifying extreme weather in climate data. Our Intelcaffe-based implementation obtains $\\sim$2TFLOP/s on a single Cori Phase-II Xeon-Phi node. We use a hybrid strategy employin...

  9. Unsupervised/supervised learning concept for 24-hour load forecasting

    Energy Technology Data Exchange (ETDEWEB)

    Djukanovic, M. (Electrical Engineering Inst. ' Nikola Tesla' , Belgrade (Yugoslavia)); Babic, B. (Electrical Power Industry of Serbia, Belgrade (Yugoslavia)); Sobajic, D.J.; Pao, Y.-H. (Case Western Reserve Univ., Cleveland, OH (United States). Dept. of Electrical Engineering and Computer Science)

    1993-07-01

    An application of artificial neural networks in short-term load forecasting is described. An algorithm using an unsupervised/supervised learning concept and historical relationship between the load and temperature for a given season, day type and hour of the day to forecast hourly electric load with a lead time of 24 hours is proposed. An additional approach using functional link net, temperature variables, average load and last one-hour load of previous day is introduced and compared with the ANN model with one hidden layer load forecast. In spite of limited available weather variables (maximum, minimum and average temperature for the day) quite acceptable results have been achieved. The 24-hour-ahead forecast errors (absolute average) ranged from 2.78% for Saturdays and 3.12% for working days to 3.54% for Sundays. (Author)

  10. Multiclass semi-supervised learning for animal behavior recognition from accelerometer data

    NARCIS (Netherlands)

    Tanha, J.; van Someren, M.; de Bakker, M.; Bouten, W.; Shamoun-Baranes, J.; Afsarmanesh, H.

    2012-01-01

    In this paper we present a new Multiclass semi-supervised learning algorithm that uses a base classifier in combination with a similarity function applied to all data to find a classifier that maximizes the margin and consistency over all data. A novel multiclass loss function is presented and used

  11. An efficient flow-based botnet detection using supervised machine learning

    DEFF Research Database (Denmark)

    Stevanovic, Matija; Pedersen, Jens Myrup

    2014-01-01

    Botnet detection represents one of the most crucial prerequisites of successful botnet neutralization. This paper explores how accurate and timely detection can be achieved by using supervised machine learning as the tool of inferring about malicious botnet traffic. In order to do so, the paper...... introduces a novel flow-based detection system that relies on supervised machine learning for identifying botnet network traffic. For use in the system we consider eight highly regarded machine learning algorithms, indicating the best performing one. Furthermore, the paper evaluates how much traffic needs...... to accurately and timely detect botnet traffic using purely flow-based traffic analysis and supervised machine learning. Additionally, the results show that in order to achieve accurate detection traffic flows need to be monitored for only a limited time period and number of packets per flow. This indicates...

  12. Out-of-Sample Generalizations for Supervised Manifold Learning for Classification

    Science.gov (United States)

    Vural, Elif; Guillemot, Christine

    2016-03-01

    Supervised manifold learning methods for data classification map data samples residing in a high-dimensional ambient space to a lower-dimensional domain in a structure-preserving way, while enhancing the separation between different classes in the learned embedding. Most nonlinear supervised manifold learning methods compute the embedding of the manifolds only at the initially available training points, while the generalization of the embedding to novel points, known as the out-of-sample extension problem in manifold learning, becomes especially important in classification applications. In this work, we propose a semi-supervised method for building an interpolation function that provides an out-of-sample extension for general supervised manifold learning algorithms studied in the context of classification. The proposed algorithm computes a radial basis function (RBF) interpolator that minimizes an objective function consisting of the total embedding error of unlabeled test samples, defined as their distance to the embeddings of the manifolds of their own class, as well as a regularization term that controls the smoothness of the interpolation function in a direction-dependent way. The class labels of test data and the interpolation function parameters are estimated jointly with a progressive procedure. Experimental results on face and object images demonstrate the potential of the proposed out-of-sample extension algorithm for the classification of manifold-modeled data sets.

  13. Supervised learning of semantic classes for image annotation and retrieval.

    Science.gov (United States)

    Carneiro, Gustavo; Chan, Antoni B; Moreno, Pedro J; Vasconcelos, Nuno

    2007-03-01

    A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning.

  14. Unsupervised learning algorithms

    CERN Document Server

    Aydin, Kemal

    2016-01-01

    This book summarizes the state-of-the-art in unsupervised learning. The contributors discuss how with the proliferation of massive amounts of unlabeled data, unsupervised learning algorithms, which can automatically discover interesting and useful patterns in such data, have gained popularity among researchers and practitioners. The authors outline how these algorithms have found numerous applications including pattern recognition, market basket analysis, web mining, social network analysis, information retrieval, recommender systems, market research, intrusion detection, and fraud detection. They present how the difficulty of developing theoretically sound approaches that are amenable to objective evaluation have resulted in the proposal of numerous unsupervised learning algorithms over the past half-century. The intended audience includes researchers and practitioners who are increasingly using unsupervised learning algorithms to analyze their data. Topics of interest include anomaly detection, clustering,...

  15. A Semi-Supervised WLAN Indoor Localization Method Based on ℓ1-Graph Algorithm

    Institute of Scientific and Technical Information of China (English)

    Liye Zhang; Lin Ma; Yubin Xu

    2015-01-01

    For indoor location estimation based on received signal strength ( RSS ) in wireless local area networks ( WLAN) , in order to reduce the influence of noise on the positioning accuracy, a large number of RSS should be collected in offline phase. Therefore, collecting training data with positioning information is time consuming which becomes the bottleneck of WLAN indoor localization. In this paper, the traditional semi⁃supervised learning method based on k⁃NN andε⁃NN graph for reducing collection workload of offline phase are analyzed, and the result shows that the k⁃NN or ε⁃NN graph are sensitive to data noise, which limit the performance of semi⁃supervised learning WLAN indoor localization system. Aiming at the above problem, it proposes a ℓ1⁃graph⁃algorithm⁃based semi⁃supervised learning ( LG⁃SSL) indoor localization method in which the graph is built by ℓ1⁃norm algorithm. In our system, it firstly labels the unlabeled data using LG⁃SSL and labeled data to build the Radio Map in offline training phase, and then uses LG⁃SSL to estimate user’ s location in online phase. Extensive experimental results show that, benefit from the robustness to noise and sparsity ofℓ1⁃graph, LG⁃SSL exhibits superior performance by effectively reducing the collection workload in offline phase and improving localization accuracy in online phase.

  16. Customers Behavior Modeling by Semi-Supervised Learning in Customer Relationship Management

    CERN Document Server

    Emtiyaz, Siavash; 10.4156/AISS.vol3.issue9.31

    2012-01-01

    Leveraging the power of increasing amounts of data to analyze customer base for attracting and retaining the most valuable customers is a major problem facing companies in this information age. Data mining technologies extract hidden information and knowledge from large data stored in databases or data warehouses, thereby supporting the corporate decision making process. CRM uses data mining (one of the elements of CRM) techniques to interact with customers. This study investigates the use of a technique, semi-supervised learning, for the management and analysis of customer-related data warehouse and information. The idea of semi-supervised learning is to learn not only from the labeled training data, but to exploit also the structural information in additionally available unlabeled data. The proposed semi-supervised method is a model by means of a feed-forward neural network trained by a back propagation algorithm (multi-layer perceptron) in order to predict the category of an unknown customer (potential cus...

  17. Enhancing Adult Learning in Clinical Supervision

    Science.gov (United States)

    Goldman, Stuart

    2011-01-01

    Objective/Background: For decades, across almost every training site, clinical supervision has been considered "central to the development of skills" in psychiatry. The crucial supervisor/supervisee relationship has been described extensively in the literature, most often framed as a clinical apprenticeship of the novice to the master craftsman.…

  18. Semi-supervised Eigenvectors for Locally-biased Learning

    DEFF Research Database (Denmark)

    Hansen, Toke Jansen; Mahoney, Michael W.

    2012-01-01

    of this sort are particularly challenging for popular eigenvector-based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities. In this paper, we address this issue by providing a methodology to construct semi-supervised eigenvectors of a graph......In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks "nearby" that pre-specified target region. Locally-biased problems...... Laplacian, and we illustrate how these locally-biased eigenvectors can be used to perform locally-biased machine learning. These semi-supervised eigenvectors capture successively-orthogonalized directions of maximum variance, conditioned on being well-correlated with an input seed set of nodes...

  19. Action Learning in Undergraduate Engineering Thesis Supervision

    Science.gov (United States)

    Stappenbelt, Brad

    2017-01-01

    In the present action learning implementation, twelve action learning sets were conducted over eight years. The action learning sets consisted of students involved in undergraduate engineering research thesis work. The concurrent study accompanying this initiative investigated the influence of the action learning environment on student approaches…

  20. Multiclass Semi-Supervised Boosting and Similarity Learning

    NARCIS (Netherlands)

    Tanha, J.; Saberian, M.J.; van Someren, M.; Xiong, H.; Karypis, G.; Thuraisingham, B.; Cook, D.; Wu, X.

    2013-01-01

    In this paper, we consider the multiclass semi-supervised classification problem. A boosting algorithm is proposed to solve the multiclass problem directly. The proposed multiclass approach uses a new multiclass loss function, which includes two terms. The first term is the cost of the multiclass ma

  1. Improving Semi-Supervised Learning with Auxiliary Deep Generative Models

    DEFF Research Database (Denmark)

    Maaløe, Lars; Sønderby, Casper Kaae; Sønderby, Søren Kaae

    Deep generative models based upon continuous variational distributions parameterized by deep networks give state-of-the-art performance. In this paper we propose a framework for extending the latent representation with extra auxiliary variables in order to make the variational distribution more...... expressive for semi-supervised learning. By utilizing the stochasticity of the auxiliary variable we demonstrate how to train discriminative classifiers resulting in state-of-the-art performance within semi-supervised learning exemplified by an 0.96% error on MNIST using 100 labeled data points. Furthermore...

  2. Online Pairwise Learning Algorithms.

    Science.gov (United States)

    Ying, Yiming; Zhou, Ding-Xuan

    2016-04-01

    Pairwise learning usually refers to a learning task that involves a loss function depending on pairs of examples, among which the most notable ones are bipartite ranking, metric learning, and AUC maximization. In this letter we study an online algorithm for pairwise learning with a least-square loss function in an unconstrained setting of a reproducing kernel Hilbert space (RKHS) that we refer to as the Online Pairwise lEaRning Algorithm (OPERA). In contrast to existing works (Kar, Sriperumbudur, Jain, & Karnick, 2013 ; Wang, Khardon, Pechyony, & Jones, 2012 ), which require that the iterates are restricted to a bounded domain or the loss function is strongly convex, OPERA is associated with a non-strongly convex objective function and learns the target function in an unconstrained RKHS. Specifically, we establish a general theorem that guarantees the almost sure convergence for the last iterate of OPERA without any assumptions on the underlying distribution. Explicit convergence rates are derived under the condition of polynomially decaying step sizes. We also establish an interesting property for a family of widely used kernels in the setting of pairwise learning and illustrate the convergence results using such kernels. Our methodology mainly depends on the characterization of RKHSs using its associated integral operators and probability inequalities for random variables with values in a Hilbert space.

  3. Semi-supervised Eigenvectors for Locally-biased Learning

    DEFF Research Database (Denmark)

    Hansen, Toke Jansen; Mahoney, Michael W.

    2012-01-01

    of this sort are particularly challenging for popular eigenvector-based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities. In this paper, we address this issue by providing a methodology to construct semi-supervised eigenvectors of a graph...

  4. Supervised learning with decision tree-based methods in computational and systems biology.

    Science.gov (United States)

    Geurts, Pierre; Irrthum, Alexandre; Wehenkel, Louis

    2009-12-01

    At the intersection between artificial intelligence and statistics, supervised learning allows algorithms to automatically build predictive models from just observations of a system. During the last twenty years, supervised learning has been a tool of choice to analyze the always increasing and complexifying data generated in the context of molecular biology, with successful applications in genome annotation, function prediction, or biomarker discovery. Among supervised learning methods, decision tree-based methods stand out as non parametric methods that have the unique feature of combining interpretability, efficiency, and, when used in ensembles of trees, excellent accuracy. The goal of this paper is to provide an accessible and comprehensive introduction to this class of methods. The first part of the review is devoted to an intuitive but complete description of decision tree-based methods and a discussion of their strengths and limitations with respect to other supervised learning methods. The second part of the review provides a survey of their applications in the context of computational and systems biology.

  5. SPATIALLY ADAPTIVE SEMI-SUPERVISED LEARNING WITH GAUSSIAN PROCESSES FOR HYPERSPECTRAL DATA ANALYSIS

    Data.gov (United States)

    National Aeronautics and Space Administration — SPATIALLY ADAPTIVE SEMI-SUPERVISED LEARNING WITH GAUSSIAN PROCESSES FOR HYPERSPECTRAL DATA ANALYSIS GOO JUN * AND JOYDEEP GHOSH* Abstract. A semi-supervised learning...

  6. Transfer learning improves supervised image segmentation across imaging protocols

    DEFF Research Database (Denmark)

    van Opbroek, Annegreet; Ikram, M. Arfan; Vernooij, Meike W.;

    2015-01-01

    well, often require a large amount of labeled training data that is exactly representative of the target data. We therefore propose to use transfer learning for image segmentation. Transfer-learning techniques can cope with differences in distributions between training and target data, and therefore......The variation between images obtained with different scanners or different imaging protocols presents a major challenge in automatic segmentation of biomedical images. This variation especially hampers the application of otherwise successful supervised-learning techniques which, in order to perform...... may improve performance over supervised learning for segmentation across scanners and scan protocols. We present four transfer classifiers that can train a classification scheme with only a small amount of representative training data, in addition to a larger amount of other training data...

  7. Combining Unsupervised and Supervised Learning for Discovering Disease Subclasses

    OpenAIRE

    Tucker, A; Bosoni, P; Bellazzi, R.; Nihtyanova, S; Denton, C.

    2016-01-01

    Diseases are often umbrella terms for many subcategories of disease. The identification of these subcategories is vital if we are to develop personalised treatments that are better focussed on individual patients. In this short paper, we explore the use of a combination of unsupervised learning to identify potential subclasses, and supervised learning to build models for better predicting a number of different health outcomes for patients that suffer from systemic sclerosis, a rare chronic co...

  8. Effects of coaching supervision, mentoring supervision and abusive supervision on talent development among trainee doctors in public hospitals: moderating role of clinical learning environment.

    Science.gov (United States)

    Subramaniam, Anusuiya; Silong, Abu Daud; Uli, Jegak; Ismail, Ismi Arif

    2015-08-13

    Effective talent development requires robust supervision. However, the effects of supervisory styles (coaching, mentoring and abusive supervision) on talent development and the moderating effects of clinical learning environment in the relationship between supervisory styles and talent development among public hospital trainee doctors have not been thoroughly researched. In this study, we aim to achieve the following, (1) identify the extent to which supervisory styles (coaching, mentoring and abusive supervision) can facilitate talent development among trainee doctors in public hospital and (2) examine whether coaching, mentoring and abusive supervision are moderated by clinical learning environment in predicting talent development among trainee doctors in public hospital. A questionnaire-based critical survey was conducted among trainee doctors undergoing housemanship at six public hospitals in the Klang Valley, Malaysia. Prior permission was obtained from the Ministry of Health Malaysia to conduct the research in the identified public hospitals. The survey yielded 355 responses. The results were analysed using SPSS 20.0 and SEM with AMOS 20.0. The findings of this research indicate that coaching and mentoring supervision are positively associated with talent development, and that there is no significant relationship between abusive supervision and talent development. The findings also support the moderating role of clinical learning environment on the relationships between coaching supervision-talent development, mentoring supervision-talent development and abusive supervision-talent development among public hospital trainee doctors. Overall, the proposed model indicates a 26 % variance in talent development. This study provides an improved understanding on the role of the supervisory styles (coaching and mentoring supervision) on facilitating talent development among public hospital trainee doctors. Furthermore, this study extends the literature to better

  9. Enhancing fieldwork learning using blended learning, GIS and remote supervision

    Science.gov (United States)

    Marra, Wouter A.; Alberti, Koko; Karssenberg, Derek

    2015-04-01

    Fieldwork is an important part of education in geosciences and essential to put theoretical knowledge into an authentic context. Fieldwork as teaching tool can take place in various forms, such as field-tutorial, excursion, or supervised research. Current challenges with fieldwork in education are to incorporate state-of-the art methods for digital data collection, on-site GIS-analysis and providing high-quality feedback to large groups of students in the field. We present a case on first-year earth-sciences fieldwork with approximately 80 students in the French Alps focused on geological and geomorphological mapping. Here, students work in couples and each couple maps their own fieldwork area to reconstruct the formative history. We present several major improvements for this fieldwork using a blended-learning approach, relying on open source software only. An important enhancement to the French Alps fieldwork is improving students' preparation. In a GIS environment, students explore their fieldwork areas using existing remote sensing data, a digital elevation model and derivatives to formulate testable hypotheses before the actual fieldwork. The advantage of this is that the students already know their area when arriving in the field, have started to apply the empirical cycle prior to their field visit, and are therefore eager to investigate their own research questions. During the fieldwork, students store and analyze their field observations in the same GIS environment. This enables them to get a better overview of their own collected data, and to integrate existing data sources also used in the preparation phase. This results in a quicker and enhanced understanding by the students. To enable remote access to observational data collected by students, the students synchronize their data daily with a webserver running a web map application. Supervisors can review students' progress remotely, examine and evaluate their observations in a GIS, and provide

  10. Facial nerve image enhancement from CBCT using supervised learning technique.

    Science.gov (United States)

    Ping Lu; Barazzetti, Livia; Chandran, Vimal; Gavaghan, Kate; Weber, Stefan; Gerber, Nicolas; Reyes, Mauricio

    2015-08-01

    Facial nerve segmentation plays an important role in surgical planning of cochlear implantation. Clinically available CBCT images are used for surgical planning. However, its relatively low resolution renders the identification of the facial nerve difficult. In this work, we present a supervised learning approach to enhance facial nerve image information from CBCT. A supervised learning approach based on multi-output random forest was employed to learn the mapping between CBCT and micro-CT images. Evaluation was performed qualitatively and quantitatively by using the predicted image as input for a previously published dedicated facial nerve segmentation, and cochlear implantation surgical planning software, OtoPlan. Results show the potential of the proposed approach to improve facial nerve image quality as imaged by CBCT and to leverage its segmentation using OtoPlan.

  11. Modeling Time Series Data for Supervised Learning

    Science.gov (United States)

    Baydogan, Mustafa Gokce

    2012-01-01

    Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning…

  12. Non-Supervised Learning for Spread Spectrum Signal Pseudo-Noise Sequence Acquisition

    Institute of Scientific and Technical Information of China (English)

    Hao Cheng; Na Yu,; Tai-Jun Wang

    2015-01-01

    Abstract¾An idea of estimating the direct sequence spread spectrum (DSSS) signal pseudo-noise (PN) sequence is presented. Without the apriority knowledge about the DSSS signal in the non-cooperation condition, we propose a self-organizing feature map (SOFM) neural network algorithm to detect and identify the PN sequence. A non-supervised learning algorithm is proposed according the Kohonen rule in SOFM. The blind algorithm can also estimate the PN sequence in a low signal-to-noise (SNR) and computer simulation demonstrates that the algorithm is effective. Compared with the traditional correlation algorithm based on slip-correlation, the proposed algorithm’s bit error rate (BER) and complexity are lower.

  13. Musical Instrument Classification Based on Nonlinear Recurrence Analysis and Supervised Learning

    Directory of Open Access Journals (Sweden)

    R.Rui

    2013-04-01

    Full Text Available In this paper, the phase space reconstruction of time series produced by different instruments is discussed based on the nonlinear dynamic theory. The dense ratio, a novel quantitative recurrence parameter, is proposed to describe the difference of wind instruments, stringed instruments and keyboard instruments in the phase space by analyzing the recursive property of every instrument. Furthermore, a novel supervised learning algorithm for automatic classification of individual musical instrument signals is addressed deriving from the idea of supervised non-negative matrix factorization (NMF algorithm. In our approach, the orthogonal basis matrix could be obtained without updating the matrix iteratively, which NMF is unable to do. The experimental results indicate that the accuracy of the proposed method is improved by 3% comparing with the conventional features in the individual instrument classification.

  14. Hierarchical Wireless Multimedia Sensor Networks for Collaborative Hybrid Semi-Supervised Classifier Learning

    Directory of Open Access Journals (Sweden)

    Liang Ding

    2007-11-01

    Full Text Available Wireless multimedia sensor networks (WMSN have recently emerged as one ofthe most important technologies, driven by the powerful multimedia signal acquisition andprocessing abilities. Target classification is an important research issue addressed in WMSN,which has strict requirement in robustness, quickness and accuracy. This paper proposes acollaborative semi-supervised classifier learning algorithm to achieve durative onlinelearning for support vector machine (SVM based robust target classification. The proposedalgorithm incrementally carries out the semi-supervised classifier learning process inhierarchical WMSN, with the collaboration of multiple sensor nodes in a hybrid computingparadigm. For decreasing the energy consumption and improving the performance, somemetrics are introduced to evaluate the effectiveness of the samples in specific sensor nodes,and a sensor node selection strategy is also proposed to reduce the impact of inevitablemissing detection and false detection. With the ant optimization routing, the learningprocess is implemented with the selected sensor nodes, which can decrease the energyconsumption. Experimental results demonstrate that the collaborative hybrid semi-supervised classifier learning algorithm can effectively implement target classification inhierarchical WMSN. It has outstanding performance in terms of energy efficiency and timecost, which verifies the effectiveness of the sensor nodes selection and ant optimizationrouting.

  15. Semi-supervised learning for ordinal Kernel Discriminant Analysis.

    Science.gov (United States)

    Pérez-Ortiz, M; Gutiérrez, P A; Carbonero-Ruz, M; Hervás-Martínez, C

    2016-12-01

    Ordinal classification considers those classification problems where the labels of the variable to predict follow a given order. Naturally, labelled data is scarce or difficult to obtain in this type of problems because, in many cases, ordinal labels are given by a user or expert (e.g. in recommendation systems). Firstly, this paper develops a new strategy for ordinal classification where both labelled and unlabelled data are used in the model construction step (a scheme which is referred to as semi-supervised learning). More specifically, the ordinal version of kernel discriminant learning is extended for this setting considering the neighbourhood information of unlabelled data, which is proposed to be computed in the feature space induced by the kernel function. Secondly, a new method for semi-supervised kernel learning is devised in the context of ordinal classification, which is combined with our developed classification strategy to optimise the kernel parameters. The experiments conducted compare 6 different approaches for semi-supervised learning in the context of ordinal classification in a battery of 30 datasets, showing (1) the good synergy of the ordinal version of discriminant analysis and the use of unlabelled data and (2) the advantage of computing distances in the feature space induced by the kernel function.

  16. Biomedical data analysis by supervised manifold learning.

    Science.gov (United States)

    Alvarez-Meza, A M; Daza-Santacoloma, G; Castellanos-Dominguez, G

    2012-01-01

    Biomedical data analysis is usually carried out by assuming that the information structure embedded into the biomedical recordings is linear, but that statement actually does not corresponds to the real behavior of the extracted features. In order to improve the accuracy of an automatic system to diagnostic support, and to reduce the computational complexity of the employed classifiers, we propose a nonlinear dimensionality reduction methodology based on manifold learning with multiple kernel representations, which learns the underlying data structure of biomedical information. Moreover, our approach can be used as a tool that allows the specialist to do a visual analysis and interpretation about the studied variables describing the health condition. Obtained results show how our approach maps the original high dimensional features into an embedding space where simple and straightforward classification strategies achieve a suitable system performance.

  17. Automated training for algorithms that learn from genomic data.

    Science.gov (United States)

    Cilingir, Gokcen; Broschat, Shira L

    2015-01-01

    Supervised machine learning algorithms are used by life scientists for a variety of objectives. Expert-curated public gene and protein databases are major resources for gathering data to train these algorithms. While these data resources are continuously updated, generally, these updates are not incorporated into published machine learning algorithms which thereby can become outdated soon after their introduction. In this paper, we propose a new model of operation for supervised machine learning algorithms that learn from genomic data. By defining these algorithms in a pipeline in which the training data gathering procedure and the learning process are automated, one can create a system that generates a classifier or predictor using information available from public resources. The proposed model is explained using three case studies on SignalP, MemLoci, and ApicoAP in which existing machine learning models are utilized in pipelines. Given that the vast majority of the procedures described for gathering training data can easily be automated, it is possible to transform valuable machine learning algorithms into self-evolving learners that benefit from the ever-changing data available for gene products and to develop new machine learning algorithms that are similarly capable.

  18. Assessing Miniaturized Sensor Performance using Supervised Learning, with Application to Drug and Explosive Detection

    DEFF Research Database (Denmark)

    Alstrøm, Tommy Sonne

    of sensors, as the sensors are designed to provide robust and reliable measurements. That means, the sensors are designed to have repeated measurement clusters. Sensor fusion is presented for the sensor based on chemoselective compounds. An array of color changing compounds are handled and in unity they make......This Ph.D. thesis titled “Assessing Miniaturized Sensor Performance using Supervised Learning, with Application to Drug and Explosive Detection” is a part of the strategic research project “Miniaturized sensors for explosives detection in air” funded by the Danish Agency for Science and Technology...... before the sensor responses can be applied to supervised learning algorithms. The technologies used for sensing consist of Calorimetry, Cantilevers, Chemoselective compounds, Quartz Crystal Microbalance and Surface Enhanced Raman Scattering. Each of the sensors have their own strength and weaknesses...

  19. Very Short Literature Survey From Supervised Learning To Surrogate Modeling

    CERN Document Server

    Brusan, Altay

    2012-01-01

    The past century was era of linear systems. Either systems (especially industrial ones) were simple (quasi)linear or linear approximations were accurate enough. In addition, just at the ending decades of the century profusion of computing devices were available, before then due to lack of computational resources it was not easy to evaluate available nonlinear system studies. At the moment both these two conditions changed, systems are highly complex and also pervasive amount of computation strength is cheap and easy to achieve. For recent era, a new branch of supervised learning well known as surrogate modeling (meta-modeling, surface modeling) has been devised which aimed at answering new needs of modeling realm. This short literature survey is on to introduce surrogate modeling to whom is familiar with the concepts of supervised learning. Necessity, challenges and visions of the topic are considered.

  20. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  1. Recent advances on techniques and theories of feedforward networks with supervised learning

    Science.gov (United States)

    Xu, Lei; Klasa, Stan

    1992-07-01

    The rediscovery and popularization of the back propagation training technique for multilayer perceptrons as well as the invention of the Boltzmann Machine learning algorithm has given a new boost to the study of supervised learning networks. In recent years, besides the widely spread applications and the various further improvements of the classical back propagation technique, many new supervised learning models, techniques as well as theories, have also been proposed in a vast number of publications. This paper tries to give a rather systematical review on the recent advances on supervised learning techniques and theories for static feedforward networks. We summarize a great number of developments into four aspects: (1) Various improvements and variants made on the classical back propagation techniques for multilayer (static) perceptron nets, for speeding up training, avoiding local minima, increasing the generalization ability, as well as for many other interesting purposes. (2) A number of other learning methods for training multilayer (static) perceptron, such as derivative estimation by perturbation, direct weight update by perturbation, genetic algorithms, recursive least square estimate and extended Kalman filter, linear programming, the policy of fixing one layer while updating another, constructing networks by converting decision tree classifiers, and others. (3) Various other feedforward models which are also able to implement function approximation, probability density estimation and classification, including various models of basis function expansion (e.g., radial basis functions, restricted coulomb energy, multivariate adaptive regression splines, trigonometric and polynomial bases, projection pursuit, basis function tree, and may others), and several other supervised learning models. (4) Models with complex structures, e.g., modular architecture, hierarchy architecture, and others. (5) A number of theoretical issues involving the universal

  2. Pulsar Search Using Supervised Machine Learning

    Science.gov (United States)

    Ford, John M.

    2017-05-01

    Pulsars are rapidly rotating neutron stars which emit a strong beam of energy through mechanisms that are not entirely clear to physicists. These very dense stars are used by astrophysicists to study many basic physical phenomena, such as the behavior of plasmas in extremely dense environments, behavior of pulsar-black hole pairs, and tests of general relativity. Many of these tasks require a large ensemble of pulsars to provide enough statistical information to answer the scientific questions posed by physicists. In order to provide more pulsars to study, there are several large-scale pulsar surveys underway, which are generating a huge backlog of unprocessed data. Searching for pulsars is a very labor-intensive process, currently requiring skilled people to examine and interpret plots of data output by analysis programs. An automated system for screening the plots will speed up the search for pulsars by a very large factor. Research to date on using machine learning and pattern recognition has not yielded a completely satisfactory system, as systems with the desired near 100% recall have false positive rates that are higher than desired, causing more manual labor in the classification of pulsars. This work proposed to research, identify, propose and develop methods to overcome the barriers to building an improved classification system with a false positive rate of less than 1% and a recall of near 100% that will be useful for the current and next generation of large pulsar surveys. The results show that it is possible to generate classifiers that perform as needed from the available training data. While a false positive rate of 1% was not reached, recall of over 99% was achieved with a false positive rate of less than 2%. Methods of mitigating the imbalanced training and test data were explored and found to be highly effective in enhancing classification accuracy.

  3. Automated labeling of cancer textures in larynx histopathology slides using quasi-supervised learning.

    Science.gov (United States)

    Onder, Devrim; Sarioglu, Sulen; Karacali, Bilge

    2014-12-01

    To evaluate the performance of a quasi-supervised statistical learning algorithm, operating on datasets having normal and neoplastic tissues, to identify larynx squamous cell carcinomas. Furthermore, cancer texture separability measures against normal tissues are to be developed and compared either for colorectal or larynx tissues. Light microscopic digital images from histopathological sections were obtained from laryngectomy materials including squamous cell carcinoma and nonneoplastic regions. The texture features were calculated by using co-occurrence matrices and local histograms. The texture features were input to the quasi-supervised learning algorithm. Larynx regions containing squamous cell carcinomas were accurately identified, having false and true positive rates up to 21% and 87%, respectively. Larynx squamous cell carcinoma versus normal tissue texture separability measures were higher than colorectal adenocarcinoma versus normal textures for the colorectal database. Furthermore, the resultant labeling performances for all larynx datasets are higher than or equal to that of colorectal datasets. The results in larynx datasets, in comparison with the former colorectal study, suggested that quasi-supervised texture classification is to be a helpful method in histopathological image classification and analysis.

  4. Semi-supervised Learning for Photometric Supernova Classification

    CERN Document Server

    Richards, Joseph W; Freeman, Peter E; Schafer, Chad M; Poznanski, Dovi

    2011-01-01

    We present a semi-supervised method for photometric supernova typing. Our approach is to first use the nonlinear dimension reduction technique diffusion map to detect structure in a database of supernova light curves and subsequently employ random forest classification on a spectroscopically confirmed training set to learn a model that can predict the type of each newly observed supernova. We demonstrate that this is an effective method for supernova typing. As supernova numbers increase, our semi-supervised method efficiently utilizes this information to improve classification, a property not enjoyed by template based methods. Applied to supernova data simulated by Kessler et al. (2010b) to mimic those of the Dark Energy Survey, our methods achieve (cross-validated) 96% Type Ia purity and 86% Type Ia efficiency on the spectroscopic sample, but only 56% Type Ia purity and 48% efficiency on the photometric sample due to their spectroscopic followup strategy. To improve the performance on the photometric sample...

  5. Baccalaureate nursing students' perceptions of learning and supervision in the clinical environment.

    Science.gov (United States)

    Dimitriadou, Maria; Papastavrou, Evridiki; Efstathiou, Georgios; Theodorou, Mamas

    2015-06-01

    This study is an exploration of nursing students' experiences within the clinical learning environment (CLE) and supervision provided in hospital settings. A total of 357 second-year nurse students from all universities in Cyprus participated in the study. Data were collected using the Clinical Learning Environment, Supervision and Nurse Teacher instrument. The dimension "supervisory relationship (mentor)", as well as the frequency of individualized supervision meetings, were found to be important variables in the students' clinical learning. However, no statistically-significant connection was established between successful mentor relationship and team supervision. The majority of students valued their mentor's supervision more highly than a nurse teacher's supervision toward the fulfillment of learning outcomes. The dimensions "premises of nursing care" and "premises of learning" were highly correlated, indicating that a key component of a quality clinical learning environment is the quality of care delivered. The results suggest the need to modify educational strategies that foster desirable learning for students in response to workplace demands.

  6. Polyceptron: A Polyhedral Learning Algorithm

    CERN Document Server

    Manwani, Naresh

    2011-01-01

    In this paper we propose a new algorithm for learning polyhedral classifiers which we call as Polyceptron. It is a Perception like algorithm which updates the parameters only when the current classifier misclassifies any training data. We give both batch and online version of Polyceptron algorithm. Finally we give experimental results to show the effectiveness of our approach.

  7. Function approximation using combined unsupervised and supervised learning.

    Science.gov (United States)

    Andras, Peter

    2014-03-01

    Function approximation is one of the core tasks that are solved using neural networks in the context of many engineering problems. However, good approximation results need good sampling of the data space, which usually requires exponentially increasing volume of data as the dimensionality of the data increases. At the same time, often the high-dimensional data is arranged around a much lower dimensional manifold. Here we propose the breaking of the function approximation task for high-dimensional data into two steps: (1) the mapping of the high-dimensional data onto a lower dimensional space corresponding to the manifold on which the data resides and (2) the approximation of the function using the mapped lower dimensional data. We use over-complete self-organizing maps (SOMs) for the mapping through unsupervised learning, and single hidden layer neural networks for the function approximation through supervised learning. We also extend the two-step procedure by considering support vector machines and Bayesian SOMs for the determination of the best parameters for the nonlinear neurons in the hidden layer of the neural networks used for the function approximation. We compare the approximation performance of the proposed neural networks using a set of functions and show that indeed the neural networks using combined unsupervised and supervised learning outperform in most cases the neural networks that learn the function approximation using the original high-dimensional data.

  8. Efficient supervised learning in networks with binary synapses

    CERN Document Server

    Baldassi, Carlo; Brunel, Nicolas; Zecchina, Riccardo

    2007-01-01

    Recent experimental studies indicate that synaptic changes induced by neuronal activity are discrete jumps between a small number of stable states. Learning in systems with discrete synapses is known to be a computationally hard problem. Here, we study a neurobiologically plausible on-line learning algorithm that derives from Belief Propagation algorithms. We show that it performs remarkably well in a model neuron with binary synapses, and a finite number of `hidden' states per synapse, that has to learn a random classification task. Such system is able to learn a number of associations close to the theoretical limit, in time which is sublinear in system size. This is to our knowledge the first on-line algorithm that is able to achieve efficiently a finite number of patterns learned per binary synapse. Furthermore, we show that performance is optimal for a finite number of hidden states which becomes very small for sparse coding. The algorithm is similar to the standard `perceptron' learning algorithm, with a...

  9. Generalization of Supervised Learning for Binary Mask Estimation

    DEFF Research Database (Denmark)

    May, Tobias; Gerkmann, Timo

    2014-01-01

    This paper addresses the problem of speech segregation by es- timating the ideal binary mask (IBM) from noisy speech. Two methods will be compared, one supervised learning approach that incorporates a priori knowledge about the feature distri- bution observed during training. The second method...... solely relies on a frame-based speech presence probability (SPP) es- timation, and therefore, does not depend on the acoustic con- dition seen during training. We investigate the influence of mismatches between the acoustic conditions used for training and testing on the IBM estimation performance...

  10. Learning outcomes using video in supervision and peer feedback during clinical skills training

    DEFF Research Database (Denmark)

    Lauridsen, Henrik Hein; Toftgård, Rie Castella; Nørgaard, Cita

    supervision of clinical skills (formative assessment). Demonstrations of these principles will be presented as video podcasts during the session. The learning outcomes of video supervision and peer-feedback were assessed in an online questionnaire survey. Results Results of the supervision showed large self...

  11. Supervised neural network modeling: an empirical investigation into learning from imbalanced data with labeling errors.

    Science.gov (United States)

    Khoshgoftaar, Taghi M; Van Hulse, Jason; Napolitano, Amri

    2010-05-01

    Neural network algorithms such as multilayer perceptrons (MLPs) and radial basis function networks (RBFNets) have been used to construct learners which exhibit strong predictive performance. Two data related issues that can have a detrimental impact on supervised learning initiatives are class imbalance and labeling errors (or class noise). Imbalanced data can make it more difficult for the neural network learning algorithms to distinguish between examples of the various classes, and class noise can lead to the formulation of incorrect hypotheses. Both class imbalance and labeling errors are pervasive problems encountered in a wide variety of application domains. Many studies have been performed to investigate these problems in isolation, but few have focused on their combined effects. This study presents a comprehensive empirical investigation using neural network algorithms to learn from imbalanced data with labeling errors. In particular, the first component of our study investigates the impact of class noise and class imbalance on two common neural network learning algorithms, while the second component considers the ability of data sampling (which is commonly used to address the issue of class imbalance) to improve their performances. Our results, for which over two million models were trained and evaluated, show that conclusions drawn using the more commonly studied C4.5 classifier may not apply when using neural networks.

  12. ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system

    National Research Council Canada - National Science Library

    Hughey, Jacob J; Hastie, Trevor; Butte, Atul J

    2016-01-01

    Numerous biological systems oscillate over time or space. Despite these oscillators' importance, data from an oscillatory system is problematic for existing methods of regularized supervised learning...

  13. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  14. Benchmarking protein classification algorithms via supervised cross-validation

    NARCIS (Netherlands)

    Kertész-Farkas, A.; Dhir, S.; Sonego, P.; Pacurar, M.; Netoteia, S.; Nijveen, H.; Kuzniar, A.; Leunissen, J.A.M.; Kocsor, A.; Pongor, S.

    2008-01-01

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-o

  15. Algorithm and Implementation of the Blog-Post Supervision Process

    CERN Document Server

    Biswas, Kamanashis; Harun, S A M

    2010-01-01

    A web log or blog in short is a trendy way to share personal entries with others through website. A typical blog may consist of texts, images, audios and videos etc. Most of the blogs work as personal online diaries, while others may focus on specific interest such as photographs (photoblog), art (artblog), travel (tourblog), IT (techblog) etc. Another type of blogging called microblogging is also very well known now-a-days which contains very short posts. Like the developed countries, the users of blogs are gradually increasing in the developing countries e.g. Bangladesh. Due to the nature of open access to all users, some people misuse it to spread fake news to achieve individual or political goals. Some of them also post vulgar materials that make an embarrass situation for other bloggers. Even, sometimes it indulges the reputation of the victim. The only way to overcome this problem is to bring all the posts under supervision of the blog moderator. But it totally contradicts with blogging concepts. In thi...

  16. Path Control Experiment of Mobile Robot Based on Supervised Learning

    Directory of Open Access Journals (Sweden)

    Gao Chi

    2013-07-01

    Full Text Available To solve the weak capacity and low control accuracy of the robots which adapt to the complex working conditions, proposed that a path control method based on the driving experience and supervised learning. According to the slope road geometry characteristics, established the modeling study due to ramp pavement path control method and the control structure based on monitoring and self-learning. Made use of the Global Navigation Satellite System did the experiment. The test data illustrates that when the running speed is not greater than 5 m / s, the straight-line trajectory path transverse vertical deviation within 士20cm ,which proved that the control method has a high feasibility. 

  17. SUPERVISED LEARNING METHODS FOR BANGLA WEB DOCUMENT CATEGORIZATION

    Directory of Open Access Journals (Sweden)

    Ashis Kumar Mandal

    2014-09-01

    Full Text Available This paper explores the use of machine learning approaches, or more specifically, four supervised learning Methods, namely Decision Tree(C 4.5, K-Nearest Neighbour (KNN, Naïve Bays (NB, and Support Vector Machine (SVM for categorization of Bangla web documents. This is a task of automatically sorting a set of documents into categories from a predefined set. Whereas a wide range of methods have been applied to English text categorization, relatively few studies have been conducted on Bangla language text categorization. Hence, we attempt to analyze the efficiency of those four methods for categorization of Bangla documents. In order to validate, Bangla corpus from various websites has been developed and used as examples for the experiment. For Bangla, empirical results support that all four methods produce satisfactory performance with SVM attaining good result in terms of high dimensional and relatively noisy document feature vectors.

  18. Mining visual collocation patterns via self-supervised subspace learning.

    Science.gov (United States)

    Yuan, Junsong; Wu, Ying

    2012-04-01

    Traditional text data mining techniques are not directly applicable to image data which contain spatial information and are characterized by high-dimensional visual features. It is not a trivial task to discover meaningful visual patterns from images because the content variations and spatial dependence in visual data greatly challenge most existing data mining methods. This paper presents a novel approach to coping with these difficulties for mining visual collocation patterns. Specifically, the novelty of this work lies in the following new contributions: 1) a principled solution to the discovery of visual collocation patterns based on frequent itemset mining and 2) a self-supervised subspace learning method to refine the visual codebook by feeding back discovered patterns via subspace learning. The experimental results show that our method can discover semantically meaningful patterns efficiently and effectively.

  19. Multicultural supervision: lessons learned about an ongoing struggle.

    Science.gov (United States)

    Christiansen, Abigail Tolhurst; Thomas, Volker; Kafescioglu, Nilufer; Karakurt, Gunnur; Lowe, Walter; Smith, William; Wittenborn, Andrea

    2011-01-01

    This article examines the experiences of seven diverse therapists in a supervision course as they wrestled with the real-world application of multicultural supervision. Existing literature on multicultural supervision does not address the difficulties that arise in addressing multicultural issues in the context of the supervision relationship. The experiences of six supervisory candidates and one mentoring supervisor in addressing multicultural issues in supervision are explored. Guidelines for conversations regarding multicultural issues are provided.

  20. Synthesis of supervised classification algorithm using intelligent and statistical tools

    Directory of Open Access Journals (Sweden)

    Ali Douik

    2009-09-01

    Full Text Available A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby, the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national competition. This work is prominent for many next computer vision studies as it's detailed in this study.

  1. Synthesis of supervised classification algorithm using intelligent and statistical tools

    CERN Document Server

    Douik, Ali

    2009-01-01

    A fundamental task in detecting foreground objects in both static and dynamic scenes is to take the best choice of color system representation and the efficient technique for background modeling. We propose in this paper a non-parametric algorithm dedicated to segment and to detect objects in color images issued from a football sports meeting. Indeed segmentation by pixel concern many applications and revealed how the method is robust to detect objects, even in presence of strong shadows and highlights. In the other hand to refine their playing strategy such as in football, handball, volley ball, Rugby..., the coach need to have a maximum of technical-tactics information about the on-going of the game and the players. We propose in this paper a range of algorithms allowing the resolution of many problems appearing in the automated process of team identification, where each player is affected to his corresponding team relying on visual data. The developed system was tested on a match of the Tunisian national c...

  2. A supervised machine learning estimator for the non-linear matter power spectrum - SEMPS

    CERN Document Server

    Mohammed, Irshad

    2015-01-01

    In this article, we argue that models based on machine learning (ML) can be very effective in estimating the non-linear matter power spectrum ($P(k)$). We employ the prediction ability of the supervised ML algorithms to build an estimator for the $P(k)$. The estimator is trained on a set of cosmological models, and redshifts for which the $P(k)$ is known, and it learns to predict $P(k)$ for any other set. We review three ML algorithms -- Random Forest, Gradient Boosting Machines, and K-Nearest Neighbours -- and investigate their prime parameters to optimize the prediction accuracy of the estimator. We also compute an optimal size of the training set, which is realistic enough, and still yields high accuracy. We find that, employing the optimal values of the internal parameters, a set of $50-100$ cosmological models is enough to train the estimator that can predict the $P(k)$ for a wide range of cosmological models, and redshifts. Using this configuration, we build a blackbox -- Supervised Estimator for Matter...

  3. On Training Targets for Supervised Speech Separation

    OpenAIRE

    Wang, Yuxuan; Narayanan, Arun; Wang, DeLiang

    2014-01-01

    Formulation of speech separation as a supervised learning problem has shown considerable promise. In its simplest form, a supervised learning algorithm, typically a deep neural network, is trained to learn a mapping from noisy features to a time-frequency representation of the target of interest. Traditionally, the ideal binary mask (IBM) is used as the target because of its simplicity and large speech intelligibility gains. The supervised learning framework, however, is not restricted to the...

  4. Towards harmonized seismic analysis across Europe using supervised machine learning approaches

    Science.gov (United States)

    Zaccarelli, Riccardo; Bindi, Dino; Cotton, Fabrice; Strollo, Angelo

    2017-04-01

    In the framework of the Thematic Core Services for Seismology of EPOS-IP (European Plate Observing System-Implementation Phase), a service for disseminating a regionalized logic-tree of ground motions models for Europe is under development. While for the Mediterranean area the large availability of strong motion data qualified and disseminated through the Engineering Strong Motion database (ESM-EPOS), supports the development of both selection criteria and ground motion models, for the low-to-moderate seismic regions of continental Europe the development of ad-hoc models using weak motion recordings of moderate earthquakes is unavoidable. Aim of this work is to present a platform for creating application-oriented earthquake databases by retrieving information from EIDA (European Integrated Data Archive) and applying supervised learning models for earthquake records selection and processing suitable for any specific application of interest. Supervised learning models, i.e. the task of inferring a function from labelled training data, have been extensively used in several fields such as spam detection, speech and image recognition and in general pattern recognition. Their suitability to detect anomalies and perform a semi- to fully- automated filtering on large waveform data set easing the effort of (or replacing) human expertise is therefore straightforward. Being supervised learning algorithms capable of learning from a relatively small training set to predict and categorize unseen data, its advantage when processing large amount of data is crucial. Moreover, their intrinsic ability to make data driven predictions makes them suitable (and preferable) in those cases where explicit algorithms for detection might be unfeasible or too heuristic. In this study, we consider relatively simple statistical classifiers (e.g., Naive Bayes, Logistic Regression, Random Forest, SVMs) where label are assigned to waveform data based on "recognized classes" needed for our use case

  5. Identification of Village Building via Google Earth Images and Supervised Machine Learning Methods

    Directory of Open Access Journals (Sweden)

    Zhiling Guo

    2016-03-01

    Full Text Available In this study, a method based on supervised machine learning is proposed to identify village buildings from open high-resolution remote sensing images. We select Google Earth (GE RGB images to perform the classification in order to examine its suitability for village mapping, and investigate the feasibility of using machine learning methods to provide automatic classification in such fields. By analyzing the characteristics of GE images, we design different features on the basis of two kinds of supervised machine learning methods for classification: adaptive boosting (AdaBoost and convolutional neural networks (CNN. To recognize village buildings via their color and texture information, the RGB color features and a large number of Haar-like features in a local window are utilized in the AdaBoost method; with multilayer trained networks based on gradient descent algorithms and back propagation, CNN perform the identification by mining deeper information from buildings and their neighborhood. Experimental results from the testing area at Savannakhet province in Laos show that our proposed AdaBoost method achieves an overall accuracy of 96.22% and the CNN method is also competitive with an overall accuracy of 96.30%.

  6. Phenotype classification of zebrafish embryos by supervised learning.

    Directory of Open Access Journals (Sweden)

    Nathalie Jeanray

    Full Text Available Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning. We show that, compared to manual classification, automatic classification results in 90 to 100% agreement with consensus voting of biological experts in nine out of eleven considered defects in 3 days old zebrafish larvae. Automation of the analysis and classification of zebrafish embryo pictures reduces the workload and time required for the biological expert and increases the reproducibility and objectivity of this classification.

  7. Detection of money laundering groups using supervised learning in networks

    CERN Document Server

    Savage, David; Chou, Pauline; Zhang, Xiuzhen; Yu, Xinghuo

    2016-01-01

    Money laundering is a major global problem, enabling criminal organisations to hide their ill-gotten gains and to finance further operations. Prevention of money laundering is seen as a high priority by many governments, however detection of money laundering without prior knowledge of predicate crimes remains a significant challenge. Previous detection systems have tended to focus on individuals, considering transaction histories and applying anomaly detection to identify suspicious behaviour. However, money laundering involves groups of collaborating individuals, and evidence of money laundering may only be apparent when the collective behaviour of these groups is considered. In this paper we describe a detection system that is capable of analysing group behaviour, using a combination of network analysis and supervised learning. This system is designed for real-world application and operates on networks consisting of millions of interacting parties. Evaluation of the system using real-world data indicates th...

  8. Using Supervised Learning to Improve Monte Carlo Integral Estimation

    CERN Document Server

    Tracey, Brendan; Alonso, Juan J

    2011-01-01

    Monte Carlo (MC) techniques are often used to estimate integrals of a multivariate function using randomly generated samples of the function. In light of the increasing interest in uncertainty quantification and robust design applications in aerospace engineering, the calculation of expected values of such functions (e.g. performance measures) becomes important. However, MC techniques often suffer from high variance and slow convergence as the number of samples increases. In this paper we present Stacked Monte Carlo (StackMC), a new method for post-processing an existing set of MC samples to improve the associated integral estimate. StackMC is based on the supervised learning techniques of fitting functions and cross validation. It should reduce the variance of any type of Monte Carlo integral estimate (simple sampling, importance sampling, quasi-Monte Carlo, MCMC, etc.) without adding bias. We report on an extensive set of experiments confirming that the StackMC estimate of an integral is more accurate than ...

  9. Phenotype classification of zebrafish embryos by supervised learning.

    Science.gov (United States)

    Jeanray, Nathalie; Marée, Raphaël; Pruvot, Benoist; Stern, Olivier; Geurts, Pierre; Wehenkel, Louis; Muller, Marc

    2015-01-01

    Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning. We show that, compared to manual classification, automatic classification results in 90 to 100% agreement with consensus voting of biological experts in nine out of eleven considered defects in 3 days old zebrafish larvae. Automation of the analysis and classification of zebrafish embryo pictures reduces the workload and time required for the biological expert and increases the reproducibility and objectivity of this classification.

  10. Supervised dictionary learning for inferring concurrent brain networks.

    Science.gov (United States)

    Zhao, Shijie; Han, Junwei; Lv, Jinglei; Jiang, Xi; Hu, Xintao; Zhao, Yu; Ge, Bao; Guo, Lei; Liu, Tianming

    2015-10-01

    Task-based fMRI (tfMRI) has been widely used to explore functional brain networks via predefined stimulus paradigm in the fMRI scan. Traditionally, the general linear model (GLM) has been a dominant approach to detect task-evoked networks. However, GLM focuses on task-evoked or event-evoked brain responses and possibly ignores the intrinsic brain functions. In comparison, dictionary learning and sparse coding methods have attracted much attention recently, and these methods have shown the promise of automatically and systematically decomposing fMRI signals into meaningful task-evoked and intrinsic concurrent networks. Nevertheless, two notable limitations of current data-driven dictionary learning method are that the prior knowledge of task paradigm is not sufficiently utilized and that the establishment of correspondences among dictionary atoms in different brains have been challenging. In this paper, we propose a novel supervised dictionary learning and sparse coding method for inferring functional networks from tfMRI data, which takes both of the advantages of model-driven method and data-driven method. The basic idea is to fix the task stimulus curves as predefined model-driven dictionary atoms and only optimize the other portion of data-driven dictionary atoms. Application of this novel methodology on the publicly available human connectome project (HCP) tfMRI datasets has achieved promising results.

  11. Dynamical transitions in the evolution of learning algorithms by selection

    CERN Document Server

    Neirotti, J P; Neirotti, Juan Pablo; Caticha, Nestor

    2002-01-01

    We study the evolution of artificial learning systems by means of selection. Genetic programming is used to generate a sequence of populations of algorithms which can be used by neural networks for supervised learning of a rule that generates examples. In opposition to concentrating on final results, which would be the natural aim while designing good learning algorithms, we study the evolution process and pay particular attention to the temporal order of appearance of functional structures responsible for the improvements in the learning process, as measured by the generalization capabilities of the resulting algorithms. The effect of such appearances can be described as dynamical phase transitions. The concepts of phenotypic and genotypic entropies, which serve to describe the distribution of fitness in the population and the distribution of symbols respectively, are used to monitor the dynamics. In different runs the phase transitions might be present or not, with the system finding out good solutions, or ...

  12. Supervised orthogonal discriminant subspace projects learning for face recognition.

    Science.gov (United States)

    Chen, Yu; Xu, Xiao-Hong

    2014-02-01

    In this paper, a new linear dimension reduction method called supervised orthogonal discriminant subspace projection (SODSP) is proposed, which addresses high-dimensionality of data and the small sample size problem. More specifically, given a set of data points in the ambient space, a novel weight matrix that describes the relationship between the data points is first built. And in order to model the manifold structure, the class information is incorporated into the weight matrix. Based on the novel weight matrix, the local scatter matrix as well as non-local scatter matrix is defined such that the neighborhood structure can be preserved. In order to enhance the recognition ability, we impose an orthogonal constraint into a graph-based maximum margin analysis, seeking to find a projection that maximizes the difference, rather than the ratio between the non-local scatter and the local scatter. In this way, SODSP naturally avoids the singularity problem. Further, we develop an efficient and stable algorithm for implementing SODSP, especially, on high-dimensional data set. Moreover, the theoretical analysis shows that LPP is a special instance of SODSP by imposing some constraints. Experiments on the ORL, Yale, Extended Yale face database B and FERET face database are performed to test and evaluate the proposed algorithm. The results demonstrate the effectiveness of SODSP.

  13. Novel Newton's learning algorithm of neural networks

    Institute of Scientific and Technical Information of China (English)

    Long Ning; Zhang Fengli

    2006-01-01

    Newton's learning algorithm of NN is presented and realized. In theory, the convergence rate of learning algorithm of NN based on Newton's method must be faster than BP's and other learning algorithms, because the gradient method is linearly convergent while Newton's method has second order convergence rate.The fast computing algorithm of Hesse matrix of the cost function of NN is proposed and it is the theory basis of the improvement of Newton's learning algorithm. Simulation results show that the convergence rate of Newton's learning algorithm is high and apparently faster than the traditional BP method's, and the robustness of Newton's learning algorithm is also better than BP method's.

  14. A Semi-supervised Heat Kernel Pagerank MBO Algorithm for Data Classification

    Science.gov (United States)

    2016-07-01

    closed-form expression for the class of each node is derived. Moreover, the authors of [50] describe a semi-supervised method for classifying data using...manifold smoothing and image denoising. In addition to image processing, methods in- volving spectral graph theory [17,56], based on a graphical setting...pagerank and Section 3 presents a model using heat kernel pagerank directly as a classifier . Section 4 formulates the new algorithm as well as provides

  15. I’m just thinking - How learning opportunities are created in doctoral supervision

    DEFF Research Database (Denmark)

    Kobayashi, Sofie; Berge, Maria; Grout, Brian William Wilson;

    With this paper we aim to contribute towards an understanding of learning dynamics in doctoral supervision by analysing how learning opportunities are created in the interaction. We analyse interaction between supervisors and doctoral students using the notion of experiencing variation as a key...... for learning. Earlier research into doctoral supervision has been rather vague on how doctoral students learn to carry out research. Empirically, we have based the study on four cases each with one doctoral student and their supervisors. The supervision sessions were captured on video and audio to provide...

  16. How Supervisor Experience Influences Trust, Supervision, and Trainee Learning: A Qualitative Study.

    Science.gov (United States)

    Sheu, Leslie; Kogan, Jennifer R; Hauer, Karen E

    2017-09-01

    Appropriate trust and supervision facilitate trainees' growth toward unsupervised practice. The authors investigated how supervisor experience influences trust, supervision, and subsequently trainee learning. In a two-phase qualitative inductive content analysis, phase one entailed reviewing 44 internal medicine resident and attending supervisor interviews from two institutions (July 2013 to September 2014) for themes on how supervisor experience influences trust and supervision. Three supervisor exemplars (early, developing, experienced) were developed and shared in phase two focus groups at a single institution, wherein 23 trainees validated the exemplars and discussed how each impacted learning (November 2015). Phase one: Four domains of trust and supervision varying with experience emerged: data, approach, perspective, clinical. Early supervisors were detail oriented and determined trust depending on task completion (data), were rule based (approach), drew on their experiences as trainees to guide supervision (perspective), and felt less confident clinically compared with more experienced supervisors (clinical). Experienced supervisors determined trust holistically (data), checked key aspects of patient care selectively and covertly (approach), reflected on individual experiences supervising (perspective), and felt comfortable managing clinical problems and gauging trainee abilities (clinical). Phase two: Trainees felt the exemplars reflected their experiences, described their preferences and learning needs shifting over time, and emphasized the importance of supervisor flexibility to match their learning needs. With experience, supervisors differ in their approach to trust and supervision. Supervisors need to trust themselves before being able to trust others. Trainees perceive these differences and seek supervision approaches that align with their learning needs.

  17. Supervised learning classification models for prediction of plant virus encoded RNA silencing suppressors.

    Directory of Open Access Journals (Sweden)

    Zeenia Jagga

    Full Text Available Viral encoded RNA silencing suppressor proteins interfere with the host RNA silencing machinery, facilitating viral infection by evading host immunity. In plant hosts, the viral proteins have several basic science implications and biotechnology applications. However in silico identification of these proteins is limited by their high sequence diversity. In this study we developed supervised learning based classification models for plant viral RNA silencing suppressor proteins in plant viruses. We developed four classifiers based on supervised learning algorithms: J48, Random Forest, LibSVM and Naïve Bayes algorithms, with enriched model learning by correlation based feature selection. Structural and physicochemical features calculated for experimentally verified primary protein sequences were used to train the classifiers. The training features include amino acid composition; auto correlation coefficients; composition, transition, and distribution of various physicochemical properties; and pseudo amino acid composition. Performance analysis of predictive models based on 10 fold cross-validation and independent data testing revealed that the Random Forest based model was the best and achieved 86.11% overall accuracy and 86.22% balanced accuracy with a remarkably high area under the Receivers Operating Characteristic curve of 0.95 to predict viral RNA silencing suppressor proteins. The prediction models for plant viral RNA silencing suppressors can potentially aid identification of novel viral RNA silencing suppressors, which will provide valuable insights into the mechanism of RNA silencing and could be further explored as potential targets for designing novel antiviral therapeutics. Also, the key subset of identified optimal features may help in determining compositional patterns in the viral proteins which are important determinants for RNA silencing suppressor activities. The best prediction model developed in the study is available as a

  18. The Practice of Supervision for Professional Learning: The Example of Future Forensic Specialists

    Science.gov (United States)

    Köpsén, Susanne; Nyström, Sofia

    2015-01-01

    Supervision intended to support learning is of great interest in professional knowledge development. No single definition governs the implementation and enactment of supervision because of different conditions, intentions, and pedagogical approaches. Uncertainty exists at a time when knowledge and methods are undergoing constant development. This…

  19. The Practice of Supervision for Professional Learning: The Example of Future Forensic Specialists

    Science.gov (United States)

    Köpsén, Susanne; Nyström, Sofia

    2015-01-01

    Supervision intended to support learning is of great interest in professional knowledge development. No single definition governs the implementation and enactment of supervision because of different conditions, intentions, and pedagogical approaches. Uncertainty exists at a time when knowledge and methods are undergoing constant development. This…

  20. Částečně řízené učení algoritmů strojového učení (semi-supervised learning)

    OpenAIRE

    Burda, Karel

    2014-01-01

    The final thesis summarizes in its theoretical part basic knowledge of machine learning algorithms that involves supervised, semi-supervised, and unsupervised learning. Experiments with textual data in natural spoken language involving different machine learning methods and parameterization are carried out in its practical part. Conclusions made in the thesis may be of use to individuals that are at least slightly interested in this domain.

  1. Semi-Supervised Learning for Classification of Protein Sequence Data

    Directory of Open Access Journals (Sweden)

    Brian R. King

    2008-01-01

    Full Text Available Protein sequence data continue to become available at an exponential rate. Annotation of functional and structural attributes of these data lags far behind, with only a small fraction of the data understood and labeled by experimental methods. Classification methods that are based on semi-supervised learning can increase the overall accuracy of classifying partly labeled data in many domains, but very few methods exist that have shown their effect on protein sequence classification. We show how proven methods from text classification can be applied to protein sequence data, as we consider both existing and novel extensions to the basic methods, and demonstrate restrictions and differences that must be considered. We demonstrate comparative results against the transductive support vector machine, and show superior results on the most difficult classification problems. Our results show that large repositories of unlabeled protein sequence data can indeed be used to improve predictive performance, particularly in situations where there are fewer labeled protein sequences available, and/or the data are highly unbalanced in nature.

  2. A novel supervised trajectory segmentation algorithm identifies distinct types of human adenovirus motion in host cells.

    Science.gov (United States)

    Helmuth, Jo A; Burckhardt, Christoph J; Koumoutsakos, Petros; Greber, Urs F; Sbalzarini, Ivo F

    2007-09-01

    Biological trajectories can be characterized by transient patterns that may provide insight into the interactions of the moving object with its immediate environment. The accurate and automated identification of trajectory motifs is important for the understanding of the underlying mechanisms. In this work, we develop a novel trajectory segmentation algorithm based on supervised support vector classification. The algorithm is validated on synthetic data and applied to the identification of trajectory fingerprints of fluorescently tagged human adenovirus particles in live cells. In virus trajectories on the cell surface, periods of confined motion, slow drift, and fast drift are efficiently detected. Additionally, directed motion is found for viruses in the cytoplasm. The algorithm enables the linking of microscopic observations to molecular phenomena that are critical in many biological processes, including infectious pathogen entry and signal transduction.

  3. Combining theories to reach multi-faceted insights into learning opportunities in doctoral supervision

    DEFF Research Database (Denmark)

    Kobayashi, Sofie; Rump, Camilla Østerberg

    The aim of this paper is to illustrate how theories can be combined to explore opportunities for learning in doctoral supervision. While our earlier research into learning dynamics in doctoral supervision in life science research (Kobayashi, 2014) has focused on illustrating learning opportunities...... this paper focuses on the methodological advantages and potential criticism of combining theories. Learning in doctoral education, as in classroom learning, can be analysed from different perspectives. Zembylas (2005) suggests three perspectives with the aim of linking the cognitive and the emotional...

  4. SimNest: Social Media Nested Epidemic Simulation via Online Semi-supervised Deep Learning.

    Science.gov (United States)

    Zhao, Liang; Chen, Jiangzhuo; Chen, Feng; Wang, Wei; Lu, Chang-Tien; Ramakrishnan, Naren

    2015-11-01

    Infectious disease epidemics such as influenza and Ebola pose a serious threat to global public health. It is crucial to characterize the disease and the evolution of the ongoing epidemic efficiently and accurately. Computational epidemiology can model the disease progress and underlying contact network, but suffers from the lack of real-time and fine-grained surveillance data. Social media, on the other hand, provides timely and detailed disease surveillance, but is insensible to the underlying contact network and disease model. This paper proposes a novel semi-supervised deep learning framework that integrates the strengths of computational epidemiology and social media mining techniques. Specifically, this framework learns the social media users' health states and intervention actions in real time, which are regularized by the underlying disease model and contact network. Conversely, the learned knowledge from social media can be fed into computational epidemic model to improve the efficiency and accuracy of disease diffusion modeling. We propose an online optimization algorithm to substantialize the above interactive learning process iteratively to achieve a consistent stage of the integration. The extensive experimental results demonstrated that our approach can effectively characterize the spatio-temporal disease diffusion, outperforming competing methods by a substantial margin on multiple metrics.

  5. Semi-Supervised Learning Techniques in AO Applications: A Novel Approach To Drift Counteraction

    Science.gov (United States)

    De Vito, S.; Fattoruso, G.; Pardo, M.; Tortorella, F.; Di Francia, G.

    2011-11-01

    In this work we proposed and tested the use of SSL techniques in the AO domain. The SSL characteristics have been exploited to reduce the need for costly supervised samples and the effects of time dependant drift of state-of-the-art statistical learning approaches. For this purpose, an on-field recorded one year long atmospheric pollution dataset has been used. The semi-supervised approach benefitted from the use of updated unlabeled samples, adapting its knowledge to the slowly changing drift effects. We expect that semi-supervised learning can provide significant advantages to the performance of sensor fusion subsystems in artificial olfaction exhibiting an interesting drift counteraction effect.

  6. Using Optimal Ratio Mask as Training Target for Supervised Speech Separation

    OpenAIRE

    Xia, Shasha; Li, Hao; ZHANG Xueliang

    2017-01-01

    Supervised speech separation uses supervised learning algorithms to learn a mapping from an input noisy signal to an output target. With the fast development of deep learning, supervised separation has become the most important direction in speech separation area in recent years. For the supervised algorithm, training target has a significant impact on the performance. Ideal ratio mask is a commonly used training target, which can improve the speech intelligibility and quality of the separate...

  7. Generation of a Supervised Classification Algorithm for Time-Series Variable Stars with an Application to the LINEAR Dataset

    CERN Document Server

    Johnston, Kyle B

    2016-01-01

    With the advent of digital astronomy, new benefits and new problems have been presented to the modern day astronomer. While data can be captured in a more efficient and accurate manor using digital means, the efficiency of data retrieval has led to an overload of scientific data for processing and storage. This paper will focus on the construction and application of a supervised pattern classification algorithm for the identification of variable stars. Given the reduction of a survey of stars into a standard feature space, the problem of using prior patterns to identify new observed patterns can be reduced to time tested classification methodologies and algorithms. Such supervised methods, so called because the user trains the algorithms prior to application using patterns with known classes or labels, provide a means to probabilistically determine the estimated class type of new observations. This paper will demonstrate the construction and application of a supervised classification algorithm on variable sta...

  8. Supervised learning of short and high-dimensional temporal sequences for life science measurements

    CERN Document Server

    Schleif, F -M; Hammer, B

    2011-01-01

    The analysis of physiological processes over time are often given by spectrometric or gene expression profiles over time with only few time points but a large number of measured variables. The analysis of such temporal sequences is challenging and only few methods have been proposed. The information can be encoded time independent, by means of classical expression differences for a single time point or in expression profiles over time. Available methods are limited to unsupervised and semi-supervised settings. The predictive variables can be identified only by means of wrapper or post-processing techniques. This is complicated due to the small number of samples for such studies. Here, we present a supervised learning approach, termed Supervised Topographic Mapping Through Time (SGTM-TT). It learns a supervised mapping of the temporal sequences onto a low dimensional grid. We utilize a hidden markov model (HMM) to account for the time domain and relevance learning to identify the relevant feature dimensions mo...

  9. A supervised contextual classifier based on a region-growth algorithm

    DEFF Research Database (Denmark)

    Lira, Jorge; Maletti, Gabriela Mariel

    2002-01-01

    A supervised classification scheme to segment optical multi-spectral images has been developed. In this classifier, an automated region-growth algorithm delineates the training sets. This algorithm handles three parameters: an initial pixel seed, a window size and a threshold for each class....... A suitable pixel seed is manually implanted through visual inspection of the image classes. The best value for the window and the threshold are obtained from a spectral distance and heuristic criteria. This distance is calculated from a mathematical model of spectral separability. A pixel is incorporated...... into a region if a spectral homogeneity criterion is satisfied in the pixel-centered window for a given threshold. The homogeneity criterion is obtained from the model of spectral distance. The set of pixels forming a region represents a statistically valid sample of a defined class signaled by the initial...

  10. A supervised contextual classifier based on a region-growth algorithm

    DEFF Research Database (Denmark)

    Lira, Jorge; Maletti, Gabriela Mariel

    2002-01-01

    A supervised classification scheme to segment optical multi-spectral images has been developed. In this classifier, an automated region-growth algorithm delineates the training sets. This algorithm handles three parameters: an initial pixel seed, a window size and a threshold for each class...... pixel seed. The grown regions therefore constitute suitable training sets for each class. Comparing the statistical behavior of the pixel population of a sliding window with that of each class performs the classification. For region-growth, a window size is employed for each class. For classification....... A suitable pixel seed is manually implanted through visual inspection of the image classes. The best value for the window and the threshold are obtained from a spectral distance and heuristic criteria. This distance is calculated from a mathematical model of spectral separability. A pixel is incorporated...

  11. Tuning, Diagnostics & Data Preparation for Generalized Linear Models Supervised Algorithm in Data Mining Technologies

    Directory of Open Access Journals (Sweden)

    Sachin Bhaskar

    2015-07-01

    Full Text Available Data mining techniques are the result of a long process of research and product development. Large amount of data are searched by the practice of Data Mining to find out the trends and patterns that go beyond simple analysis. For segmentation of data and also to evaluate the possibility of future events, complex mathematical algorithms are used here. Specific algorithm produces each Data Mining model. More than one algorithms are used to solve in best way by some Data Mining problems. Data Mining technologies can be used through Oracle. Generalized Linear Models (GLM Algorithm is used in Regression and Classification Oracle Data Mining functions. For linear modelling, GLM is one the popular statistical techniques. For regression and binary classification, GLM is implemented by Oracle Data Mining. Row diagnostics as well as model statistics and extensive co-efficient statistics are provided by GLM. It also supports confidence bounds.. This paper outlines and produces analysis of GLM algorithm, which will guide to understand the tuning, diagnostics & data preparation process and the importance of Regression & Classification supervised Oracle Data Mining functions and it is utilized in marketing, time series prediction, financial forecasting, overall business planning, trend analysis, environmental modelling, biomedical and drug response modelling, etc.

  12. Combining theories to reach multi-faceted insights into learning opportunities in doctoral supervision

    DEFF Research Database (Denmark)

    Kobayashi, Sofie; Rump, Camilla Østerberg

    in science learning; conceptual change, socio-constructivism and post-structuralism. In the present study we employ variation theory (Marton & Tsui, 2004) to study the individual acquisition perspective, what Zembylas terms conceptual change. As for the post-structural perspective we employ positioning......The aim of this paper is to illustrate how theories can be combined to explore opportunities for learning in doctoral supervision. While our earlier research into learning dynamics in doctoral supervision in life science research (Kobayashi, 2014) has focused on illustrating learning opportunities......-another when intertwining the analyses to get a multi-faceted insight into the phenomenon of learning to be a life science researcher. The data was derived from four observations of supervision of doctoral students in life science, each with a doctoral student and two supervisors. The storylines hypothesized...

  13. Response monitoring using quantitative ultrasound methods and supervised dictionary learning in locally advanced breast cancer

    Science.gov (United States)

    Gangeh, Mehrdad J.; Fung, Brandon; Tadayyon, Hadi; Tran, William T.; Czarnota, Gregory J.

    2016-03-01

    A non-invasive computer-aided-theragnosis (CAT) system was developed for the early assessment of responses to neoadjuvant chemotherapy in patients with locally advanced breast cancer. The CAT system was based on quantitative ultrasound spectroscopy methods comprising several modules including feature extraction, a metric to measure the dissimilarity between "pre-" and "mid-treatment" scans, and a supervised learning algorithm for the classification of patients to responders/non-responders. One major requirement for the successful design of a high-performance CAT system is to accurately measure the changes in parametric maps before treatment onset and during the course of treatment. To this end, a unified framework based on Hilbert-Schmidt independence criterion (HSIC) was used for the design of feature extraction from parametric maps and the dissimilarity measure between the "pre-" and "mid-treatment" scans. For the feature extraction, HSIC was used to design a supervised dictionary learning (SDL) method by maximizing the dependency between the scans taken from "pre-" and "mid-treatment" with "dummy labels" given to the scans. For the dissimilarity measure, an HSIC-based metric was employed to effectively measure the changes in parametric maps as an indication of treatment effectiveness. The HSIC-based feature extraction and dissimilarity measure used a kernel function to nonlinearly transform input vectors into a higher dimensional feature space and computed the population means in the new space, where enhanced group separability was ideally obtained. The results of the classification using the developed CAT system indicated an improvement of performance compared to a CAT system with basic features using histogram of intensity.

  14. Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques

    Directory of Open Access Journals (Sweden)

    Rayner Alfred

    2010-01-01

    Full Text Available Problem statement: In solving a classification problem in relational data mining, traditional methods, for example, the C4.5 and its variants, usually require data transformations from datasets stored in multiple tables into a single table. Unfortunately, we may loss some information when we join tables with a high degree of one-to-many association. Therefore, data transformation becomes a tedious trial-and-error work and the classification result is often not very promising especially when the number of tables and the degree of one-to-many association are large. Approach: We proposed a genetic semi-supervised clustering technique as a means of aggregating data stored in multiple tables to facilitate the task of solving a classification problem in relational database. This algorithm is suitable for classification of datasets with a high degree of one-to-many associations. It can be used in two ways. One is user-controlled clustering, where the user may control the result of clustering by varying the compactness of the spherical cluster. The other is automatic clustering, where a non-overlap clustering strategy is applied. In this study, we use the latter method to dynamically cluster multiple instances, as a means of aggregating them and illustrate the effectiveness of this method using the semi-supervised genetic algorithm-based clustering technique. Results: It was shown in the experimental results that using the reciprocal of Davies-Bouldin Index for cluster dispersion and the reciprocal of Gini Index for cluster purity, as the fitness function in the Genetic Algorithm (GA, finds solutions with much greater accuracy. The results obtained in this study showed that automatic clustering (seeding, by optimizing the cluster dispersion or cluster purity alone using GA, provides one with good results compared to the traditional k-means clustering. However, the best result can be achieved by optimizing the combination values of both the cluster

  15. Weakly supervised learning of a classifier for unusual event detection.

    Science.gov (United States)

    Jäger, Mark; Knoll, Christian; Hamprecht, Fred A

    2008-09-01

    In this paper, we present an automatic classification framework combining appearance based features and hidden Markov models (HMM) to detect unusual events in image sequences. One characteristic of the classification task is that anomalies are rare. This reflects the situation in the quality control of industrial processes, where error events are scarce by nature. As an additional restriction, class labels are only available for the complete image sequence, since frame-wise manual scanning of the recorded sequences for anomalies is too expensive and should, therefore, be avoided. The proposed framework reduces the feature space dimension of the image sequences by employing subspace methods and encodes characteristic temporal dynamics using continuous hidden Markov models (CHMMs). The applied learning procedure is as follows. 1) A generative model for the regular sequences is trained (one-class learning). 2) The regular sequence model (RSM) is used to locate potentially unusual segments within error sequences by means of a change detection algorithm (outlier detection). 3) Unusual segments are used to expand the RSM to an error sequence model (ESM). The complexity of the ESM is controlled by means of the Bayesian Information Criterion (BIC). The likelihood ratio of the data given the ESM and the RSM is used for the classification decision. This ratio is close to one for sequences without error events and increases for sequences containing error events. Experimental results are presented for image sequences recorded from industrial laser welding processes. We demonstrate that the learning procedure can significantly reduce the user interaction and that sequences with error events can be found with a small false positive rate. It has also been shown that a modeling of the temporal dynamics is necessary to reach these low error rates.

  16. Efficient Learning Algorithms with Limited Information

    Science.gov (United States)

    De, Anindya

    2013-01-01

    The thesis explores efficient learning algorithms in settings which are more restrictive than the PAC model of learning (Valiant) in one of the following two senses: (i) The learning algorithm has a very weak access to the unknown function, as in, it does not get labeled samples for the unknown function (ii) The error guarantee required from the…

  17. Supervised Learning Detection of Sixty Non-transiting Hot Jupiter Candidates

    Science.gov (United States)

    Millholland, Sarah; Laughlin, Gregory

    2017-09-01

    The optical full-phase photometric variations of a short-period planet provide a unique view of the planet’s atmospheric composition and dynamics. The number of planets with optical phase curve detections, however, is currently too small to study them as an aggregate population, motivating an extension of the search to non-transiting planets. Here we present an algorithm for the detection of non-transiting short-period giant planets in the Kepler field. The procedure uses the phase curves themselves as evidence for the planets’ existence. We employ a supervised learning algorithm to recognize the salient time-dependent properties of synthetic phase curves; we then search for detections of signals that match these properties. After demonstrating the algorithm’s capabilities, we classify 142,630 FGK Kepler stars without confirmed planets or Kepler Objects of Interest, and for each one, we assign a probability of a phase curve of a non-transiting planet being present. We identify 60 high-probability non-transiting hot Jupiter candidates. We also derive constraints on the candidates’ albedos and offsets of the phase curve maxima. These targets are strong candidates for follow-up radial velocity confirmation and characterization. Once confirmed, the atmospheric information content in the phase curves may be studied in yet greater detail.

  18. Semi-supervised learning for detecting text-lines in noisy document images

    Science.gov (United States)

    Liu, Zongyi; Zhou, Hanning

    2010-01-01

    Document layout analysis is a key step in document image understanding with wide applications in document digitization and reformatting. Identifying correct layout from noisy scanned images is especially challenging. In this paper, we introduce a semi-supervised learning framework to detect text-lines from noisy document images. Our framework consists of three steps. The first step is the initial segmentation that extracts text-lines and images using simple morphological operations. The second step is a grouping-based layout analysis that identifies text-lines, image zones, column separator and vertical border noise. It is able to efficiently remove the vertical border noises from multi-column pages. The third step is an online classifier that is trained with the high confidence line detection results from Step Two, and filters out noise from low confidence lines. The classifier effectively removes speckle noises embedded inside the content zones. We compare the performance of our algorithm to the state-of-the-art work in the field on the UW-III database. We choose the results reported by the Image Understanding Pattern Recognition Research (IUPR) and Scansoft Omnipage SDK 15.5. We evaluate the performances at both the page frame level and the text-line level. The result shows that our system has much lower false-alarm rate, while maintains similar content detection rate. In addition, we also show that our online training model generalizes better than algorithms depending on offline training.

  19. Learning to Teach: Teaching Internships in Counselor Education and Supervision

    Science.gov (United States)

    Hunt, Brandon; Gilmore, Genevieve Weber

    2011-01-01

    In an effort to ensure the efficacy of preparing emerging counselors in the field, CACREP standards require that by 2013 all core faculty at accredited universities have a doctorate in Counselor Education and Supervision. However, literature suggests that a disparity may exist in the preparation of counselor educators and the actual…

  20. Data integration modeling applied to drill hole planning through semi-supervised learning: A case study from the Dalli Cu-Au porphyry deposit in the central Iran

    Science.gov (United States)

    Fatehi, Moslem; Asadi, Hooshang H.

    2017-04-01

    In this study, the application of a transductive support vector machine (TSVM), an innovative semi-supervised learning algorithm, has been proposed for mapping the potential drill targets at a detailed exploration stage. The semi-supervised learning method is a hybrid of supervised and unsupervised learning approach that simultaneously uses both training and non-training data to design a classifier. By using the TSVM algorithm, exploration layers at the Dalli porphyry Cu-Au deposit in the central Iran were integrated to locate the boundary of the Cu-Au mineralization for further drilling. By applying this algorithm on the non-training (unlabeled) and limited training (labeled) Dalli exploration data, the study area was classified in two domains of Cu-Au ore and waste. Then, the results were validated by the earlier block models created, using the available borehole and trench data. In addition to TSVM, the support vector machine (SVM) algorithm was also implemented on the study area for comparison. Thirty percent of the labeled exploration data was used to evaluate the performance of these two algorithms. The results revealed 87 percent correct recognition accuracy for the TSVM algorithm and 82 percent for the SVM algorithm. The deepest inclined borehole, recently drilled in the western part of the Dalli deposit, indicated that the boundary of Cu-Au mineralization, as identified by the TSVM algorithm, was only 15 m off from the actual boundary intersected by this borehole. According to the results of the TSVM algorithm, six new boreholes were suggested for further drilling at the Dalli deposit. This study showed that the TSVM algorithm could be a useful tool for enhancing the mineralization zones and consequently, ensuring a more accurate drill hole planning.

  1. 基于半监督学习的变种群规模区间适应值交互式遗传算法%Interval-fitness interactive genetic algorithms with varying population size based on semi-supervised learning

    Institute of Scientific and Technical Information of China (English)

    孙晓燕; 任洁; 巩敦卫

    2011-01-01

    In order to alleviate user fatigue and improve the performances of interactive genetic algorithms (IGAs) in exploration, we present the interval-fitness interactive genetic algorithms with varying population size based on a co-training semi-supervised learning(CSSL). According to the clustering results of a large population, we develop the strategy for selecting unlabeled samples and labeled samples. Based on the approximation precision of two co-training learners, an efficient strategy for selecting high reliable unlabeled samples for labeling is given. Then, the CSSL mechanism is employed to train two radial basis function(RBF) neural networks in order to establish the surrogate model with high precision and good generalization ability. In the subsequent evolution, the surrogate model is used to estimate the fitness of an individual; in turn, the surrogate model is updated based on its estimation error. The proposed algorithm is analyzed and applied to a fashion evolutionary design system. The experimental results show its efficacy.%为了减轻用户疲劳并增强算法的搜索性能,本文在变种群规模交互式遗传算法的基础上引入协同训练半监督学习方法,提出基于半监督学习的变种群规模区间适应值交互式遗传算法.根据对大规模种群的聚类结果,给出标记样本和未标记样本的获取方法;结合半监督协同学习器逼近误差的改变,提出高可信度未标记样本的选择策略;采用半监督协同学习机制训练两个径向基函数(RBF)神经网络,构造精度高泛化能力强的代理模型;在进化过程中,利用代理模型估计大种群规模进化个体适应值,并根据估计偏差更新代理模型.算法的理论分析及其在服装进化设计系统中的应用结果说明了算法的有效性.

  2. A comparative evaluation of supervised and unsupervised representation learning approaches for anaplastic medulloblastoma differentiation

    Science.gov (United States)

    Cruz-Roa, Angel; Arevalo, John; Basavanhally, Ajay; Madabhushi, Anant; González, Fabio

    2015-01-01

    Learning data representations directly from the data itself is an approach that has shown great success in different pattern recognition problems, outperforming state-of-the-art feature extraction schemes for different tasks in computer vision, speech recognition and natural language processing. Representation learning applies unsupervised and supervised machine learning methods to large amounts of data to find building-blocks that better represent the information in it. Digitized histopathology images represents a very good testbed for representation learning since it involves large amounts of high complex, visual data. This paper presents a comparative evaluation of different supervised and unsupervised representation learning architectures to specifically address open questions on what type of learning architectures (deep or shallow), type of learning (unsupervised or supervised) is optimal. In this paper we limit ourselves to addressing these questions in the context of distinguishing between anaplastic and non-anaplastic medulloblastomas from routine haematoxylin and eosin stained images. The unsupervised approaches evaluated were sparse autoencoders and topographic reconstruct independent component analysis, and the supervised approach was convolutional neural networks. Experimental results show that shallow architectures with more neurons are better than deeper architectures without taking into account local space invariances and that topographic constraints provide useful invariant features in scale and rotations for efficient tumor differentiation.

  3. Exploitation of linkage learning in evolutionary algorithms

    CERN Document Server

    Chen, Ying-ping

    2010-01-01

    The exploitation of linkage learning is enhancing the performance of evolutionary algorithms. This monograph examines recent progress in linkage learning, with a series of focused technical chapters that cover developments and trends in the field.

  4. Semi-supervised eigenvectors for large-scale locally-biased learning

    DEFF Research Database (Denmark)

    Hansen, Toke Jansen; Mahoney, Michael W.

    2014-01-01

    -based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities, thus limiting the applicability of eigenvector-based methods in situations where one is interested in very local properties of the data. In this paper, we address this issue by providing......In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks nearby that prespecified target region. For example, one might...... a methodology to construct semi-supervised eigenvectors of a graph Laplacian, and we illustrate how these locally-biased eigenvectors can be used to perform locally-biased machine learning. These semi-supervised eigenvectors capture successively-orthogonalized directions of maximum variance, conditioned...

  5. Generalized SMO algorithm for SVM-based multitask learning.

    Science.gov (United States)

    Cai, Feng; Cherkassky, Vladimir

    2012-06-01

    Exploiting additional information to improve traditional inductive learning is an active research area in machine learning. In many supervised-learning applications, training data can be naturally separated into several groups, and incorporating this group information into learning may improve generalization. Recently, Vapnik proposed a general approach to formalizing such problems, known as "learning with structured data" and its support vector machine (SVM) based optimization formulation called SVM+. Liang and Cherkassky showed the connection between SVM+ and multitask learning (MTL) approaches in machine learning, and proposed an SVM-based formulation for MTL called SVM+MTL for classification. Training the SVM+MTL classifier requires the solution of a large quadratic programming optimization problem which scales as O(n(3)) with sample size n. So there is a need to develop computationally efficient algorithms for implementing SVM+MTL. This brief generalizes Platt's sequential minimal optimization (SMO) algorithm to the SVM+MTL setting. Empirical results show that, for typical SVM+MTL problems, the proposed generalized SMO achieves over 100 times speed-up, in comparison with general-purpose optimization routines.

  6. Determining effects of non-synonymous SNPs on protein-protein interactions using supervised and semi-supervised learning.

    Directory of Open Access Journals (Sweden)

    Nan Zhao

    2014-05-01

    Full Text Available Single nucleotide polymorphisms (SNPs are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs have been found near or inside the protein-protein interaction (PPI interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous SNP INteraction effect predictor. Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1 a 2-class problem (strengthening/weakening PPI mutations, (2 another 2-class problem (mutations that disrupt/preserve a PPI, and (3 a 3-class classification (detrimental/neutral/beneficial mutation effects. In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the

  7. An Algorithm for Learning the Essential Graph

    CERN Document Server

    Noble, John M

    2010-01-01

    This article presents an algorithm for learning the essential graph of a Bayesian network. The basis of the algorithm is the Maximum Minimum Parents and Children algorithm developed by previous authors, with three substantial modifications. The MMPC algorithm is the first stage of the Maximum Minimum Hill Climbing algorithm for learning the directed acyclic graph of a Bayesian network, introduced by previous authors. The MMHC algorithm runs in two phases; firstly, the MMPC algorithm to locate the skeleton and secondly an edge orientation phase. The computationally expensive part is the edge orientation phase. The first modification introduced to the MMPC algorithm, which requires little additional computational cost, is to obtain the immoralities and hence the essential graph. This renders the edge orientation phase, the computationally expensive part, unnecessary, since the entire Markov structure that can be derived from data is present in the essential graph. Secondly, the MMPC algorithm can accept indepen...

  8. No Free Lunch versus Occam's Razor in Supervised Learning

    CERN Document Server

    Lattimore, Tor

    2011-01-01

    The No Free Lunch theorems are often used to argue that domain specific knowledge is required to design successful algorithms. We use algorithmic information theory to argue the case for a universal bias allowing an algorithm to succeed in all interesting problem domains. Additionally, we give a new algorithm for off-line classification, inspired by Solomonoff induction, with good performance on all structured problems under reasonable assumptions. This includes a proof of the efficacy of the well-known heuristic of randomly selecting training data in the hope of reducing misclassification rates.

  9. Learning Intelligent Genetic Algorithms Using Japanese Nonograms

    Science.gov (United States)

    Tsai, Jinn-Tsong; Chou, Ping-Yi; Fang, Jia-Cen

    2012-01-01

    An intelligent genetic algorithm (IGA) is proposed to solve Japanese nonograms and is used as a method in a university course to learn evolutionary algorithms. The IGA combines the global exploration capabilities of a canonical genetic algorithm (CGA) with effective condensed encoding, improved fitness function, and modified crossover and…

  10. Learning Intelligent Genetic Algorithms Using Japanese Nonograms

    Science.gov (United States)

    Tsai, Jinn-Tsong; Chou, Ping-Yi; Fang, Jia-Cen

    2012-01-01

    An intelligent genetic algorithm (IGA) is proposed to solve Japanese nonograms and is used as a method in a university course to learn evolutionary algorithms. The IGA combines the global exploration capabilities of a canonical genetic algorithm (CGA) with effective condensed encoding, improved fitness function, and modified crossover and…

  11. Learning from nature: Nature-inspired algorithms

    DEFF Research Database (Denmark)

    Albeanu, Grigore; Madsen, Henrik; Popentiu-Vladicescu, Florin

    2016-01-01

    During last decade, the nature has inspired researchers to develop new algorithms. The largest collection of nature-inspired algorithms is biology-inspired: swarm intelligence (particle swarm optimization, ant colony optimization, cuckoo search, bees' algorithm, bat algorithm, firefly algorithm etc...... on collective social behaviour of organisms, researchers have developed optimization strategies taking into account not only the individuals, but also groups and environment. However, learning from nature, new classes of approaches can be identified, tested and compared against already available algorithms....... This work reviews the most effective nature-inspired algorithms and describes learning strategies based on nature oriented thinking. Examples and the benefits obtained from applying nature-inspired strategies in test generation, learners group optimization, and artificial immune systems for learning...

  12. Classification of autism spectrum disorder using supervised learning of brain connectivity measures extracted from synchrostates

    Science.gov (United States)

    Jamal, Wasifa; Das, Saptarshi; Oprescu, Ioana-Anastasia; Maharatna, Koushik; Apicella, Fabio; Sicca, Federico

    2014-08-01

    Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave one out cross-validation of the classification algorithm gives 94.7% accuracy as the best performance with corresponding sensitivity and specificity values as 85.7% and 100% respectively. Significance. The proposed method gives high classification accuracies and outperforms other contemporary research results. The effectiveness of the proposed method for classification of autistic and typical children suggests the possibility of using it on a larger population to validate it for clinical practice.

  13. A Model for Detecting Tor Encrypted Traffic using Supervised Machine Learning

    Directory of Open Access Journals (Sweden)

    Alaeddin Almubayed

    2015-06-01

    Full Text Available Tor is the low-latency anonymity tool and one of the prevalent used open source anonymity tools for anonymizing TCP traffic on the Internet used by around 500,000 people every day. Tor protects user's privacy against surveillance and censorship by making it extremely difficult for an observer to correlate visited websites in the Internet with the real physical-world identity. Tor accomplished that by ensuring adequate protection of Tor traffic against traffic analysis and feature extraction techniques. Further, Tor ensures anti-website fingerprinting by implementing different defences like TLS encryption, padding, and packet relaying. However, in this paper, an analysis has been performed against Tor from a local observer in order to bypass Tor protections; the method consists of a feature extraction from a local network dataset. Analysis shows that it's still possible for a local observer to fingerprint top monitored sites on Alexa and Tor traffic can be classified amongst other HTTPS traffic in the network despite the use of Tor's protections. In the experiment, several supervised machine-learning algorithms have been employed. The attack assumes a local observer sitting on a local network fingerprinting top 100 sites on Alexa; results gave an improvement amongst previous results by achieving an accuracy of 99.64% and 0.01% false positive.

  14. Ensemble Learning for Free with Evolutionary Algorithms ?

    CERN Document Server

    Gagné, Christian; Schoenauer, Marc; Tomassini, Marco

    2007-01-01

    Evolutionary Learning proceeds by evolving a population of classifiers, from which it generally returns (with some notable exceptions) the single best-of-run classifier as final result. In the meanwhile, Ensemble Learning, one of the most efficient approaches in supervised Machine Learning for the last decade, proceeds by building a population of diverse classifiers. Ensemble Learning with Evolutionary Computation thus receives increasing attention. The Evolutionary Ensemble Learning (EEL) approach presented in this paper features two contributions. First, a new fitness function, inspired by co-evolution and enforcing the classifier diversity, is presented. Further, a new selection criterion based on the classification margin is proposed. This criterion is used to extract the classifier ensemble from the final population only (Off-line) or incrementally along evolution (On-line). Experiments on a set of benchmark problems show that Off-line outperforms single-hypothesis evolutionary learning and state-of-art ...

  15. Reinforcement Learning Algorithms in Humanoid Robotics

    OpenAIRE

    Katic, Dusko; Vukobratovic, Miomir

    2007-01-01

    This study considers a optimal solutions for application of reinforcement learning in humanoid robotics Humanoid Robotics is a very challenging domain for reinforcement learning, Reinforcement learning control algorithms represents general framework to take traditional robotics towards true autonomy and versatility. The reinforcement learning paradigm described above has been successfully implemented for some special type of humanoid robots in the last 10 years. Reinforcement learning is well...

  16. GEOLOGICAL MAPPING USING MACHINE LEARNING ALGORITHMS

    Directory of Open Access Journals (Sweden)

    A. S. Harvey

    2016-06-01

    Full Text Available Remotely sensed spectral imagery, geophysical (magnetic and gravity, and geodetic (elevation data are useful in a variety of Earth science applications such as environmental monitoring and mineral exploration. Using these data with Machine Learning Algorithms (MLA, which are widely used in image analysis and statistical pattern recognition applications, may enhance preliminary geological mapping and interpretation. This approach contributes towards a rapid and objective means of geological mapping in contrast to conventional field expedition techniques. In this study, four supervised MLAs (naïve Bayes, k-nearest neighbour, random forest, and support vector machines are compared in order to assess their performance for correctly identifying geological rocktypes in an area with complete ground validation information. Geological maps of the Sudbury region are used for calibration and validation. Percent of correct classifications was used as indicators of performance. Results show that random forest is the best approach. As expected, MLA performance improves with more calibration clusters, i.e. a more uniform distribution of calibration data over the study region. Performance is generally low, though geological trends that correspond to a ground validation map are visualized. Low performance may be the result of poor spectral images of bare rock which can be covered by vegetation or water. The distribution of calibration clusters and MLA input parameters affect the performance of the MLAs. Generally, performance improves with more uniform sampling, though this increases required computational effort and time. With the achievable performance levels in this study, the technique is useful in identifying regions of interest and identifying general rocktype trends. In particular, phase I geological site investigations will benefit from this approach and lead to the selection of sites for advanced surveys.

  17. Geological Mapping Using Machine Learning Algorithms

    Science.gov (United States)

    Harvey, A. S.; Fotopoulos, G.

    2016-06-01

    Remotely sensed spectral imagery, geophysical (magnetic and gravity), and geodetic (elevation) data are useful in a variety of Earth science applications such as environmental monitoring and mineral exploration. Using these data with Machine Learning Algorithms (MLA), which are widely used in image analysis and statistical pattern recognition applications, may enhance preliminary geological mapping and interpretation. This approach contributes towards a rapid and objective means of geological mapping in contrast to conventional field expedition techniques. In this study, four supervised MLAs (naïve Bayes, k-nearest neighbour, random forest, and support vector machines) are compared in order to assess their performance for correctly identifying geological rocktypes in an area with complete ground validation information. Geological maps of the Sudbury region are used for calibration and validation. Percent of correct classifications was used as indicators of performance. Results show that random forest is the best approach. As expected, MLA performance improves with more calibration clusters, i.e. a more uniform distribution of calibration data over the study region. Performance is generally low, though geological trends that correspond to a ground validation map are visualized. Low performance may be the result of poor spectral images of bare rock which can be covered by vegetation or water. The distribution of calibration clusters and MLA input parameters affect the performance of the MLAs. Generally, performance improves with more uniform sampling, though this increases required computational effort and time. With the achievable performance levels in this study, the technique is useful in identifying regions of interest and identifying general rocktype trends. In particular, phase I geological site investigations will benefit from this approach and lead to the selection of sites for advanced surveys.

  18. Re/Learning Student Teaching Supervision: A Co/Autoethnographic Self-Study

    Science.gov (United States)

    Butler, Brandon M.; Diacopoulos, Mark M.

    2016-01-01

    This article documents the critical friendship of an experienced teacher educator and a doctoral student through our joint exploration of student teaching supervision. By adopting a co/autoethnographic approach, we learned from biographical and contemporaneous critical incidents that informed short- and long-term practices. In particular, we…

  19. Undergraduate Internship Supervision in Psychology Departments: Use of Experiential Learning Best Practices

    Science.gov (United States)

    Bailey, Sarah F.; Barber, Larissa K.; Nelson, Videl L.

    2017-01-01

    This study examined trends in how psychology internships are supervised compared to current experiential learning best practices in the literature. We sent a brief online survey to relevant contact persons for colleges/universities with psychology departments throughout the United States (n = 149 responded). Overall, the majority of institutions…

  20. Social media research: The application of supervised machine learning in organizational communication research

    NARCIS (Netherlands)

    van Zoonen, W.; van der Meer, T.G.L.A.

    2016-01-01

    Despite the online availability of data, analysis of this information in academic research is arduous. This article explores the application of supervised machine learning (SML) to overcome challenges associated with online data analysis. In SML classifiers are used to categorize and code binary dat

  1. Supervised learning for neural manifold using spatiotemporal brain activity

    Science.gov (United States)

    Kuo, Po-Chih; Chen, Yong-Sheng; Chen, Li-Fen

    2015-12-01

    Objective. Determining the means by which perceived stimuli are compactly represented in the human brain is a difficult task. This study aimed to develop techniques for the construction of the neural manifold as a representation of visual stimuli. Approach. We propose a supervised locally linear embedding method to construct the embedded manifold from brain activity, taking into account similarities between corresponding stimuli. In our experiments, photographic portraits were used as visual stimuli and brain activity was calculated from magnetoencephalographic data using a source localization method. Main results. The results of 10 × 10-fold cross-validation revealed a strong correlation between manifolds of brain activity and the orientation of faces in the presented images, suggesting that high-level information related to image content can be revealed in the brain responses represented in the manifold. Significance. Our experiments demonstrate that the proposed method is applicable to investigation into the inherent patterns of brain activity.

  2. Semi-supervised Machine Learning for Analysis of Hydrogeochemical Data and Models

    Science.gov (United States)

    Vesselinov, Velimir; O'Malley, Daniel; Alexandrov, Boian; Moore, Bryan

    2017-04-01

    Data- and model-based analyses such as uncertainty quantification, sensitivity analysis, and decision support using complex physics models with numerous model parameters and typically require a huge number of model evaluations (on order of 10^6). Furthermore, model simulations of complex physics may require substantial computational time. For example, accounting for simultaneously occurring physical processes such as fluid flow and biogeochemical reactions in heterogeneous porous medium may require several hours of wall-clock computational time. To address these issues, we have developed a novel methodology for semi-supervised machine learning based on Non-negative Matrix Factorization (NMF) coupled with customized k-means clustering. The algorithm allows for automated, robust Blind Source Separation (BSS) of groundwater types (contamination sources) based on model-free analyses of observed hydrogeochemical data. We have also developed reduced order modeling tools, which coupling support vector regression (SVR), genetic algorithms (GA) and artificial and convolutional neural network (ANN/CNN). SVR is applied to predict the model behavior within prior uncertainty ranges associated with the model parameters. ANN and CNN procedures are applied to upscale heterogeneity of the porous medium. In the upscaling process, fine-scale high-resolution models of heterogeneity are applied to inform coarse-resolution models which have improved computational efficiency while capturing the impact of fine-scale effects at the course scale of interest. These techniques are tested independently on a series of synthetic problems. We also present a decision analysis related to contaminant remediation where the developed reduced order models are applied to reproduce groundwater flow and contaminant transport in a synthetic heterogeneous aquifer. The tools are coded in Julia and are a part of the MADS high-performance computational framework (https://github.com/madsjulia/Mads.jl).

  3. Learning theory of distributed spectral algorithms

    Science.gov (United States)

    Guo, Zheng-Chu; Lin, Shao-Bo; Zhou, Ding-Xuan

    2017-07-01

    Spectral algorithms have been widely used and studied in learning theory and inverse problems. This paper is concerned with distributed spectral algorithms, for handling big data, based on a divide-and-conquer approach. We present a learning theory for these distributed kernel-based learning algorithms in a regression framework including nice error bounds and optimal minimax learning rates achieved by means of a novel integral operator approach and a second order decomposition of inverse operators. Our quantitative estimates are given in terms of regularity of the regression function, effective dimension of the reproducing kernel Hilbert space, and qualification of the filter function of the spectral algorithm. They do not need any eigenfunction or noise conditions and are better than the existing results even for the classical family of spectral algorithms.

  4. Developing a practice of supervision in university as a collective learning process

    DEFF Research Database (Denmark)

    Lund, Birthe; Jensen, Annie Aarup

    2009-01-01

    of the framework surrounding the supervision process, both as regards the students and the teachers; to de-privatize the problems encountered by the individual teacher during the supervision; to ensure that students would be able to graduate within the timeframe of the education (the institutional economic...... of creating a transformation in the sense that it may change from being a top-down project (instigated by the Faculty) and develop into being a bottom-up project. It may hold the potential for developing collective learning processes assuming that good structures and frameworks can be created, as well...

  5. Kernel learning algorithms for face recognition

    CERN Document Server

    Li, Jun-Bao; Pan, Jeng-Shyang

    2013-01-01

    Kernel Learning Algorithms for Face Recognition covers the framework of kernel based face recognition. This book discusses the advanced kernel learning algorithms and its application on face recognition. This book also focuses on the theoretical deviation, the system framework and experiments involving kernel based face recognition. Included within are algorithms of kernel based face recognition, and also the feasibility of the kernel based face recognition method. This book provides researchers in pattern recognition and machine learning area with advanced face recognition methods and its new

  6. Novel Approach to Unsupervised Change Detection Based on a Robust Semi-Supervised FCM Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Pan Shao

    2016-03-01

    Full Text Available This study presents a novel approach for unsupervised change detection in multitemporal remotely sensed images. This method addresses the problem of the analysis of the difference image by proposing a novel and robust semi-supervised fuzzy C-means (RSFCM clustering algorithm. The advantage of the RSFCM is to further introduce the pseudolabels from the difference image compared with the existing change detection methods; these methods, mainly use difference intensity levels and spatial context. First, the patterns with a high probability of belonging to the changed or unchanged class are identified by selectively thresholding the difference image histogram. Second, the pseudolabels of these nearly certain pixel-patterns are jointly exploited with the intensity levels and spatial information in the properly defined RSFCM classifier in order to discriminate the changed pixels from the unchanged pixels. Specifically, labeling knowledge is used to guide the RSFCM clustering process to enhance the change information and obtain a more accurate membership; information on spatial context helps to lower the effect of noise and outliers by modifying the membership. RSFCM can detect more changes and provide noise immunity by the synergistic exploitation of pseudolabels and spatial context. The two main contributions of this study are as follows: (1 it proposes the idea of combining the three information types from the difference image, namely, (a intensity levels, (b labels, and (c spatial context; and (2 it develops the novel RSFCM algorithm for image segmentation and forms the proposed change detection framework. The proposed method is effective and efficient for change detection as confirmed by six experimental results of this study.

  7. Generation of a supervised classification algorithm for time-series variable stars with an application to the LINEAR dataset

    Science.gov (United States)

    Johnston, K. B.; Oluseyi, H. M.

    2017-04-01

    With the advent of digital astronomy, new benefits and new problems have been presented to the modern day astronomer. While data can be captured in a more efficient and accurate manner using digital means, the efficiency of data retrieval has led to an overload of scientific data for processing and storage. This paper will focus on the construction and application of a supervised pattern classification algorithm for the identification of variable stars. Given the reduction of a survey of stars into a standard feature space, the problem of using prior patterns to identify new observed patterns can be reduced to time-tested classification methodologies and algorithms. Such supervised methods, so called because the user trains the algorithms prior to application using patterns with known classes or labels, provide a means to probabilistically determine the estimated class type of new observations. This paper will demonstrate the construction and application of a supervised classification algorithm on variable star data. The classifier is applied to a set of 192,744 LINEAR data points. Of the original samples, 34,451 unique stars were classified with high confidence (high level of probability of being the true class).

  8. Developing a practice of supervision in university as a collective learning process

    DEFF Research Database (Denmark)

    Lund, Birthe; Jensen, Annie Aarup

    2009-01-01

    of the framework surrounding the supervision process, both as regards the students and the teachers; to de-privatize the problems encountered by the individual teacher during the supervision; to ensure that students would be able to graduate within the timeframe of the education (the institutional economic......The point of departure of the paper is a university pedagogical course established with the purpose of strengthening the university teachers’ competence regarding the supervision of students working on their master’s thesis. The purpose of the course is furthermore to ensure the improvement...... of creating a transformation in the sense that it may change from being a top-down project (instigated by the Faculty) and develop into being a bottom-up project. It may hold the potential for developing collective learning processes assuming that good structures and frameworks can be created, as well...

  9. Gene classification using parameter-free semi-supervised manifold learning.

    Science.gov (United States)

    Huang, Hong; Feng, Hailiang

    2012-01-01

    A new manifold learning method, called parameter-free semi-supervised local Fisher discriminant analysis (pSELF), is proposed to map the gene expression data into a low-dimensional space for tumor classification. Motivated by the fact that semi-supervised and parameter-free are two desirable and promising characteristics for dimension reduction, a new difference-based optimization objective function with unlabeled samples has been designed. The proposed method preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The semi-supervised method has an analytic form of the globally optimal solution, which can be computed efficiently by eigen decomposition. Experimental results on synthetic data and SRBCT, DLBCL, and Brain Tumor gene expression data sets demonstrate the effectiveness of the proposed method.

  10. A new decision tree learning algorithm

    Institute of Scientific and Technical Information of China (English)

    FANG Yong; QI Fei-hu

    2005-01-01

    In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension featurespace, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OC1.

  11. Lane Detection Based on Machine Learning Algorithm

    National Research Council Canada - National Science Library

    Chao Fan; Jingbo Xu; Shuai Di

    2013-01-01

    In order to improve accuracy and robustness of the lane detection in complex conditions, such as the shadows and illumination changing, a novel detection algorithm was proposed based on machine learning...

  12. A Semi-Supervised Learning Approach to Enhance Health Care Community–Based Question Answering: A Case Study in Alcoholism

    Science.gov (United States)

    Klabjan, Diego; Jonnalagadda, Siddhartha Reddy

    2016-01-01

    Background Community-based question answering (CQA) sites play an important role in addressing health information needs. However, a significant number of posted questions remain unanswered. Automatically answering the posted questions can provide a useful source of information for Web-based health communities. Objective In this study, we developed an algorithm to automatically answer health-related questions based on past questions and answers (QA). We also aimed to understand information embedded within Web-based health content that are good features in identifying valid answers. Methods Our proposed algorithm uses information retrieval techniques to identify candidate answers from resolved QA. To rank these candidates, we implemented a semi-supervised leaning algorithm that extracts the best answer to a question. We assessed this approach on a curated corpus from Yahoo! Answers and compared against a rule-based string similarity baseline. Results On our dataset, the semi-supervised learning algorithm has an accuracy of 86.2%. Unified medical language system–based (health related) features used in the model enhance the algorithm’s performance by proximately 8%. A reasonably high rate of accuracy is obtained given that the data are considerably noisy. Important features distinguishing a valid answer from an invalid answer include text length, number of stop words contained in a test question, a distance between the test question and other questions in the corpus, and a number of overlapping health-related terms between questions. Conclusions Overall, our automated QA system based on historical QA pairs is shown to be effective according to the dataset in this case study. It is developed for general use in the health care domain, which can also be applied to other CQA sites. PMID:27485666

  13. Assessing Electronic Cigarette-Related Tweets for Sentiment and Content Using Supervised Machine Learning.

    Science.gov (United States)

    Cole-Lewis, Heather; Varghese, Arun; Sanders, Amy; Schwarz, Mary; Pugatch, Jillian; Augustson, Erik

    2015-08-25

    Electronic cigarettes (e-cigarettes) continue to be a growing topic among social media users, especially on Twitter. The ability to analyze conversations about e-cigarettes in real-time can provide important insight into trends in the public's knowledge, attitudes, and beliefs surrounding e-cigarettes, and subsequently guide public health interventions. Our aim was to establish a supervised machine learning algorithm to build predictive classification models that assess Twitter data for a range of factors related to e-cigarettes. Manual content analysis was conducted for 17,098 tweets. These tweets were coded for five categories: e-cigarette relevance, sentiment, user description, genre, and theme. Machine learning classification models were then built for each of these five categories, and word groupings (n-grams) were used to define the feature space for each classifier. Predictive performance scores for classification models indicated that the models correctly labeled the tweets with the appropriate variables between 68.40% and 99.34% of the time, and the percentage of maximum possible improvement over a random baseline that was achieved by the classification models ranged from 41.59% to 80.62%. Classifiers with the highest performance scores that also achieved the highest percentage of the maximum possible improvement over a random baseline were Policy/Government (performance: 0.94; % improvement: 80.62%), Relevance (performance: 0.94; % improvement: 75.26%), Ad or Promotion (performance: 0.89; % improvement: 72.69%), and Marketing (performance: 0.91; % improvement: 72.56%). The most appropriate word-grouping unit (n-gram) was 1 for the majority of classifiers. Performance continued to marginally increase with the size of the training dataset of manually annotated data, but eventually leveled off. Even at low dataset sizes of 4000 observations, performance characteristics were fairly sound. Social media outlets like Twitter can uncover real-time snapshots of

  14. Visualizing output for a data learning algorithm

    Science.gov (United States)

    Carson, Daniel; Graham, James; Ternovskiy, Igor

    2016-05-01

    This paper details the process we went through to visualize the output for our data learning algorithm. We have been developing a hierarchical self-structuring learning algorithm based around the general principles of the LaRue model. One example of a proposed application of this algorithm would be traffic analysis, chosen because it is conceptually easy to follow and there is a significant amount of already existing data and related research material with which to work with. While we choose the tracking of vehicles for our initial approach, it is by no means the only target of our algorithm. Flexibility is the end goal, however, we still need somewhere to start. To that end, this paper details our creation of the visualization GUI for our algorithm, the features we included and the initial results we obtained from our algorithm running a few of the traffic based scenarios we designed.

  15. Facilitating the Learning Process in Design-Based Learning Practices: An Investigation of Teachers' Actions in Supervising Students

    Science.gov (United States)

    Gómez Puente, S. M.; van Eijck, M.; Jochems, W.

    2013-01-01

    Background: In research on design-based learning (DBL), inadequate attention is paid to the role the teacher plays in supervising students in gathering and applying knowledge to design artifacts, systems, and innovative solutions in higher education. Purpose: In this study, we examine whether teacher actions we previously identified in the DBL…

  16. Emotional Literacy Support Assistants' Views on Supervision Provided by Educational Psychologists: What EPs Can Learn from Group Supervision

    Science.gov (United States)

    Osborne, Cara; Burton, Sheila

    2014-01-01

    The Educational Psychology Service in this study has responsibility for providing group supervision to Emotional Literacy Support Assistants (ELSAs) working in schools. To date, little research has examined this type of inter-professional supervision arrangement. The current study used a questionnaire to examine ELSAs' views on the supervision…

  17. Generating a Spanish Affective Dictionary with Supervised Learning Techniques

    Science.gov (United States)

    Bermudez-Gonzalez, Daniel; Miranda-Jiménez, Sabino; García-Moreno, Raúl-Ulises; Calderón-Nepamuceno, Dora

    2016-01-01

    Nowadays, machine learning techniques are being used in several Natural Language Processing (NLP) tasks such as Opinion Mining (OM). OM is used to analyse and determine the affective orientation of texts. Usually, OM approaches use affective dictionaries in order to conduct sentiment analysis. These lexicons are labeled manually with affective…

  18. Top Tagging by Deep Learning Algorithm

    CERN Document Server

    Akil, Ali

    2015-01-01

    In this report I will show the application of a deep learning algorithm on a Monte Carlo simulation sample to test its performance in tagging hadronic decays of boosted top quarks and compare what we get with the results of the application of some other algorithms.

  19. Parallelization of TMVA Machine Learning Algorithms

    CERN Document Server

    Hajili, Mammad

    2017-01-01

    This report reflects my work on Parallelization of TMVA Machine Learning Algorithms integrated to ROOT Data Analysis Framework during summer internship at CERN. The report consists of 4 impor- tant part - data set used in training and validation, algorithms that multiprocessing applied on them, parallelization techniques and re- sults of execution time changes due to number of workers.

  20. LEARNING ALGORITHM OF STAGE CONTROL NBP NETWORK

    Institute of Scientific and Technical Information of China (English)

    Yan Lixiang; Qin Zheng

    2003-01-01

    This letter analyzes the reasons why the known Neural Back Promulgation (NBP)network learning algorithm has slower speed and greater sample error. Based on the analysis and experiment, the training group descending Enhanced Combination Algorithm (ECA) is proposed.The analysis of the generalized property and sample error shows that the ECA can heighten the study speed and reduce individual error.

  1. Extended apprenticeship learning in doctoral training and supervision - moving beyond 'cookbook recipes'

    DEFF Research Database (Denmark)

    Tanggaard, Lene; Wegener, Charlotte

    An apprenticeship perspective on learning in academia sheds light on the potential for mutual learning and production, and also reveals the diverse range of learning resources beyond the formal novice-–expert relationship. Although apprenticeship is a well-known concept in educational research......, in this case apprenticeship offers an innovative perspective on future practice and research in academia allowing more students access to high high-quality research training and giving supervisors a chance to combine their own research with their supervision obligations....

  2. Automating parallel implementation of neural learning algorithms.

    Science.gov (United States)

    Rana, O F

    2000-06-01

    Neural learning algorithms generally involve a number of identical processing units, which are fully or partially connected, and involve an update function, such as a ramp, a sigmoid or a Gaussian function for instance. Some variations also exist, where units can be heterogeneous, or where an alternative update technique is employed, such as a pulse stream generator. Associated with connections are numerical values that must be adjusted using a learning rule, and and dictated by parameters that are learning rule specific, such as momentum, a learning rate, a temperature, amongst others. Usually, neural learning algorithms involve local updates, and a global interaction between units is often discouraged, except in instances where units are fully connected, or involve synchronous updates. In all of these instances, concurrency within a neural algorithm cannot be fully exploited without a suitable implementation strategy. A design scheme is described for translating a neural learning algorithm from inception to implementation on a parallel machine using PVM or MPI libraries, or onto programmable logic such as FPGAs. A designer must first describe the algorithm using a specialised Neural Language, from which a Petri net (PN) model is constructed automatically for verification, and building a performance model. The PN model can be used to study issues such as synchronisation points, resource sharing and concurrency within a learning rule. Specialised constructs are provided to enable a designer to express various aspects of a learning rule, such as the number and connectivity of neural nodes, the interconnection strategies, and information flows required by the learning algorithm. A scheduling and mapping strategy is then used to translate this PN model onto a multiprocessor template. We demonstrate our technique using a Kohonen and backpropagation learning rules, implemented on a loosely coupled workstation cluster, and a dedicated parallel machine, with PVM libraries.

  3. Neural Gen Feature Selection for Supervised Learning Classifier

    Directory of Open Access Journals (Sweden)

    Mohammed Hasan Abdulameer

    2014-04-01

    Full Text Available Face recognition has recently received significant attention, especially during the past few years. Many face recognition techniques were developed such as PSO-SVM and LDA-SVM However, inefficient features in the face recognition may lead to inadequate in the recognition results. Hence, a new face recognition system based on Genetic Algorithm and FFBNN technique is proposed. Our proposed face recognition system initially performs the feature extraction and these optimal features are promoted to the recognition process. In the feature extraction, the optimal features are extracted from the face image database by Genetic Algorithm (GA with FFBNN and the computed optimal features are given to the FFBNN technique to carry out the training and testing process. The optimal features from the feature database are fed to the FFBNN for accomplishing the training process. The well trained FFBNN with the optimal features provide the recognition result. The optimal features in FFBNN by GA efficiently perform the face recognition process. The human face dataset called YALE is utilized to analyze the performance of our proposed GA-FFNN technique and also this GA-FFBNN is compared with standard SVM and PSO-SVM techniques.

  4. Algorithmic learning in a random world

    CERN Document Server

    Vovk, Vladimir; Shafer, Glenn

    2005-01-01

    A new scientific monograph developing significant new algorithmic foundations in machine learning theory. Researchers and postgraduates in CS, statistics, and A.I. will find the book an authoritative and formal presentation of some of the most promising theoretical developments in machine learning.

  5. Supervised Machine Learning Methods Applied to Predict Ligand- Binding Affinity.

    Science.gov (United States)

    Heck, Gabriela S; Pintro, Val O; Pereira, Richard R; de Ávila, Mauricio B; Levin, Nayara M B; de Azevedo, Walter F

    2017-01-01

    Calculation of ligand-binding affinity is an open problem in computational medicinal chemistry. The ability to computationally predict affinities has a beneficial impact in the early stages of drug development, since it allows a mathematical model to assess protein-ligand interactions. Due to the availability of structural and binding information, machine learning methods have been applied to generate scoring functions with good predictive power. Our goal here is to review recent developments in the application of machine learning methods to predict ligand-binding affinity. We focus our review on the application of computational methods to predict binding affinity for protein targets. In addition, we also describe the major available databases for experimental binding constants and protein structures. Furthermore, we explain the most successful methods to evaluate the predictive power of scoring functions. Association of structural information with ligand-binding affinity makes it possible to generate scoring functions targeted to a specific biological system. Through regression analysis, this data can be used as a base to generate mathematical models to predict ligandbinding affinities, such as inhibition constant, dissociation constant and binding energy. Experimental biophysical techniques were able to determine the structures of over 120,000 macromolecules. Considering also the evolution of binding affinity information, we may say that we have a promising scenario for development of scoring functions, making use of machine learning techniques. Recent developments in this area indicate that building scoring functions targeted to the biological systems of interest shows superior predictive performance, when compared with other approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  6. Learning Algorithms of Multilayer Neural Networks

    OpenAIRE

    Fujiki, Sumiyoshi; FUJIKI, Nahomi, M.

    1996-01-01

    A positive reinforcement type learning algorithm is formulated for a stochastic feed-forward multilayer neural network, with far interlayer synaptic connections, and we obtain a learning rule similar to that of the Boltzmann machine on the same multilayer structure. By applying a mean field approximation to the stochastic feed-forward neural network, the generalized error back-propagation learning rule is derived for a deterministic analog feed-forward multilayer network with the far interlay...

  7. Stochastic Descent Analysis of Representation Learning Algorithms

    OpenAIRE

    Golden, Richard M.

    2014-01-01

    Although stochastic approximation learning methods have been widely used in the machine learning literature for over 50 years, formal theoretical analyses of specific machine learning algorithms are less common because stochastic approximation theorems typically possess assumptions which are difficult to communicate and verify. This paper presents a new stochastic approximation theorem for state-dependent noise with easily verifiable assumptions applicable to the analysis and design of import...

  8. A Learning Algorithm for Multimodal Grammar Inference.

    Science.gov (United States)

    D'Ulizia, A; Ferri, F; Grifoni, P

    2011-12-01

    The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.

  9. Integrating learning assessment and supervision in a competency framework for clinical workplace education.

    Science.gov (United States)

    Embo, M; Driessen, E; Valcke, M; van der Vleuten, C P M

    2015-02-01

    Although competency-based education is well established in health care education, research shows that the competencies do not always match the reality of clinical workplaces. Therefore, there is a need to design feasible and evidence-based competency frameworks that fit the workplace reality. This theoretical paper outlines a competency-based framework, designed to facilitate learning, assessment and supervision in clinical workplace education. Integration is the cornerstone of this holistic competency framework.

  10. Clinical learning environment, supervision and nurse teacher evaluation scale: psychometric evaluation of the Swedish version.

    Science.gov (United States)

    Johansson, Unn-Britt; Kaila, Päivi; Ahlner-Elmqvist, Marianne; Leksell, Janeth; Isoaho, Hannu; Saarikoski, Mikko

    2010-09-01

    This article is a report of the development and psychometric testing of the Swedish version of the Clinical Learning Environment, Supervision and Nurse Teacher evaluation scale. To achieve quality assurance, collaboration between the healthcare and nursing systems is a pre-requisite. Therefore, it is important to develop a tool that can measure the quality of clinical education. The Clinical Learning Environment, Supervision and Nurse Teacher evaluation scale is a previously validated instrument, currently used in several universities across Europe. The instrument has been suggested for use as part of quality assessment and evaluation of nursing education. The scale was translated into Swedish from the English version. Data were collected between March 2008 and May 2009 among nursing students from three university colleges, with 324 students completing the questionnaire. Exploratory factor analysis was performed on the 34-item scale to determine construct validity and Cronbach's alpha was used to measure the internal consistency. The five sub-dimensions identified in the original scale were replicated in the exploratory factor analysis. The five factors had explanation percentages of 60.2%, which is deemed sufficient. Cronbach's alpha coefficient for the total scale was 0.95, and varied between 0.96 and 0.75 within the five sub-dimensions. The Swedish version of Clinical Learning Environment, Supervision and Nurse Teacher evaluation scale has satisfactory psychometric properties and could be a useful quality instrument in nursing education. However, further investigation is required to develop and evaluate the questionnaire.

  11. Clinical learning environment and supervision of international nursing students: A cross-sectional study.

    Science.gov (United States)

    Mikkonen, Kristina; Elo, Satu; Miettunen, Jouko; Saarikoski, Mikko; Kääriäinen, Maria

    2017-05-01

    Previously, it has been shown that the clinical learning environment causes challenges for international nursing students, but there is a lack of empirical evidence relating to the background factors explaining and influencing the outcomes. To describe international and national students' perceptions of their clinical learning environment and supervision, and explain the related background factors. An explorative cross-sectional design was used in a study conducted in eight universities of applied sciences in Finland during September 2015-May 2016. All nursing students studying English language degree programs were invited to answer a self-administered questionnaire based on both the clinical learning environment, supervision and nurse teacher scale and Cultural and Linguistic Diversity scale with additional background questions. Participants (n=329) included international (n=231) and Finnish (n=98) nursing students. Binary logistic regression was used to identify background factors relating to the clinical learning environment and supervision. International students at a beginner level in Finnish perceived the pedagogical atmosphere as worse than native speakers. In comparison to native speakers, these international students generally needed greater support from the nurse teacher at their university. Students at an intermediate level in Finnish reported two times fewer negative encounters in cultural diversity at their clinical placement than the beginners. To facilitate a successful learning experience, international nursing students require a sufficient level of competence in the native language when conducting clinical placements. Educational interventions in language education are required to test causal effects on students' success in the clinical learning environment. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. PADDLE: Proximal Algorithm for Dual Dictionaries LEarning

    CERN Document Server

    Basso, Curzio; Verri, Alessandro; Villa, Silvia

    2010-01-01

    Recently, considerable research efforts have been devoted to the design of methods to learn from data overcomplete dictionaries for sparse coding. However, learned dictionaries require the solution of an optimization problem for coding new data. In order to overcome this drawback, we propose an algorithm aimed at learning both a dictionary and its dual: a linear mapping directly performing the coding. By leveraging on proximal methods, our algorithm jointly minimizes the reconstruction error of the dictionary and the coding error of its dual; the sparsity of the representation is induced by an $\\ell_1$-based penalty on its coefficients. The results obtained on synthetic data and real images show that the algorithm is capable of recovering the expected dictionaries. Furthermore, on a benchmark dataset, we show that the image features obtained from the dual matrix yield state-of-the-art classification performance while being much less computational intensive.

  13. Paradigms for Realizing Machine Learning Algorithms.

    Science.gov (United States)

    Agneeswaran, Vijay Srinivas; Tonpay, Pranay; Tiwary, Jayati

    2013-12-01

    The article explains the three generations of machine learning algorithms-with all three trying to operate on big data. The first generation tools are SAS, SPSS, etc., while second generation realizations include Mahout and RapidMiner (that work over Hadoop), and the third generation paradigms include Spark and GraphLab, among others. The essence of the article is that for a number of machine learning algorithms, it is important to look beyond the Hadoop's Map-Reduce paradigm in order to make them work on big data. A number of promising contenders have emerged in the third generation that can be exploited to realize deep analytics on big data.

  14. Semi-supervised Learning for Classification of Polarimetric SAR Images Based on SVM-Wishart

    Directory of Open Access Journals (Sweden)

    Hua Wen-qiang

    2015-02-01

    Full Text Available In this study, we propose a new semi-supervised classification method for Polarimetric SAR (PolSAR images, aiming at handling the issue that the number of train set is small. First, considering the scattering characters of PolSAR data, this method extracts multiple scattering features using target decomposition approach. Then, a semi-supervised learning model is established based on a co-training framework and Support Vector Machine (SVM. Both labeled and unlabeled data are utilized in this model to obtain high classification accuracy. Third, a recovery scheme based on the Wishart classifier is proposed to improve the classification performance. From the experiments conducted in this study, it is evident that the proposed method performs more effectively compared with other traditional methods when the number of train set is small.

  15. Exploiting Attribute Correlations: A Novel Trace Lasso-Based Weakly Supervised Dictionary Learning Method.

    Science.gov (United States)

    Wu, Lin; Wang, Yang; Pan, Shirui

    2016-10-04

    It is now well established that sparse representation models are working effectively for many visual recognition tasks, and have pushed forward the success of dictionary learning therein. Recent studies over dictionary learning focus on learning discriminative atoms instead of purely reconstructive ones. However, the existence of intraclass diversities (i.e., data objects within the same category but exhibit large visual dissimilarities), and interclass similarities (i.e., data objects from distinct classes but share much visual similarities), makes it challenging to learn effective recognition models. To this end, a large number of labeled data objects are required to learn models which can effectively characterize these subtle differences. However, labeled data objects are always limited to access, committing it difficult to learn a monolithic dictionary that can be discriminative enough. To address the above limitations, in this paper, we propose a weakly-supervised dictionary learning method to automatically learn a discriminative dictionary by fully exploiting visual attribute correlations rather than label priors. In particular, the intrinsic attribute correlations are deployed as a critical cue to guide the process of object categorization, and then a set of subdictionaries are jointly learned with respect to each category. The resulting dictionary is highly discriminative and leads to intraclass diversity aware sparse representations. Extensive experiments on image classification and object recognition are conducted to show the effectiveness of our approach.

  16. New supervised learning theory applied to cerebellar modeling for suppression of variability of saccade end points.

    Science.gov (United States)

    Fujita, Masahiko

    2013-06-01

    A new supervised learning theory is proposed for a hierarchical neural network with a single hidden layer of threshold units, which can approximate any continuous transformation, and applied to a cerebellar function to suppress the end-point variability of saccades. In motor systems, feedback control can reduce noise effects if the noise is added in a pathway from a motor center to a peripheral effector; however, it cannot reduce noise effects if the noise is generated in the motor center itself: a new control scheme is necessary for such noise. The cerebellar cortex is well known as a supervised learning system, and a novel theory of cerebellar cortical function developed in this study can explain the capability of the cerebellum to feedforwardly reduce noise effects, such as end-point variability of saccades. This theory assumes that a Golgi-granule cell system can encode the strength of a mossy fiber input as the state of neuronal activity of parallel fibers. By combining these parallel fiber signals with appropriate connection weights to produce a Purkinje cell output, an arbitrary continuous input-output relationship can be obtained. By incorporating such flexible computation and learning ability in a process of saccadic gain adaptation, a new control scheme in which the cerebellar cortex feedforwardly suppresses the end-point variability when it detects a variation in saccadic commands can be devised. Computer simulation confirmed the efficiency of such learning and showed a reduction in the variability of saccadic end points, similar to results obtained from experimental data.

  17. Enhanced low-rank representation via sparse manifold adaption for semi-supervised learning.

    Science.gov (United States)

    Peng, Yong; Lu, Bao-Liang; Wang, Suhang

    2015-05-01

    Constructing an informative and discriminative graph plays an important role in various pattern recognition tasks such as clustering and classification. Among the existing graph-based learning models, low-rank representation (LRR) is a very competitive one, which has been extensively employed in spectral clustering and semi-supervised learning (SSL). In SSL, the graph is composed of both labeled and unlabeled samples, where the edge weights are calculated based on the LRR coefficients. However, most of existing LRR related approaches fail to consider the geometrical structure of data, which has been shown beneficial for discriminative tasks. In this paper, we propose an enhanced LRR via sparse manifold adaption, termed manifold low-rank representation (MLRR), to learn low-rank data representation. MLRR can explicitly take the data local manifold structure into consideration, which can be identified by the geometric sparsity idea; specifically, the local tangent space of each data point was sought by solving a sparse representation objective. Therefore, the graph to depict the relationship of data points can be built once the manifold information is obtained. We incorporate a regularizer into LRR to make the learned coefficients preserve the geometric constraints revealed in the data space. As a result, MLRR combines both the global information emphasized by low-rank property and the local information emphasized by the identified manifold structure. Extensive experimental results on semi-supervised classification tasks demonstrate that MLRR is an excellent method in comparison with several state-of-the-art graph construction approaches.

  18. DL-ReSuMe: A Delay Learning-Based Remote Supervised Method for Spiking Neurons.

    Science.gov (United States)

    Taherkhani, Aboozar; Belatreche, Ammar; Li, Yuhua; Maguire, Liam P

    2015-12-01

    Recent research has shown the potential capability of spiking neural networks (SNNs) to model complex information processing in the brain. There is biological evidence to prove the use of the precise timing of spikes for information coding. However, the exact learning mechanism in which the neuron is trained to fire at precise times remains an open problem. The majority of the existing learning methods for SNNs are based on weight adjustment. However, there is also biological evidence that the synaptic delay is not constant. In this paper, a learning method for spiking neurons, called delay learning remote supervised method (DL-ReSuMe), is proposed to merge the delay shift approach and ReSuMe-based weight adjustment to enhance the learning performance. DL-ReSuMe uses more biologically plausible properties, such as delay learning, and needs less weight adjustment than ReSuMe. Simulation results have shown that the proposed DL-ReSuMe approach achieves learning accuracy and learning speed improvements compared with ReSuMe.

  19. Multiclass Semi-Supervised Learning on Graphs using Ginzburg-Landau Functional Minimization

    CERN Document Server

    Garcia-Cardona, Cristina; Percus, Allon G

    2013-01-01

    We present a graph-based variational algorithm for classification of high-dimensional data, generalizing the binary diffuse interface model to the case of multiple classes. Motivated by total variation techniques, the method involves minimizing an energy functional made up of three terms. The first two terms promote a stepwise continuous classification function with sharp transitions between classes, while preserving symmetry among the class labels. The third term is a data fidelity term, allowing us to incorporate prior information into the model in a semi-supervised framework. The performance of the algorithm on synthetic data, as well as on the COIL and MNIST benchmark datasets, is competitive with state-of-the-art graph-based multiclass segmentation methods.

  20. Virtual Calibration of Cosmic Ray Sensor: Using Supervised Ensemble Machine Learning

    Directory of Open Access Journals (Sweden)

    Ritaban Dutta

    2013-09-01

    Full Text Available In this paper an ensemble of supervised machine learning methods has been investigated to virtually and dynamically calibrate the cosmic ray sensors measuring area wise bulk soil moisture. Main focus of this study was to find an alternative to the currently available field calibration method; based on expensive and time consuming soil sample collection methodology. Data from the Australian Water Availability Project (AWAP database was used as independent soil moisture ground truth and results were compared against the conventionally estimated soil moisture using a Hydroinnova CRS-1000 cosmic ray probe deployed in Tullochgorum, Australia. Prediction performance of a complementary ensemble of four supervised estimators, namely Sugano type Adaptive Neuro-Fuzzy Inference System (S-ANFIS, Cascade Forward Neural Network (CFNN, Elman Neural Network (ENN and Learning Vector Quantization Neural Network (LVQN was evaluated using training and testing paradigms. An AWAP trained ensemble of four estimators was able to predict bulk soil moisture directly from cosmic ray neutron counts with 94.4% as best accuracy. The ensemble approach outperformed the individual performances from these networks. This result proved that an ensemble machine learning based paradigm could be a valuable alternative data driven calibration method for cosmic ray sensors against the current expensive and hydrological assumption based field calibration method.

  1. Test-retest reliability of the Clinical Learning Environment, Supervision and Nurse Teacher (CLES + T) scale.

    Science.gov (United States)

    Gustafsson, Margareta; Blomberg, Karin; Holmefur, Marie

    2015-07-01

    The Clinical Learning Environment, Supervision and Nurse Teacher (CLES + T) scale evaluates the student nurses' perception of the learning environment and supervision within the clinical placement. It has never been tested in a replication study. The aim of the present study was to evaluate the test-retest reliability of the CLES + T scale. The CLES + T scale was administered twice to a group of 42 student nurses, with a one-week interval. Test-retest reliability was determined by calculations of Intraclass Correlation Coefficients (ICCs) and weighted Kappa coefficients. Standard Error of Measurements (SEM) and Smallest Detectable Difference (SDD) determined the precision of individual scores. Bland-Altman plots were created for analyses of systematic differences between the test occasions. The results of the study showed that the stability over time was good to excellent (ICC 0.88-0.96) in the sub-dimensions "Supervisory relationship", "Pedagogical atmosphere on the ward" and "Role of the nurse teacher". Measurements of "Premises of nursing on the ward" and "Leadership style of the manager" had lower but still acceptable stability (ICC 0.70-0.75). No systematic differences occurred between the test occasions. This study supports the usefulness of the CLES + T scale as a reliable measure of the student nurses' perception of the learning environment within the clinical placement at a hospital.

  2. Evaluation of the impact of convolution masks on algorithm to supervise scenery changes at space vehicle integration pads

    Directory of Open Access Journals (Sweden)

    Francisco Carlos P. Bizarria

    2009-06-01

    Full Text Available The Satellite Launch Vehicle developed in Brazil employs a specialized unit at the launch center known as the Movable Integration Tower. On that tower, fixed and movable work floors are installed for use by specialists, at predefined periods of time, to carry out tests mainly related to the pre-launch phase of that vehicle. Outside of those periods it is necessary to detect unexpected movements of platforms and unauthorized people on the site. Within that context, this work presents an evaluation of different resolutions of convolution mask and tolerances in the efficiency of a proposed algorithm to supervise scenery changes on these work floors. The results obtained from this evaluation are satisfactory and show that the proposed algorithm is suitable for the purpose for which it is intended.

  3. Multi-Agent Reinforcement Learning Algorithm Based on Action Prediction

    Institute of Scientific and Technical Information of China (English)

    TONG Liang; LU Ji-lian

    2006-01-01

    Multi-agent reinforcement learning algorithms are studied. A prediction-based multi-agent reinforcement learning algorithm is presented for multi-robot cooperation task. The multi-robot cooperation experiment based on multi-agent inverted pendulum is made to test the efficency of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation strategy much faster than the primitive multiagent reinforcement learning algorithm.

  4. An Experimental Method for the Active Learning of Greedy Algorithms

    Science.gov (United States)

    Velazquez-Iturbide, J. Angel

    2013-01-01

    Greedy algorithms constitute an apparently simple algorithm design technique, but its learning goals are not simple to achieve.We present a didacticmethod aimed at promoting active learning of greedy algorithms. The method is focused on the concept of selection function, and is based on explicit learning goals. It mainly consists of an…

  5. An Experimental Method for the Active Learning of Greedy Algorithms

    Science.gov (United States)

    Velazquez-Iturbide, J. Angel

    2013-01-01

    Greedy algorithms constitute an apparently simple algorithm design technique, but its learning goals are not simple to achieve.We present a didacticmethod aimed at promoting active learning of greedy algorithms. The method is focused on the concept of selection function, and is based on explicit learning goals. It mainly consists of an…

  6. Multi-Modal Curriculum Learning for Semi-Supervised Image Classification.

    Science.gov (United States)

    Gong, Chen; Tao, Dacheng; Maybank, Stephen J; Liu, Wei; Kang, Guoliang; Yang, Jie

    2016-07-01

    Semi-supervised image classification aims to classify a large quantity of unlabeled images by typically harnessing scarce labeled images. Existing semi-supervised methods often suffer from inadequate classification accuracy when encountering difficult yet critical images, such as outliers, because they treat all unlabeled images equally and conduct classifications in an imperfectly ordered sequence. In this paper, we employ the curriculum learning methodology by investigating the difficulty of classifying every unlabeled image. The reliability and the discriminability of these unlabeled images are particularly investigated for evaluating their difficulty. As a result, an optimized image sequence is generated during the iterative propagations, and the unlabeled images are logically classified from simple to difficult. Furthermore, since images are usually characterized by multiple visual feature descriptors, we associate each kind of features with a teacher, and design a multi-modal curriculum learning (MMCL) strategy to integrate the information from different feature modalities. In each propagation, each teacher analyzes the difficulties of the currently unlabeled images from its own modality viewpoint. A consensus is subsequently reached among all the teachers, determining the currently simplest images (i.e., a curriculum), which are to be reliably classified by the multi-modal learner. This well-organized propagation process leveraging multiple teachers and one learner enables our MMCL to outperform five state-of-the-art methods on eight popular image data sets.

  7. AcceleRater: a web application for supervised learning of behavioral modes from acceleration measurements.

    Science.gov (United States)

    Resheff, Yehezkel S; Rotics, Shay; Harel, Roi; Spiegel, Orr; Nathan, Ran

    2014-01-01

    The study of animal movement is experiencing rapid progress in recent years, forcefully driven by technological advancement. Biologgers with Acceleration (ACC) recordings are becoming increasingly popular in the fields of animal behavior and movement ecology, for estimating energy expenditure and identifying behavior, with prospects for other potential uses as well. Supervised learning of behavioral modes from acceleration data has shown promising results in many species, and for a diverse range of behaviors. However, broad implementation of this technique in movement ecology research has been limited due to technical difficulties and complicated analysis, deterring many practitioners from applying this approach. This highlights the need to develop a broadly applicable tool for classifying behavior from acceleration data. Here we present a free-access python-based web application called AcceleRater, for rapidly training, visualizing and using models for supervised learning of behavioral modes from ACC measurements. We introduce AcceleRater, and illustrate its successful application for classifying vulture behavioral modes from acceleration data obtained from free-ranging vultures. The seven models offered in the AcceleRater application achieved overall accuracy of between 77.68% (Decision Tree) and 84.84% (Artificial Neural Network), with a mean overall accuracy of 81.51% and standard deviation of 3.95%. Notably, variation in performance was larger between behavioral modes than between models. AcceleRater provides the means to identify animal behavior, offering a user-friendly tool for ACC-based behavioral annotation, which will be dynamically upgraded and maintained.

  8. A Decomposition Algorithm for Learning Bayesian Network Structures from Data

    DEFF Research Database (Denmark)

    Zeng, Yifeng; Cordero Hernandez, Jorge

    2008-01-01

    It is a challenging task of learning a large Bayesian network from a small data set. Most conventional structural learning approaches run into the computational as well as the statistical problems. We propose a decomposition algorithm for the structure construction without having to learn...... the complete network. The new learning algorithm firstly finds local components from the data, and then recover the complete network by joining the learned components. We show the empirical performance of the decomposition algorithm in several benchmark networks....

  9. Rho-learning: a robotics oriented reinforcement learning algorithm

    OpenAIRE

    Porta Pleite, Josep Maria

    2000-01-01

    We present a new reinforcement learning system more suitable to be used in robotics than existing ones. Existing reinforcement learning algorithms are not specifically tailored for robotics and so they do not take advantage of the robotic perception characteristics as well as of the expected complexity of the task that robots are likely to face. In a robot, the information about the environment comes from a set of qualitatively different sensors and in the main par of tasks small subsets of t...

  10. Semi-supervised eigenvectors for large-scale locally-biased learning

    DEFF Research Database (Denmark)

    Hansen, Toke Jansen; Mahoney, Michael W.

    2014-01-01

    In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks nearby that prespecified target region. For example, one might...... machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities, thus limiting the applicability of eigenvector-based methods in situations where one is interested in very local properties of the data. In this paper, we address this issue by providing...... be interested in the clustering structure of a data graph near a prespecified seed set of nodes, or one might be interested in finding partitions in an image that are near a prespecified ground truth set of pixels. Locally-biased problems of this sort are particularly challenging for popular eigenvector-based...

  11. Semi-supervised eigenvectors for large-scale locally-biased learning

    DEFF Research Database (Denmark)

    Hansen, Toke Jansen; Mahoney, Michael W.

    2014-01-01

    -based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities, thus limiting the applicability of eigenvector-based methods in situations where one is interested in very local properties of the data. In this paper, we address this issue by providing......In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks nearby that prespecified target region. For example, one might...... be interested in the clustering structure of a data graph near a prespecified seed set of nodes, or one might be interested in finding partitions in an image that are near a prespecified ground truth set of pixels. Locally-biased problems of this sort are particularly challenging for popular eigenvector...

  12. Image Recovery Algorithm Based on Learned Dictionary

    Directory of Open Access Journals (Sweden)

    Xinghui Zhu

    2014-01-01

    Full Text Available We proposed a recovery scheme for image deblurring. The scheme is under the framework of sparse representation and it has three main contributions. Firstly, considering the sparse property of natural image, the nonlocal overcompleted dictionaries are learned for image patches in our scheme. And, then, we coded the patches in each nonlocal clustering with the corresponding learned dictionary to recover the whole latent image. In addition, for some practical applications, we also proposed a method to evaluate the blur kernel to make the algorithm usable in blind image recovery. The experimental results demonstrated that the proposed scheme is competitive with some current state-of-the-art methods.

  13. Semi-supervised analysis of human brain tumours from partially labeled MRS information, using manifold learning models.

    Science.gov (United States)

    Cruz-Barbosa, Raúl; Vellido, Alfredo

    2011-02-01

    Medical diagnosis can often be understood as a classification problem. In oncology, this typically involves differentiating between tumour types and grades, or some type of discrete outcome prediction. From the viewpoint of computer-based medical decision support, this classification requires the availability of accurate diagnoses of past cases as training target examples. The availability of such labeled databases is scarce in most areas of oncology, and especially so in neuro-oncology. In such context, semi-supervised learning oriented towards classification can be a sensible data modeling choice. In this study, semi-supervised variants of Generative Topographic Mapping, a model of the manifold learning family, are applied to two neuro-oncology problems: the diagnostic discrimination between different brain tumour pathologies, and the prediction of outcomes for a specific type of aggressive brain tumours. Their performance compared favorably with those of the alternative Laplacian Eigenmaps and Semi-Supervised SVM for Manifold Learning models in most of the experiments.

  14. 局部学习半监督多类分类机%Local learning semi-supervised multi-class classifier

    Institute of Scientific and Technical Information of China (English)

    吕佳; 邓乃扬; 田英杰; 邵元海; 杨新民

    2013-01-01

    半监督多类分类问题是机器学习和模式识别领域中的一个研究热点,目前大多数多类分类算法是将问题分解成若干个二类分类问题来求解.提出两种类标号表示方法来避免多个二类分类问题的求解,一种是单位圆类标号表示方法,一种是二进制序列类标号表示方法,并利用局部学习在二类分类问题中的良好学习特性,提出基于局部学习的半监督多类分类机.实验结果证明采用了基于局部学习的半监督多类分类机错分率更小,稳定性更高.%Semi-supervised multi-class classification problem opens research focuses in machine learning and pattern recognition, currently it is decomposed into a set of binary classification problems. Two kinds of class label presentation methods that one was class label presentation method of unit disc and the other was that of binary string were proposed for fear that multiple binary classification problems were solved. Besides, local learning has the good feature in semi-supervised binary classification problem. On the basis of it, local learning semi-supervised multi-class classifier was presented in this paper. The effectiveness of the algorithms was confirmed with experiments on benchmark datasets compared to other related algorithms.

  15. Lane Detection Based on Machine Learning Algorithm

    Directory of Open Access Journals (Sweden)

    Chao Fan

    2013-09-01

    Full Text Available In order to improve accuracy and robustness of the lane detection in complex conditions, such as the shadows and illumination changing, a novel detection algorithm was proposed based on machine learning. After pretreatment, a set of haar-like filters were used to calculate the eigenvalue in the gray image f(x,y and edge e(x,y. Then these features were trained by using improved boosting algorithm and the final class function g(x was obtained, which was used to judge whether the point x belonging to the lane or not. To avoid the over fitting in traditional boosting, Fisher discriminant analysis was used to initialize the weights of samples. After testing by many road in all conditions, it showed that this algorithm had good robustness and real-time to recognize the lane in all challenging conditions.

  16. Ozone ensemble forecast with machine learning algorithms

    OpenAIRE

    Mallet, Vivien; Stoltz, Gilles; Mauricette, Boris

    2009-01-01

    International audience; We apply machine learning algorithms to perform sequential aggregation of ozone forecasts. The latter rely on a multimodel ensemble built for ozone forecasting with the modeling system Polyphemus. The ensemble simulations are obtained by changes in the physical parameterizations, the numerical schemes, and the input data to the models. The simulations are carried out for summer 2001 over western Europe in order to forecast ozone daily peaks and ozone hourly concentrati...

  17. Learning algorithms for perceptrons from statistical physics

    Science.gov (United States)

    Gordon, Mirta B.; Peretto, Pierre; Berchier, Dominique

    1993-02-01

    Learning algorithms for perceptrons are deduced from statistical mechanics. Thermodynamical quantities are used as cost functions which may be extremalized by gradient dynamics to find the synaptic efficacies that store the learning set of patterns. The learning rules so obtained are classified in two categories, following the statistics used to derive the cost functions, namely, Boltzmann statistics, and Fermi statistics. In the limits of zero or infinite temperatures some of the rules behave like already known algorithms, but new strategies for learning are obtained at finite temperatures, which minimize the number of errors on the training set. Nous déduisons des algorithmes d'apprentissage pour des perceptrons à partir de considérations de mécanique statistique. Des quantités thermodynamiques sont considérées comme des fonctions de coût, dont on obtient, par une dynamique de gradient, les efficacités synaptiques qui apprennent l'ensemble d'apprentissage. Les règles ainsi obtenues sont classées en deux catégories suivant les statistiques, de Boltzmann ou de Fermi, utilisées pour dériver les fonctions de coût. Dans les limites de températures nulle ou infinie, la plupart des règles trouvées tendent vers les algorithmes connus, mais à température finie on trouve des stratégies nouvelles, qui minimisent le nombre d'erreurs dans l'ensemble d'apprentissage.

  18. A new machine learning algorithm for removal of salt and pepper noise

    Science.gov (United States)

    Wang, Yi; Adhami, Reza; Fu, Jian

    2015-07-01

    Supervised machine learning algorithm has been extensively studied and applied to different fields of image processing in past decades. This paper proposes a new machine learning algorithm, called margin setting (MS), for restoring images that are corrupted by salt and pepper impulse noise. Margin setting generates decision surface to classify the noise pixels and non-noise pixels. After the noise pixels are detected, a modified ranked order mean (ROM) filter is used to replace the corrupted pixels for images reconstruction. Margin setting algorithm is tested with grayscale and color images for different noise densities. The experimental results are compared with those of the support vector machine (SVM) and standard median filter (SMF). The results show that margin setting outperforms these methods with higher Peak Signal-to-Noise Ratio (PSNR), lower mean square error (MSE), higher image enhancement factor (IEF) and higher Structural Similarity Index (SSIM).

  19. Hearing in a shoe-box : binaural source position and wall absorption estimation using virtually supervised learning

    OpenAIRE

    Kataria, Saurabh; Gaultier, Clément; Deleforge, Antoine

    2016-01-01

    This paper introduces a new framework for supervised sound source localization referred to as virtually-supervised learning. An acoustic shoe-box room simulator is used to generate a large number of binaural single-source audio scenes. These scenes are used to build a dataset of spatial binaural features annotated with acoustic properties such as the 3D source position and the walls' absorption coefficients. A probabilis-tic high-to low-dimensional regression framework is used to learn a mapp...

  20. AN IMPROVED ALGORITHM FOR SUPERVISED FUZZY C-MEANS CLUSTERING OF REMOTELY SENSED DATA

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    This paper describes an improved algorithm for fuzzy c-means clustering of remotely sensed data, by which the degree of fuzziness of the resultant classification is de creased as comparing with that by a conventional algorithm: that is , the classification accura cy is increased. This is achieved by incorporating covariance matrices at the level of individual classes rather than assuming a global one. Empirical results from a fuzzy classification of an Edinburgh suburban land cover confirmed the improved performance of the new algorithm for fuzzy c-means clustering, in particular when fuzziness is also accommodated in the assumed reference data.

  1. A new semi-supervised classification strategy combining active learning and spectral unmixing of hyperspectral data

    Science.gov (United States)

    Sun, Yanli; Zhang, Xia; Plaza, Antonio; Li, Jun; Dópido, Inmaculada; Liu, Yi

    2016-10-01

    Hyperspectral remote sensing allows for the detailed analysis of the surface of the Earth by providing high-dimensional images with hundreds of spectral bands. Hyperspectral image classification plays a significant role in hyperspectral image analysis and has been a very active research area in the last few years. In the context of hyperspectral image classification, supervised techniques (which have achieved wide acceptance) must address a difficult task due to the unbalance between the high dimensionality of the data and the limited availability of labeled training samples in real analysis scenarios. While the collection of labeled samples is generally difficult, expensive, and time-consuming, unlabeled samples can be generated in a much easier way. Semi-supervised learning offers an effective solution that can take advantage of both unlabeled and a small amount of labeled samples. Spectral unmixing is another widely used technique in hyperspectral image analysis, developed to retrieve pure spectral components and determine their abundance fractions in mixed pixels. In this work, we propose a method to perform semi-supervised hyperspectral image classification by combining the information retrieved with spectral unmixing and classification. Two kinds of samples that are highly mixed in nature are automatically selected, aiming at finding the most informative unlabeled samples. One kind is given by the samples minimizing the distance between the first two most probable classes by calculating the difference between the two highest abundances. Another kind is given by the samples minimizing the distance between the most probable class and the least probable class, obtained by calculating the difference between the highest and lowest abundances. The effectiveness of the proposed method is evaluated using a real hyperspectral data set collected by the airborne visible infrared imaging spectrometer (AVIRIS) over the Indian Pines region in Northwestern Indiana. In the

  2. Literature mining of protein-residue associations with graph rules learned through distant supervision

    Directory of Open Access Journals (Sweden)

    Ravikumar KE

    2012-10-01

    Full Text Available Abstract Background We propose a method for automatic extraction of protein-specific residue mentions from the biomedical literature. The method searches text for mentions of amino acids at specific sequence positions and attempts to correctly associate each mention with a protein also named in the text. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic patterns corresponding to protein-residue pairs mentioned in the text. We finally present an approach to automated construction of relevant training and test data using the distant supervision model. Results The performance of the method was assessed by extracting protein-residue relations from a new automatically generated test set of sentences containing high confidence examples found using distant supervision. It achieved a F-measure of 0.84 on automatically created silver corpus and 0.79 on a manually annotated gold data set for this task, outperforming previous methods. Conclusions The primary contributions of this work are to (1 demonstrate the effectiveness of distant supervision for automatic creation of training data for protein-residue relation extraction, substantially reducing the effort and time involved in manual annotation of a data set and (2 show that the graph-based relation extraction approach we used generalizes well to the problem of protein-residue association extraction. This work paves the way towards effective extraction of protein functional residues from the literature.

  3. Processing of rock core microtomography images: Using seven different machine learning algorithms

    Science.gov (United States)

    Chauhan, Swarup; Rühaak, Wolfram; Khan, Faisal; Enzmann, Frieder; Mielke, Philipp; Kersten, Michael; Sass, Ingo

    2016-01-01

    The abilities of machine learning algorithms to process X-ray microtomographic rock images were determined. The study focused on the use of unsupervised, supervised, and ensemble clustering techniques, to segment X-ray computer microtomography rock images and to estimate the pore spaces and pore size diameters in the rocks. The unsupervised k-means technique gave the fastest processing time and the supervised least squares support vector machine technique gave the slowest processing time. Multiphase assemblages of solid phases (minerals and finely grained minerals) and the pore phase were found on visual inspection of the images. In general, the accuracy in terms of porosity values and pore size distribution was found to be strongly affected by the feature vectors selected. Relative porosity average value of 15.92±1.77% retrieved from all the seven machine learning algorithm is in very good agreement with the experimental results of 17±2%, obtained using gas pycnometer. Of the supervised techniques, the least square support vector machine technique is superior to feed forward artificial neural network because of its ability to identify a generalized pattern. In the ensemble classification techniques boosting technique converged faster compared to bragging technique. The k-means technique outperformed the fuzzy c-means and self-organized maps techniques in terms of accuracy and speed.

  4. Two Linear Unmixing Algorithms to Recognize Targets Using Supervised Classification and Orthogonal Rotation in Airborne Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Michael Zheludev

    2012-02-01

    Full Text Available The goal of the paper is to detect pixels that contain targets of known spectra. The target can be present in a sub- or above pixel. Pixels without targets are classified as background pixels. Each pixel is treated via the content of its neighborhood. A pixel whose spectrum is different from its neighborhood is classified as a “suspicious point”. In each suspicious point there is a mix of target(s and background. The main objective in a supervised detection (also called “target detection” is to search for a specific given spectral material (target in hyperspectral imaging (HSI where the spectral signature of the target is known a priori from laboratory measurements. In addition, the fractional abundance of the target is computed. To achieve this we present two linear unmixing algorithms that recognize targets with known (given spectral signatures. The CLUN is based on automatic feature extraction from the target’s spectrum. These features separate the target from the background. The ROTU algorithm is based on embedding the spectra space into a special space by random orthogonal transformation and on the statistical properties of the embedded result. Experimental results demonstrate that the targets’ locations were extracted correctly and these algorithms are robust and efficient.

  5. Semi-supervised learning of causal relations in biomedical scientific discourse

    Science.gov (United States)

    2014-01-01

    Background The increasing number of daily published articles in the biomedical domain has become too large for humans to handle on their own. As a result, bio-text mining technologies have been developed to improve their workload by automatically analysing the text and extracting important knowledge. Specific bio-entities, bio-events between these and facts can now be recognised with sufficient accuracy and are widely used by biomedical researchers. However, understanding how the extracted facts are connected in text is an extremely difficult task, which cannot be easily tackled by machinery. Results In this article, we describe our method to recognise causal triggers and their arguments in biomedical scientific discourse. We introduce new features and show that a self-learning approach improves the performance obtained by supervised machine learners to 83.47% for causal triggers. Furthermore, the spans of causal arguments can be recognised to a slightly higher level that by using supervised or rule-based methods that have been employed before. Conclusion Exploiting the large amount of unlabelled data that is already available can help improve the performance of recognising causal discourse relations in the biomedical domain. This improvement will further benefit the development of multiple tasks, such as hypothesis generation for experimental laboratories, contradiction detection, and the creation of causal networks. PMID:25559746

  6. A Novel Semi-Supervised Electronic Nose Learning Technique: M-Training

    Directory of Open Access Journals (Sweden)

    Pengfei Jia

    2016-03-01

    Full Text Available When an electronic nose (E-nose is used to distinguish different kinds of gases, the label information of the target gas could be lost due to some fault of the operators or some other reason, although this is not expected. Another fact is that the cost of getting the labeled samples is usually higher than for unlabeled ones. In most cases, the classification accuracy of an E-nose trained using labeled samples is higher than that of the E-nose trained by unlabeled ones, so gases without label information should not be used to train an E-nose, however, this wastes resources and can even delay the progress of research. In this work a novel multi-class semi-supervised learning technique called M-training is proposed to train E-noses with both labeled and unlabeled samples. We employ M-training to train the E-nose which is used to distinguish three indoor pollutant gases (benzene, toluene and formaldehyde. Data processing results prove that the classification accuracy of E-nose trained by semi-supervised techniques (tri-training and M-training is higher than that of an E-nose trained only with labeled samples, and the performance of M-training is better than that of tri-training because more base classifiers can be employed by M-training.

  7. Using distant supervised learning to identify protein subcellular localizations from full-text scientific articles.

    Science.gov (United States)

    Zheng, Wu; Blake, Catherine

    2015-10-01

    Databases of curated biomedical knowledge, such as the protein-locations reflected in the UniProtKB database, provide an accurate and useful resource to researchers and decision makers. Our goal is to augment the manual efforts currently used to curate knowledge bases with automated approaches that leverage the increased availability of full-text scientific articles. This paper describes experiments that use distant supervised learning to identify protein subcellular localizations, which are important to understand protein function and to identify candidate drug targets. Experiments consider Swiss-Prot, the manually annotated subset of the UniProtKB protein knowledge base, and 43,000 full-text articles from the Journal of Biological Chemistry that contain just under 11.5 million sentences. The system achieves 0.81 precision and 0.49 recall at sentence level and an accuracy of 57% on held-out instances in a test set. Moreover, the approach identifies 8210 instances that are not in the UniProtKB knowledge base. Manual inspection of the 50 most likely relations showed that 41 (82%) were valid. These results have immediate benefit to researchers interested in protein function, and suggest that distant supervision should be explored to complement other manual data curation efforts.

  8. A new learning algorithm with reference-following variables

    Institute of Scientific and Technical Information of China (English)

    Baiquan L(u); Yuan CAO

    2004-01-01

    A learning algorithm is presented for the learning of neural networks,in which the learning trajectory is convergence without any over-learning by changing of topological construction of the algorithm near any local minimum points of learning error.Because the topological construction is not convergent for some functions by usual BP method near some local minimum points,there is an over-learning phenomenon.To avoid the over-learning phenomenon, reference-following variables are used to change the topological construction of this algorithm.The theoretical analysis and the simulation results indicate that the proposed method is simple and useful.

  9. Quality-Related Monitoring and Grading of Granulated Products by Weibull-Distribution Modeling of Visual Images with Semi-Supervised Learning

    Directory of Open Access Journals (Sweden)

    Jinping Liu

    2016-06-01

    Full Text Available The topic of online product quality inspection (OPQI with smart visual sensors is attracting increasing interest in both the academic and industrial communities on account of the natural connection between the visual appearance of products with their underlying qualities. Visual images captured from granulated products (GPs, e.g., cereal products, fabric textiles, are comprised of a large number of independent particles or stochastically stacking locally homogeneous fragments, whose analysis and understanding remains challenging. A method of image statistical modeling-based OPQI for GP quality grading and monitoring by a Weibull distribution(WD model with a semi-supervised learning classifier is presented. WD-model parameters (WD-MPs of GP images’ spatial structures, obtained with omnidirectional Gaussian derivative filtering (OGDF, which were demonstrated theoretically to obey a specific WD model of integral form, were extracted as the visual features. Then, a co-training-style semi-supervised classifier algorithm, named COSC-Boosting, was exploited for semi-supervised GP quality grading, by integrating two independent classifiers with complementary nature in the face of scarce labeled samples. Effectiveness of the proposed OPQI method was verified and compared in the field of automated rice quality grading with commonly-used methods and showed superior performance, which lays a foundation for the quality control of GP on assembly lines.

  10. Quality-Related Monitoring and Grading of Granulated Products by Weibull-Distribution Modeling of Visual Images with Semi-Supervised Learning

    Science.gov (United States)

    Liu, Jinping; Tang, Zhaohui; Xu, Pengfei; Liu, Wenzhong; Zhang, Jin; Zhu, Jianyong

    2016-01-01

    The topic of online product quality inspection (OPQI) with smart visual sensors is attracting increasing interest in both the academic and industrial communities on account of the natural connection between the visual appearance of products with their underlying qualities. Visual images captured from granulated products (GPs), e.g., cereal products, fabric textiles, are comprised of a large number of independent particles or stochastically stacking locally homogeneous fragments, whose analysis and understanding remains challenging. A method of image statistical modeling-based OPQI for GP quality grading and monitoring by a Weibull distribution(WD) model with a semi-supervised learning classifier is presented. WD-model parameters (WD-MPs) of GP images’ spatial structures, obtained with omnidirectional Gaussian derivative filtering (OGDF), which were demonstrated theoretically to obey a specific WD model of integral form, were extracted as the visual features. Then, a co-training-style semi-supervised classifier algorithm, named COSC-Boosting, was exploited for semi-supervised GP quality grading, by integrating two independent classifiers with complementary nature in the face of scarce labeled samples. Effectiveness of the proposed OPQI method was verified and compared in the field of automated rice quality grading with commonly-used methods and showed superior performance, which lays a foundation for the quality control of GP on assembly lines. PMID:27367703

  11. Classification of Autism Spectrum Disorder Using Supervised Learning of Brain Connectivity Measures Extracted from Synchrostates

    CERN Document Server

    Jamal, Wasifa; Oprescu, Ioana-Anastasia; Maharatna, Koushik; Apicella, Fabio; Sicca, Federico

    2014-01-01

    Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave ...

  12. Exhaustive and Efficient Constraint Propagation: A Semi-Supervised Learning Perspective and Its Applications

    CERN Document Server

    Lu, Zhiwu; Peng, Yuxin

    2011-01-01

    This paper presents a novel pairwise constraint propagation approach by decomposing the challenging constraint propagation problem into a set of independent semi-supervised learning subproblems which can be solved in quadratic time using label propagation based on k-nearest neighbor graphs. Considering that this time cost is proportional to the number of all possible pairwise constraints, our approach actually provides an efficient solution for exhaustively propagating pairwise constraints throughout the entire dataset. The resulting exhaustive set of propagated pairwise constraints are further used to adjust the similarity matrix for constrained spectral clustering. Other than the traditional constraint propagation on single-source data, our approach is also extended to more challenging constraint propagation on multi-source data where each pairwise constraint is defined over a pair of data points from different sources. This multi-source constraint propagation has an important application to cross-modal mul...

  13. Anxiety, supervision and a space for thinking: some narcissistic perils for clinical psychologists in learning psychotherapy.

    Science.gov (United States)

    Mollon, P

    1989-06-01

    The process of learning psychotherapy involves narcissistic dangers--there may be injuries to self-esteem and self-image, especially when working with certain kinds of disturbed and hostile patients. Some patients will unconsciously recreate, in the transference, representations of early damaging experiences with parents, but now reversed with the therapist as the victim. It is vital for the trainee to be helped to understand these powerful interactional pressures. There are aspects of the professional culture and ideals of clinical psychologists (and possibly of some psychiatrists and social workers as well) which may make them particularly vulnerable in work with the hostile patient. It is argued that the function of supervision is not to teach a technique directly, but to create a 'space for thinking'--a kind of thinking which is more akin to maternal reverie, as described by Bion, than problem solving.

  14. Bayesian online algorithms for learning in discrete Hidden Markov Models

    OpenAIRE

    Alamino, Roberto C.; Caticha, Nestor

    2008-01-01

    We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.

  15. A semi-supervised learning framework for biomedical event extraction based on hidden topics.

    Science.gov (United States)

    Zhou, Deyu; Zhong, Dayou

    2015-05-01

    Scientists have devoted decades of efforts to understanding the interaction between proteins or RNA production. The information might empower the current knowledge on drug reactions or the development of certain diseases. Nevertheless, due to the lack of explicit structure, literature in life science, one of the most important sources of this information, prevents computer-based systems from accessing. Therefore, biomedical event extraction, automatically acquiring knowledge of molecular events in research articles, has attracted community-wide efforts recently. Most approaches are based on statistical models, requiring large-scale annotated corpora to precisely estimate models' parameters. However, it is usually difficult to obtain in practice. Therefore, employing un-annotated data based on semi-supervised learning for biomedical event extraction is a feasible solution and attracts more interests. In this paper, a semi-supervised learning framework based on hidden topics for biomedical event extraction is presented. In this framework, sentences in the un-annotated corpus are elaborately and automatically assigned with event annotations based on their distances to these sentences in the annotated corpus. More specifically, not only the structures of the sentences, but also the hidden topics embedded in the sentences are used for describing the distance. The sentences and newly assigned event annotations, together with the annotated corpus, are employed for training. Experiments were conducted on the multi-level event extraction corpus, a golden standard corpus. Experimental results show that more than 2.2% improvement on F-score on biomedical event extraction is achieved by the proposed framework when compared to the state-of-the-art approach. The results suggest that by incorporating un-annotated data, the proposed framework indeed improves the performance of the state-of-the-art event extraction system and the similarity between sentences might be precisely

  16. Entry-Level Technical Skills That Teachers Expected Students to Learn through Supervised Agricultural Experiences (SAEs): A Modified Delphi Study

    Science.gov (United States)

    Ramsey, Jon W.; Edwards, M. Craig

    2012-01-01

    Supervised experiences are designed to provide opportunities for the hands-on learning of skills and practices that lead to successful personal growth and future employment in an agricultural career (Talbert, Vaughn, Croom, & Lee, 2007). In the Annual Report for Agricultural Education (2005-2006), it was stated that 91% of the respondents…

  17. Just How Much Can School Pupils Learn from School Gardening? A Study of Two Supervised Agricultural Experience Approaches in Uganda

    Science.gov (United States)

    Okiror, John James; Matsiko, Biryabaho Frank; Oonyu, Joseph

    2011-01-01

    School systems in Africa are short of skills that link well with rural communities, yet arguments to vocationalize curricula remain mixed and school agriculture lacks the supervised practical component. This study, conducted in eight primary (elementary) schools in Uganda, sought to compare the learning achievement of pupils taught using…

  18. Teaching the computer to code frames in news: comparing two supervised machine learning approaches to frame analysis

    NARCIS (Netherlands)

    Burscher, B.; Odijk, D.; Vliegenthart, R.; de Rijke, M.; de Vreese, C.H.

    2014-01-01

    We explore the application of supervised machine learning (SML) to frame coding. By automating the coding of frames in news, SML facilitates the incorporation of large-scale content analysis into framing research, even if financial resources are scarce. This furthers a more integrated investigation

  19. Teaching the computer to code frames in news: comparing two supervised machine learning approaches to frame analysis

    NARCIS (Netherlands)

    Burscher, B.; Odijk, D.; Vliegenthart, R.; de Rijke, M.; de Vreese, C.H.

    2014-01-01

    We explore the application of supervised machine learning (SML) to frame coding. By automating the coding of frames in news, SML facilitates the incorporation of large-scale content analysis into framing research, even if financial resources are scarce. This furthers a more integrated investigation

  20. Entry-Level Technical Skills That Teachers Expected Students to Learn through Supervised Agricultural Experiences (SAEs): A Modified Delphi Study

    Science.gov (United States)

    Ramsey, Jon W.; Edwards, M. Craig

    2012-01-01

    Supervised experiences are designed to provide opportunities for the hands-on learning of skills and practices that lead to successful personal growth and future employment in an agricultural career (Talbert, Vaughn, Croom, & Lee, 2007). In the Annual Report for Agricultural Education (2005-2006), it was stated that 91% of the respondents (i.e.,…

  1. Collective Academic Supervision: A Model for Participation and Learning in Higher Education

    Science.gov (United States)

    Nordentoft, Helle Merete; Thomsen, Rie; Wichmann-Hansen, Gitte

    2013-01-01

    Supervision of graduate students is a core activity in higher education. Previous research on graduate supervision focuses on individual and relational aspects of the supervisory relationship rather than collective, pedagogical and methodological aspects of the supervision process. In presenting a collective model we have developed for academic…

  2. Machine Learning Algorithms in Web Page Classification

    Directory of Open Access Journals (Sweden)

    W.A.AWAD

    2012-11-01

    Full Text Available In this paper we use machine learning algorithms like SVM, KNN and GIS to perform a behaviorcomparison on the web pages classifications problem, from the experiment we see in the SVM with smallnumber of negative documents to build the centroids has the smallest storage requirement and the least online test computation cost. But almost all GIS with different number of nearest neighbors have an evenhigher storage requirement and on line test computation cost than KNN. This suggests that some futurework should be done to try to reduce the storage requirement and on list test cost of GIS.

  3. Self-Learning Algorithm for Coiling Temperature Controlling

    Institute of Scientific and Technical Information of China (English)

    WANG Jun; WANG Guo-dong; LIU Xiang-hua; ZHANG Dian-hua

    2004-01-01

    In order to establish a mathematical model for strip laminar cooling, the self-learning algorithm was introduced with the level learning for obvious heat flux fluctuation and the pattern learning for small heat flux fluctuation. The short self-learning calculation steps of water cooling and air cooling, and the long self-learning formula were given with some results.

  4. Learning Bayesian network structure with immune algorithm

    Institute of Scientific and Technical Information of China (English)

    Zhiqiang Cai; Shubin Si; Shudong Sun; Hongyan Dui

    2015-01-01

    Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa-per proposes an immune algorithm based method (BN-IA) for the learning of the BN structure with the idea of vaccination. Further-more, the methods on how to extract the effective vaccines from local optimal structure and root nodes are also described in details. Final y, the simulation studies are implemented with the helicopter convertor BN model and the car start BN model. The comparison results show that the proposed vaccines and the BN-IA can learn the BN structure effectively and efficiently.

  5. Visual Recognition by Learning From Web Data via Weakly Supervised Domain Generalization.

    Science.gov (United States)

    Niu, Li; Li, Wen; Xu, Dong; Cai, Jianfei

    2016-06-01

    In this paper, a weakly supervised domain generalization (WSDG) method is proposed for real-world visual recognition tasks, in which we train classifiers by using Web data (\\eg, Web images and Web videos) with noisy labels. In particular, two challenging problems need to be solved when learning robust classifiers, in which the first issue is to cope with the label noise of training Web data from the source domain, while the second issue is to enhance the generalization capability of learned classifiers to an arbitrary target domain. In order to handle the first problem, the training samples within each category are partitioned into clusters, where we use one bag to denote each cluster and instances to denote the samples in each cluster. Then, we identify a proportion of good training samples in each bag and train robust classifiers by using the good training samples, which leads to a multi-instance learning (MIL) problem. In order to handle the second problem, we assume that the training samples possibly form a set of hidden domains, with each hidden domain associated with a distinctive data distribution. Then, for each category and each hidden latent domain, we propose to learn one classifier by extending our MIL formulation, which leads to our WSDG approach. In the testing stage, our approach can obtain better generalization capability by effectively integrating multiple classifiers from different latent domains in each category. Moreover, our WSDG approach is further extended to utilize additional textual descriptions associated with Web data as privileged information (PI), although testing data do not have such PI. Extensive experiments on three benchmark data sets indicate that our newly proposed methods are effective for real-world visual recognition tasks by learning from Web data.

  6. Whither Supervision?

    Directory of Open Access Journals (Sweden)

    Duncan Waite

    2006-11-01

    Full Text Available This paper inquires if the school supervision is in decadence. Dr. Waite responds that the answer will depend on which perspective you look at it. Dr. Waite suggests taking in consideration three elements that are related: the field itself, the expert in the field (the professor, the theorist, the student and the administrator, and the context. When these three elements are revised, it emphasizes that there is not a consensus about the field of supervision, but there are coincidences related to its importance and that it is related to the improvement of the practice of the students in the school for their benefit. Dr. Waite suggests that the practice on this field is not always in harmony with what the theorists affirm. When referring to the supervisor or the skilled person, the author indicates that his or her perspective depends on his or her epistemological believes or in the way he or she conceives the learning; that is why supervision can be understood in different ways. About the context, Waite suggests that there have to be taken in consideration the social or external forces that influent the people and the society, because through them the education is affected. Dr. Waite concludes that the way to understand the supervision depends on the performer’s perspective. He responds to the initial question saying that the supervision authorities, the knowledge on this field, the performers, and its practice, are maybe spread but not extinct because the supervision will always be part of the great enterprise that we called education.

  7. A Learning Algorithm based on High School Teaching Wisdom

    CERN Document Server

    Philip, Ninan Sajeeth

    2010-01-01

    A learning algorithm based on primary school teaching and learning is presented. The methodology is to continuously evaluate a student and to give them training on the examples for which they repeatedly fail, until, they can correctly answer all types of questions. This incremental learning procedure produces better learning curves by demanding the student to optimally dedicate their learning time on the failed examples. When used in machine learning, the algorithm is found to train a machine on a data with maximum variance in the feature space so that the generalization ability of the network improves. The algorithm has interesting applications in data mining, model evaluations and rare objects discovery.

  8. Improved Bounds on Quantum Learning Algorithms

    CERN Document Server

    Atici, A; Atici, Alp; Servedio, Rocco A.

    2004-01-01

    In this article we give several new results on the complexity of algorithms that learn Boolean functions from quantum queries and quantum examples. Hunziker et al. conjectured that for any class C of Boolean functions, the number of quantum black-box queries which are required to exactly identify an unknown function from C is at most $O(\\frac{\\log |C|}{\\sqrt{{\\hat{\\gamma}}^{C}}})$, where $\\hat{\\gamma}^{C}$ is a combinatorial parameter of the class C. We essentially resolve this conjecture in the affirmative by giving a quantum algorithm that, for any class C, identifies any unknown function from C using at most $O(\\frac{\\log |C| \\log \\log |C|}{\\sqrt{{\\hat{\\gamma}}^{C}}})$ quantum black-box queries. We consider a range of natural problems intermediate between the exact learning problem (in which the learner must obtain all bits of information about the black-box function) and the usual problem of computing a predicate (in which the learner must obtain only one bit of information about the black-box function). ...

  9. Supervised Learning of Two-Layer Perceptron under the Existence of External Noise — Learning Curve of Boolean Functions of Two Variables in Tree-Like Architecture —

    Science.gov (United States)

    Uezu, Tatsuya; Kiyokawa, Shuji

    2016-06-01

    We investigate the supervised batch learning of Boolean functions expressed by a two-layer perceptron with a tree-like structure. We adopt continuous weights (spherical model) and the Gibbs algorithm. We study the Parity and And machines and two types of noise, input and output noise, together with the noiseless case. We assume that only the teacher suffers from noise. By using the replica method, we derive the saddle point equations for order parameters under the replica symmetric (RS) ansatz. We study the critical value αC of the loading rate α above which the learning phase exists for cases with and without noise. We find that αC is nonzero for the Parity machine, while it is zero for the And machine. We derive the exponents barβ of order parameters expressed as (α - α C)bar{β} when α is near to αC. Furthermore, in the Parity machine, when noise exists, we find a spin glass solution, in which the overlap between the teacher and student vectors is zero but that between student vectors is nonzero. We perform Markov chain Monte Carlo simulations by simulated annealing and also by exchange Monte Carlo simulations in both machines. In the Parity machine, we study the de Almeida-Thouless stability, and by comparing theoretical and numerical results, we find that there exist parameter regions where the RS solution is unstable, and that the spin glass solution is metastable or unstable. We also study asymptotic learning behavior for large α and derive the exponents hat{β } of order parameters expressed as α - hat{β } when α is large in both machines. By simulated annealing simulations, we confirm these results and conclude that learning takes place for the input noise case with any noise amplitude and for the output noise case when the probability that the teacher's output is reversed is less than one-half.

  10. Supervised Learning Approach for Spam Classification Analysis using Data Mining Tools

    Directory of Open Access Journals (Sweden)

    R.Deepa Lakshmi

    2010-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifierrelated issues. In recent days, Machine learning for spamclassification is an important research issue. This paper exploresand identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysisamong the algorithms has also been presented.

  11. Supervised Learning Approach for Spam Classification Analysis using Data Mining Tools

    Directory of Open Access Journals (Sweden)

    R.Deepa Lakshmi

    2010-11-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifierrelated issues. In recent days, Machine learning for spamclassification is an important research issue. This paper exploresand identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysisamong the algorithms has also been presented.

  12. Optimization of Evolutionary Neural Networks Using Hybrid Learning Algorithms

    OpenAIRE

    Abraham, Ajith

    2004-01-01

    Evolutionary artificial neural networks (EANNs) refer to a special class of artificial neural networks (ANNs) in which evolution is another fundamental form of adaptation in addition to learning. Evolutionary algorithms are used to adapt the connection weights, network architecture and learning algorithms according to the problem environment. Even though evolutionary algorithms are well known as efficient global search algorithms, very often they miss the best local solutions in the complex s...

  13. Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data

    Science.gov (United States)

    2015-07-01

    Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data Guy Van den Broeck∗ and Karthika Mohan∗ and Arthur Choi and Adnan...We propose a family of efficient algorithms for learning the parameters of a Bayesian network from incomplete data. Our approach is based on recent...algorithms like EM (which require inference). 1 INTRODUCTION When learning the parameters of a Bayesian network from data with missing values, the

  14. Automated detection of microaneurysms using scale-adapted blob analysis and semi-supervised learning.

    Science.gov (United States)

    Adal, Kedir M; Sidibé, Désiré; Ali, Sharib; Chaum, Edward; Karnowski, Thomas P; Mériaudeau, Fabrice

    2014-04-01

    Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier which can detect true MAs. The developed system is built using only few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images.

  15. The effects of supervised learning on event-related potential correlates of music-syntactic processing.

    Science.gov (United States)

    Guo, Shuang; Koelsch, Stefan

    2015-11-11

    Humans process music even without conscious effort according to implicit knowledge about syntactic regularities. Whether such automatic and implicit processing is modulated by veridical knowledge has remained unknown in previous neurophysiological studies. This study investigates this issue by testing whether the acquisition of veridical knowledge of a music-syntactic irregularity (acquired through supervised learning) modulates early, partly automatic, music-syntactic processes (as reflected in the early right anterior negativity, ERAN), and/or late controlled processes (as reflected in the late positive component, LPC). Excerpts of piano sonatas with syntactically regular and less regular chords were presented repeatedly (10 times) to non-musicians and amateur musicians. Participants were informed by a cue as to whether the following excerpt contained a regular or less regular chord. Results showed that the repeated exposure to several presentations of regular and less regular excerpts did not influence the ERAN elicited by less regular chords. By contrast, amplitudes of the LPC (as well as of the P3a evoked by less regular chords) decreased systematically across learning trials. These results reveal that late controlled, but not early (partly automatic), neural mechanisms of music-syntactic processing are modulated by repeated exposure to a musical piece. This article is part of a Special Issue entitled SI: Prediction and Attention. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. Semi-supervised manifold learning with affinity regularization for Alzheimer's disease identification using positron emission tomography imaging.

    Science.gov (United States)

    Lu, Shen; Xia, Yong; Cai, Tom Weidong; Feng, David Dagan

    2015-01-01

    Dementia, Alzheimer's disease (AD) in particular is a global problem and big threat to the aging population. An image based computer-aided dementia diagnosis method is needed to providing doctors help during medical image examination. Many machine learning based dementia classification methods using medical imaging have been proposed and most of them achieve accurate results. However, most of these methods make use of supervised learning requiring fully labeled image dataset, which usually is not practical in real clinical environment. Using large amount of unlabeled images can improve the dementia classification performance. In this study we propose a new semi-supervised dementia classification method based on random manifold learning with affinity regularization. Three groups of spatial features are extracted from positron emission tomography (PET) images to construct an unsupervised random forest which is then used to regularize the manifold learning objective function. The proposed method, stat-of-the-art Laplacian support vector machine (LapSVM) and supervised SVM are applied to classify AD and normal controls (NC). The experiment results show that learning with unlabeled images indeed improves the classification performance. And our method outperforms LapSVM on the same dataset.

  17. Effective and efficient optics inspection approach using machine learning algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Abdulla, G; Kegelmeyer, L; Liao, Z; Carr, W

    2010-11-02

    The Final Optics Damage Inspection (FODI) system automatically acquires and utilizes the Optics Inspection (OI) system to analyze images of the final optics at the National Ignition Facility (NIF). During each inspection cycle up to 1000 images acquired by FODI are examined by OI to identify and track damage sites on the optics. The process of tracking growing damage sites on the surface of an optic can be made more effective by identifying and removing signals associated with debris or reflections. The manual process to filter these false sites is daunting and time consuming. In this paper we discuss the use of machine learning tools and data mining techniques to help with this task. We describe the process to prepare a data set that can be used for training and identifying hardware reflections in the image data. In order to collect training data, the images are first automatically acquired and analyzed with existing software and then relevant features such as spatial, physical and luminosity measures are extracted for each site. A subset of these sites is 'truthed' or manually assigned a class to create training data. A supervised classification algorithm is used to test if the features can predict the class membership of new sites. A suite of self-configuring machine learning tools called 'Avatar Tools' is applied to classify all sites. To verify, we used 10-fold cross correlation and found the accuracy was above 99%. This substantially reduces the number of false alarms that would otherwise be sent for more extensive investigation.

  18. A Competency-Based Guided-Learning Algorithm Applied on Adaptively Guiding E-Learning

    Science.gov (United States)

    Hsu, Wei-Chih; Li, Cheng-Hsiu

    2015-01-01

    This paper presents a new algorithm called competency-based guided-learning algorithm (CBGLA), which can be applied on adaptively guiding e-learning. Computational process analysis and mathematical derivation of competency-based learning (CBL) were used to develop the CBGLA. The proposed algorithm could generate an effective adaptively guiding…

  19. Nonlinear system identification by Gustafson-Kessel fuzzy clustering and supervised local model network learning for the drug absorption spectra process.

    Science.gov (United States)

    Teslic, Luka; Hartmann, Benjamin; Nelles, Oliver; Skrjanc, Igor

    2011-12-01

    This paper deals with the problem of fuzzy nonlinear model identification in the framework of a local model network (LMN). A new iterative identification approach is proposed, where supervised and unsupervised learning are combined to optimize the structure of the LMN. For the purpose of fitting the cluster-centers to the process nonlinearity, the Gustafsson-Kessel (GK) fuzzy clustering, i.e., unsupervised learning, is applied. In combination with the LMN learning procedure, a new incremental method to define the number and the initial locations of the cluster centers for the GK clustering algorithm is proposed. Each data cluster corresponds to a local region of the process and is modeled with a local linear model. Since the validity functions are calculated from the fuzzy covariance matrices of the clusters, they are highly adaptable and thus the process can be described with a very sparse amount of local models, i.e., with a parsimonious LMN model. The proposed method for constructing the LMN is finally tested on a drug absorption spectral process and compared to two other methods, namely, Lolimot and Hilomot. The comparison between the experimental results when using each method shows the usefulness of the proposed identification algorithm.

  20. Manifold regularized multitask learning for semi-supervised multilabel image classification.

    Science.gov (United States)

    Luo, Yong; Tao, Dacheng; Geng, Bo; Xu, Chao; Maybank, Stephen J

    2013-02-01

    It is a significant challenge to classify images with multiple labels by using only a small number of labeled samples. One option is to learn a binary classifier for each label and use manifold regularization to improve the classification performance by exploring the underlying geometric structure of the data distribution. However, such an approach does not perform well in practice when images from multiple concepts are represented by high-dimensional visual features. Thus, manifold regularization is insufficient to control the model complexity. In this paper, we propose a manifold regularized multitask learning (MRMTL) algorithm. MRMTL learns a discriminative subspace shared by multiple classification tasks by exploiting the common structure of these tasks. It effectively controls the model complexity because different tasks limit one another's search volume, and the manifold regularization ensures that the functions in the shared hypothesis space are smooth along the data manifold. We conduct extensive experiments, on the PASCAL VOC'07 dataset with 20 classes and the MIR dataset with 38 classes, by comparing MRMTL with popular image classification algorithms. The results suggest that MRMTL is effective for image classification.

  1. A framework to facilitate self-directed learning, assessment and supervision in midwifery practice: a qualitative study of supervisors' perceptions.

    Science.gov (United States)

    Embo, M; Driessen, E; Valcke, M; van der Vleuten, C P M

    2014-08-01

    Self-directed learning is an educational concept that has received increasing attention. The recent workplace literature, however, reports problems with the facilitation of self-directed learning in clinical practice. We developed the Midwifery Assessment and Feedback Instrument (MAFI) as a framework to facilitate self-directed learning. In the present study, we sought clinical supervisors' perceptions of the usefulness of MAFI. Interviews with fifteen clinical supervisors were audio taped, transcribed verbatim and analysed thematically using Atlas-Ti software for qualitative data analysis. Four themes emerged from the analysis. (1) The competency-based educational structure promotes the setting of realistic learning outcomes and a focus on competency development, (2) instructing students to write reflections facilitates student-centred supervision, (3) creating a feedback culture is necessary to achieve continuity in supervision and (4) integrating feedback and assessment might facilitate competency development under the condition that evidence is discussed during assessment meetings. Supervisors stressed the need for direct observation, and instruction how to facilitate a self-directed learning process. The MAFI appears to be a useful framework to promote self-directed learning in clinical practice. The effect can be advanced by creating a feedback and assessment culture where learners and supervisors share the responsibility for developing self-directed learning. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Challenges in the Verification of Reinforcement Learning Algorithms

    Science.gov (United States)

    Van Wesel, Perry; Goodloe, Alwyn E.

    2017-01-01

    Machine learning (ML) is increasingly being applied to a wide array of domains from search engines to autonomous vehicles. These algorithms, however, are notoriously complex and hard to verify. This work looks at the assumptions underlying machine learning algorithms as well as some of the challenges in trying to verify ML algorithms. Furthermore, we focus on the specific challenges of verifying reinforcement learning algorithms. These are highlighted using a specific example. Ultimately, we do not offer a solution to the complex problem of ML verification, but point out possible approaches for verification and interesting research opportunities.

  3. An Adaptive Privacy Protection Method for Smart Home Environments Using Supervised Learning

    Directory of Open Access Journals (Sweden)

    Jingsha He

    2017-03-01

    Full Text Available In recent years, smart home technologies have started to be widely used, bringing a great deal of convenience to people’s daily lives. At the same time, privacy issues have become particularly prominent. Traditional encryption methods can no longer meet the needs of privacy protection in smart home applications, since attacks can be launched even without the need for access to the cipher. Rather, attacks can be successfully realized through analyzing the frequency of radio signals, as well as the timestamp series, so that the daily activities of the residents in the smart home can be learnt. Such types of attacks can achieve a very high success rate, making them a great threat to users’ privacy. In this paper, we propose an adaptive method based on sample data analysis and supervised learning (SDASL, to hide the patterns of daily routines of residents that would adapt to dynamically changing network loads. Compared to some existing solutions, our proposed method exhibits advantages such as low energy consumption, low latency, strong adaptability, and effective privacy protection.

  4. Distributed multisensory integration in a recurrent network model through supervised learning

    Science.gov (United States)

    Wang, He; Wong, K. Y. Michael

    Sensory integration between different modalities has been extensively studied. It is suggested that the brain integrates signals from different modalities in a Bayesian optimal way. However, how the Bayesian rule is implemented in a neural network remains under debate. In this work we propose a biologically plausible recurrent network model, which can perform Bayesian multisensory integration after trained by supervised learning. Our model is composed of two modules, each for one modality. We assume that each module is a recurrent network, whose activity represents the posterior distribution of each stimulus. The feedforward input on each module is the likelihood of each modality. Two modules are integrated through cross-links, which are feedforward connections from the other modality, and reciprocal connections, which are recurrent connections between different modules. By stochastic gradient descent, we successfully trained the feedforward and recurrent coupling matrices simultaneously, both of which resembles the Mexican-hat. We also find that there are more than one set of coupling matrices that can approximate the Bayesian theorem well. Specifically, reciprocal connections and cross-links will compensate each other if one of them is removed. Even though trained with two inputs, the network's performance with only one input is in good accordance with what is predicted by the Bayesian theorem.

  5. Restricted Boltzmann machines based oversampling and semi-supervised learning for false positive reduction in breast CAD.

    Science.gov (United States)

    Cao, Peng; Liu, Xiaoli; Bao, Hang; Yang, Jinzhu; Zhao, Dazhe

    2015-01-01

    The false-positive reduction (FPR) is a crucial step in the computer aided detection system for the breast. The issues of imbalanced data distribution and the limitation of labeled samples complicate the classification procedure. To overcome these challenges, we propose oversampling and semi-supervised learning methods based on the restricted Boltzmann machines (RBMs) to solve the classification of imbalanced data with a few labeled samples. To evaluate the proposed method, we conducted a comprehensive performance study and compared its results with the commonly used techniques. Experiments on benchmark dataset of DDSM demonstrate the effectiveness of the RBMs based oversampling and semi-supervised learning method in terms of geometric mean (G-mean) for false positive reduction in Breast CAD.

  6. Supporting and Supervising Teachers Working With Adults Learning English. CAELA Network Brief

    Science.gov (United States)

    Young, Sarah

    2009-01-01

    This brief provides an overview of the knowledge and skills that administrators need in order to support and supervise teachers of adult English language learners. It begins with a review of resources and literature related to teacher supervision in general and to adult ESL education. It continues with information on the background and…

  7. Understanding Trust as an Essential Element of Trainee Supervision and Learning in the Workplace

    Science.gov (United States)

    Hauer, Karen E.; ten Cate, Olle; Boscardin, Christy; Irby, David M.; Iobst, William; O'Sullivan, Patricia S.

    2014-01-01

    Clinical supervision requires that supervisors make decisions about how much independence to allow their trainees for patient care tasks. The simultaneous goals of ensuring quality patient care and affording trainees appropriate and progressively greater responsibility require that the supervising physician trusts the trainee. Trust allows the…

  8. Enhancing the Doctoral Journey: The Role of Group Supervision in Supporting Collaborative Learning and Creativity

    Science.gov (United States)

    Fenge, Lee-Ann

    2012-01-01

    This article explores the role of group supervision within doctoral education, offering an exploration of the experience of group supervision processes through a small-scale study evaluating both student and staff experience across three cohorts of one professional doctorate programme. There has been very little research to date exploring…

  9. Is Direct Supervision in Clinical Education for Athletic Training Students Always Necessary to Enhance Student Learning?

    Science.gov (United States)

    Scriber, Kent; Trowbridge, Cindy

    2009-01-01

    Objective: To present an alternative model of supervision within clinical education experiences. Background: Several years ago direct supervision was defined more clearly in the accreditation standards for athletic training education programs (ATEPs). Currently, athletic training students may not gain any clinical experience without their clinical…

  10. Clinical group supervision in yoga therapy: model effects, and lessons learned.

    Science.gov (United States)

    Forbes, Bo; Volpe Horii, Cassandra; Earls, Bethany; Mashek, Stephanie; Akhtar, Fiona

    2012-01-01

    Clinical supervision is an integral component of therapist training and professional development because of its capacity for fostering knowledge, self-awareness, and clinical acumen. Individual supervision is part of many yoga therapy training programs and is referenced in the IAYT Standards as "mentoring." Group supervision is not typically used in the training of yoga therapists. We propose that group supervision effectively supports the growth and development of yoga therapists-in-training. We present a model of group supervision for yoga therapist trainees developed by the New England School of Integrative Yoga Therapeutics™ (The NESIYT Model) that includes the background, structure, format, and development of our inaugural 18-month supervision group. Pre-and post-supervision surveys and analyzed case notes, which captured key didactic and process themes, are discussed. Clinical issues, such as boundaries, performance anxiety, sense of self efficacy, the therapeutic alliance, transference and counter transference, pacing of yoga therapy sessions, evaluation of client progress, and adjunct therapist interaction are reviewed. The timing and sequence of didactic and process themes and benefits for yoga therapist trainees' professional development, are discussed. The NESIYT group supervision model is offered as an effective blueprint for yoga therapy training programs.

  11. Knowledge Work Supervision: Transforming School Systems into High Performing Learning Organizations.

    Science.gov (United States)

    Duffy, Francis M.

    1997-01-01

    This article describes a new supervision model conceived to help a school system redesign its anatomy (structures), physiology (flow of information and webs of relationships), and psychology (beliefs and values). The new paradigm (Knowledge Work Supervision) was constructed by reviewing the practices of several interrelated areas: sociotechnical…

  12. Location-Aware Mobile Learning of Spatial Algorithms

    Science.gov (United States)

    Karavirta, Ville

    2013-01-01

    Learning an algorithm--a systematic sequence of operations for solving a problem with given input--is often difficult for students due to the abstract nature of the algorithms and the data they process. To help students understand the behavior of algorithms, a subfield in computing education research has focused on algorithm…

  13. A new accelerating algorithm for multi-agent reinforcement learning

    Institute of Scientific and Technical Information of China (English)

    ZHANG Ru-bo; ZHONG Yu; GU Guo-chang

    2005-01-01

    In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents' behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by jointaction. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm.

  14. Clonal Selection Algorithm Based Iterative Learning Control with Random Disturbance

    Directory of Open Access Journals (Sweden)

    Yuanyuan Ju

    2013-01-01

    Full Text Available Clonal selection algorithm is improved and proposed as a method to solve optimization problems in iterative learning control. And a clonal selection algorithm based optimal iterative learning control algorithm with random disturbance is proposed. In the algorithm, at the same time, the size of the search space is decreased and the convergence speed of the algorithm is increased. In addition a model modifying device is used in the algorithm to cope with the uncertainty in the plant model. In addition a model is used in the algorithm cope with the uncertainty in the plant model. Simulations show that the convergence speed is satisfactory regardless of whether or not the plant model is precise nonlinear plants. The simulation test verify the controlled system with random disturbance can reached to stability by using improved iterative learning control law but not the traditional control law.

  15. Material classification and automatic content enrichment of images using supervised learning and knowledge bases

    Science.gov (United States)

    Mallepudi, Sri Abhishikth; Calix, Ricardo A.; Knapp, Gerald M.

    2011-02-01

    In recent years there has been a rapid increase in the size of video and image databases. Effective searching and retrieving of images from these databases is a significant current research area. In particular, there is a growing interest in query capabilities based on semantic image features such as objects, locations, and materials, known as content-based image retrieval. This study investigated mechanisms for identifying materials present in an image. These capabilities provide additional information impacting conditional probabilities about images (e.g. objects made of steel are more likely to be buildings). These capabilities are useful in Building Information Modeling (BIM) and in automatic enrichment of images. I2T methodologies are a way to enrich an image by generating text descriptions based on image analysis. In this work, a learning model is trained to detect certain materials in images. To train the model, an image dataset was constructed containing single material images of bricks, cloth, grass, sand, stones, and wood. For generalization purposes, an additional set of 50 images containing multiple materials (some not used in training) was constructed. Two different supervised learning classification models were investigated: a single multi-class SVM classifier, and multiple binary SVM classifiers (one per material). Image features included Gabor filter parameters for texture, and color histogram data for RGB components. All classification accuracy scores using the SVM-based method were above 85%. The second model helped in gathering more information from the images since it assigned multiple classes to the images. A framework for the I2T methodology is presented.

  16. A neuron model with trainable activation function (TAF) and its MFNN supervised learning

    Institute of Scientific and Technical Information of China (English)

    吴佑寿; 赵明生

    2001-01-01

    This paper addresses a new kind of neuron model, which has trainable activation function (TAF) in addition to only trainable weights in the conventional M-P model. The final neuron activation function can be derived from a primitive neuron activation function by training. The BP like learning algorithm has been presented for MFNN constructed by neurons of TAF model. Several simulation examples are given to show the network capacity and performance advantages of the new MFNN in comparison with that of conventional sigmoid MFNN.

  17. Photometric classification of type Ia supernovae in the SuperNova Legacy Survey with supervised learning

    CERN Document Server

    Möller, A; Leloup, C; Neveu, J; Palanque-Delabrouille, N; Rich, J; Carlberg, R; Lidman, C; Pritchet, C

    2016-01-01

    In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts ($0.2learning classification. We study the performance of different algorithms such as Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia sa...

  18. Image-Derived Input Function Derived from a Supervised Clustering Algorithm: Methodology and Validation in a Clinical Protocol Using [11C](R)-Rolipram

    OpenAIRE

    Chul Hyoung Lyoo; Paolo Zanotti-Fregonara; Zoghbi, Sami S.; Jeih-San Liow; Rong Xu; Pike, Victor W.; Zarate, Carlos A.; Masahiro Fujita; Innis, Robert B.

    2014-01-01

    Image-derived input function (IDIF) obtained by manually drawing carotid arteries (manual-IDIF) can be reliably used in [(11)C](R)-rolipram positron emission tomography (PET) scans. However, manual-IDIF is time consuming and subject to inter- and intra-operator variability. To overcome this limitation, we developed a fully automated technique for deriving IDIF with a supervised clustering algorithm (SVCA). To validate this technique, 25 healthy controls and 26 patients with moderate to severe...

  19. Attend in groups: a weakly-supervised deep learning framework for learning from web data

    OpenAIRE

    Zhuang, Bohan; Liu, Lingqiao; Li, Yao; Shen, Chunhua; Reid, Ian

    2016-01-01

    Large-scale datasets have driven the rapid development of deep neural networks for visual recognition. However, annotating a massive dataset is expensive and time-consuming. Web images and their labels are, in comparison, much easier to obtain, but direct training on such automatically harvested images can lead to unsatisfactory performance, because the noisy labels of Web images adversely affect the learned recognition models. To address this drawback we propose an end-to-end weakly-supervis...

  20. Evaluation of Four Supervised Learning Methods for Benthic Habitat Mapping Using Backscatter from Multi-Beam Sonar

    Directory of Open Access Journals (Sweden)

    Jacquomo Monk

    2012-11-01

    Full Text Available An understanding of the distribution and extent of marine habitats is essential for the implementation of ecosystem-based management strategies. Historically this had been difficult in marine environments until the advancement of acoustic sensors. This study demonstrates the applicability of supervised learning techniques for benthic habitat characterization using angular backscatter response data. With the advancement of multibeam echo-sounder (MBES technology, full coverage datasets of physical structure over vast regions of the seafloor are now achievable. Supervised learning methods typically applied to terrestrial remote sensing provide a cost-effective approach for habitat characterization in marine systems. However the comparison of the relative performance of different classifiers using acoustic data is limited. Characterization of acoustic backscatter data from MBES using four different supervised learning methods to generate benthic habitat maps is presented. Maximum Likelihood Classifier (MLC, Quick, Unbiased, Efficient Statistical Tree (QUEST, Random Forest (RF and Support Vector Machine (SVM were evaluated to classify angular backscatter response into habitat classes using training data acquired from underwater video observations. Results for biota classifications indicated that SVM and RF produced the highest accuracies, followed by QUEST and MLC, respectively. The most important backscatter data were from the moderate incidence angles between 30° and 50°. This study presents initial results for understanding how acoustic backscatter from MBES can be optimized for the characterization of marine benthic biological habitats.

  1. Hypothetical Pattern Recognition Design Using Multi-Layer Perceptorn Neural Network For Supervised Learning

    Directory of Open Access Journals (Sweden)

    Md. Abdullah-al-mamun

    2015-08-01

    Full Text Available Abstract Humans are capable to identifying diverse shape in the different pattern in the real world as effortless fashion due to their intelligence is grow since born with facing several learning process. Same way we can prepared an machine using human like brain called Artificial Neural Network that can be recognize different pattern from the real world object. Although the various techniques is exists to implementation the pattern recognition but recently the artificial neural network approaches have been giving the significant attention. Because the approached of artificial neural network is like a human brain that is learn from different observation and give a decision the previously learning rule. Over the 50 years research now a days pattern recognition for machine learning using artificial neural network got a significant achievement. For this reason many real world problem can be solve by modeling the pattern recognition process. The objective of this paper is to present the theoretical concept for pattern recognition design using Multi-Layer Perceptorn neural networkin the algorithm of artificial Intelligence as the best possible way of utilizing available resources to make a decision that can be a human like performance.

  2. Unsupervised Labeling Of Data For Supervised Learning And Its Application To Medical Claims Prediction

    Directory of Open Access Journals (Sweden)

    Che Ngufor

    2013-01-01

    Full Text Available The task identifying changes and irregularities in medical insurance claim pay-ments is a difficult process of which the traditional practice involves queryinghistorical claims databases and flagging potential claims as normal or abnor-mal. Because what is considered as normal payment is usually unknown andmay change over time, abnormal payments often pass undetected; only to bediscovered when the payment period has passed.This paper presents the problem of on-line unsupervised learning from datastreams when the distribution that generates the data changes or drifts overtime. Automated algorithms for detecting drifting concepts in a probabilitydistribution of the data are presented. The idea behind the presented driftdetection methods is to transform the distribution of the data within a slidingwindow into a more convenient distribution. Then, a test statistics p-value ata given significance level can be used to infer the drift rate, adjust the windowsize and decide on the status of the drift. The detected concepts drifts areused to label the data, for subsequent learning of classification models by asupervised learner. The algorithms were tested on several synthetic and realmedical claims data sets.

  3. Supervised machine learning on a network scale: application to seismic event classification and detection

    Science.gov (United States)

    Reynen, Andrew; Audet, Pascal

    2017-09-01

    A new method using a machine learning technique is applied to event classification and detection at seismic networks. This method is applicable to a variety of network sizes and settings. The algorithm makes use of a small catalogue of known observations across the entire network. Two attributes, the polarization and frequency content, are used as input to regression. These attributes are extracted at predicted arrival times for P and S waves using only an approximate velocity model, as attributes are calculated over large time spans. This method of waveform characterization is shown to be able to distinguish between blasts and earthquakes with 99 per cent accuracy using a network of 13 stations located in Southern California. The combination of machine learning with generalized waveform features is further applied to event detection in Oklahoma, United States. The event detection algorithm makes use of a pair of unique seismic phases to locate events, with a precision directly related to the sampling rate of the generalized waveform features. Over a week of data from 30 stations in Oklahoma, United States are used to automatically detect 25 times more events than the catalogue of the local geological survey, with a false detection rate of less than 2 per cent. This method provides a highly confident way of detecting and locating events. Furthermore, a large number of seismic events can be automatically detected with low false alarm, allowing for a larger automatic event catalogue with a high degree of trust.

  4. Kollegial supervision

    DEFF Research Database (Denmark)

    Andersen, Ole Dibbern; Petersson, Erling

    Publikationen belyser, hvordan kollegial supervision i en kan organiseres i en uddannelsesinstitution......Publikationen belyser, hvordan kollegial supervision i en kan organiseres i en uddannelsesinstitution...

  5. Human resource recommendation algorithm based on ensemble learning and Spark

    Science.gov (United States)

    Cong, Zihan; Zhang, Xingming; Wang, Haoxiang; Xu, Hongjie

    2017-08-01

    Aiming at the problem of “information overload” in the human resources industry, this paper proposes a human resource recommendation algorithm based on Ensemble Learning. The algorithm considers the characteristics and behaviours of both job seeker and job features in the real business circumstance. Firstly, the algorithm uses two ensemble learning methods-Bagging and Boosting. The outputs from both learning methods are then merged to form user interest model. Based on user interest model, job recommendation can be extracted for users. The algorithm is implemented as a parallelized recommendation system on Spark. A set of experiments have been done and analysed. The proposed algorithm achieves significant improvement in accuracy, recall rate and coverage, compared with recommendation algorithms such as UserCF and ItemCF.

  6. An efficient learning algorithm for associative memories.

    Science.gov (United States)

    Wu, Y; Batalama, S N

    2000-01-01

    Associative memories (AMs) can be implemented using networks with or without feedback. In this paper we utilize a two-layer feedforward neural network and propose a new learning algorithm that efficiently implements the association rule of a bipolar AM. The hidden layer of the network employs p neurons where p is the number of prototype patterns. In the first layer, the input pattern activates at most one hidden layer neuron or "winner." In the second layer, the "winner" associates the input pattern to the corresponding prototype pattern. The underlying association principle is minimum Hamming distance and the proposed scheme can be viewed also as an approximately minimum Hamming distance decoder. Theoretical analysis supported by simulations indicates that, in comparison with other suboptimum minimum Hamming distance association schemes, the proposed structure exhibits the following favorable characteristics: 1) It operates in one-shot which implies no convergence-time requirements; 2) it does not require any feedback; and 3) our case studies show that it exhibits superior performance than the popular linear system in a saturated mode (LSSM). The network also exhibits 4) exponential capacity and 5) easy performance assessment (no asymptotic analysis is necessary). Finally, since it does not require any hidden layer interconnections or tree-search operations, it exhibits low structural as well as operational complexity.

  7. SU-E-J-107: Supervised Learning Model of Aligned Collagen for Human Breast Carcinoma Prognosis

    Energy Technology Data Exchange (ETDEWEB)

    Bredfeldt, J; Liu, Y; Conklin, M; Keely, P; Eliceiri, K; Mackie, T [University of Wisconsin, Madison, WI (United States)

    2014-06-01

    Purpose: Our goal is to develop and apply a set of optical and computational tools to enable large-scale investigations of the interaction between collagen and tumor cells. Methods: We have built a novel imaging system for automating the capture of whole-slide second harmonic generation (SHG) images of collagen in registry with bright field (BF) images of hematoxylin and eosin stained tissue. To analyze our images, we have integrated a suite of supervised learning tools that semi-automatically model and score collagen interactions with tumor cells via a variety of metrics, a method we call Electronic Tumor Associated Collagen Signatures (eTACS). This group of tools first segments regions of epithelial cells and collagen fibers from BF and SHG images respectively. We then associate fibers with groups of epithelial cells and finally compute features based on the angle of interaction and density of the collagen surrounding the epithelial cell clusters. These features are then processed with a support vector machine to separate cancer patients into high and low risk groups. Results: We validated our model by showing that eTACS produces classifications that have statistically significant correlation with manual classifications. In addition, our system generated classification scores that accurately predicted breast cancer patient survival in a cohort of 196 patients. Feature rank analysis revealed that TACS positive fibers are more well aligned with each other, generally lower density, and terminate within or near groups of epithelial cells. Conclusion: We are working to apply our model to predict survival in larger cohorts of breast cancer patients with a diversity of breast cancer types, predict response to treatments such as COX2 inhibitors, and to study collagen architecture changes in other cancer types. In the future, our system may be used to provide metastatic potential information to cancer patients to augment existing clinical assays.

  8. Semi-supervised sparse coding

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-07-06

    Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a semi-supervised manner, where only a few training samples are labeled. By using the manifold structure spanned by the data set of both labeled and unlabeled samples and the constraints provided by the labels of the labeled samples, we learn the variable class labels for all the samples. Furthermore, to improve the discriminative ability of the learned sparse codes, we assume that the class labels could be predicted from the sparse codes directly using a linear classifier. By solving the codebook, sparse codes, class labels and classifier parameters simultaneously in a unified objective function, we develop a semi-supervised sparse coding algorithm. Experiments on two real-world pattern recognition problems demonstrate the advantage of the proposed methods over supervised sparse coding methods on partially labeled data sets.

  9. Linkage intensity learning approach with genetic algorithm for causality diagram

    Institute of Scientific and Technical Information of China (English)

    WANG Cheng-liang; CHEN Juan-juan

    2007-01-01

    The causality diagram theory, which adopts graphical expression of knowledge and direct intensity of causality, overcomes some shortages in belief network and has evolved into a mixed causality diagram methodology for discrete and continuous variable. But to give linkage intensity of causality diagram is difficult, particularly in many working conditions in which sampling data are limited or noisy. The classic learning algorithm is hard to be adopted. We used genetic algorithm to learn linkage intensity from limited data. The simulation results demonstrate that this algorithm is more suitable than the classic algorithm in the condition of sample shortage such as space shuttle's fault diagnoisis.

  10. A Sparse Bayesian Learning Algorithm for Longitudinal Image Data.

    Science.gov (United States)

    Sabuncu, Mert R

    2015-10-01

    Longitudinal imaging studies, where serial (multiple) scans are collected on each individual, are becoming increasingly widespread. The field of machine learning has in general neglected the longitudinal design, since many algorithms are built on the assumption that each datapoint is an independent sample. Thus, the application of general purpose machine learning tools to longitudinal image data can be sub-optimal. Here, we present a novel machine learning algorithm designed to handle longitudinal image datasets. Our approach builds on a sparse Bayesian image-based prediction algorithm. Our empirical results demonstrate that the proposed method can offer a significant boost in prediction performance with longitudinal clinical data.

  11. Learning a Markov Logic network for supervised gene regulatory network inference.

    Science.gov (United States)

    Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence

    2013-09-12

    Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a

  12. Investigating the control of climatic oscillations over global terrestrial evaporation using a simple supervised learning method

    Science.gov (United States)

    Martens, Brecht; Miralles, Diego; Waegeman, Willem; Dorigo, Wouter; Verhoest, Niko

    2017-04-01

    Intra-annual and multi-decadal variations in the Earth's climate are to a large extent driven by periodic oscillations in the coupled state of atmosphere and ocean. These oscillations alter not only the climate in nearby regions, but also have an important impact on the local climate in remote areas, a phenomenon that is often referred to as 'teleconnection'. Because changes in local climate immediately impact terrestrial ecosystems through a series of complex processes and feedbacks, ocean-atmospheric teleconnections are expected to influence land evaporation - i.e. the return flux of water from land to atmosphere. In this presentation, the effects of these intra-annual and multi-decadal climate oscillations on global terrestrial evaporation are analysed. To this end, we use satellite observations of different essential climate variables in combination with a simple supervised learning method, the lasso regression. A total of sixteen Climate Oscillation Indices (COIs) - which are routinely used to diagnose the major ocean-atmospheric oscillations - are selected. Multi-decadal data of terrestrial evaporation are retrieved from the Global Land Evaporation Amsterdam Model (GLEAM, www.gleam.eu). Using the lasso regression, it is shown that more than 30% of the inter-annual variations in terrestrial evaporation can be explained by ocean-atmospheric oscillations. In addition, the impact in different regions across the globe can typically be attributed to a small subset of the sixteen COIs. For instance, the dynamics in terrestrial evaporation over Australia are substantially impacted by both the El Niño Southern Oscillation (here diagnosed using the Southern Oscillation Index, SOI) and the Indian Ocean Dipole Oscillation (here diagnosed using the Indian Dipole Mode Index, DMI). Subsequently, using the same learning method but regressing terrestrial evaporation to its local climatic drivers (air temperature, precipitation, radiation), allows us to discern through which

  13. Risk-sensitive reinforcement learning algorithms with generalized average criterion

    Institute of Scientific and Technical Information of China (English)

    YIN Chang-ming; WANG Han-xing; ZHAO Fei

    2007-01-01

    A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robusticity of solutions. The robusticity of solutions maybe becomes a very important property for a learning system when there exists non-matching between theory models and practical physical system, or the practical system is not static,or the availability of a control action changes along with the variety of time. The main contribution is that a set of approximation algorithms and their convergence results are given. A generalized average operator instead of the general optimal operator max (or min) is applied to study a class of important learning algorithms, dynamic programming algorithms, and discuss their convergences from theoretic point of view. The purpose for this research is to improve the robusticity of reinforcement learning algorithms theoretically.

  14. Multi-agent reinforcement learning using modular neural network Q-learning algorithms

    Institute of Scientific and Technical Information of China (English)

    YANG Yin-xian; FANG Kai

    2005-01-01

    Reinforcement learning is an excellent approach which is used in artificial intelligence,automatic control, etc. However, ordinary reinforcement learning algorithm, such as Q-learning with lookup table cannot cope with extremely complex and dynamic environment due to the huge state space. To reduce the state space, modular neural network Q-learning algorithm is proposed, which combines Q-learning algorithm with neural network and module method. Forward feedback neural network, Elman neural network and radius-basis neural network are separately employed to construct such algorithm. It is revealed that Elman neural network Q-learning algorithm has the best performance under the condition that the same neural network training method, i.e. gradient descent error back-propagation algorithm is applied.

  15. Two Novel On-policy Reinforcement Learning Algorithms based on TD(lambda)-methods

    NARCIS (Netherlands)

    Wiering, M.A.; Hasselt, H. van

    2007-01-01

    This paper describes two novel on-policy reinforcement learning algorithms, named QV(lambda)-learning and the actor critic learning automaton (ACLA). Both algorithms learn a state value-function using TD(lambda)-methods. The difference between the algorithms is that QV-learning uses the learned

  16. Teaching learning based optimization algorithm and its engineering applications

    CERN Document Server

    Rao, R Venkata

    2016-01-01

    Describing a new optimization algorithm, the “Teaching-Learning-Based Optimization (TLBO),” in a clear and lucid style, this book maximizes reader insights into how the TLBO algorithm can be used to solve continuous and discrete optimization problems involving single or multiple objectives. As the algorithm operates on the principle of teaching and learning, where teachers influence the quality of learners’ results, the elitist version of TLBO algorithm (ETLBO) is described along with applications of the TLBO algorithm in the fields of electrical engineering, mechanical design, thermal engineering, manufacturing engineering, civil engineering, structural engineering, computer engineering, electronics engineering, physics and biotechnology. The book offers a valuable resource for scientists, engineers and practitioners involved in the development and usage of advanced optimization algorithms.

  17. Information theoretic derivation of network architecture and learning algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Jones, R.D.; Barnes, C.W.; Lee, Y.C.; Mead, W.C.

    1991-01-01

    Using variational techniques, we derive a feedforward network architecture that minimizes a least squares cost function with the soft constraint that the mutual information between input and output be maximized. This permits optimum generalization for a given accuracy. A set of learning algorithms are also obtained. The network and learning algorithms are tested on a set of test problems which emphasize time series prediction. 6 refs., 1 fig.

  18. On stochastic approximation algorithms for classes of PAC learning problems

    Energy Technology Data Exchange (ETDEWEB)

    Rao, N.S.V.; Uppuluri, V.R.R.; Oblow, E.M.

    1994-03-01

    The classical stochastic approximation methods are shown to yield algorithms to solve several formulations of the PAC learning problem defined on the domain [o,1]{sup d}. Under some assumptions on different ability of the probability measure functions, simple algorithms to solve some PAC learning problems are proposed based on networks of non-polynomial units (e.g. artificial neural networks). Conditions on the sizes of these samples required to ensure the error bounds are derived using martingale inequalities.

  19. Imbalanced learning foundations, algorithms, and applications

    CERN Document Server

    He, Haibo

    2013-01-01

    The first book of its kind to review the current status and future direction of the exciting new branch of machine learning/data mining called imbalanced learning Imbalanced learning focuses on how an intelligent system can learn when it is provided with imbalanced data. Solving imbalanced learning problems is critical in numerous data-intensive networked systems, including surveillance, security, Internet, finance, biomedical, defense, and more. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles,

  20. SelfieBoost: A Boosting Algorithm for Deep Learning

    OpenAIRE

    2014-01-01

    We describe and analyze a new boosting algorithm for deep learning called SelfieBoost. Unlike other boosting algorithms, like AdaBoost, which construct ensembles of classifiers, SelfieBoost boosts the accuracy of a single network. We prove a $\\log(1/\\epsilon)$ convergence rate for SelfieBoost under some "SGD success" assumption which seems to hold in practice.

  1. Online learning algorithm for ensemble of decision rules

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    We describe an online learning algorithm that builds a system of decision rules for a classification problem. Rules are constructed according to the minimum description length principle by a greedy algorithm or using the dynamic programming approach. © 2011 Springer-Verlag.

  2. SelfieBoost: A Boosting Algorithm for Deep Learning

    OpenAIRE

    Shalev-Shwartz, Shai

    2014-01-01

    We describe and analyze a new boosting algorithm for deep learning called SelfieBoost. Unlike other boosting algorithms, like AdaBoost, which construct ensembles of classifiers, SelfieBoost boosts the accuracy of a single network. We prove a $\\log(1/\\epsilon)$ convergence rate for SelfieBoost under some "SGD success" assumption which seems to hold in practice.

  3. Interactive prostate segmentation using atlas-guided semi-supervised learning and adaptive feature selection

    Energy Technology Data Exchange (ETDEWEB)

    Park, Sang Hyun [Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599 (United States); Gao, Yaozong, E-mail: yzgao@cs.unc.edu [Department of Computer Science, Department of Radiology, and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599 (United States); Shi, Yinghuan, E-mail: syh@nju.edu.cn [State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023 (China); Shen, Dinggang, E-mail: dgshen@med.unc.edu [Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599 and Department of Brain and Cognitive Engineering, Korea University, Seoul 136-713 (Korea, Republic of)

    2014-11-01

    Purpose: Accurate prostate segmentation is necessary for maximizing the effectiveness of radiation therapy of prostate cancer. However, manual segmentation from 3D CT images is very time-consuming and often causes large intra- and interobserver variations across clinicians. Many segmentation methods have been proposed to automate this labor-intensive process, but tedious manual editing is still required due to the limited performance. In this paper, the authors propose a new interactive segmentation method that can (1) flexibly generate the editing result with a few scribbles or dots provided by a clinician, (2) fast deliver intermediate results to the clinician, and (3) sequentially correct the segmentations from any type of automatic or interactive segmentation methods. Methods: The authors formulate the editing problem as a semisupervised learning problem which can utilize a priori knowledge of training data and also the valuable information from user interactions. Specifically, from a region of interest near the given user interactions, the appropriate training labels, which are well matched with the user interactions, can be locally searched from a training set. With voting from the selected training labels, both confident prostate and background voxels, as well as unconfident voxels can be estimated. To reflect informative relationship between voxels, location-adaptive features are selected from the confident voxels by using regression forest and Fisher separation criterion. Then, the manifold configuration computed in the derived feature space is enforced into the semisupervised learning algorithm. The labels of unconfident voxels are then predicted by regularizing semisupervised learning algorithm. Results: The proposed interactive segmentation method was applied to correct automatic segmentation results of 30 challenging CT images. The correction was conducted three times with different user interactions performed at different time periods, in order to

  4. Extreme learning machines 2013 algorithms and applications

    CERN Document Server

    Toh, Kar-Ann; Romay, Manuel; Mao, Kezhi

    2014-01-01

    In recent years, ELM has emerged as a revolutionary technique of computational intelligence, and has attracted considerable attentions. An extreme learning machine (ELM) is a single layer feed-forward neural network alike learning system, whose connections from the input layer to the hidden layer are randomly generated, while the connections from the hidden layer to the output layer are learned through linear learning methods. The outstanding merits of extreme learning machine (ELM) are its fast learning speed, trivial human intervene and high scalability.   This book contains some selected papers from the International Conference on Extreme Learning Machine 2013, which was held in Beijing China, October 15-17, 2013. This conference aims to bring together the researchers and practitioners of extreme learning machine from a variety of fields including artificial intelligence, biomedical engineering and bioinformatics, system modelling and control, and signal and image processing, to promote research and discu...

  5. 基于半监督学习的Web页面内容分类技术研究%Study on Web page content classification technology based on semi-supervised learning

    Institute of Scientific and Technical Information of China (English)

    赵夫群

    2016-01-01

    For the key issues that how to use labeled and unlabeled data to conduct Web classification,a classifier of com-bining generative model with discriminative model is explored. The maximum likelihood estimation is adopted in the unlabeled training set to construct a semi-supervised classifier with high classification performance. The Dirichlet-polynomial mixed distri-bution is used to model the text,and then a hybrid model which is suitable for the semi-supervised learning is proposed. Since the EM algorithm for the semi-supervised learning has fast convergence rate and is easy to fall into local optimum,two intelli-gent optimization methods of simulated annealing algorithm and genetic algorithm are introduced,analyzed and processed. A new intelligent semi-supervised classification algorithm was generated by combing the two algorithms,and the feasibility of the algorithm was verified.%针对如何使用标记和未标记数据进行Web分类这一关键性问题,探索一种生成模型和判别模型相互结合的分类器,在无标记训练集中采用最大似然估计,构造一种具有良好分类性能的半监督分类器.利用狄利克雷-多项式混合分布对文本进行建模,提出了适用于半监督学习的混合模型.针对半监督学习的EM算法收敛速度过快,容易陷入局部最优的难题,引入两种智能优化的方法——模拟退火算法和遗传算法进行分析和处理,结合这两种算法形成一种新型智能的半监督分类算法,并且验证了该算法的可行性.

  6. Wrapped Progressive Sampling Search for Optimizing Learning Algorithm Parameters

    NARCIS (Netherlands)

    Bosch, Antal van den

    2005-01-01

    We present a heuristic meta-learning search method for finding a set of optimized algorithmic parameters for a range of machine learning algo- rithms. The method, wrapped progressive sampling, is a combination of classifier wrapping and progressive sampling of training data. A series of experiments

  7. Wrapped Progressive Sampling Search for Optimizing Learning Algorithm Parameters

    NARCIS (Netherlands)

    Bosch, Antal van den

    2005-01-01

    We present a heuristic meta-learning search method for finding a set of optimized algorithmic parameters for a range of machine learning algo- rithms. The method, wrapped progressive sampling, is a combination of classifier wrapping and progressive sampling of training data. A series of experiments

  8. 一种结合半监督Boosting方法的迁移学习算法%Transfer Learning via Semi-supervised Boosting Method

    Institute of Scientific and Technical Information of China (English)

    洪佳明; 陈炳超; 印鉴

    2011-01-01

    迁移学习是数据挖掘中的一个研究方向,试图重用相关领域的数据样本,将相关领域的知识”迁移”到新领域中帮助训练.当前,基于实例的迁移学习算法容易产生过度拟合的问题,不能充分利用相关领域中的有用数据,为了避免这个问题,通过引入目标领域的无标记样本参与训练,利用半监督Boosting方法,提出一种新的迁移学习算法,能够对样本的相关性进行更好的判断,减少选择性偏差的影响,在大量文本数据集上的实验表明了新算法的有效性.%Transfer learning aims at reusing existing instances from other related domains to help learning models for the target domain. Existing algorithms in instance-transfer learning might easily suffer from the problem of overfitting. To address this problem, we propose to incorporate additional unlabeled instances from the target domain, so that more domain knowledge can be brought into the training process. Specifically, under the generalized framework of boosting methods, we show that a semi-supervised boosting method can be applied to help re-weighting the source domain instances, making the final classifiers less sensitive to the small amount of labeled instances in the target domain. Extensive experiments confirm the efficiency of the new algorithm.

  9. On Training Targets for Supervised Speech Separation

    Science.gov (United States)

    Wang, Yuxuan; Narayanan, Arun; Wang, DeLiang

    2014-01-01

    Formulation of speech separation as a supervised learning problem has shown considerable promise. In its simplest form, a supervised learning algorithm, typically a deep neural network, is trained to learn a mapping from noisy features to a time-frequency representation of the target of interest. Traditionally, the ideal binary mask (IBM) is used as the target because of its simplicity and large speech intelligibility gains. The supervised learning framework, however, is not restricted to the use of binary targets. In this study, we evaluate and compare separation results by using different training targets, including the IBM, the target binary mask, the ideal ratio mask (IRM), the short-time Fourier transform spectral magnitude and its corresponding mask (FFT-MASK), and the Gammatone frequency power spectrum. Our results in various test conditions reveal that the two ratio mask targets, the IRM and the FFT-MASK, outperform the other targets in terms of objective intelligibility and quality metrics. In addition, we find that masking based targets, in general, are significantly better than spectral envelope based targets. We also present comparisons with recent methods in non-negative matrix factorization and speech enhancement, which show clear performance advantages of supervised speech separation. PMID:25599083

  10. Conducting Supervised Experiential Learning/Field Experiences for Students' Development and Career Reinforcement.

    Science.gov (United States)

    Leventhal, Jerome I.

    A major problem in the educational system of the United States is that a great number of students and graduates lack a career objective, and, therefore, many workers are unhappy. Offering a variety of supervised field experiences, paid or unpaid, in which students see workers in their occupations will help students identify career choices.…

  11. Don't Leave Teaching to Chance: Learning Objectives for Psychodynamic Psychotherapy Supervision

    Science.gov (United States)

    Rojas, Alicia; Arbuckle, Melissa; Cabaniss, Deborah

    2010-01-01

    Objective: The way in which the competencies for psychodynamic psychotherapy specified by the Psychiatry Residency Review Committee of the Accreditation Council for Graduate Medical Education translate into the day-to-day work of individual supervision remains unstudied and unspecified. The authors hypothesized that despite the existence of…

  12. Pre-trained Convolutional Networks and generative statiscial models: a study in semi-supervised learning

    OpenAIRE

    John Michael Salgado Cebola

    2016-01-01

    Comparative study between the performance of Convolutional Networks using pretrained models and statistical generative models on tasks of image classification in semi-supervised enviroments.Study of multiple ensembles using these techniques and generated data from estimated pdfs.Pretrained Convents, LDA, pLSA, Fisher Vectors, Sparse-coded SPMs, TSVMs being the key models worked upon.

  13. Fieldwork online: a GIS-based electronic learning environment for supervising fieldwork

    NARCIS (Netherlands)

    Alberti, K.; Marra, W.A.; Baarsma, R.J.; Karssenberg, D.J.

    2016-01-01

    Fieldwork comes in many forms: individual research projects in unique places, large groups of students on organized fieldtrips, and everything in between those extremes. Supervising students in often distant places can be a logistical challenge and requires a significant time investment of their

  14. Enabling Connections in Postgraduate Supervision for an Applied eLearning Professional Development Programme

    Science.gov (United States)

    Donnelly, Roisin

    2013-01-01

    This article describes the practice of postgraduate supervision on a blended professional development programme for academics, and discusses how connectivism has been a useful lens to explore a complex form of instruction. By examining the processes by which supervisors and their students on a two-year part-time masters in Applied eLearning…

  15. Learning motor skills from algorithms to robot experiments

    CERN Document Server

    Kober, Jens

    2014-01-01

    This book presents the state of the art in reinforcement learning applied to robotics both in terms of novel algorithms and applications. It discusses recent approaches that allow robots to learn motor skills and presents tasks that need to take into account the dynamic behavior of the robot and its environment, where a kinematic movement plan is not sufficient. The book illustrates a method that learns to generalize parameterized motor plans which is obtained by imitation or reinforcement learning, by adapting a small set of global parameters, and appropriate kernel-based reinforcement learning algorithms. The presented applications explore highly dynamic tasks and exhibit a very efficient learning process. All proposed approaches have been extensively validated with benchmarks tasks, in simulation, and on real robots. These tasks correspond to sports and games but the presented techniques are also applicable to more mundane household tasks. The book is based on the first author’s doctoral thesis, which wo...

  16. A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge

    Institute of Scientific and Technical Information of China (English)

    徐俊刚; 赵越; 陈健; 韩超

    2015-01-01

    Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a non-deterministic polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, and some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction and use restriction. This makes it difficult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into the Markov chain Monte Carlo (MCMC) algorithm and propose an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node, and distribution knowledge including the conditional probability distribution (CPD) of edges and the probability distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.

  17. An Early Historical Examination of the Educational Intent of Supervised Agricultural Experiences (SAEs) and Project-Based Learning in Agricultural Education

    Science.gov (United States)

    Smith, Kasee L.; Rayfield, John

    2016-01-01

    Project-based learning has been a component of agricultural education since its inception. In light of the current call for additional emphasis of the Supervised Agricultural Experience (SAE) component of agricultural education, there is a need to revisit the roots of project-based learning. This early historical research study was conducted to…

  18. GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data.

    Science.gov (United States)

    Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E

    2016-03-11

    Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.

  19. Machine learning algorithms for datasets popularity prediction

    CERN Document Server

    Kancys, Kipras

    2016-01-01

    This report represents continued study where ML algorithms were used to predict databases popularity. Three topics were covered. First of all, there was a discrepancy between old and new meta-data collection procedures, so a reason for that had to be found. Secondly, different parameters were analysed and dropped to make algorithms perform better. And third, it was decided to move modelling part on Spark.

  20. A meta-learning system based on genetic algorithms

    Science.gov (United States)

    Pellerin, Eric; Pigeon, Luc; Delisle, Sylvain

    2004-04-01

    The design of an efficient machine learning process through self-adaptation is a great challenge. The goal of meta-learning is to build a self-adaptive learning system that is constantly adapting to its specific (and dynamic) environment. To that end, the meta-learning mechanism must improve its bias dynamically by updating the current learning strategy in accordance with its available experiences or meta-knowledge. We suggest using genetic algorithms as the basis of an adaptive system. In this work, we propose a meta-learning system based on a combination of the a priori and a posteriori concepts. A priori refers to input information and knowledge available at the beginning in order to built and evolve one or more sets of parameters by exploiting the context of the system"s information. The self-learning component is based on genetic algorithms and neural Darwinism. A posteriori refers to the implicit knowledge discovered by estimation of the future states of parameters and is also applied to the finding of optimal parameters values. The in-progress research presented here suggests a framework for the discovery of knowledge that can support human experts in their intelligence information assessment tasks. The conclusion presents avenues for further research in genetic algorithms and their capability to learn to learn.

  1. Clinical supervision in a community setting.

    Science.gov (United States)

    Evans, Carol; Marcroft, Emma

    Clinical supervision is a formal process of professional support, reflection and learning that contributes to individual development. First Community Health and Care is committed to providing clinical supervision to nurses and allied healthcare professionals to support the provision and maintenance of high-quality care. In 2012, we developed new guidelines for nurses and AHPs on supervision, incorporating a clinical supervision framework. This offers a range of options to staff so supervision accommodates variations in work settings and individual learning needs and styles.

  2. Evaluating the Security of Machine Learning Algorithms

    Science.gov (United States)

    2008-05-20

    description of this setting and several results appear in Cesa -Bianchi and Lugosi [2006]. 2.5 Summary In this chapter we have presented a framework for...Learning Research (JMLR), 3:993–1022, 2003. ISSN 1533-7928. Nicolò Cesa -Bianchi and Gábor Lugosi. Prediction, Learning, and Games. Cambridge University

  3. Development of a Late-Life Dementia Prediction Index with Supervised Machine Learning in the Population-Based CAIDE Study

    Science.gov (United States)

    Pekkala, Timo; Hall, Anette; Lötjönen, Jyrki; Mattila, Jussi; Soininen, Hilkka; Ngandu, Tiia; Laatikainen, Tiina; Kivipelto, Miia; Solomon, Alina

    2016-01-01

    Background and objective: This study aimed to develop a late-life dementia prediction model using a novel validated supervised machine learning method, the Disease State Index (DSI), in the Finnish population-based CAIDE study. Methods: The CAIDE study was based on previous population-based midlife surveys. CAIDE participants were re-examined twice in late-life, and the first late-life re-examination was used as baseline for the present study. The main study population included 709 cognitively normal subjects at first re-examination who returned to the second re-examination up to 10 years later (incident dementia n = 39). An extended population (n = 1009, incident dementia 151) included non-participants/non-survivors (national registers data). DSI was used to develop a dementia index based on first re-examination assessments. Performance in predicting dementia was assessed as area under the ROC curve (AUC). Results: AUCs for DSI were 0.79 and 0.75 for main and extended populations. Included predictors were cognition, vascular factors, age, subjective memory complaints, and APOE genotype. Conclusion: The supervised machine learning method performed well in identifying comprehensive profiles for predicting dementia development up to 10 years later. DSI could thus be useful for identifying individuals who are most at risk and may benefit from dementia prevention interventions. PMID:27802228

  4. Development of a Late-Life Dementia Prediction Index with Supervised Machine Learning in the Population-Based CAIDE Study.

    Science.gov (United States)

    Pekkala, Timo; Hall, Anette; Lötjönen, Jyrki; Mattila, Jussi; Soininen, Hilkka; Ngandu, Tiia; Laatikainen, Tiina; Kivipelto, Miia; Solomon, Alina

    2017-01-01

    This study aimed to develop a late-life dementia prediction model using a novel validated supervised machine learning method, the Disease State Index (DSI), in the Finnish population-based CAIDE study. The CAIDE study was based on previous population-based midlife surveys. CAIDE participants were re-examined twice in late-life, and the first late-life re-examination was used as baseline for the present study. The main study population included 709 cognitively normal subjects at first re-examination who returned to the second re-examination up to 10 years later (incident dementia n = 39). An extended population (n = 1009, incident dementia 151) included non-participants/non-survivors (national registers data). DSI was used to develop a dementia index based on first re-examination assessments. Performance in predicting dementia was assessed as area under the ROC curve (AUC). AUCs for DSI were 0.79 and 0.75 for main and extended populations. Included predictors were cognition, vascular factors, age, subjective memory complaints, and APOE genotype. The supervised machine learning method performed well in identifying comprehensive profiles for predicting dementia development up to 10 years later. DSI could thus be useful for identifying individuals who are most at risk and may benefit from dementia prevention interventions.

  5. Dermoscopic Image Segmentation using Machine Learning Algorithm

    Directory of Open Access Journals (Sweden)

    L. P. Suresh

    2011-01-01

    Full Text Available Problem statement: Malignant melanoma is the most frequent type of skin cancer. Its incidence has been rapidly increasing over the last few decades. Medical image segmentation is the most essential and crucial process in order to facilitate the characterization and visualization of the structure of interest in medical images. Approach: This study explains the task of segmenting skin lesions in Dermoscopy images based on intelligent systems such as Fuzzy and Neural Networks clustering techniques for the early diagnosis of Malignant Melanoma. The various intelligent system based clustering techniques used are Fuzzy C Means Algorithm (FCM, Possibilistic C Means Algorithm (PCM, Hierarchical C Means Algorithm (HCM; C-mean based Fuzzy Hopfield Neural Network, Adaline Neural Network and Regression Neural Network. Results: The segmented images are compared with the ground truth image using various parameters such as False Positive Error (FPE, False Negative Error (FNE Coefficient of similarity, spatial overlap and their performance is evaluated. Conclusion: The experimental results show that the Hierarchical C Means algorithm( Fuzzy provides better segmentation than other (Fuzzy C Means, Possibilistic C Means, Adaline Neural Network, FHNN and GRNN clustering algorithms. Thus Hierarchical C Means approach can handle uncertainties that exist in the data efficiently and useful for the lesion segmentation in a computer aided diagnosis system to assist the clinical diagnosis of dermatologists.

  6. Cloud detection in all-sky images via multi-scale neighborhood features and multiple supervised learning techniques

    Science.gov (United States)

    Cheng, Hsu-Yung; Lin, Chih-Lung

    2017-01-01

    Cloud detection is important for providing necessary information such as cloud cover in many applications. Existing cloud detection methods include red-to-blue ratio thresholding and other classification-based techniques. In this paper, we propose to perform cloud detection using supervised learning techniques with multi-resolution features. One of the major contributions of this work is that the features are extracted from local image patches with different sizes to include local structure and multi-resolution information. The cloud models are learned through the training process. We consider classifiers including random forest, support vector machine, and Bayesian classifier. To take advantage of the clues provided by multiple classifiers and various levels of patch sizes, we employ a voting scheme to combine the results to further increase the detection accuracy. In the experiments, we have shown that the proposed method can distinguish cloud and non-cloud pixels more accurately compared with existing works.

  7. Poster abstract: Water level estimation in urban ultrasonic/passive infrared flash flood sensor networks using supervised learning

    KAUST Repository

    Mousa, Mustafa

    2014-04-01

    This article describes a machine learning approach to water level estimation in a dual ultrasonic/passive infrared urban flood sensor system. We first show that an ultrasonic rangefinder alone is unable to accurately measure the level of water on a road due to thermal effects. Using additional passive infrared sensors, we show that ground temperature and local sensor temperature measurements are sufficient to correct the rangefinder readings and improve the flood detection performance. Since floods occur very rarely, we use a supervised learning approach to estimate the correction to the ultrasonic rangefinder caused by temperature fluctuations. Preliminary data shows that water level can be estimated with an absolute error of less than 2 cm. © 2014 IEEE.

  8. Cavity contour segmentation in chest radiographs using supervised learning and dynamic programming

    Energy Technology Data Exchange (ETDEWEB)

    Maduskar, Pragnya, E-mail: pragnya.maduskar@radboudumc.nl; Hogeweg, Laurens; Sánchez, Clara I.; Ginneken, Bram van [Diagnostic Image Analysis Group, Radboud University Medical Center, Nijmegen, 6525 GA (Netherlands); Jong, Pim A. de [Department of Radiology, University Medical Center Utrecht, 3584 CX (Netherlands); Peters-Bax, Liesbeth [Department of Radiology, Radboud University Medical Center, Nijmegen, 6525 GA (Netherlands); Dawson, Rodney [University of Cape Town Lung Institute, Cape Town 7700 (South Africa); Ayles, Helen [Department of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London WC1E 7HT (United Kingdom)

    2014-07-15

    Purpose: Efficacy of tuberculosis (TB) treatment is often monitored using chest radiography. Monitoring size of cavities in pulmonary tuberculosis is important as the size predicts severity of the disease and its persistence under therapy predicts relapse. The authors present a method for automatic cavity segmentation in chest radiographs. Methods: A two stage method is proposed to segment the cavity borders, given a user defined seed point close to the center of the cavity. First, a supervised learning approach is employed to train a pixel classifier using texture and radial features to identify the border pixels of the cavity. A likelihood value of belonging to the cavity border is assigned to each pixel by the classifier. The authors experimented with four different classifiers:k-nearest neighbor (kNN), linear discriminant analysis (LDA), GentleBoost (GB), and random forest (RF). Next, the constructed likelihood map was used as an input cost image in the polar transformed image space for dynamic programming to trace the optimal maximum cost path. This constructed path corresponds to the segmented cavity contour in image space. Results: The method was evaluated on 100 chest radiographs (CXRs) containing 126 cavities. The reference segmentation was manually delineated by an experienced chest radiologist. An independent observer (a chest radiologist) also delineated all cavities to estimate interobserver variability. Jaccard overlap measure Ω was computed between the reference segmentation and the automatic segmentation; and between the reference segmentation and the independent observer's segmentation for all cavities. A median overlap Ω of 0.81 (0.76 ± 0.16), and 0.85 (0.82 ± 0.11) was achieved between the reference segmentation and the automatic segmentation, and between the segmentations by the two radiologists, respectively. The best reported mean contour distance and Hausdorff distance between the reference and the automatic segmentation were

  9. A Linkage Learning Genetic Algorithm with Linkage Matrix

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    The goal of linkage learning, or building block identification, is the creation of a more effective Genetic Algorithm (GA). This paper proposes a new Linkage Learning Genetic Algorithms, named m-LLGA. With the linkage learning module and the linkage-based genetic operation, m-LLGA is not only able to learn and record the linkage information among genes without any prior knowledge of the function being optimized. It also can use the linkage information stored in the linkage matrix to guide the selection of crossover point. The preliminary experiments on two kinds of bounded difficulty problems and a TSP problem validated the performance of m-LLGA. The m-LLGA learns the linkage of different building blocks parallel and therefore solves these problems effectively; it can also reasonably reduce the probability of building blocks being disrupted by crossover at the same time give attention to getting away from local minimum.

  10. Implementing a self-structuring data learning algorithm

    Science.gov (United States)

    Graham, James; Carson, Daniel; Ternovskiy, Igor

    2016-05-01

    In this paper, we elaborate on what we did to implement our self-structuring data learning algorithm. To recap, we are working to develop a data learning algorithm that will eventually be capable of goal driven pattern learning and extrapolation of more complex patterns from less complex ones. At this point we have developed a conceptual framework for the algorithm, but have yet to discuss our actual implementation and the consideration and shortcuts we needed to take to create said implementation. We will elaborate on our initial setup of the algorithm and the scenarios we used to test our early stage algorithm. While we want this to be a general algorithm, it is necessary to start with a simple scenario or two to provide a viable development and testing environment. To that end, our discussion will be geared toward what we include in our initial implementation and why, as well as what concerns we may have. In the future, we expect to be able to apply our algorithm to a more general approach, but to do so within a reasonable time, we needed to pick a place to start.

  11. Semi-Supervised Multi-View Learning in Big Data%半监督多视图学习在大数据分析中的应用探讨

    Institute of Scientific and Technical Information of China (English)

    蓝超; 饶泓; 浣军

    2015-01-01

    半监督多视图学习是机器学习领域一种极具潜力的大数据处理和分析方法,该方法能有效处理异构和半监督数据,并能方便地在线化和并行化,适合处理海量数据.该方法在大数据时代的应用前景值得研究人员和业界关注.指出未来需要通过引入其他领域新的研究技术和成果,不断丰富和完善半监督多视图学习的理论体系和算法设计,并在实验和实践中不断检验和探索.%This paper introduces a promising machine-learning paradigm cal ed semi-supervised multi-view learning. With this paradigm, information is extracted from heterogeneous and semi-supervised data sets. Lately, multi-view learning has been scaled up online and through paral elization to deal with emerging big data chal enges. Due to its successful application in many research domains and the fact that it has been explored and used by leading companies, multi-view learning may have a future in the big-data era as a major data analytic technique. New research techniques should be introduced into this area to improve the theoretical system and algorithm design of semi-supervised multi-view learning.

  12. Learning Bayesian networks using genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    Chen Fei; Wang Xiufeng; Rao Yimei

    2007-01-01

    A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while the others not.Moreover it facilitates the computation greatly. In order to reduce the search space, the notation of equivalent class proposed by David Chickering is adopted. Instead of using the method directly, the novel criterion, variable ordering, and equivalent class are combined,moreover the proposed mthod avoids some problems caused by the previous one. Later, the genetic algorithm which allows global convergence, lack in the most of the methods searching for Bayesian network is applied to search for a good model in thisspace. To speed up the convergence, the genetic algorithm is combined with the greedy algorithm. Finally, the simulation shows the validity of the proposed approach.

  13. A statistical learning algorithm for word segmentation

    CERN Document Server

    Van Aken, Jerry R

    2011-01-01

    In natural speech, the speaker does not pause between words, yet a human listener somehow perceives this continuous stream of phonemes as a series of distinct words. The detection of boundaries between spoken words is an instance of a general capability of the human neocortex to remember and to recognize recurring sequences. This paper describes a computer algorithm that is designed to solve the problem of locating word boundaries in blocks of English text from which the spaces have been removed. This problem avoids the complexities of processing speech but requires similar capabilities for detecting recurring sequences. The algorithm that is described in this paper relies entirely on statistical relationships between letters in the input stream to infer the locations of word boundaries. The source code for a C++ version of this algorithm is presented in an appendix.

  14. An introduction to machine learning with Scikit-Learn

    CERN Document Server

    CERN. Geneva

    2015-01-01

    This tutorial gives an introduction to the scientific ecosystem for data analysis and machine learning in Python. After a short introduction of machine learning concepts, we will demonstrate on High Energy Physics data how a basic supervised learning analysis can be carried out using the Scikit-Learn library. Topics covered include data loading facilities and data representation, supervised learning algorithms, pipelines, model selection and evaluation, and model introspection.

  15. Semi-supervised binary classification algorithm based on global and local regularization%结合全局和局部正则化的半监督二分类算法

    Institute of Scientific and Technical Information of China (English)

    吕佳

    2012-01-01

    As for semi-supervised classification problem, it is difficult to obtain a good classification function for the entire input space if global learning is used alone, while if local learning is utilized alone, a good classification function on some specified regions of the input space can be got. Accordingly, a new semi-supervised binary classification algorithm based on a mixed local and global regularization was presented in this paper. The algorithm integrated the benefits of global regularizer and local regularizes Global regularizer was built to smooth the class labels of the data so as to lessen insufficient training of local regularizer, and based upon the neighboring region, local regularizer was constructed to make class label of each data have the desired property, thus the objective function of semi-supervised binary classification problem was constructed. Comparative semi-supervised binary classification experiments on some benchmark datasets validate that the average classification accuracy and the standard error of the proposed algorithm are obviously superior to other algorithms.%针对在半监督分类问题中单独使用全局学习容易出现的在整个输入空间中较难获得一个优良的决策函数的问题,以及单独使用局部学习可在特定的局部区域内习得较好的决策函数的特点,提出了一种结合全局和局部正则化的半监督二分类算法.该算法综合全局正则项和局部正则项的优点,基于先验知识构建的全局正则项能平滑样本的类标号以避免局部正则项学习不充分的问题,通过基于局部邻域内样本信息构建的局部正则项使得每个样本的类标号具有理想的特性,从而构造出半监督二分类问题的目标函数.通过在标准二类数据集上的实验,结果表明所提出的算法其平均分类正确率和标准误差均优于基于拉普拉斯正则项方法、基于正则化拉普拉斯正则项方法和基于局部学习正则项方法.

  16. Identifying presence of correlated errors in GRACE monthly harmonic coefficients using machine learning algorithms

    Science.gov (United States)

    Piretzidis, Dimitrios; Sra, Gurveer; Karantaidis, George; Sideris, Michael G.

    2017-04-01

    A new method for identifying correlated errors in Gravity Recovery and Climate Experiment (GRACE) monthly harmonic coefficients has been developed and tested. Correlated errors are present in the differences between monthly GRACE solutions, and can be suppressed using a de-correlation filter. In principle, the de-correlation filter should be implemented only on coefficient series with correlated errors to avoid losing useful geophysical information. In previous studies, two main methods of implementing the de-correlation filter have been utilized. In the first one, the de-correlation filter is implemented starting from a specific minimum order until the maximum order of the monthly solution examined. In the second one, the de-correlation filter is implemented only on specific coefficient series, the selection of which is based on statistical testing. The method proposed in the present study exploits the capabilities of supervised machine learning algorithms such as neural networks and support vector machines (SVMs). The pattern of correlated errors can be described by several numerical and geometric features of the harmonic coefficient series. The features of extreme cases of both correlated and uncorrelated coefficients are extracted and used for the training of the machine learning algorithms. The trained machine learning algorithms are later used to identify correlated errors and provide the probability of a coefficient series to be correlated. Regarding SVMs algorithms, an extensive study is performed with various kernel functions in order to find the optimal training model for prediction. The selection of the optimal training model is based on the classification accuracy of the trained SVM algorithm on the same samples used for training. Results show excellent performance of all algorithms with a classification accuracy of 97% - 100% on a pre-selected set of training samples, both in the validation stage of the training procedure and in the subsequent use of

  17. Online co-regularized algorithms

    NARCIS (Netherlands)

    Ruijter, T. de; Tsivtsivadze, E.; Heskes, T.

    2012-01-01

    We propose an online co-regularized learning algorithm for classification and regression tasks. We demonstrate that by sequentially co-regularizing prediction functions on unlabeled data points, our algorithm provides improved performance in comparison to supervised methods on several UCI benchmarks

  18. Online co-regularized algorithms

    NARCIS (Netherlands)

    Ruijter, T. de; Tsivtsivadze, E.; Heskes, T.

    2012-01-01

    We propose an online co-regularized learning algorithm for classification and regression tasks. We demonstrate that by sequentially co-regularizing prediction functions on unlabeled data points, our algorithm provides improved performance in comparison to supervised methods on several UCI benchmarks

  19. An entropy-based unsupervised anomaly detection pattern learning algorithm

    Institute of Scientific and Technical Information of China (English)

    YANG Ying-jie; MA Fan-yuan

    2005-01-01

    Currently, most anomaly detection pattern learning algorithms require a set of purely normal data from which they train their model. If the data contain some intrusions buried within the training data, the algorithm may not detect these attacks because it will assume that they are normal. In reality, it is very hard to guarantee that there are no attack items in the collected training data. Focusing on this problem, in this paper,firstly a new anomaly detection measurement is proposed according to the probability characteristics of intrusion instances and normal instances. Secondly, on the basis of anomaly detection measure, we present a clusteringbased unsupervised anomaly detection patterns learning algorithm, which can overcome the shortage above. Finally, some experiments are conducted to verify the proposed algorithm is valid.

  20. Optimization of circuits using a constructive learning algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Beiu, V.

    1997-05-01

    The paper presents an application of a constructive learning algorithm to optimization of circuits. For a given Boolean function f. a fresh constructive learning algorithm builds circuits belonging to the smallest F{sub n,m} class of functions (n inputs and having m groups of ones in their truth table). The constructive proofs, which show how arbitrary Boolean functions can be implemented by this algorithm, are shortly enumerated An interesting aspect is that the algorithm can be used for generating both classical Boolean circuits and threshold gate circuits (i.e. analogue inputs and digital outputs), or a mixture of them, thus taking advantage of mixed analogue/digital technologies. One illustrative example is detailed The size and the area of the different circuits are compared (special cost functions can be used to closer estimate the area and the delay of VLSI implementations). Conclusions and further directions of research are ending the paper.

  1. Learning algorithms for feedforward networks based on finite samples

    Energy Technology Data Exchange (ETDEWEB)

    Rao, N.S.V.; Protopopescu, V.; Mann, R.C.; Oblow, E.M.; Iyengar, S.S.

    1994-09-01

    Two classes of convergent algorithms for learning continuous functions (and also regression functions) that are represented by feedforward networks, are discussed. The first class of algorithms, applicable to networks with unknown weights located only in the output layer, is obtained by utilizing the potential function methods of Aizerman et al. The second class, applicable to general feedforward networks, is obtained by utilizing the classical Robbins-Monro style stochastic approximation methods. Conditions relating the sample sizes to the error bounds are derived for both classes of algorithms using martingale-type inequalities. For concreteness, the discussion is presented in terms of neural networks, but the results are applicable to general feedforward networks, in particular to wavelet networks. The algorithms can be directly adapted to concept learning problems.

  2. Adaptation and validation of the instrument Clinical Learning Environment and Supervision for medical students in primary health care.

    Science.gov (United States)

    Öhman, Eva; Alinaghizadeh, Hassan; Kaila, Päivi; Hult, Håkan; Nilsson, Gunnar H; Salminen, Helena

    2016-12-01

    Clinical learning takes place in complex socio-cultural environments that are workplaces for the staff and learning places for the students. In the clinical context, the students learn by active participation and in interaction with the rest of the community at the workplace. Clinical learning occurs outside the university, therefore is it important for both the university and the student that the student is given opportunities to evaluate the clinical placements with an instrument that allows evaluation from many perspectives. The instrument Clinical Learning Environment and Supervision (CLES) was originally developed for evaluation of nursing students' clinical learning environment. The aim of this study was to adapt and validate the CLES instrument to measure medical students' perceptions of their learning environment in primary health care. In the adaptation process the face validity was tested by an expert panel of primary care physicians, who were also active clinical supervisors. The adapted CLES instrument with 25 items and six background questions was sent electronically to 1,256 medical students from one university. Answers from 394 students were eligible for inclusion. Exploratory factor analysis based on principal component methods followed by oblique rotation was used to confirm the adequate number of factors in the data. Construct validity was assessed by factor analysis. Confirmatory factor analysis was used to confirm the dimensions of CLES instrument. The construct validity showed a clearly indicated four-factor model. The cumulative variance explanation was 0.65, and the overall Cronbach's alpha was 0.95. All items loaded similarly with the dimensions in the non-adapted CLES except for one item that loaded to another dimension. The CLES instrument in its adapted form had high construct validity and high reliability and internal consistency. CLES, in its adapted form, appears to be a valid instrument to evaluate medical students' perceptions of

  3. Adaptation and validation of the instrument Clinical Learning Environment and Supervision for medical students in primary health care

    Directory of Open Access Journals (Sweden)

    Eva Öhman

    2016-12-01

    Full Text Available Abstract Background Clinical learning takes place in complex socio-cultural environments that are workplaces for the staff and learning places for the students. In the clinical context, the students learn by active participation and in interaction with the rest of the community at the workplace. Clinical learning occurs outside the university, therefore is it important for both the university and the student that the student is given opportunities to evaluate the clinical placements with an instrument that allows evaluation from many perspectives. The instrument Clinical Learning Environment and Supervision (CLES was originally developed for evaluation of nursing students’ clinical learning environment. The aim of this study was to adapt and validate the CLES instrument to measure medical students’ perceptions of their learning environment in primary health care. Methods In the adaptation process the face validity was tested by an expert panel of primary care physicians, who were also active clinical supervisors. The adapted CLES instrument with 25 items and six background questions was sent electronically to 1,256 medical students from one university. Answers from 394 students were eligible for inclusion. Exploratory factor analysis based on principal component methods followed by oblique rotation was used to confirm the adequate number of factors in the data. Construct validity was assessed by factor analysis. Confirmatory factor analysis was used to confirm the dimensions of CLES instrument. Results The construct validity showed a clearly indicated four-factor model. The cumulative variance explanation was 0.65, and the overall Cronbach’s alpha was 0.95. All items loaded similarly with the dimensions in the non-adapted CLES except for one item that loaded to another dimension. The CLES instrument in its adapted form had high construct validity and high reliability and internal consistency. Conclusion CLES, in its adapted form, appears

  4. Machine-Learning Algorithms to Code Public Health Spending Accounts.

    Science.gov (United States)

    Brady, Eoghan S; Leider, Jonathon P; Resnick, Beth A; Alfonso, Y Natalia; Bishai, David

    Government public health expenditure data sets require time- and labor-intensive manipulation to summarize results that public health policy makers can use. Our objective was to compare the performances of machine-learning algorithms with manual classification of public health expenditures to determine if machines could provide a faster, cheaper alternative to manual classification. We used machine-learning algorithms to replicate the process of manually classifying state public health expenditures, using the standardized public health spending categories from the Foundational Public Health Services model and a large data set from the US Census Bureau. We obtained a data set of 1.9 million individual expenditure items from 2000 to 2013. We collapsed these data into 147 280 summary expenditure records, and we followed a standardized method of manually classifying each expenditure record as public health, maybe public health, or not public health. We then trained 9 machine-learning algorithms to replicate the manual process. We calculated recall, precision, and coverage rates to measure the performance of individual and ensembled algorithms. Compared with manual classification, the machine-learning random forests algorithm produced 84% recall and 91% precision. With algorithm ensembling, we achieved our target criterion of 90% recall by using a consensus ensemble of ≥6 algorithms while still retaining 93% coverage, leaving only 7% of the summary expenditure records unclassified. Machine learning can be a time- and cost-saving tool for estimating public health spending in the United States. It can be used with standardized public health spending categories based on the Foundational Public Health Services model to help parse public health expenditure information from other types of health-related spending, provide data that are more comparable across public health organizations, and evaluate the impact of evidence-based public health resource allocation.

  5. Neural-Network-Biased Genetic Algorithms for Materials Design: Evolutionary Algorithms That Learn.

    Science.gov (United States)

    Patra, Tarak K; Meenakshisundaram, Venkatesh; Hung, Jui-Hsiang; Simmons, David S

    2017-02-13

    Machine learning has the potential to dramatically accelerate high-throughput approaches to materials design, as demonstrated by successes in biomolecular design and hard materials design. However, in the search for new soft materials exhibiting properties and performance beyond those previously achieved, machine learning approaches are frequently limited by two shortcomings. First, because they are intrinsically interpolative, they are better suited to the optimization of properties within the known range of accessible behavior than to the discovery of new materials with extremal behavior. Second, they require large pre-existing data sets, which are frequently unavailable and prohibitively expensive to produce. Here we describe a new strategy, the neural-network-biased genetic algorithm (NBGA), for combining genetic algorithms, machine learning, and high-throughput computation or experiment to discover materials with extremal properties in the absence of pre-existing data. Within this strategy, predictions from a progressively constructed artificial neural network are employed to bias the evolution of a genetic algorithm, with fitness evaluations performed via direct simulation or experiment. In effect, this strategy gives the evolutionary algorithm the ability to "learn" and draw inferences from its experience to accelerate the evolutionary process. We test this algorithm against several standard optimization problems and polymer design problems and demonstrate that it matches and typically exceeds the efficiency and reproducibility of standard approaches including a direct-evaluation genetic algorithm and a neural-network-evaluated genetic algorithm. The success of this algorithm in a range of test problems indicates that the NBGA provides a robust strategy for employing informatics-accelerated high-throughput methods to accelerate materials design in the absence of pre-existing data.

  6. Evolutionary Pseudo-Relaxation Learning Algorithm for Bidirectional Associative Memory

    Institute of Scientific and Technical Information of China (English)

    Sheng-Zhi Du; Zeng-Qiang Chen; Zhu-Zhi Yuan

    2005-01-01

    This paper analyzes the sensitivity to noise in BAM (Bidirectional Associative Memory), and then proves the noise immunity of BAM relates not only to the minimum absolute value of net inputs (MAV) but also to the variance of weights associated with synapse connections. In fact, it is a positive monotonically increasing function of the quotient of MAV divided by the variance of weights. Besides, the performance of pseudo-relaxation method depends on learning parameters (λ and ζ), but the relation of them is not linear. So it is hard to find a best combination of λ and ζ which leads to the best BAM performance. And it is obvious that pseudo-relaxation is a kind of local optimization method, so it cannot guarantee to get the global optimal solution. In this paper, a novel learning algorithm EPRBAM (evolutionary psendo-relaxation learning algorithm for bidirectional association memory) employing genetic algorithm and pseudo-relaxation method is proposed to get feasible solution of BAM weight matrix. This algorithm uses the quotient as the fitness of each individual and employs pseudo-relaxation method to adjust individual solution when it does not satisfy constraining condition any more after genetic operation. Experimental results show this algorithm improves noise immunity of BAM greatly. At the same time, EPRBAM does not depend on learning parameters and can get global optimal solution.

  7. An Initiative-Learning Algorithm Based on System Uncertainty

    Institute of Scientific and Technical Information of China (English)

    ZHAO Jun

    2005-01-01

    Initiative-learning algorithms are characterized by and hence advantageous for their independence of prior domain knowledge.Usually,their induced results could more objectively express the potential characteristics and patterns of information systems.Initiative-learning processes can be effectively conducted by system uncertainty,because uncertainty is an intrinsic common feature of and also an essential link between information systems and their induced results.Obviously,the effectiveness of such initiative-learning framework is heavily dependent on the accuracy of system uncertainty measurements.Herein,a more reasonable method for measuring system uncertainty is developed based on rough set theory and the conception of information entropy;then a new algorithm is developed on the bases of the new system uncertainty measurement and the Skowron's algorithm for mining propositional default decision rules.The proposed algorithm is typically initiative-learning.It is well adaptable to system uncertainty.As shown by simulation experiments,its comprehensive performances are much better than those of congeneric algorithms.

  8. Gradient Learning Algorithms for Ontology Computing

    Science.gov (United States)

    Gao, Wei; Zhu, Linli

    2014-01-01

    The gradient learning model has been raising great attention in view of its promising perspectives for applications in statistics, data dimensionality reducing, and other specific fields. In this paper, we raise a new gradient learning model for ontology similarity measuring and ontology mapping in multidividing setting. The sample error in this setting is given by virtue of the hypothesis space and the trick of ontology dividing operator. Finally, two experiments presented on plant and humanoid robotics field verify the efficiency of the new computation model for ontology similarity measure and ontology mapping applications in multidividing setting. PMID:25530752

  9. Gradient Learning Algorithms for Ontology Computing

    Directory of Open Access Journals (Sweden)

    Wei Gao

    2014-01-01

    Full Text Available The gradient learning model has been raising great attention in view of its promising perspectives for applications in statistics, data dimensionality reducing, and other specific fields. In this paper, we raise a new gradient learning model for ontology similarity measuring and ontology mapping in multidividing setting. The sample error in this setting is given by virtue of the hypothesis space and the trick of ontology dividing operator. Finally, two experiments presented on plant and humanoid robotics field verify the efficiency of the new computation model for ontology similarity measure and ontology mapping applications in multidividing setting.

  10. Optimization of deep learning algorithms for object classification

    Science.gov (United States)

    Horváth, András.

    2017-02-01

    Deep learning is currently the state of the art algorithm for image classification. The complexity of these feedforward neural networks have overcome a critical point, resulting algorithmic breakthroughs in various fields. On the other hand their complexity makes them executable in tasks, where High-throughput computing powers are available. The optimization of these networks -considering computational complexity and applicability on embedded systems- has not yet been studied and investigated in details. In this paper I show some examples how this algorithms can be optimized and accelerated on embedded systems.

  11. Manifold learning based registration algorithms applied to multimodal images.

    Science.gov (United States)

    Azampour, Mohammad Farid; Ghaffari, Aboozar; Hamidinekoo, Azam; Fatemizadeh, Emad

    2014-01-01

    Manifold learning algorithms are proposed to be used in image processing based on their ability in preserving data structures while reducing the dimension and the exposure of data structure in lower dimension. Multi-modal images have the same structure and can be registered together as monomodal images if only structural information is shown. As a result, manifold learning is able to transform multi-modal images to mono-modal ones and subsequently do the registration using mono-modal methods. Based on this application, in this paper novel similarity measures are proposed for multi-modal images in which Laplacian eigenmaps are employed as manifold learning algorithm and are tested against rigid registration of PET/MR images. Results show the feasibility of using manifold learning as a way of calculating the similarity between multimodal images.

  12. Fieldwork online: a GIS-based electronic learning environment for supervising fieldwork

    Science.gov (United States)

    Alberti, Koko; Marra, Wouter; Baarsma, Rein; Karssenberg, Derek

    2016-04-01

    Fieldwork comes in many forms: individual research projects in unique places, large groups of students on organized fieldtrips, and everything in between those extremes. Supervising students in often distant places can be a logistical challenge and requires a significant time investment of their supervisors. We developed an online application for remote supervision of students on fieldwork. In our fieldworkonline webapp, which is accessible through a web browser, students can upload their field data in the form of a spreadsheet with coordinates (in a system of choice) and data-fields. Field data can be any combination of quantitative or qualitative data, and can contain references to photos or other documents uploaded to the app. The student's data is converted to a map with data-points that contain all the data-fields and links to photos and documents associated with that location. Supervisors can review the data of their students and provide feedback on observations, or geo-referenced feedback on the map. Similarly, students can ask geo-referenced questions to their supervisors. Furthermore, supervisors can choose different basemaps or upload their own. Fieldwork online is a useful tool for supervising students at a distant location in the field and is most suitable for first-order feedback on students' observations, can be used to guide students to interesting locations, and allows for short discussions on phenomena observed in the field. We seek user that like to use this system, we are able to provide support and add new features if needed. The website is built and controlled using Flask, an open-source Python Framework. The maps are generated and controlled using MapServer and OpenLayers, and the database is built in PostgreSQL with PostGIS support. Fieldworkonline and all tools used to create it are open-source. Experience fieldworkonline at our demo during this session, or online at fieldworkonline.geo.uu.nl (username: EGU2016, password: Vienna).

  13. A noise tolerant fine tuning algorithm for the Naïve Bayesian learning algorithm

    Directory of Open Access Journals (Sweden)

    Khalil El Hindi

    2014-07-01

    Full Text Available This work improves on the FTNB algorithm to make it more tolerant to noise. The FTNB algorithm augments the Naïve Bayesian (NB learning algorithm with a fine-tuning stage in an attempt to find better estimations of the probability terms involved. The fine-tuning stage has proved to be effective in improving the classification accuracy of the NB; however, it makes the NB algorithm more sensitive to noise in a training set. This work presents several modifications of the fine tuning stage to make it more tolerant to noise. Our empirical results using 47 data sets indicate that the proposed methods greatly enhance the algorithm tolerance to noise. Furthermore, one of the proposed methods improved the performance of the fine tuning method on many noise-free data sets.

  14. Learning Sorting Algorithms through Visualization Construction

    Science.gov (United States)

    Cetin, Ibrahim; Andrews-Larson, Christine

    2016-01-01

    Recent increased interest in computational thinking poses an important question to researchers: What are the best ways to teach fundamental computing concepts to students? Visualization is suggested as one way of supporting student learning. This mixed-method study aimed to (i) examine the effect of instruction in which students constructed…

  15. Development and psychometric testing of the Clinical Learning Environment, Supervision and Nurse Teacher evaluation scale (CLES+T): the Spanish version.

    Science.gov (United States)

    Vizcaya-Moreno, M Flores; Pérez-Cañaveras, Rosa M; De Juan, Joaquín; Saarikoski, Mikko

    2015-01-01

    The Clinical Learning Environment, Supervision and Nurse Teacher scale is a reliable and valid instrument to evaluate the quality of the clinical learning process in international nursing education contexts. This paper reports the development and psychometric testing of the Spanish version of the Clinical Learning Environment, Supervision and Nurse Teacher scale. Cross-sectional validation study of the scale. 10 public and private hospitals in the Alicante area, and the Faculty of Health Sciences (University of Alicante, Spain). 370 student nurses on clinical placement (January 2011-March 2012). The Clinical Learning Environment, Supervision and Nurse Teacher scale was translated using the modified direct translation method. Statistical analyses were performed using PASW Statistics 18 and AMOS 18.0.0 software. A multivariate analysis was conducted in order to assess construct validity. Cronbach's alpha coefficient was used to evaluate instrument reliability. An exploratory factorial analysis identified the five dimensions from the original version, and explained 66.4% of the variance. Confirmatory factor analysis supported the factor structure of the Spanish version of the instrument. Cronbach's alpha coefficient for the scale was .95, ranging from .80 to .97 for the subscales. This version of the Clinical Learning Environment, Supervision and Nurse Teacher scale instrument showed acceptable psychometric properties for use as an assessment scale in Spanish-speaking countries. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. A Hybrid Constrained Semi-Supervised Clustering Algorithm%一种混合约束的半监督聚类算法

    Institute of Scientific and Technical Information of China (English)

    李雪梅; 王立宏; 宋宜斌

    2011-01-01

    提出一种混合约束的半监督聚类算法(HCC),综合考虑标号点和成对点约束信息的作用,使两种先验信息在聚类的过程中能以不同的方式发挥作用.给出理论推导、具体算法步骤、实验及分析.实验表明在HCC算法中,标号点对提高聚类结果的作用要比成对点约束信息的作用更明显,算法得到的CRI、聚类数、运行时间等多项指标都比对比算法好.%A hybrid constrained semi-supervised clustering algorithm (HCC) is proposed based on consistency algorithm. To get a better clustering result, both labeled data and pairwise constraints are considered in clustering to make use of two types of prior knowledge supplementary to each other. The theoretical derivation and the algorithm are presented in detail. Experimental results show that labeled data outperform pairwise constraints in promoting the quality of clustering. Additionally, for many indices, such as CRI, number of clusters and running time, HCC is better than comparative algorithms.

  17. Learning in the Absence of Direct Supervision: Person-Dependent Scaffolding

    Science.gov (United States)

    Palesy, Debra

    2017-01-01

    Contemporary accounts of learning emphasise the importance of immediate social partners such as teachers and co-workers. Yet, much of our learning for work occurs without such experts. This paper provides an understanding of how and why new home care workers use scaffolding to learn and enact safe manual handling techniques in their workplaces,…

  18. Using animation to help students learn computer algorithms.

    Science.gov (United States)

    Catrambone, Richard; Seay, A Fleming

    2002-01-01

    This paper compares the effects of graphical study aids and animation on the problem-solving performance of students learning computer algorithms. Prior research has found inconsistent effects of animation on learning, and we believe this is partly attributable to animations not being designed to convey key information to learners. We performed an instructional analysis of the to-be-learned algorithms and designed the teaching materials based on that analysis. Participants studied stronger or weaker text-based information about the algorithm, and then some participants additionally studied still frames or an animation. Across 2 studies, learners who studied materials based on the instructional analysis tended to outperform other participants on both near and far transfer tasks. Animation also aided performance, particularly for participants who initially read the weaker text. These results suggest that animation might be added to curricula as a way of improving learning without needing revisions of existing texts and materials. Actual or potential applications of this research include the development of animations for learning complex systems as well as guidelines for determining when animations can aid learning.

  19. Online Levenberg-Marquardt algorithm for digital predistortion based on direct learning and indirect learning architectures

    Science.gov (United States)

    Chen, Limin; Liang, Yin; Wan, Guojin

    2012-04-01

    An regularization approach is introduced into the online identification of inverse model for predistortion. It is based on a modified backpropagation Levenberg-Marquardt algorithm with sliding window. Adaptive predistorter with feedback was identified respectively based on direct learning and indirect learning architectures. Length of the sliding window was discussed. Compared with the Recursive Prediction Error Method (RPEM) algorithm and Nonlinear Filtered Least-Mean-Square (NFxLMS) algorithm, the algorithm is tested by identification of infinite impulse response Wiener predistorter. It is found that the proposed algorithm is much more efficient than either of the other techniques. The values of the parameters are also smaller than those extracted by the ordinary least-squares algorithm since the proposed algorithm constrains the L2-norm of the parameters.

  20. Applying active learning to high-throughput phenotyping algorithms for electronic health records data.

    Science.gov (United States)

    Chen, Yukun; Carroll, Robert J; Hinz, Eugenia R McPeek; Shah, Anushi; Eyler, Anne E; Denny, Joshua C; Xu, Hua

    2013-12-01

    Generalizable, high-throughput phenotyping methods based on supervised machine learning (ML) algorithms could significantly accelerate the use of electronic health records data for clinical and translational research. However, they often require large numbers of annotated samples, which are costly and time-consuming to review. We investigated the use of active learning (AL) in ML-based phenotyping algorithms. We integrated an uncertainty sampling AL approach with support vector machines-based phenotyping algorithms and evaluated its performance using three annotated disease cohorts including rheumatoid arthritis (RA), colorectal cancer (CRC), and venous thromboembolism (VTE). We investigated performance using two types of feature sets: unrefined features, which contained at least all clinical concepts extracted from notes and billing codes; and a smaller set of refined features selected by domain experts. The performance of the AL was compared with a passive learning (PL) approach based on random sampling. Our evaluation showed that AL outperformed PL on three phenotyping tasks. When unrefined features were used in the RA and CRC tasks, AL reduced the number of annotated samples required to achieve an area under the curve (AUC) score of 0.95 by 68% and 23%, respectively. AL also achieved a reduction of 68% for VTE with an optimal AUC of 0.70 using refined features. As expected, refined features improved the performance of phenotyping classifiers and required fewer annotated samples. This study demonstrated that AL can be useful in ML-based phenotyping methods. Moreover, AL and feature engineering based on domain knowledge could be combined to develop efficient and generalizable phenotyping methods.

  1. Expectation-maximization algorithms for learning a finite mixture of univariate survival time distributions from partially specified class values

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Youngrok [Iowa State Univ., Ames, IA (United States)

    2013-05-15

    Heterogeneity exists on a data set when samples from di erent classes are merged into the data set. Finite mixture models can be used to represent a survival time distribution on heterogeneous patient group by the proportions of each class and by the survival time distribution within each class as well. The heterogeneous data set cannot be explicitly decomposed to homogeneous subgroups unless all the samples are precisely labeled by their origin classes; such impossibility of decomposition is a barrier to overcome for estimating nite mixture models. The expectation-maximization (EM) algorithm has been used to obtain maximum likelihood estimates of nite mixture models by soft-decomposition of heterogeneous samples without labels for a subset or the entire set of data. In medical surveillance databases we can find partially labeled data, that is, while not completely unlabeled there is only imprecise information about class values. In this study we propose new EM algorithms that take advantages of using such partial labels, and thus incorporate more information than traditional EM algorithms. We particularly propose four variants of the EM algorithm named EM-OCML, EM-PCML, EM-HCML and EM-CPCML, each of which assumes a specific mechanism of missing class values. We conducted a simulation study on exponential survival trees with five classes and showed that the advantages of incorporating substantial amount of partially labeled data can be highly signi cant. We also showed model selection based on AIC values fairly works to select the best proposed algorithm on each specific data set. A case study on a real-world data set of gastric cancer provided by Surveillance, Epidemiology and End Results (SEER) program showed a superiority of EM-CPCML to not only the other proposed EM algorithms but also conventional supervised, unsupervised and semi-supervised learning algorithms.

  2. Enhancing Time Series Clustering by Incorporating Multiple Distance Measures with Semi-Supervised Learning

    Institute of Scientific and Technical Information of China (English)

    周竞; 朱山风; 黄晓地; 张彦春

    2015-01-01

    Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the first time, that no single distance measure performs significantly better than others on clustering datasets of time series where spectral clustering is used. As such, a question arises as to how to choose an appropriate measure for a given dataset of time series. To answer this question, we propose an integration scheme that incorporates multiple distance measures using semi-supervised clustering. Our approach is able to integrate all the measures by extracting valuable underlying information for the clustering. To the best of our knowledge, this work demonstrates for the first time that the semi-supervised clustering method based on constraints is able to enhance time series clustering by combining multiple distance measures. Having tested on clustering various time series datasets, we show that our method outperforms individual measures, as well as typical integration approaches.

  3. Interactive Algorithms for Unsupervised Machine Learning

    Science.gov (United States)

    2015-06-01

    Copetas, and Diane Stidle who greatly enriched my life at CMU. I am thankful to Zeeshan Syed and Eu-Jin Goh who supported me during my internship at Google...for a fun and productive internship . I am looking forward to spending another year at MSR and continuing to collaborate with and learn from everyone at...the nuclear norm minimization program to exactly 1As before this could equivalently be the column space with assumption on the maximal row coherence. 12

  4. Cascade Error Projection: A Learning Algorithm for Hardware Implementation

    Science.gov (United States)

    Duong, Tuan A.; Daud, Taher

    1996-01-01

    In this paper, we workout a detailed mathematical analysis for a new learning algorithm termed Cascade Error Projection (CEP) and a general learning frame work. This frame work can be used to obtain the cascade correlation learning algorithm by choosing a particular set of parameters. Furthermore, CEP learning algorithm is operated only on one layer, whereas the other set of weights can be calculated deterministically. In association with the dynamical stepsize change concept to convert the weight update from infinite space into a finite space, the relation between the current stepsize and the previous energy level is also given and the estimation procedure for optimal stepsize is used for validation of our proposed technique. The weight values of zero are used for starting the learning for every layer, and a single hidden unit is applied instead of using a pool of candidate hidden units similar to cascade correlation scheme. Therefore, simplicity in hardware implementation is also obtained. Furthermore, this analysis allows us to select from other methods (such as the conjugate gradient descent or the Newton's second order) one of which will be a good candidate for the learning technique. The choice of learning technique depends on the constraints of the problem (e.g., speed, performance, and hardware implementation); one technique may be more suitable than others. Moreover, for a discrete weight space, the theoretical analysis presents the capability of learning with limited weight quantization. Finally, 5- to 8-bit parity and chaotic time series prediction problems are investigated; the simulation results demonstrate that 4-bit or more weight quantization is sufficient for learning neural network using CEP. In addition, it is demonstrated that this technique is able to compensate for less bit weight resolution by incorporating additional hidden units. However, generation result may suffer somewhat with lower bit weight quantization.

  5. 基于流形正则化半监督学习的污水处理操作工况识别方法%Identification of wastewater operational conditions based on manifold regularization semi-supervised learning

    Institute of Scientific and Technical Information of China (English)

    赵立杰; 王海龙; 陈斌

    2016-01-01

    The wastewater treatment process is vulnerable to the impact of external shocks to cause sludge floating, aging, poisoning, expansion and other failure conditions, resulting in effluent deterioration and high energy consumption. It is urgent to quickly and accurately identify the operating conditions of wastewater treatment process. In the existing supervised learning methods all the data are labeled which are time consuming and expensive. A multitude of unlabeled data to collect easily and cheaply have rich and useful information about the operating condition. To overcome the disadvantage of supervised learning algorithms that they cannot make use of unlabeled data, a semi-supervised extreme learning machine algorithm based on manifold regularization is adopted to monitor the operation states of biochemical wastewater treatment process. The graph Laplacian matrix is constructed from both the labeled patterns and the unlabeled patterns. Extreme learning machine algorithm is adopted to handle the semi-supervised learning task under the framework of the manifold regularization. It constructs the hidden layer using random feature mapping and solves the weights between the hidden layer and the output layer, which exhibit the computational efficiency and generalization performance of the random neural network. The results of simulation experiments show that the fault identification method based on semi supervised learning machine has superiority to the basic extreme learning machine in improving the accuracy and reliability.%污水处理过程容易受外界冲激扰动影响,引发污泥上浮、老化、中毒、膨胀等故障工况,导致出水水质质量差,能源消耗高等问题,如何快速准确识别污水操作工况故障至关重要。针对污水工况识别过程中现有监督学习方法未利用大量未标记数据蕴含的丰富操作工况信息,采用基于流形正则化极限学习机的半监督学习方法,监视生化污水处

  6. Teachers' Use of a Verbally Governed Algorithm and Student Learning

    Science.gov (United States)

    Keohane, Dolleen-Day; Greer, R. Douglas

    2005-01-01

    The effects of instructing teachers in the use of a verbally governed algorithm to solve students' learning problems were measured. The teachers were taught to analyze students' responses to instruction using a strategic protocol, which included a series of verbally governed questions. The study was designed to determine whether the instructional…

  7. LAHS: A novel harmony search algorithm based on learning automata

    Science.gov (United States)

    Enayatifar, Rasul; Yousefi, Moslem; Abdullah, Abdul Hanan; Darus, Amer Nordin

    2013-12-01

    This study presents a learning automata-based harmony search (LAHS) for unconstrained optimization of continuous problems. The harmony search (HS) algorithm performance strongly depends on the fine tuning of its parameters, including the harmony consideration rate (HMCR), pitch adjustment rate (PAR) and bandwidth (bw). Inspired by the spur-in-time responses in the musical improvisation process, learning capabilities are employed in the HS to select these parameters based on spontaneous reactions. An extensive numerical investigation is conducted on several well-known test functions, and the results are compared with the HS algorithm and its prominent variants, including the improved harmony search (IHS), global-best harmony search (GHS) and self-adaptive global-best harmony search (SGHS). The numerical results indicate that the LAHS is more efficient in finding optimum solutions and outperforms the existing HS algorithm variants.

  8. Machine learning algorithms for mode-of-action classification in toxicity assessment.

    Science.gov (United States)

    Zhang, Yile; Wong, Yau Shu; Deng, Jian; Anton, Cristina; Gabos, Stephan; Zhang, Weiping; Huang, Dorothy Yu; Jin, Can

    2016-01-01

    Real Time Cell Analysis (RTCA) technology is used to monitor cellular changes continuously over the entire exposure period. Combining with different testing concentrations, the profiles have potential in probing the mode of action (MOA) of the testing substances. In this paper, we present machine learning approaches for MOA assessment. Computational tools based on artificial neural network (ANN) and support vector machine (SVM) are developed to analyze the time-concentration response curves (TCRCs) of human cell lines responding to tested chemicals. The techniques are capable of learning data from given TCRCs with known MOA information and then making MOA classification for the unknown toxicity. A novel data processing step based on wavelet transform is introduced to extract important features from the original TCRC data. From the dose response curves, time interval leading to higher classification success rate can be selected as input to enhance the performance of the machine learning algorithm. This is particularly helpful when handling cases with limited and imbalanced data. The validation of the proposed method is demonstrated by the supervised learning algorithm applied to the exposure data of HepG2 cell line to 63 chemicals with 11 concentrations in each test case. Classification success rate in the range of 85 to 95 % are obtained using SVM for MOA classification with two clusters to cases up to four clusters. Wavelet transform is capable of capturing important features of TCRCs for MOA classification. The proposed SVM scheme incorporated with wavelet transform has a great potential for large scale MOA classification and high-through output chemical screening.

  9. Target Localization in Wireless Sensor Networks Using Online Semi-Supervised Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Jaehyun Yoo

    2015-05-01

    Full Text Available Machine learning has been successfully used for target localization in wireless sensor networks (WSNs due to its accurate and robust estimation against highly nonlinear and noisy sensor measurement. For efficient and adaptive learning, this paper introduces online semi-supervised support vector regression (OSS-SVR. The first advantage of the proposed algorithm is that, based on semi-supervised learning framework, it can reduce the requirement on the amount of the labeled training data, maintaining accurate estimation. Second, with an extension to online learning, the proposed OSS-SVR automatically tracks changes of the system to be learned, such as varied noise characteristics. We compare the proposed algorithm with semi-supervised manifold learning, an online Gaussian process and online semi-supervised colocalization. The algorithms are evaluated for estimating the unknown location of a mobile robot in a WSN. The experimental results show that the proposed algorithm is more accurate under the smaller amount of labeled training data and is robust to varying noise. Moreover, the suggested algorithm performs fast computation, maintaining the best localization performance in comparison with the other methods.

  10. Face detection based on multiple kernel learning algorithm

    Science.gov (United States)

    Sun, Bo; Cao, Siming; He, Jun; Yu, Lejun

    2016-09-01

    Face detection is important for face localization in face or facial expression recognition, etc. The basic idea is to determine whether there is a face in an image or not, and also its location, size. It can be seen as a binary classification problem, which can be well solved by support vector machine (SVM). Though SVM has strong model generalization ability, it has some limitations, which will be deeply analyzed in the paper. To access them, we study the principle and characteristics of the Multiple Kernel Learning (MKL) and propose a MKL-based face detection algorithm. In the paper, we describe the proposed algorithm in the interdisciplinary research perspective of machine learning and image processing. After analyzing the limitation of describing a face with a single feature, we apply several ones. To fuse them well, we try different kernel functions on different feature. By MKL method, the weight of each single function is determined. Thus, we obtain the face detection model, which is the kernel of the proposed method. Experiments on the public data set and real life face images are performed. We compare the performance of the proposed algorithm with the single kernel-single feature based algorithm and multiple kernels-single feature based algorithm. The effectiveness of the proposed algorithm is illustrated. Keywords: face detection, feature fusion, SVM, MKL

  11. Enhancing the Standard of Teaching and Learning in the 21st Century via Qualitative School-Based Supervision in Secondary Schools in Abuja Municipal Area Council (AMAC)

    Science.gov (United States)

    Ebele, Uju F.; Olofu, Paul A.

    2017-01-01

    The study focused on enhancing the standard of teaching and learning in the 21st century via qualitative school-based supervision in secondary schools in Abuja municipal area council. To guide the study, two null hypotheses were formulated. A descriptive survey research design was adopted. The sample of the study constituted of 270 secondary…

  12. Exploration of joint redundancy but not task space variability facilitates supervised motor learning.

    Science.gov (United States)

    Singh, Puneet; Jana, Sumitash; Ghosal, Ashitava; Murthy, Aditya

    2016-12-13

    The number of joints and muscles in a human arm is more than what is required for reaching to a desired point in 3D space. Although previous studies have emphasized how such redundancy and the associated flexibility may play an important role in path planning, control of noise, and optimization of motion, whether and how redundancy might promote motor learning has not been investigated. In this work, we quantify redundancy space and investigate its significance and effect on motor learning. We propose that a larger redundancy space leads to faster learning across subjects. We observed this pattern in subjects learning novel kinematics (visuomotor adaptation) and dynamics (force-field adaptation). Interestingly, we also observed differences in the redundancy space between the dominant hand and nondominant hand that explained differences in the learning of dynamics. Taken together, these results provide support for the hypothesis that redundancy aids in motor learning and that the redundant component of motor variability is not noise.

  13. Reflections on Doctoral Supervision: Drawing from the Experiences of Students with Additional Learning Needs in Two Universities

    Science.gov (United States)

    Collins, Bethan

    2015-01-01

    Supervision is an essential part of doctoral study, consisting of relationship and process aspects, underpinned by a range of values. To date there has been limited research specifically about disabled doctoral students' experiences of supervision. This paper draws on qualitative, narrative interviews about doctoral supervision with disabled…

  14. Analysed potential of big data and supervised machine learning techniques in effectively forecasting travel times from fused data

    Directory of Open Access Journals (Sweden)

    Ivana Šemanjski

    2015-12-01

    Full Text Available Travel time forecasting is an interesting topic for many ITS services. Increased availability of data collection sensors increases the availability of the predictor variables but also highlights the high processing issues related to this big data availability. In this paper we aimed to analyse the potential of big data and supervised machine learning techniques in effectively forecasting travel times. For this purpose we used fused data from three data sources (Global Positioning System vehicles tracks, road network infrastructure data and meteorological data and four machine learning techniques (k-nearest neighbours, support vector machines, boosting trees and random forest. To evaluate the forecasting results we compared them in-between different road classes in the context of absolute values, measured in minutes, and the mean squared percentage error. For the road classes with the high average speed and long road segments, machine learning techniques forecasted travel times with small relative error, while for the road classes with the small average speeds and segment lengths this was a more demanding task. All three data sources were proven itself to have a high impact on the travel time forecast accuracy and the best results (taking into account all road classes were achieved for the k-nearest neighbours and random forest techniques.

  15. Student experiences in learning person-centred care of patients with Alzheimer's disease as perceived by nursing students and supervising nurses.

    Science.gov (United States)

    Skaalvik, Mari W; Normann, Hans Ketil; Henriksen, Nils

    2010-09-01

    The aims and objectives of this paper are to illuminate and discuss the experiences and perceptions of nursing students and supervising nurses regarding the students' learning of person- centred care of patients with Alzheimer's disease in a teaching nursing home. This information is then used to develop recommendations as to how student learning could be improved. The clinical experiences of nursing students are an important part of learning person-centred care. Caring for patients with Alzheimer's disease may cause frustration, sadness, fear and empathy. Person-centred care can be learned in clinical practice. A qualitative study. The study was performed in 2006 using field work with field notes and qualitative interviews with seven-fifth-semester nursing students and six supervising nurses. This study determined the variation in the perceptions of nursing students and supervising nurses with regards to the students' expertise in caring for patients with Alzheimer's disease. The nursing students experienced limited learning regarding person-centred approaches in caring for patients with Alzheimer's disease. However, the supervising nurses perceived the teaching nursing home as a site representing multiple learning opportunities in this area. Nursing students perceived limited learning outcomes because they did not observe or experience systematic person-centred approaches in caring for patients with Alzheimer's disease. It is important that measures of quality improvements in the care of patients with Alzheimer's disease are communicated and demonstrated for nursing students working in clinical practices in a teaching nursing home. Introduction of person-centred approaches is vital regarding learning outcomes for nursing students caring for patients with Alzheimer's disease. © 2010 The Authors. Journal compilation © 2010 Blackwell Publishing Ltd.

  16. Learning Algorithm for a Brachiating Robot

    Directory of Open Access Journals (Sweden)

    Hideki Kajima

    2003-01-01

    Full Text Available This paper introduces a new concept of multi-locomotion robot inspired by an animal. The robot, ‘Gorilla Robot II’, can select the appropriate locomotion (from biped locomotion, quadruped locomotion and brachiation according to an environment or task. We consider ‘brachiation’ to be one of the most dynamic of animal motions. To develop a brachiation controller, architecture of the hierarchical behaviour-based controller, which consists of behaviour controllers and behaviour coordinators, was used. To achieve better brachiation, an enhanced learning method for motion control, adjusting the timing of the behaviour coordination, is proposed. Finally, it is shown that the developed robot successfully performs two types of brachiation and continuous locomotion.

  17. Backpropagation Learning Algorithms for Email Classification.

    Directory of Open Access Journals (Sweden)

    *David Ndumiyana and Tarirayi Mukabeta

    2016-07-01

    Full Text Available Today email has become one the fastest and most effective form of communication. The popularity of this mode of transmitting goods, information and services has motivated spammers to perfect their technical skills to fool spam filters. This development has worsened the problems faced by Internet users as they have to deal with email congestion, email overload and unprioritised email messages. The result was an exponential increase in the number of email classification management tools for the past few decades. In this paper we propose a new spam classifier using a learning process of multilayer neural network to implement back propagation technique. Our contribution to the body of knowledge is the use of an improved empirical analysis to choose an optimum, novel collection of attributes of a user’s email contents that allows a quick detection of most important words in emails. We also demonstrate the effectiveness of two equal sets of emails training and testing data.

  18. Automatic learning rate adjustment for self-supervising autonomous robot control

    Science.gov (United States)

    Arras, Michael K.; Protzel, Peter W.; Palumbo, Daniel L.

    1992-01-01

    Described is an application in which an Artificial Neural Network (ANN) controls the positioning of a robot arm with five degrees of freedom by using visual feedback provided by two cameras. This application and the specific ANN model, local liner maps, are based on the work of Ritter, Martinetz, and Schulten. We extended their approach by generating a filtered, average positioning error from the continuous camera feedback and by coupling the learning rate to this error. When the network learns to position the arm, the positioning error decreases and so does the learning rate until the system stabilizes at a minimum error and learning rate. This abolishes the need for a predetermined cooling schedule. The automatic cooling procedure results in a closed loop control with no distinction between a learning phase and a production phase. If the positioning error suddenly starts to increase due to an internal failure such as a broken joint, or an environmental change such as a camera moving, the learning rate increases accordingly. Thus, learning is automatically activated and the network adapts to the new condition after which the error decreases again and learning is 'shut off'. The automatic cooling is therefore a prerequisite for the autonomy and the fault tolerance of the system.

  19. Dolanan Dance Learning on Supervising Pre-Service Teachers during Teaching Practicum Program

    Directory of Open Access Journals (Sweden)

    Nilam Cahyaningrum

    2015-01-01

    Full Text Available Taman Kanak- kanak Mekarsari (Mekarsari Kindergarten is a school that choses dolanan anak dance lesson which is taught using demonstration methods. This study aims to find, understand, and describe the process and learning outcomes of dolanan anak dance in Mekarsari Kindergarten, Kandeman District of Batang. This study uses qualitative research methods with a phenomenological approach to research sites in Mekarsari Kindergarten, Kandeman District of Batang. Data collection techniques used were observation, interview techniques, and technical documentation. Data analysis were using data reduction, data presentation, drawing conclusions, and verification. The validity test were using triangulation of data sources, techniques, and time. Dolanan anak dance learning in Mekarsari Kindergarten consists of several components, namely teaching and learning activities, goals, teachers, students, materials, methods, media, tools and learning resources, and evaluation. Dolanan dance learning was using demonstration method implemented through three stages: pre-development activities, core activities, and closing activities. The learning outcomes of dolanan anak dance learning in Mekarsari kindergarten were categorized into three aspects, namely cognitive, affective, and psychomotor. Cognitive aspects can be seen from the students’ ability to remember, memorize and understand the dance. Affective aspects include familiar levels, namely learning to know friends and dance movements, respond the movements amomg friends, and appreciate the teacher’s explanation given to each student. Psychomotor aspects can be seen from the students’ ability to imitate the dance movements, use the concept of doing the movements and precision of movements, weave movement and exercise appropriately.

  20. Learning sorting algorithms through visualization construction

    Science.gov (United States)

    Cetin, Ibrahim; Andrews-Larson, Christine

    2016-01-01

    Recent increased interest in computational thinking poses an important question to researchers: What are the best ways to teach fundamental computing concepts to students? Visualization is suggested as one way of supporting student learning. This mixed-method study aimed to (i) examine the effect of instruction in which students constructed visualizations on students' programming achievement and students' attitudes toward computer programming, and (ii) explore how this kind of instruction supports students' learning according to their self-reported experiences in the course. The study was conducted with 58 pre-service teachers who were enrolled in their second programming class. They expect to teach information technology and computing-related courses at the primary and secondary levels. An embedded experimental model was utilized as a research design. Students in the experimental group were given instruction that required students to construct visualizations related to sorting, whereas students in the control group viewed pre-made visualizations. After the instructional intervention, eight students from each group were selected for semi-structured interviews. The results showed that the intervention based on visualization construction resulted in significantly better acquisition of sorting concepts. However, there was no significant difference between the groups with respect to students' attitudes toward computer programming. Qualitative data analysis indicated that students in the experimental group constructed necessary abstractions through their engagement in visualization construction activities. The authors of this study argue that the students' active engagement in the visualization construction activities explains only one side of students' success. The other side can be explained through the instructional approach, constructionism in this case, used to design instruction. The conclusions and implications of this study can be used by researchers and

  1. General asymmetric neutral networks and structure design by genetic algorithms: A learning rule for temporal patterns

    Energy Technology Data Exchange (ETDEWEB)

    Bornholdt, S. [Heidelberg Univ., (Germany). Inst., fuer Theoretische Physik; Graudenz, D. [Lawrence Berkeley Lab., CA (United States)

    1993-07-01

    A learning algorithm based on genetic algorithms for asymmetric neural networks with an arbitrary structure is presented. It is suited for the learning of temporal patterns and leads to stable neural networks with feedback.

  2. Machine Learning for Information Retrieval: Neural Networks, Symbolic Learning, and Genetic Algorithms.

    Science.gov (United States)

    Chen, Hsinchun

    1995-01-01

    Presents an overview of artificial-intelligence-based inductive learning techniques and their use in information science research. Three methods are discussed: the connectionist Hopfield network; the symbolic ID3/ID5R; evolution-based genetic algorithms. The knowledge representations and algorithms of these methods are examined in the context of…

  3. Machine Learning for Information Retrieval: Neural Networks, Symbolic Learning, and Genetic Algorithms.

    Science.gov (United States)

    Chen, Hsinchun

    1995-01-01

    Presents an overview of artificial-intelligence-based inductive learning techniques and their use in information science research. Three methods are discussed: the connectionist Hopfield network; the symbolic ID3/ID5R; evolution-based genetic algorithms. The knowledge representations and algorithms of these methods are examined in the context of…

  4. Denoising of gravitational wave signals via dictionary learning algorithms

    Science.gov (United States)

    Torres-Forné, Alejandro; Marquina, Antonio; Font, José A.; Ibáñez, José M.

    2016-12-01

    Gravitational wave astronomy has become a reality after the historical detections accomplished during the first observing run of the two advanced LIGO detectors. In the following years, the number of detections is expected to increase significantly with the full commissioning of the advanced LIGO, advanced Virgo and KAGRA detectors. The development of sophisticated data analysis techniques to improve the opportunities of detection for low signal-to-noise-ratio events is, hence, a most crucial effort. In this paper, we present one such technique, dictionary-learning algorithms, which have been extensively developed in the last few years and successfully applied mostly in the context of image processing. However, to the best of our knowledge, such algorithms have not yet been employed to denoise gravitational wave signals. By building dictionaries from numerical relativity templates of both binary black holes mergers and bursts of rotational core collapse, we show how machine-learning algorithms based on dictionaries can also be successfully applied for gravitational wave denoising. We use a subset of signals from both catalogs, embedded in nonwhite Gaussian noise, to assess our techniques with a large sample of tests and to find the best model parameters. The application of our method to the actual signal GW150914 shows promising results. Dictionary-learning algorithms could be a complementary addition to the gravitational wave data analysis toolkit. They may be used to extract signals from noise and to infer physical parameters if the data are in good enough agreement with the morphology of the dictionary atoms.

  5. Active constraints selection based semi-supervised dimensionality in ensemble subspaces

    Institute of Scientific and Technical Information of China (English)

    Jie Zeng; Wei Nie; Yong Zhang

    2015-01-01

    Semi-supervised dimensionality reduction (SSDR) has attracted an increasing amount of attention in this big-data era. Many algorithms have been developed with a smal number of pairwise constraints to achieve performances comparable to those of ful y supervised methods. However, one chal enging problem with semi-supervised approaches is the appropriate choice of the constraint set, including the cardinality and the composition of the constraint set, which to a large extent, affects the performance of the resulting algorithm. In this work, we address the problem by incorporating ensemble subspace and active learning into dimen-sionality reduction and propose a new algorithm, termed as global and local scatter based SSDR with active pairwise constraints selection in ensemble subspaces (SSGL-ESA). Unlike traditional methods that select the supervised information in one subspace, we pick up pairwise constraints in ensemble subspace, where a novel active learning algorithm is designed with both exploration and filtering to generate informative pairwise constraints. The auto-matic constraint selection approach proposed in this paper can be generalized to be used with al constraint-based semi-supervised learning algorithms. Comparative experiments are conducted on two face database and the results validate the effectiveness of the proposed method.

  6. AN SUPERVISED METHOD FOR DETECTION MALWARE BY USING MACHINE LEARNING ALGORITHM

    OpenAIRE

    Nisha Badwaik*, Vijay Bagdi

    2016-01-01

    There is Explosive increase in mobile application more and more threat, viruses and benign are migrate from traditional PC to mobile devices. Existence of this information and access creates more importance which makes device attractive targets for malicious entities. For this we proposed a probabilistic discriminative model which has regularized logistic regression for android malware detection with decompiled source code. There are so many approaches for detection of android malware has bee...

  7. Video game for learning and metaphorization of recursive algorithms

    Directory of Open Access Journals (Sweden)

    Ricardo Inacio Alvares Silva

    2013-09-01

    Full Text Available The learning of recursive algorithms in computer programming is problematic, because its execution and resolution is not natural to the thinking way people are trained and used to since young. As with other topics in algorithms, we use metaphors to make parallels between the abstract and the concrete to help in understanding the operation of recursive algorithms. However, the classic metaphors employed in this area, such as calculating factorial recursively and Towers of Hanoi game, may just confuse more or be insufficient. In this work, we produced a computer game to assist students in computer courses in learning recursive algorithms. It was designed to have regular video game characteristics, with narrative and classical gameplay elements, commonly found in this kind of product. Aiding to education occurs through metaphorization, or in other words, through experiences provided by game situations that refer to recursive algorithms. To this end, we designed and imbued in the game four valid metaphors related to the theory, and other minor references to the subject.

  8. Comparison of machine learning algorithms for detecting coral reef

    Directory of Open Access Journals (Sweden)

    Eduardo Tusa

    2014-09-01

    Full Text Available (Received: 2014/07/31 - Accepted: 2014/09/23This work focuses on developing a fast coral reef detector, which is used for an autonomous underwater vehicle, AUV. A fast detection secures the AUV stabilization respect to an area of reef as fast as possible, and prevents devastating collisions. We use the algorithm of Purser et al. (2009 because of its precision. This detector has two parts: feature extraction that uses Gabor Wavelet filters, and feature classification that uses machine learning based on Neural Networks. Due to the extensive time of the Neural Networks, we exchange for a classification algorithm based on Decision Trees. We use a database of 621 images of coral reef in Belize (110 images for training and 511 images for testing. We implement the bank of Gabor Wavelets filters using C++ and the OpenCV library. We compare the accuracy and running time of 9 machine learning algorithms, whose result was the selection of the Decision Trees algorithm. Our coral detector performs 70ms of running time in comparison to 22s executed by the algorithm of Purser et al. (2009.

  9. Computer aided lung cancer diagnosis with deep learning algorithms

    Science.gov (United States)

    Sun, Wenqing; Zheng, Bin; Qian, Wei

    2016-03-01

    Deep learning is considered as a popular and powerful method in pattern recognition and classification. However, there are not many deep structured applications used in medical imaging diagnosis area, because large dataset is not always available for medical images. In this study we tested the feasibility of using deep learning algorithms for lung cancer diagnosis with the cases from Lung Image Database Consortium (LIDC) database. The nodules on each computed tomography (CT) slice were segmented according to marks provided by the radiologists. After down sampling and rotating we acquired 174412 samples with 52 by 52 pixel each and the corresponding truth files. Three deep learning algorithms were designed and implemented, including Convolutional Neural Network (CNN), Deep Belief Networks (DBNs), Stacked Denoising Autoencoder (SDAE). To compare the performance of deep learning algorithms with traditional computer aided diagnosis (CADx) system, we designed a scheme with 28 image features and support vector machine. The accuracies of CNN, DBNs, and SDAE are 0.7976, 0.8119, and 0.7929, respectively; the accuracy of our designed traditional CADx is 0.7940, which is slightly lower than CNN and DBNs. We also noticed that the mislabeled nodules using DBNs are 4% larger than using traditional CADx, this might be resulting from down sampling process lost some size information of the nodules.

  10. Advanced Machine learning Algorithm Application for Rotating Machine Health Monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Kanemoto, Shigeru; Watanabe, Masaya [The University of Aizu, Aizuwakamatsu (Japan); Yusa, Noritaka [Tohoku University, Sendai (Japan)

    2014-08-15

    The present paper tries to evaluate the applicability of conventional sound analysis techniques and modern machine learning algorithms to rotating machine health monitoring. These techniques include support vector machine, deep leaning neural network, etc. The inner ring defect and misalignment anomaly sound data measured by a rotating machine mockup test facility are used to verify the above various kinds of algorithms. Although we cannot find remarkable difference of anomaly discrimination performance, some methods give us the very interesting eigen patterns corresponding to normal and abnormal states. These results will be useful for future more sensitive and robust anomaly monitoring technology.

  11. Developing a Learning Algorithm-Generated Empirical Relaxer

    Energy Technology Data Exchange (ETDEWEB)

    Mitchell, Wayne [Univ. of Colorado, Boulder, CO (United States). Dept. of Applied Math; Kallman, Josh [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Toreja, Allen [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Gallagher, Brian [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Laney, Dan [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-03-30

    One of the main difficulties when running Arbitrary Lagrangian-Eulerian (ALE) simulations is determining how much to relax the mesh during the Eulerian step. This determination is currently made by the user on a simulation-by-simulation basis. We present a Learning Algorithm-Generated Empirical Relaxer (LAGER) which uses a regressive random forest algorithm to automate this decision process. We also demonstrate that LAGER successfully relaxes a variety of test problems, maintains simulation accuracy, and has the potential to significantly decrease both the person-hours and computational hours needed to run a successful ALE simulation.

  12. An Educational System for Learning Search Algorithms and Automatically Assessing Student Performance

    Science.gov (United States)

    Grivokostopoulou, Foteini; Perikos, Isidoros; Hatzilygeroudis, Ioannis

    2017-01-01

    In this paper, first we present an educational system that assists students in learning and tutors in teaching search algorithms, an artificial intelligence topic. Learning is achieved through a wide range of learning activities. Algorithm visualizations demonstrate the operational functionality of algorithms according to the principles of active…

  13. An Educational System for Learning Search Algorithms and Automatically Assessing Student Performance

    Science.gov (United States)

    Grivokostopoulou, Foteini; Perikos, Isidoros; Hatzilygeroudis, Ioannis

    2017-01-01

    In this paper, first we present an educational system that assists students in learning and tutors in teaching search algorithms, an artificial intelligence topic. Learning is achieved through a wide range of learning activities. Algorithm visualizations demonstrate the operational functionality of algorithms according to the principles of active…

  14. HAMA-Based Semi-Supervised Hashing Algorithm%基于HAMA的半监督哈希方法

    Institute of Scientific and Technical Information of China (English)

    刘扬; 朱明

    2014-01-01

    In the massive data retrieval applications, hashing-based approximate nearest(ANN) search has become popular due to its computational and memory efficiency for online search. Semi-supervised hashing (SSH) framework that minimizes empirical error over the labeled set and an information theoretic regularizer over both labeled and unlabeled sets. But the training of hashing function of this framework is so slow due to the large-scale complex training process. HAMA is a Hadoop top-level parallel framework based on Bulk Synchronous Parallel mode (BSP). In this paper, we analyze calculation of adjusted covariance matrix in the training process of SSH, split it into two parts:unsupervised data variance part and supervised pairwise labeled data part, and explore its parallelization. And experiments show the performance and scalability over general commercial hardware and network environment.%在海量数据检索应用中,基于哈希算法的最近邻搜索算法有着很高的计算和内存效率。而半监督哈希算法,结合了无监督哈希算法的正规化信息以及监督算法跨越语义鸿沟的优点,从而取得了良好的结果。但其线下的哈希函数训练过程则非常之缓慢,要对全部数据集进行复杂的训练过程。 HAMA是在Hadoop平台基础上,按照分布式计算BSP模型构建的并行计算框架。本文尝试在HAMA框架基础上,将半监督哈希算法的训练过程中的调整相关矩阵计算过程分解为无监督的相关矩阵部分与监督性的调整部分,分别进行并行计算处理。这使得使得其可以水平扩展在较大规模的商业计算集群上,使得其可以应用于实际应用。实验表明,这种分布式算法,有效提高算法的性能,并且可以进一步应用在大规模的计算集群上。

  15. Assessment of work-integrated learning: comparison of the usage of a grading rubric by supervising radiographers and teachers

    Energy Technology Data Exchange (ETDEWEB)

    Kilgour, Andrew J, E-mail: akilgour@csu.edu.au [Charles Sturt University, Wagga Wagga, NSW (Australia); Kilgour, Peter W [Avondale College of Higher Education, Cooranbong, NSW (Australia); Gerzina, Tania [Dental Educational Research, Faculty of Dentistry, Jaw Function and Orofacial Pain Research Unit, Westmead Centre for Oral Health, C24- Westmead Hospital, The University of Sydney, Sydney, NSW, 2006 (Australia); Christian, Beverly [Avondale College of Higher Education, Cooranbong, NSW (Australia); Charles Sturt University, Wagga Wagga, NSW (Australia)

    2014-02-15

    Introduction: Professional work-integrated learning (WIL) that integrates the academic experience with off-campus professional experience placements is an integral part of many tertiary courses. Issues with the reliability and validity of assessment grades in these placements suggest that there is a need to strengthen the level of academic rigour of placements in these programmes. This study aims to compare the attitudes to the usage of assessment rubrics of radiographers supervising medical imaging students and teachers supervising pre-service teachers. Methods: WIL placement assessment practices in two programmes, pre-service teacher training (Avondale College of Higher Education, NSW) and medical diagnostic radiography (Faculty of Health Sciences, University of Sydney, NSW), were compared with a view to comparing assessment strategies across these two different educational domains. Educators (course coordinators) responsible for teaching professional development placements of teacher trainees and diagnostic radiography students developed a standards-based grading rubric designed to guide assessment of students’ work during WIL placement by assessors. After ∼12 months of implementation of the rubrics, assessors’ reaction to the effectiveness and usefulness of the grading rubric was determined using a specially created survey form. Data were collected over the period from March to June 2011. Quantitative and qualitative data found that assessors in both programmes considered the grading rubric to be a vital tool in the assessment process, though teacher supervisors were more positive about the benefits of its use than the radiographer supervisors. Results: Benefits of the grading rubric included accuracy and consistency of grading, ability to identify specific areas of desired development and facilitation of the provision of supervisor feedback. The use of assessment grading rubrics is of benefit to assessors in WIL placements from two very different

  16. Assessment of work-integrated learning: comparison of the usage of a grading rubric by supervising radiographers and teachers.

    Science.gov (United States)

    Kilgour, Andrew J; Kilgour, Peter W; Gerzina, Tania; Christian, Beverly

    2014-02-01

    IntroductionProfessional work-integrated learning (WIL) that integrates the academic experience with off-campus professional experience placements is an integral part of many tertiary courses. Issues with the reliability and validity of assessment grades in these placements suggest that there is a need to strengthen the level of academic rigour of placements in these programmes. This study aims to compare the attitudes to the usage of assessment rubrics of radiographers supervising medical imaging students and teachers supervising pre-service teachers. MethodsWIL placement assessment practices in two programmes, pre-service teacher training (Avondale College of Higher Education, NSW) and medical diagnostic radiography (Faculty of Health Sciences, University of Sydney, NSW), were compared with a view to comparing assessment strategies across these two different educational domains. Educators (course coordinators) responsible for teaching professional development placements of teacher trainees and diagnostic radiography students developed a standards-based grading rubric designed to guide assessment of students' work during WIL placement by assessors. After ∼12 months of implementation of the rubrics, assessors' reaction to the effectiveness and usefulness of the grading rubric was determined using a specially created survey form. Data were collected over the period from March to June 2011. Quantitative and qualitative data found that assessors in both programmes considered the grading rubric to be a vital tool in the assessment process, though teacher supervisors were more positive about the benefits of its use than the radiographer supervisors. ResultsBenefits of the grading rubric included accuracy and consistency of grading, ability to identify specific areas of desired development and facilitation of the provision of supervisor feedback. The use of assessment grading rubrics is of benefit to assessors in WIL placements from two very different teaching

  17. Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation.

    Directory of Open Access Journals (Sweden)

    Chao Wei

    Full Text Available Topic models and neural networks can discover meaningful low-dimensional latent representations of text corpora; as such, they have become a key technology of document representation. However, such models presume all documents are non-discriminatory, resulting in latent representation dependent upon all other documents and an inability to provide discriminative document representation. To address this problem, we propose a semi-supervised manifold-inspired autoencoder to extract meaningful latent representations of documents, taking the local perspective that the latent representation of nearby documents should be correlative. We first determine the discriminative neighbors set with Euclidean distance in observation spaces. Then, the autoencoder is trained by joint minimization of the Bernoulli cross-entropy error between input and output and the sum of the square error between neighbors of input and output. The results of two widely used corpora show that our method yields at least a 15% improvement in document clustering and a nearly 7% improvement in classification tasks compared to comparative methods. The evidence demonstrates that our method can readily capture more discriminative latent representation of new documents. Moreover, some meaningful combinations of words can be efficiently discovered by activating features that promote the comprehensibility of latent representation.

  18. Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation.

    Science.gov (United States)

    Wei, Chao; Luo, Senlin; Ma, Xincheng; Ren, Hao; Zhang, Ji; Pan, Limin

    2016-01-01

    Topic models and neural networks can discover meaningful low-dimensional latent representations of text corpora; as such, they have become a key technology of document representation. However, such models presume all documents are non-discriminatory, resulting in latent representation dependent upon all other documents and an inability to provide discriminative document representation. To address this problem, we propose a semi-supervised manifold-inspired autoencoder to extract meaningful latent representations of documents, taking the local perspective that the latent representation of nearby documents should be correlative. We first determine the discriminative neighbors set with Euclidean distance in observation spaces. Then, the autoencoder is trained by joint minimization of the Bernoulli cross-entropy error between input and output and the sum of the square error between neighbors of input and output. The results of two widely used corpora show that our method yields at least a 15% improvement in document clustering and a nearly 7% improvement in classification tasks compared to comparative methods. The evidence demonstrates that our method can readily capture more discriminative latent representation of new documents. Moreover, some meaningful combinations of words can be efficiently discovered by activating features that promote the comprehensibility of latent representation.

  19. SAR Target Recognition via Supervised Discriminative Dictionary Learning and Sparse Representation of the SAR-HOG Feature

    Directory of Open Access Journals (Sweden)

    Shengli Song

    2016-08-01

    Full Text Available Automatic target recognition (ATR in synthetic aperture radar (SAR images plays an important role in both national defense and civil applications. Although many methods have been proposed, SAR ATR is still very challenging due to the complex application environment. Feature extraction and classification are key points in SAR ATR. In this paper, we first design a novel feature, which is a histogram of oriented gradients (HOG-like feature for SAR ATR (called SAR-HOG. Then, we propose a supervised discriminative dictionary learning (SDDL method to learn a discriminative dictionary for SAR ATR and propose a strategy to simplify the optimization problem. Finally, we propose a SAR ATR classifier based on SDDL and sparse representation (called SDDLSR, in which both the reconstruction error and the classification error are considered. Extensive experiments are performed on the MSTAR database under standard operating conditions and extended operating conditions. The experimental results show that SAR-HOG can reliably capture the structures of targets in SAR images, and SDDL can further capture subtle differences among the different classes. By virtue of the SAR-HOG feature and SDDLSR, the proposed method achieves the state-of-the-art performance on MSTAR database. Especially for the extended operating conditions (EOC scenario “Training 17 ∘ —Testing 45 ∘ ”, the proposed method improves remarkably with respect to the previous works.

  20. Supervised practice in occupational therapy in a psychosocial care center: Challenges for the assistance and the teaching and learning process

    Directory of Open Access Journals (Sweden)

    Milton Carlos Mariotti

    2014-09-01

    Full Text Available The psychiatric reform in Brazil has replaced the hospital-centered model by the reintegration of users to their respective communities. The Center of Psychosocial Care (CAPS has been the main equipment in that scope. Objectives: To report the development of Supervised Practice in Occupational Therapy in a CAPS II unit in Curitiba, Parana state, Brazil. Methods: This is an experience report. It features the training field and describes the stages of the teaching and learning process which involved institutional observation, reporting and intervention proposal, collecting data about the users’ profile and attendances. The work focused the non-intensive users because they are close to hospital discharge. Results: We found that users of the non-intensive system, rather than crave the discharge, would like to return to the semi-intensive or intensive systems, aiming to regain sickness and transportation benefits, which are lost as users make progress. This fact denotes great contradictions in the system. We also attended intensive and semi-intensive systems users. Conclusions: The students’ learning included aspects such as direct contact with the institutional reality; knowledge about the health system, its limitations and contradictions; approach to users, their families, realities, socioeconomic conditions, desires, aspirations, or lack thereof; difficulties in engaging in meaningful occupations in their territories, limitations, and social stigma; working with frustrations, reflecting about ways to change the reality; in addition to expanded clinical practice, participating in the discussions and formulation of public policies on mental healthcare and social control.

  1. Insights in reinforcement rearning : formal analysis and empirical evaluation of temporal-difference learning algorithms

    NARCIS (Netherlands)

    van Hasselt, H.P.

    2011-01-01

    A key aspect of artificial intelligence is the ability to learn from experience. If examples of correct solutions exist, supervised learning techniques can be used to predict what the correct solution will be for future observations. However, often such examples are not readily available. The field

  2. Learning Probabilistic Models of Word Sense Disambiguation

    CERN Document Server

    Pedersen, Ted

    1998-01-01

    This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesian model is found to perform well. An explanation for this success is presented in terms of learning rates and bias-variance decompositions.

  3. Isospectral Manifold Learning Algorithm%等谱流形学习算法

    Institute of Scientific and Technical Information of China (English)

    黄运娟; 李凡长

    2013-01-01

    基于谱方法的流形学习算法的目标是发现嵌入在高维数据空间中的低维表示。近年来,该算法已得到广泛的应用。等谱流形学习是谱方法中的主要内容之一。等谱流形学习源于这样的结论:只要两个流形的谱相同,其内部结构就是相同的。而谱计算难以解决的问题是近邻参数的选择以及如何构造合理邻接权。为此,提出了等谱流形学习算法(isospectral manifold learning algorithm,简称IMLA)。它通过直接修正稀疏重构权矩阵,将类内的判别监督信息和类间的判别监督信息同时融入邻接图,达到既能保持数据间稀疏重建关系,又能利用监督信息的目的,与 PCA等算法相比具有明显的优势。该算法在3个常用人脸数据集(Yale,ORL,Extended Yale B)上得到了验证,这进一步说明了IMLA算法的有效性。%Manifold learning based on spectral method has been widely used recently for discovering a low-dimensional representation in the high-dimensional vector space. Isospectral manifold learning is one of the main contents of spectrum method. Isospectral manifold learning stems from the conclusions that if only the spectrums of manifold are the same, so are their internal structures. However, the difficult task about the calculation of the spectrum is how to select the optimal neighborhood size and construct reasonable neighboring weights. In this paper, a supervised technique called isospectral manifold learning algorithm (IMLA) is proposed. By modifying directly sparse reconstruction weight, IMLA takes into account the within-neighboring information and between-neighboring information. Thus, it not only preserves the sparse reconstructive relationship, but also sufficiently utilizes discriminant information. Compared with PCA and other algorithms, IMLA has obvious advantages. Experimental results on face databases (Yale, ORL and Extended Yale B) show the effectiveness of the IMLA method.

  4. Logic Learning Machine and standard supervised methods for Hodgkin's lymphoma prognosis using gene expression data and clinical variables.

    Science.gov (United States)

    Parodi, Stefano; Manneschi, Chiara; Verda, Damiano; Ferrari, Enrico; Muselli, Marco

    2016-06-27

    This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms (k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene (XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.

  5. Self-Supervised Learning to Visually Detect Terrain Surfaces for Autonomous Robots Operating in Forested Terrain

    Science.gov (United States)

    2012-01-01

    classified. Stereo algorithms can generate 3D point clouds at relatively high frequency (sev- eral hertz). However, the resulting depth map is typically...10.1002/rob 280 • Journal of Field Robotics—2012 (a) (b) (c) (d) Figure 1. Experimental robot platform, (a) lateral view and (b) top view. (c) Perception ... monocular road detection in desert terrain. In Proceedings of robotics: Science and systems, Philadelphia, USA. Elmqvist, M. (2002). Ground surface

  6. Alignment of Custom Standards by Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Adela Sirbu

    2010-09-01

    Full Text Available Building an efficient model for automatic alignment of terminologies would bring a significant improvement to the information retrieval process. We have developed and compared two machine learning based algorithms whose aim is to align 2 custom standards built on a 3 level taxonomy, using kNN and SVM classifiers that work on a vector representation consisting of several similarity measures. The weights utilized by the kNN were optimized with an evolutionary algorithm, while the SVM classifier's hyper-parameters were optimized with a grid search algorithm. The database used for train was semi automatically obtained by using the Coma++ tool. The performance of our aligners is shown by the results obtained on the test set.

  7. An Analysis of Learning Algorithms in Complex Stochastic Environments

    Science.gov (United States)

    2007-06-01

    speakers saying vowel phrases and resulted in a significant improvement in predictions during the refinement phase when contexts were added to the...with parameters for the agent, to both take actions and write the percepts it receives to a separate file. These two programs ran in tandem for...sensations, due to the recency threshold limiting the total number of percepts. A comparison of these two learning algorithms shows contrasting styles of

  8. Towards the compression of parton densities through machine learning algorithms

    CERN Document Server

    Carrazza, Stefano

    2016-01-01

    One of the most fascinating challenges in the context of parton density function (PDF) is the determination of the best combined PDF uncertainty from individual PDF sets. Since 2014 multiple methodologies have been developed to achieve this goal. In this proceedings we first summarize the strategy adopted by the PDF4LHC15 recommendation and then, we discuss about a new approach to Monte Carlo PDF compression based on clustering through machine learning algorithms.

  9. [A new algorithm for NIR modeling based on manifold learning].

    Science.gov (United States)

    Hong, Ming-Jian; Wen, Zhi-Yu; Zhang, Xiao-Hong; Wen, Quan

    2009-07-01

    Manifold learning is a new kind of algorithm originating from the field of machine learning to find the intrinsic dimensionality of numerous and complex data and to extract most important information from the raw data to develop a regression or classification model. The basic assumption of the manifold learning is that the high-dimensional data measured from the same object using some devices must reside on a manifold with much lower dimensions determined by a few properties of the object. While NIR spectra are characterized by their high dimensions and complicated band assignment, the authors may assume that the NIR spectra of the same kind of substances with different chemical concentrations should reside on a manifold with much lower dimensions determined by the concentrations, according to the above assumption. As one of the best known algorithms of manifold learning, locally linear embedding (LLE) further assumes that the underlying manifold is locally linear. So, every data point in the manifold should be a linear combination of its neighbors. Based on the above assumptions, the present paper proposes a new algorithm named least square locally weighted regression (LS-LWR), which is a kind of LWR with weights determined by the least squares instead of a predefined function. Then, the NIR spectra of glucose solutions with various concentrations are measured using a NIR spectrometer and LS-LWR is verified by predicting the concentrations of glucose solutions quantitatively. Compared with the existing algorithms such as principal component regression (PCR) and partial least squares regression (PLSR), the LS-LWR has better predictability measured by the standard error of prediction (SEP) and generates an elegant model with good stability and efficiency.

  10. Behavioral Profiling of Scada Network Traffic Using Machine Learning Algorithms

    Science.gov (United States)

    2014-03-27

    encryption [37]. As an alternative to traditional classification approaches, machine learning (ML) algorithms (e.g., Naı̈ve Bayes) have successfully used...systems, and conducting physical security surveys of remote sites. Eliminating possible backdoor entry into a SCADA network can be a daunting task...notify the master of an issue. Furthermore, SCADA protocols generally lack authentication and encryption due to operating requirements and use of

  11. Optimization in Quaternion Dynamic Systems: Gradient, Hessian, and Learning Algorithms.

    Science.gov (United States)

    Xu, Dongpo; Xia, Yili; Mandic, Danilo P

    2016-02-01

    The optimization of real scalar functions of quaternion variables, such as the mean square error or array output power, underpins many practical applications. Solutions typically require the calculation of the gradient and Hessian. However, real functions of quaternion variables are essentially nonanalytic, which are prohibitive to the development of quaternion-valued learning systems. To address this issue, we propose new definitions of quaternion gradient and Hessian, based on the novel generalized Hamilton-real (GHR) calculus, thus making a possible efficient derivation of general optimization algorithms directly in the quaternion field, rather than using the isomorphism with the real domain, as is current practice. In addition, unlike the existing quaternion gradients, the GHR calculus allows for the product and chain rule, and for a one-to-one correspondence of the novel quaternion gradient and Hessian with their real counterparts. Properties of the quaternion gradient and Hessian relevant to numerical applications are also introduced, opening a new avenue of research in quaternion optimization and greatly simplified the derivations of learning algorithms. The proposed GHR calculus is shown to yield the same generic algorithm forms as the corresponding real- and complex-valued algorithms. Advantages of the proposed framework are illuminated over illustrative simulations in quaternion signal processing and neural networks.

  12. Evaluation of machine learning algorithms for classification of primary biological aerosol using a new UV-LIF spectrometer

    Science.gov (United States)

    Ruske, Simon; Topping, David O.; Foot, Virginia E.; Kaye, Paul H.; Stanley, Warren R.; Crawford, Ian; Morse, Andrew P.; Gallagher, Martin W.

    2017-03-01

    Characterisation of bioaerosols has important implications within environment and public health sectors. Recent developments in ultraviolet light-induced fluorescence (UV-LIF) detectors such as the Wideband Integrated Bioaerosol Spectrometer (WIBS) and the newly introduced Multiparameter Bioaerosol Spectrometer (MBS) have allowed for the real-time collection of fluorescence, size and morphology measurements for the purpose of discriminating between bacteria, fungal spores and pollen.This new generation of instruments has enabled ever larger data sets to be compiled with the aim of studying more complex environments. In real world data sets, particularly those from an urban environment, the population may be dominated by non-biological fluorescent interferents, bringing into question the accuracy of measurements of quantities such as concentrations. It is therefore imperative that we validate the performance of different algorithms which can be used for the task of classification.For unsupervised learning we tested hierarchical agglomerative clustering with various different linkages. For supervised learning, 11 methods were tested, including decision trees, ensemble methods (random forests, gradient boosting and AdaBoost), two implementations for support vector machines (libsvm and liblinear) and Gaussian methods (Gaussian naïve Bayesian, quadratic and linear discriminant analysis, the k-nearest neighbours algorithm and artificial neural networks).The methods were applied to two different data sets produced using the new MBS, which provides multichannel UV-LIF fluorescence signatures for single airborne biological particles. The first data set contained mixed PSLs and the second contained a variety of laboratory-generated aerosol.Clustering in general performs slightly worse than the supervised learning methods, correctly classifying, at best, only 67. 6 and 91. 1 % for the two data sets respectively. For supervised learning the gradient boosting algorithm was

  13. On-line learning algorithms for locally recurrent neural networks.

    Science.gov (United States)

    Campolucci, P; Uncini, A; Piazza, F; Rao, B D

    1999-01-01

    This paper focuses on on-line learning procedures for locally recurrent neural networks with emphasis on multilayer perceptron (MLP) with infinite impulse response (IIR) synapses and its variations which include generalized output and activation feedback multilayer networks (MLN's). We propose a new gradient-based procedure called recursive backpropagation (RBP) whose on-line version, causal recursive backpropagation (CRBP), presents some advantages with respect to the other on-line training methods. The new CRBP algorithm includes as particular cases backpropagation (BP), temporal backpropagation (TBP), backpropagation for sequences (BPS), Back-Tsoi algorithm among others, thereby providing a unifying view on gradient calculation techniques for recurrent networks with local feedback. The only learning method that has been proposed for locally recurrent networks with no architectural restriction is the one by Back and Tsoi. The proposed algorithm has better stability and higher speed of convergence with respect to the Back-Tsoi algorithm, which is supported by the theoretical development and confirmed by simulations. The computational complexity of the CRBP is comparable with that of the Back-Tsoi algorithm, e.g., less that a factor of 1.5 for usual architectures and parameter settings. The superior performance of the new algorithm, however, easily justifies this small increase in computational burden. In addition, the general paradigms of truncated BPTT and RTRL are applied to networks with local feedback and compared with the new CRBP method. The simulations show that CRBP exhibits similar performances and the detailed analysis of complexity reveals that CRBP is much simpler and easier to implement, e.g., CRBP is local in space and in time while RTRL is not local in space.

  14. Collective academic supervision

    DEFF Research Database (Denmark)

    Nordentoft, Helle Merete; Thomsen, Rie; Wichmann-Hansen, Gitte

    2013-01-01

    are interconnected. Collective Academic Supervision provides possibilities for systematic interaction between individual master students in their writing process. In this process they learn core academic competencies, such as the ability to assess theoretical and practical problems in their practice and present them...

  15. Supervision of Teachers Based on Adjusted Arithmetic Learning in Special Education

    Science.gov (United States)

    Eriksson, Gota

    2008-01-01

    This article reports on 20 children's learning in arithmetic after teaching was adjusted to their conceptual development. The report covers periods from three months up to three terms in an ongoing intervention study of teachers and children in schools for the intellectually disabled and of remedial teaching in regular schools. The researcher…

  16. New Iterative Learning Control Algorithms Based on Vector Plots Analysis1)

    Institute of Scientific and Technical Information of China (English)

    XIESheng-Li; TIANSen-Ping; XIEZhen-Dong

    2004-01-01

    Based on vector plots analysis, this paper researches the geometric frame of iterativelearning control method. New structure of iterative learning algorithms is obtained by analyzingthe vector plots of some general algorithms. The structure of the new algorithm is different fromthose of the present algorithms. It is of faster convergence speed and higher accuracy. Simulationspresented here illustrate the effectiveness and advantage of the new algorithm.

  17. Clinical supervision.

    Science.gov (United States)

    Goorapah, D

    1997-05-01

    The introduction of clinical supervision to a wider sphere of nursing is being considered from a professional and organizational point of view. Positive views are being expressed about adopting this concept, although there are indications to suggest that there are also strong reservations. This paper examines the potential for its success amidst the scepticism that exists. One important question raised is whether clinical supervision will replace or run alongside other support systems.

  18. Bias Modeling for Distantly Supervised Relation Extraction

    Directory of Open Access Journals (Sweden)

    Yang Xiang

    2015-01-01

    Full Text Available Distant supervision (DS automatically annotates free text with relation mentions from existing knowledge bases (KBs, providing a way to alleviate the problem of insufficient training data for relation extraction in natural language processing (NLP. However, the heuristic annotation process does not guarantee the correctness of the generated labels, promoting a hot research issue on how to efficiently make use of the noisy training data. In this paper, we model two types of biases to reduce noise: (1 bias-dist to model the relative distance between points (instances and classes (relation centers; (2 bias-reward to model the possibility of each heuristically generated label being incorrect. Based on the biases, we propose three noise tolerant models: MIML-dist, MIML-dist-classify, and MIML-reward, building on top of a state-of-the-art distantly supervised learning algorithm. Experimental evaluations compared with three landmark methods on the KBP dataset validate the effectiveness of the proposed methods.

  19. Sparse kernel learning with LASSO and Bayesian inference algorithm.

    Science.gov (United States)

    Gao, Junbin; Kwan, Paul W; Shi, Daming

    2010-03-01

    Kernelized LASSO (Least Absolute Selection and Shrinkage Operator) has been investigated in two separate recent papers [Gao, J., Antolovich, M., & Kwan, P. H. (2008). L1 LASSO and its Bayesian inference. In W. Wobcke, & M. Zhang (Eds.), Lecture notes in computer science: Vol. 5360 (pp. 318-324); Wang, G., Yeung, D. Y., & Lochovsky, F. (2007). The kernel path in kernelized LASSO. In International conference on artificial intelligence and statistics (pp. 580-587). San Juan, Puerto Rico: MIT Press]. This paper is concerned with learning kernels under the LASSO formulation via adopting a generative Bayesian learning and inference approach. A new robust learning algorithm is proposed which produces a sparse kernel model with the capability of learning regularized parameters and kernel hyperparameters. A comparison with state-of-the-art methods for constructing sparse regression models such as the relevance vector machine (RVM) and the local regularization assisted orthogonal least squares regression (LROLS) is given. The new algorithm is also demonstrated to possess considerable computational advantages. Copyright 2009 Elsevier Ltd. All rights reserved.

  20. Exploration Of Deep Learning Algorithms Using Openacc Parallel Programming Model

    KAUST Repository

    Hamam, Alwaleed A.

    2017-03-13

    Deep learning is based on a set of algorithms that attempt to model high level abstractions in data. Specifically, RBM is a deep learning algorithm that used in the project to increase it\\'s time performance using some efficient parallel implementation by OpenACC tool with best possible optimizations on RBM to harness the massively parallel power of NVIDIA GPUs. GPUs development in the last few years has contributed to growing the concept of deep learning. OpenACC is a directive based ap-proach for computing where directives provide compiler hints to accelerate code. The traditional Restricted Boltzmann Ma-chine is a stochastic neural network that essentially perform a binary version of factor analysis. RBM is a useful neural net-work basis for larger modern deep learning model, such as Deep Belief Network. RBM parameters are estimated using an efficient training method that called Contrastive Divergence. Parallel implementation of RBM is available using different models such as OpenMP, and CUDA. But this project has been the first attempt to apply OpenACC model on RBM.

  1. Physical Realization of a Supervised Learning System Built with Organic Memristive Synapses

    Science.gov (United States)

    Lin, Yu-Pu; Bennett, Christopher H.; Cabaret, Théo; Vodenicarevic, Damir; Chabi, Djaafar; Querlioz, Damien; Jousselme, Bruno; Derycke, Vincent; Klein, Jacques-Olivier

    2016-09-01

    Multiple modern applications of electronics call for inexpensive chips that can perform complex operations on natural data with limited energy. A vision for accomplishing this is implementing hardware neural networks, which fuse computation and memory, with low cost organic electronics. A challenge, however, is the implementation of synapses (analog memories) composed of such materials. In this work, we introduce robust, fastly programmable, nonvolatile organic memristive nanodevices based on electrografted redox complexes that implement synapses thanks to a wide range of accessible intermediate conductivity states. We demonstrate experimentally an elementary neural network, capable of learning functions, which combines four pairs of organic memristors as synapses and conventional electronics as neurons. Our architecture is highly resilient to issues caused by imperfect devices. It tolerates inter-device variability and an adaptable learning rule offers immunity against asymmetries in device switching. Highly compliant with conventional fabrication processes, the system can be extended to larger computing systems capable of complex cognitive tasks, as demonstrated in complementary simulations.

  2. An augmented Lagrangian multi-scale dictionary learning algorithm

    Directory of Open Access Journals (Sweden)

    Ye Meng

    2011-01-01

    Full Text Available Abstract Learning overcomplete dictionaries for sparse signal representation has become a hot topic fascinated by many researchers in the recent years, while most of the existing approaches have a serious problem that they always lead to local minima. In this article, we present a novel augmented Lagrangian multi-scale dictionary learning algorithm (ALM-DL, which is achieved by first recasting the constrained dictionary learning problem into an AL scheme, and then updating the dictionary after each inner iteration of the scheme during which majorization-minimization technique is employed for solving the inner subproblem. Refining the dictionary from low scale to high makes the proposed method less dependent on the initial dictionary hence avoiding local optima. Numerical tests for synthetic data and denoising applications on real images demonstrate the superior performance of the proposed approach.

  3. A Comparison of the Effects of K-Anonymity on Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Hayden Wimmer

    2014-11-01

    Full Text Available While research has been conducted in machine learning algorithms and in privacy preserving in data mining (PPDM, a gap in the literature exists which combines the aforementioned areas to determine how PPDM affects common machine learning algorithms. The aim of this research is to narrow this literature gap by investigating how a common PPDM algorithm, K-Anonymity, affects common machine learning and data mining algorithms, namely neural networks, logistic regression, decision trees, and Bayesian classifiers. This applied research reveals practical implications for applying PPDM to data mining and machine learning and serves as a critical first step learning how to apply PPDM to machine learning algorithms and the effects of PPDM on machine learning. Results indicate that certain machine learning algorithms are more suited for use with PPDM techniques.

  4. VDES J2325-5229 a z=2.7 gravitationally lensed quasar discovered using morphology independent supervised machine learning

    CERN Document Server

    Ostrovski, Fernanda; Connolly, Andrew J; Lemon, Cameron A; Auger, Matthew W; Banerji, Manda; Hung, Johnathan M; Koposov, Sergey E; Lidman, Christopher E; Reed, Sophie L; Allam, Sahar; Benoit-Lévy, Aurélien; Bertin, Emmanuel; Brooks, David; Buckley-Geer, Elizabeth; Rosell, Aurelio Carnero; Kind, Matias Carrasco; Carretero, Jorge; Cunha, Carlos E; da Costa, Luiz N; Desai, Shantanu; Diehl, H Thomas; Dietrich, Jörg P; Evrard, August E; Finley, David A; Flaugher, Brenna; Fosalba, Pablo; Frieman, Josh; Gerdes, David W; Goldstein, Daniel A; Gruen, Daniel; Gruendl, Robert A; Gutierrez, Gaston; Honscheid, Klaus; James, David J; Kuehn, Kyler; Kuropatkin, Nikolay; Lima, Marcos; Lin, Huan; Maia, Marcio A G; Marshall, Jennifer L; Martini, Paul; Melchior, Peter; Miquel, Ramon; Ogando, Ricardo; Malagón, Andrés Plazas; Reil, Kevin; Romer, Kathy; Sanchez, Eusebio; Santiago, Basilio; Scarpine, Vic; Sevilla-Noarbe, Ignacio; Soares-Santos, Marcelle; Sobreira, Flavia; Suchyta, Eric; Tarle, Gregory; Thomas, Daniel; Tucker, Douglas L; Walker, Alistair R

    2016-01-01

    We present the discovery and preliminary characterization of a gravitationally lensed quasar with a source redshift $z_{s}=2.74$ and image separation of $2.9"$ lensed by a foreground $z_{l}=0.40$ elliptical galaxy. Since the images of gravitationally lensed quasars are the superposition of multiple point sources and a foreground lensing galaxy, we have developed a morphology independent multi-wavelength approach to the photometric selection of lensed quasar candidates based on Gaussian Mixture Models (GMM) supervised machine learning. Using this technique and $gi$ multicolour photometric observations from the Dark Energy Survey (DES), near IR $JK$ photometry from the VISTA Hemisphere Survey (VHS) and WISE mid IR photometry, we have identified a candidate system with two catalogue components with $i_{AB}=18.61$ and $i_{AB}=20.44$ comprised of an elliptical galaxy and two blue point sources. Spectroscopic follow-up with NTT and the use of an archival AAT spectrum show that the point sources can be identified as...

  5. Extendable supervised dictionary learning for exploring diverse and concurrent brain activities in task-based fMRI.

    Science.gov (United States)

    Zhao, Shijie; Han, Junwei; Hu, Xintao; Jiang, Xi; Lv, Jinglei; Zhang, Tuo; Zhang, Shu; Guo, Lei; Liu, Tianming

    2017-06-09

    Recently, a growing body of studies have demonstrated the simultaneous existence of diverse brain activities, e.g., task-evoked dominant response activities, delayed response activities and intrinsic brain activities, under specific task conditions. However, current dominant task-based functional magnetic resonance imaging (tfMRI) analysis approach, i.e., the general linear model (GLM), might have difficulty in discovering those diverse and concurrent brain responses sufficiently. This subtraction-based model-driven approach focuses on the brain activities evoked directly from the task paradigm, thus likely overlooks other possible concurrent brain activities evoked during the information processing. To deal with this problem, in this paper, we propose a novel hybrid framework, called extendable supervised dictionary learning (E-SDL), to explore diverse and concurrent brain activities under task conditions. A critical difference between E-SDL framework and previous methods is that we systematically extend the basic task paradigm regressor into meaningful regressor groups to account for possible regressor variation during the information processing procedure in the brain. Applications of the proposed framework on five independent and publicly available tfMRI datasets from human connectome project (HCP) simultaneously revealed more meaningful group-wise consistent task-evoked networks and common intrinsic connectivity networks (ICNs). These results demonstrate the advantage of the proposed framework in identifying the diversity of concurrent brain activities in tfMRI datasets.

  6. Automated cell analysis tool for a genome-wide RNAi screen with support vector machine based supervised learning

    Science.gov (United States)

    Remmele, Steffen; Ritzerfeld, Julia; Nickel, Walter; Hesser, Jürgen

    2011-03-01

    RNAi-based high-throughput microscopy screens have become an important tool in biological sciences in order to decrypt mostly unknown biological functions of human genes. However, manual analysis is impossible for such screens since the amount of image data sets can often be in the hundred thousands. Reliable automated tools are thus required to analyse the fluorescence microscopy image data sets usually containing two or more reaction channels. The herein presented image analysis tool is designed to analyse an RNAi screen investigating the intracellular trafficking and targeting of acylated Src kinases. In this specific screen, a data set consists of three reaction channels and the investigated cells can appear in different phenotypes. The main issue of the image processing task is an automatic cell segmentation which has to be robust and accurate for all different phenotypes and a successive phenotype classification. The cell segmentation is done in two steps by segmenting the cell nuclei first and then using a classifier-enhanced region growing on basis of the cell nuclei to segment the cells. The classification of the cells is realized by a support vector machine which has to be trained manually using supervised learning. Furthermore, the tool is brightness invariant allowing different staining quality and it provides a quality control that copes with typical defects during preparation and acquisition. A first version of the tool has already been successfully applied for an RNAi-screen containing three hundred thousand image data sets and the SVM extended version is designed for additional screens.

  7. A spatio-temporal latent atlas for semi-supervised learning of fetal brain segmentations and morphological age estimation.

    Science.gov (United States)

    Dittrich, Eva; Riklin Raviv, Tammy; Kasprian, Gregor; Donner, René; Brugger, Peter C; Prayer, Daniela; Langs, Georg

    2014-01-01

    Prenatal neuroimaging requires reference models that reflect the normal spectrum of fetal brain development, and summarize observations from a representative sample of individuals. Collecting a sufficiently large data set of manually annotated data to construct a comprehensive in vivo atlas of rapidly developing structures is challenging but necessary for large population studies and clinical application. We propose a method for the semi-supervised learning of a spatio-temporal latent atlas of fetal brain development, and corresponding segmentations of emerging cerebral structures, such as the ventricles or cortex. The atlas is based on the annotation of a few examples, and a large number of imaging data without annotation. It models the morphological and developmental variability across the population. Furthermore, it serves as basis for the estimation of a structures' morphological age, and its deviation from the nominal gestational age during the assessment of pathologies. Experimental results covering the gestational period of 20-30 gestational weeks demonstrate segmentation accuracy achievable with minimal annotation, and precision of morphological age estimation. Age estimation results on fetuses suffering from lissencephaly demonstrate that they detect significant differences in the age offset compared to a control group. Copyright © 2013. Published by Elsevier B.V.

  8. Predicting the Ecological Quality Status of Marine Environments from eDNA Metabarcoding Data Using Supervised Machine Learning.

    Science.gov (United States)

    Cordier, Tristan; Esling, Philippe; Lejzerowicz, Franck; Visco, Joana; Ouadahi, Amine; Martins, Catarina; Cedhagen, Tomas; Pawlowski, Jan

    2017-08-15

    Monitoring biodiversity is essential to assess the impacts of increasing anthropogenic activities in marine environments. Traditionally, marine biomonitoring involves the sorting and morphological identification of benthic macro-invertebrates, which is time-consuming and taxonomic-expertise demanding. High-throughput amplicon sequencing of environmental DNA (eDNA metabarcoding) represents a promising alternative for benthic monitoring. However, an important fraction of eDNA sequences remains unassigned or belong to taxa of unknown ecology, which prevent their use for assessing the ecological quality status. Here, we show that supervised machine learning (SML) can be used to build robust predictive models for benthic monitoring, regardless of the taxonomic assignment of eDNA sequences. We tested three SML approaches to assess the environmental impact of marine aquaculture using benthic foraminifera eDNA, a group of unicellular eukaryotes known to be good bioindicators, as features to infer macro-invertebrates based biotic indices. We found similar ecological status as obtained from macro-invertebrates inventories. We argue that SML approaches could overcome and even bypass the cost and time-demanding morpho-taxonomic approaches in future biomonitoring.

  9. Translation and validation of the clinical learning environment, supervision and nurse teacher scale (CLES + T) in Croatian language.

    Science.gov (United States)

    Lovrić, Robert; Piškorjanac, Silvija; Pekić, Vlasta; Vujanić, Jasenka; Ratković, Karolina Kramarić; Luketić, Suzana; Plužarić, Jadranka; Matijašić-Bodalec, Dubravka; Barać, Ivana; Žvanut, Boštjan

    2016-07-01

    Clinical practice is essential to nursing education as it provides experience with patients and work environments that prepare students for future work as nurses. The aim of this study was to translate the "Clinical Learning Environment, Supervision and Nurse Teacher" questionnaire in Croatian language and test its validity and reliability in practice. The study was performed at the Faculty of medicine, Josip Juraj Strossmayer University of Osijek, Croatia in April 2014. The translated questionnaire was submitted to 136 nursing students: 20 males and 116 females. Our results reflected a slightly different factor structure, consisting of four factors. All translated items of the original constructs "Supervisory relationship", "Role of nurse teacher" and "Leadership style of the ward manager" loaded on factor 1. Items of "Pedagogical atmosphere on the ward" are distributed on two factors (3 and 4). The items of "Premises of nursing on the ward" loaded on factor 2. Three items were identified as problematic and iteratively removed from the analysis. The translated version of the aforementioned questionnaire has properties suitable for the evaluation of clinical practice for nursing students within a Croatian context and reflects the specifics of the nursing clinical education in this country.

  10. Application of graph-based semi-supervised learning for development of cyber COP and network intrusion detection

    Science.gov (United States)

    Levchuk, Georgiy; Colonna-Romano, John; Eslami, Mohammed

    2017-05-01

    The United States increasingly relies on cyber-physical systems to conduct military and commercial operations. Attacks on these systems have increased dramatically around the globe. The attackers constantly change their methods, making state-of-the-art commercial and military intrusion detection systems ineffective. In this paper, we present a model to identify functional behavior of network devices from netflow traces. Our model includes two innovations. First, we define novel features for a host IP using detection of application graph patterns in IP's host graph constructed from 5-min aggregated packet flows. Second, we present the first application, to the best of our knowledge, of Graph Semi-Supervised Learning (GSSL) to the space of IP behavior classification. Using a cyber-attack dataset collected from NetFlow packet traces, we show that GSSL trained with only 20% of the data achieves higher attack detection rates than Support Vector Machines (SVM) and Naïve Bayes (NB) classifiers trained with 80% of data points. We also show how to improve detection quality by filtering out web browsing data, and conclude with discussion of future research directions.

  11. Geographic atrophy segmentation in infrared and autofluorescent retina images using supervised learning.

    Science.gov (United States)

    Devisetti, K; Karnowski, T P; Giancardo, L; Li, Y; Chaum, E

    2011-01-01

    Geographic Atrophy (GA) of the retinal pigment epithelium (RPE) is an advanced form of atrophic age-related macular degeneration (AMD) and is responsible for about 20% of AMD-related legal blindness in the United States. Two different imaging modalities for retinas, infrared imaging and autofluorescence imaging, serve as interesting complimentary technologies for highlighting GA. In this work we explore the use of neural network classifiers in performing segmentation of GA in registered infrared (IR) and autofluorescence (AF) images. Our segmentation achieved a performance level of 82.5% sensitivity and 92.9% specificity on a per-pixel basis using hold-one-out validation testing. The algorithm, feature extraction, data set and experimental results are discussed and shown.

  12. Anticipatory Driving for a Robot-Car Based on Supervised Learning

    DEFF Research Database (Denmark)

    Markelic, I.; Kulvicius, Tomas; Tamosiunaite, M.

    2009-01-01

    Using look ahead information and plan making improves hu- man driving. We therefore propose that also autonomously driving systems should dispose over such abilities. We adapt a machine learning approach, where the system, a car-like robot, is trained by an experienced driver by correlating visual...... adapt a two-level ap- proach, where the result of the database is combined with an additional reactive controller for robust behavior. Concerning velocity control this paper makes a novel contribution which is the ability of the system to react adequatly to upcoming curves...

  13. Anticipatory Driving for a Robot-Car Based on Supervised Learning

    DEFF Research Database (Denmark)

    Markelic, I.; Kulvicius, Tomas; Tamosiunaite, M.

    2009-01-01

    Using look ahead information and plan making improves hu- man driving. We therefore propose that also autonomously driving systems should dispose over such abilities. We adapt a machine learning approach, where the system, a car-like robot, is trained by an experienced driver by correlating visual...... adapt a two-level ap- proach, where the result of the database is combined with an additional reactive controller for robust behavior. Concerning velocity control this paper makes a novel contribution which is the ability of the system to react adequatly to upcoming curves...

  14. The threshold EM algorithm for parameter learning in bayesian network with incomplete data

    CERN Document Server

    Lamine, Fradj Ben; Mahjoub, Mohamed Ali

    2012-01-01

    Bayesian networks (BN) are used in a big range of applications but they have one issue concerning parameter learning. In real application, training data are always incomplete or some nodes are hidden. To deal with this problem many learning parameter algorithms are suggested foreground EM, Gibbs sampling and RBE algorithms. In order to limit the search space and escape from local maxima produced by executing EM algorithm, this paper presents a learning parameter algorithm that is a fusion of EM and RBE algorithms. This algorithm incorporates the range of a parameter into the EM algorithm. This range is calculated by the first step of RBE algorithm allowing a regularization of each parameter in bayesian network after the maximization step of the EM algorithm. The threshold EM algorithm is applied in brain tumor diagnosis and show some advantages and disadvantages over the EM algorithm.

  15. Rough Set Assisted Meta-Learning Method to Select Learning Algorithms

    Institute of Scientific and Technical Information of China (English)

    Lisa Fan; Min-xiao Lei

    2006-01-01

    In this paper,we propose a Rough Set assisted Meta-Learning method on how to select the most-suited machine-learning algorithms with minimal effort for a new given dataset. A k-Nearest Neighbor (k-NN) algorithm is used to recognize the most similar datasets that have been performed by all of the candidate algorithms. By matching the most similar datasets we found,the corresponding performance of the candidate algorithms is used to generate recommendation to the user. The performance derives from a multi-criteria evaluation measure-ARR, which contains both accuracy and time. Furthermore, after applying Rough Set theory, we can find the redundant properties of the dataset. Thus,we can speed up the ranking process and increase the accuracy by using the reduct of the meta attributes.

  16. Photometric classification of type Ia supernovae in the SuperNova Legacy Survey with supervised learning

    Science.gov (United States)

    Möller, A.; Ruhlmann-Kleider, V.; Leloup, C.; Neveu, J.; Palanque-Delabrouille, N.; Rich, J.; Carlberg, R.; Lidman, C.; Pritchet, C.

    2016-12-01

    In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts (0.2 Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia samples. Using the Area Under the Curve (AUC) metric, where perfect classification is given by 1, we find that our best-performing classifier (Extreme Gradient Boosting Decision Tree) has an AUC of 0.98.We show that it is possible to obtain a large photometrically selected type Ia SN sample with an estimated contamination of less than 5%. When applied to data from the first three years of SNLS, we obtain 529 events. We investigate the differences between classifying simulated SNe, and real SN survey data. In particular, we find that applying a thorough set of selection cuts to the SN sample is essential for good classification. This work demonstrates for the first time the feasibility of machine learning classification in a high-z SN survey with application to real SN data.

  17. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    Science.gov (United States)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  18. Q-Learning-Based Adjustable Fixed-Phase Quantum Grover Search Algorithm

    Science.gov (United States)

    Guo, Ying; Shi, Wensha; Wang, Yijun; Hu, Jiankun

    2017-02-01

    We demonstrate that the rotation phase can be suitably chosen to increase the efficiency of the phase-based quantum search algorithm, leading to a dynamic balance between iterations and success probabilities of the fixed-phase quantum Grover search algorithm with Q-learning for a given number of solutions. In this search algorithm, the proposed Q-learning algorithm, which is a model-free reinforcement learning strategy in essence, is used for performing a matching algorithm based on the fraction of marked items λ and the rotation phase α. After establishing the policy function α = π(λ), we complete the fixed-phase Grover algorithm, where the phase parameter is selected via the learned policy. Simulation results show that the Q-learning-based Grover search algorithm (QLGA) enables fewer iterations and gives birth to higher success probabilities. Compared with the conventional Grover algorithms, it avoids the optimal local situations, thereby enabling success probabilities to approach one.

  19. Using Supervised Machine Learning to Classify Real Alerts and Artifact in Online Multisignal Vital Sign Monitoring Data.

    Science.gov (United States)

    Chen, Lujie; Dubrawski, Artur; Wang, Donghan; Fiterau, Madalina; Guillame-Bert, Mathieu; Bose, Eliezer; Kaynar, Ata M; Wallace, David J; Guttendorf, Jane; Clermont, Gilles; Pinsky, Michael R; Hravnak, Marilyn

    2016-07-01

    The use of machine-learning algorithms to classify alerts as real or artifacts in online noninvasive vital sign data streams to reduce alarm fatigue and missed true instability. Observational cohort study. Twenty-four-bed trauma step-down unit. Two thousand one hundred fifty-three patients. Noninvasive vital sign monitoring data (heart rate, respiratory rate, peripheral oximetry) recorded on all admissions at 1/20 Hz, and noninvasive blood pressure less frequently, and partitioned data into training/validation (294 admissions; 22,980 monitoring hours) and test sets (2,057 admissions; 156,177 monitoring hours). Alerts were vital sign deviations beyond stability thresholds. A four-member expert committee annotated a subset of alerts (576 in training/validation set, 397 in test set) as real or artifact selected by active learning, upon which we trained machine-learning algorithms. The best model was evaluated on test set alerts to enact online alert classification over time. The Random Forest model discriminated between real and artifact as the alerts evolved online in the test set with area under the curve performance of 0.79 (95% CI, 0.67-0.93) for peripheral oximetry at the instant the vital sign first crossed threshold and increased to 0.87 (95% CI, 0.71-0.95) at 3 minutes into the alerting period. Blood pressure area under the curve started at 0.77 (95% CI, 0.64-0.95) and increased to 0.87 (95% CI, 0.71-0.98), whereas respiratory rate area under the curve started at 0.85 (95% CI, 0.77-0.95) and increased to 0.97 (95% CI, 0.94-1.00). Heart rate alerts were too few for model development. Machine-learning models can discern clinically relevant peripheral oximetry, blood pressure, and respiratory rate alerts from artifacts in an online monitoring dataset (area under the curve > 0.87).

  20. Orthogonal least squares learning algorithm for radial basis function networks

    Energy Technology Data Exchange (ETDEWEB)

    Chen, S.; Cowan, C.F.N.; Grant, P.M. (Dept. of Electrical Engineering, Univ. of Edinburgh, Mayfield Road, Edinburgh EH9 3JL, Scotland (GB))

    1991-03-01

    The radial basis function network offers a viable alternative to the two-layer neural network in many applications of signal processing. A common learning algorithm for radial basis function networks is based on first choosing randomly some data points as radial basis function centers and then using singular value decomposition to solve for the weights of the network. Such a procedure has several drawbacks and, in particular, an arbitrary selection of centers is clearly unsatisfactory. The paper proposes an alternative learning procedure based on the orthogonal least squares method. The procedure choose radial basis function centers one by one in a rational way until an adequate network has been constructed. The algorithm has the property that each selected center maximizes the increment to the explained variance or energy of the desired output and does not suffer numerical ill-conditioning problems. The orthogonal least squares learning strategy provides a simple and efficient means for fitting radial basis function networks, and this is illustrated using examples taken from two different signal processing applications.

  1. Orthogonal least squares learning algorithm for radial basis function networks.

    Science.gov (United States)

    Chen, S; Cowan, C N; Grant, P M

    1991-01-01

    The radial basis function network offers a viable alternative to the two-layer neural network in many applications of signal processing. A common learning algorithm for radial basis function networks is based on first choosing randomly some data points as radial basis function centers and then using singular-value decomposition to solve for the weights of the network. Such a procedure has several drawbacks, and, in particular, an arbitrary selection of centers is clearly unsatisfactory. The authors propose an alternative learning procedure based on the orthogonal least-squares method. The procedure chooses radial basis function centers one by one in a rational way until an adequate network has been constructed. In the algorithm, each selected center maximizes the increment to the explained variance or energy of the desired output and does not suffer numerical ill-conditioning problems. The orthogonal least-squares learning strategy provides a simple and efficient means for fitting radial basis function networks. This is illustrated using examples taken from two different signal processing applications.

  2. Supervised hub-detection for brain connectivity

    Science.gov (United States)

    Kasenburg, Niklas; Liptrot, Matthew; Reislev, Nina Linde; Garde, Ellen; Nielsen, Mads; Feragen, Aasa

    2016-03-01

    A structural brain network consists of physical connections between brain regions. Brain network analysis aims to find features associated with a parameter of interest through supervised prediction models such as regression. Unsupervised preprocessing steps like clustering are often applied, but can smooth discriminative signals in the population, degrading predictive performance. We present a novel hub-detection optimized for supervised learning that both clusters network nodes based on population level variation in connectivity and also takes the learning problem into account. The found hubs are a low-dimensional representation of the network and are chosen based on predictive performance as features for a linear regression. We apply our method to the problem of finding age-related changes in structural connectivity. We compare our supervised hub-detection (SHD) to an unsupervised hub-detection and a linear regression using the original network connections as features. The results show that the SHD is able to retain regression performance, while still finding hubs that represent the underlying variation in the population. Although here we applied the SHD to brain networks, it can be applied to any network regression problem. Further development of the presented algorithm will be the extension to other predictive models such as classification or non-linear regression.

  3. Machine learning based global particle indentification algorithms at LHCb experiment

    CERN Multimedia

    Derkach, Denis; Likhomanenko, Tatiana; Rogozhnikov, Aleksei; Ratnikov, Fedor

    2017-01-01

    One of the most important aspects of data processing at LHC experiments is the particle identification (PID) algorithm. In LHCb, several different sub-detector systems provide PID information: the Ring Imaging CHerenkov (RICH) detector, the hadronic and electromagnetic calorimeters, and the muon chambers. To improve charged particle identification, several neural networks including a deep architecture and gradient boosting have been applied to data. These new approaches provide higher identification efficiencies than existing implementations for all charged particle types. It is also necessary to achieve a flat dependency between efficiencies and spectator variables such as particle momentum, in order to reduce systematic uncertainties during later stages of data analysis. For this purpose, "flat” algorithms that guarantee the flatness property for efficiencies have also been developed. This talk presents this new approach based on machine learning and its performance.

  4. Prediction of Employee Turnover in Organizations using Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Rohit Punnoose

    2016-10-01

    Full Text Available Employee turnover has been identified as a key issue for organizations because of its adverse impact on work place productivity and long term growth strategies. To solve this problem, organizations use machine learning techniques to predict employee turnover. Accurate predictions enable organizations to take action for retention or succession planning of employees. However, the data for this modeling problem comes from HR Information Systems (HRIS; these are typically under-funded compared to the Information Systems of other domains in the organization which are directly related to its priorities. This leads to the prevalence of noise in the data that renders predictive models prone to over-fitting and hence inaccurate. This is the key challenge that is the focus of this paper, and one that has not been addressed historically. The novel contribution of this paper is to explore the application of Extreme Gradient Boosting (XGBoost technique which is more robust because of its regularization formulation. Data from the HRIS of a global retailer is used to compare XGBoost against six historically used supervised classifiers and demonstrate its significantly higher accuracy for predicting employee turnover.

  5. 基于半监督模糊聚类的入侵检测%Semi-supervised fuzzy clustering algorithm for intrusion detection

    Institute of Scientific and Technical Information of China (English)

    杜红乐; 樊景博

    2016-01-01

    针对网络行为数据中带标签数据收集困难及网络行为数据的异构性,提出了一种基于异构距离和样本密度的半监督模糊聚类算法,并将该算法应用到网络入侵检测中.该方法依据网络行为数据样本的异构性计算样本与类之间的异构距离及各个类的样本密度,利用异构距离和类内样本密度计算样本与类之间的模糊隶属度,用所得隶属度对无标签样本进行加标签处理,并得到相应的分类器.在KDD CUP99数据集上进行仿真实验,结果表明该方法是可行的、高效的.%Because collecting labeled samples is more difficult than collecting unlabeled samples and network data include value attribute and symbol attribute, this paper proposes an improved semi-supervised fuzzy clustering algorithm based on heterogeneous distance and sample density for intrusion detection. The algorithm computes membership with sample den-sity of one class and heterogeneous distance of intrusion detection dataset. Then it computes distance between sample and the center of every class and sets sample belonging to class of min-distance. It makes experiment with KDDCUP99 datas-et, and experimental results show that the method improves the detection accuracy.

  6. 高校学生管理工作的辩证思考%Dialectical thought about the supervision of students in institutions of higher learning

    Institute of Scientific and Technical Information of China (English)

    李宜祥; 邢大伟; 沈广元

    2001-01-01

    针对强化素质教育问题,研究了高校学生管理工作,论述了学生管理与自身建设、行为管理与思想疏导、理性说服与人情感化、群体教育与个体工作的辩证关系,提出加强自我修养、强化思想疏导、加大感情投入、做好个体工作,是新形势下做好学生管理工作的重要手段.%In accordance with the development of quality education thispaper deals with the supervision of students in institutions of higher learning and discusses the dialectical relations between the supervision of students and colleges and universities′ self reconstruction,the supervision of students′ behaviour and ideological mediation,rational persuasion and human feeling change by persuasion ,groups education and individual education,expounds important measures to improve the supervision of students such as raise teachers′ self quality,strengthening thought mediation,giving more affection to the work and neglecting no student.

  7. How to guide group to create learning-type project supervision department%如何带领团队创建学习型项目部

    Institute of Scientific and Technical Information of China (English)

    高春玉

    2011-01-01

    阐述了在工作中学习的重要性,介绍了如何创建学习型项目部的方法,并从三个方面加以分析,以建立和完善学习体制,有效地提高监理人员自身素质。%This paper expounds the significance of study in work, introduces methods of how to creating learning-type project supervision department, and makes an analysis from three aspects, with a view to establish and improve learning system and to effectively improve supervisors' quality.

  8. A survey of supervised machine learning models for mobile-phone based pathogen identification and classification

    Science.gov (United States)

    Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Tseng, Derek; Benien, Parul; Ozcan, Aydogan

    2017-03-01

    Giardia lamblia causes a disease known as giardiasis, which results in diarrhea, abdominal cramps, and bloating. Although conventional pathogen detection methods used in water analysis laboratories offer high sensitivity and specificity, they are time consuming, and need experts to operate bulky equipment and analyze the samples. Here we present a field-portable and cost-effective smartphone-based waterborne pathogen detection platform that can automatically classify Giardia cysts using machine learning. Our platform enables the detection and quantification of Giardia cysts in one hour, including sample collection, labeling, filtration, and automated counting steps. We evaluated the performance of three prototypes using Giardia-spiked water samples from different sources (e.g., reagent-grade, tap, non-potable, and pond water samples). We populated a training database with >30,000 cysts and estimated our detection sensitivity and specificity using 20 different classifier models, including decision trees, nearest neighbor classifiers, support vector machines (SVMs), and ensemble classifiers, and compared their speed of training and classification, as well as predicted accuracies. Among them, cubic SVM, medium Gaussian SVM, and bagged-trees were the most promising classifier types with accuracies of 94.1%, 94.2%, and 95%, respectively; we selected the latter as our preferred classifier for the detection and enumeration of Giardia cysts that are imaged using our mobile-phone fluorescence microscope. Without the need for any experts or microbiologists, this field-portable pathogen detection platform can present a useful tool for water quality monitoring in resource-limited-settings.

  9. Spectral Methods for Linear and Non-Linear Semi-Supervised Dimensionality Reduction

    CERN Document Server

    Chatpatanasiri, Ratthachat

    2008-01-01

    We present a general framework of spectral methods for semi-supervised dimensionality reduction. Applying an approach called manifold regularization, our framework naturally generalizes existent supervised frameworks. Furthermore, by our two semi-supervised versions of the representer theorem, our framework can be kernelized as well. Using our framework, we give three examples of semi-supervised algorithms which are extended from three recent supervised algorithms, namely, ``discriminant neighborhood embedding'', ``marginal Fisher analysis'' and ``local Fisher discriminant analysis''. We also give three more semi-supervised examples of the kernel versions of these algorithms. Numerical results of the six semi-supervised algorithms compared to their supervised versions are presented.

  10. Understanding Neural Networks for Machine Learning using Microsoft Neural Network Algorithm

    National Research Council Canada - National Science Library

    Nagesh Ramprasad

    2016-01-01

    .... In this research, focus is on the Microsoft Neural System Algorithm. The Microsoft Neural System Algorithm is a simple implementation of the adaptable and popular neural networks that are used in the machine learning...

  11. A study on the performance comparison of metaheuristic algorithms on the learning of neural networks

    Science.gov (United States)

    Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline

    2017-08-01

    The learning or training process of neural networks entails the task of finding the most optimal set of parameters, which includes translation vectors, dilation parameter, synaptic weights, and bias terms. Apart from the traditional gradient descent-based methods, metaheuristic methods can also be used for this learning purpose. Since the inception of genetic algorithm half a century ago, the last decade witnessed the explosion of a variety of novel metaheuristic algorithms, such as harmony search algorithm, bat algorithm, and whale optimization algorithm. Despite the proof of the no free lunch theorem in the discipline of optimization, a survey in the literature of machine learning gives contrasting results. Some researchers report that certain metaheuristic algorithms are superior to the others, whereas some others argue that different metaheuristic algorithms give comparable performance. As such, this paper aims to investigate if a certain metaheuristic algorithm will outperform the other algorithms. In this work, three metaheuristic algorithms, namely genetic algorithms, particle swarm optimization, and harmony search algorithm are considered. The algorithms are incorporated in the learning of neural networks and their classification results on the benchmark UCI machine learning data sets are compared. It is found that all three metaheuristic algorithms give similar and comparable performance, as captured in the average overall classification accuracy. The results corroborate the findings reported in the works done by previous researchers. Several recommendations are given, which include the need of statistical analysis to verify the results and further theoretical works to support the obtained empirical results.

  12. Protein sequence classification with improved extreme learning machine algorithms.

    Science.gov (United States)

    Cao, Jiuwen; Xiong, Lianglin

    2014-01-01

    Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM) and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms.

  13. A MULTI-AGENT LOCAL-LEARNING ALGORITHM UNDER GROUP ENVIROMENT

    Institute of Scientific and Technical Information of China (English)

    Jiang Daoping; Yin Yixin; Ban Xiaojuan; Meng Xiangsong

    2009-01-01

    In this paper, a local-learning algorithm for multi-agent is presented based on the fact that individual agent performs local perception and local interaction under group environment. As for individual-learning, agent adopts greedy strategy to maximize its reward when interacting with environment. In group-learning, local interaction takes place between each two agents. A local-learning algorithm to choose and modify agents' actions is proposed to improve the traditional learning algorithm, respectively in the situations of zero-sum games and general-sum games with unique equilibrium or multi-equilibrium. And this local-learning algorithm is proved to be convergent and the computation complexity is lower than the NashAdditionally, through grid-game test, it is indicated that by using this local-learning algorithm, the local behaviors of agents can spread to globe.

  14. Application of Metamorphic Testing to Supervised Classifiers

    Science.gov (United States)

    Xie, Xiaoyuan; Ho, Joshua; Kaiser, Gail; Xu, Baowen; Chen, Tsong Yueh

    2010-01-01

    Many applications in the field of scientific computing - such as computational biology, computational linguistics, and others - depend on Machine Learning algorithms to provide important core functionality to support solutions in the particular problem domains. However, it is difficult to test such applications because often there is no “test oracle” to indicate what the correct output should be for arbitrary input. To help address the quality of such software, in this paper we present a technique for testing the implementations of supervised machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called “metamorphic testing”, which has been shown to be effective in such cases. More importantly, we demonstrate that our technique not only serves the purpose of verification, but also can be applied in validation. In addition to presenting our technique, we describe a case study we performed on a real-world machine learning application framework, and discuss how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also discuss how our findings can be of use to other areas outside scientific computing, as well. PMID:21243103

  15. Learning Outlier Ensembles

    DEFF Research Database (Denmark)

    Micenková, Barbora; McWilliams, Brian; Assent, Ira

    into the existing unsupervised algorithms. In this paper, we show how to use powerful machine learning approaches to combine labeled examples together with arbitrary unsupervised outlier scoring algorithms. We aim to get the best out of the two worlds—supervised and unsupervised. Our approach is also a viable...

  16. Using Supervised Machine Learning to Classify Real Alerts and Artifact in Online Multi-signal Vital Sign Monitoring Data

    Science.gov (United States)

    Chen, Lujie; Dubrawski, Artur; Wang, Donghan; Fiterau, Madalina; Guillame-Bert, Mathieu; Bose, Eliezer; Kaynar, Ata M.; Wallace, David J.; Guttendorf, Jane; Clermont, Gilles; Pinsky, Michael R.; Hravnak, Marilyn

    2015-01-01

    OBJECTIVE Use machine-learning (ML) algorithms to classify alerts as real or artifacts in online noninvasive vital sign (VS) data streams to reduce alarm fatigue and missed true instability. METHODS Using a 24-bed trauma step-down unit’s non-invasive VS monitoring data (heart rate [HR], respiratory rate [RR], peripheral oximetry [SpO2]) recorded at 1/20Hz, and noninvasive oscillometric blood pressure [BP] less frequently, we partitioned data into training/validation (294 admissions; 22,980 monitoring hours) and test sets (2,057 admissions; 156,177 monitoring hours). Alerts were VS deviations beyond stability thresholds. A four-member expert committee annotated a subset of alerts (576 in training/validation set, 397 in test set) as real or artifact selected by active learning, upon which we trained ML algorithms. The best model was evaluated on alerts in the test set to enact online alert classification as signals evolve over time. MAIN RESULTS The Random Forest model discriminated between real and artifact as the alerts evolved online in the test set with area under the curve (AUC) performance of 0.79 (95% CI 0.67-0.93) for SpO2 at the instant the VS first crossed threshold and increased to 0.87 (95% CI 0.71-0.95) at 3 minutes into the alerting period. BP AUC started at 0.77 (95%CI 0.64-0.95) and increased to 0.87 (95% CI 0.71-0.98), while RR AUC started at 0.85 (95%CI 0.77-0.95) and increased to 0.97 (95% CI 0.94–1.00). HR alerts were too few for model development. CONCLUSIONS ML models can discern clinically relevant SpO2, BP and RR alerts from artifacts in an online monitoring dataset (AUC>0.87). PMID:26992068

  17. Semi Supervised Weighted K-Means Clustering for Multi Class Data Classification

    Directory of Open Access Journals (Sweden)

    Vijaya Geeta Dharmavaram

    2013-01-01

    Full Text Available Supervised Learning techniques require large number of labeled examples to train a classifier model. Research on Semi Supervised Learning is motivated by the availability of unlabeled examples in abundance even in domains with limited number of labeled examples. In such domains semi supervised classifier uses the results of clustering for classifier development since clustering does not rely only on labeled examples as it groups the objects based on their similarities. In this paper, the authors propose a new algorithm for semi supervised classification namely Semi Supervised Weighted K-Means (SSWKM. In this algorithm, the authors suggest the usage of weighted Euclidean distance metric designed as per the purpose of clustering for estimating the proximity between a pair of points and used it for building semi supervised classifier. The authors propose a new approach for estimating the weights of features by appropriately adopting the results of multiple discriminant analysis. The proposed method was then tested on benchmark datasets from UCI repository with varied percentage of labeled examples and found to be consistent and promising.

  18. Semi-supervised Phonetic Category Learning: Does Word-level Information Enhance the Efficacy of Distributional Learning?

    Directory of Open Access Journals (Sweden)

    Till Poppels

    2014-08-01

    Full Text Available To test whether word-level information facilitates the learning of phonetic categories, 40 adult native English speakers were exposed to a bimodal distribution of vowels embedded in non-words. Half of the subjects received phonetic categories aligned with lexical categories, while the other half received no such cue. It was hypothesized that the subjects exposed to lexically-informative training stimuli that were aligned with the target categories would outperform the control subjects on a perceptual categorization task after training. While the results revealed no such group differences, the data indicated that many subjects used the relevant dimension for categorization before having received any training. Implications regarding experimental design and suggestions for future research based on the results are discussed.

  19. Implementability of Instructional Supervision as a Contemporary Educational Supervision Model in Turkish Education System

    OpenAIRE

    2012-01-01

    In this study, implementability of instructional supervision as one of contemporary educational supervision models in Turkish Education System was evaluated. Instructional supervision which aims to develop instructional processes and increase the quality of student learning based on observation of classroom activities requires collaboration among supervisors and teachers. In this literature review, significant problems have been detected due to structural organization, structural and control-...

  20. A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification.

    Science.gov (United States)

    Zhengming Li; Zhihui Lai; Yong Xu; Jian Yang; Zhang, David

    2017-02-01

    Locality and label information of training samples play an important role in image classification. However, previous dictionary learning algorithms do not take the locality and label information of atoms into account together in the learning process, and thus their performance is limited. In this paper, a discriminative dictionary learning algorithm, called the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, was proposed for image classification. First, the locality information was preserved using the graph Laplacian matrix of the learned dictionary instead of the conventional one derived from the training samples. Then, the label embedding term was constructed using the label information of atoms instead of the classification error term, which contained discriminating information of the learned dictionary. The optimal coding coefficients derived by the locality-based and label-based reconstruction were effective for image classification. Experimental results demonstrated that the LCLE-DL algorithm can achieve better performance than some state-of-the-art algorithms.